Ubuntu Landscape and MotD integration kills Gitlab SSH performance

Posted in Computing, Infrastructure Management, Systems Administration on June 15th, 2016 by Jeff – Be the first to comment

I had nothing to do with this discovery but my colleague Lance Johnston, who did, felt that we should share it because of a lack of information about the issue on the Internet.

I lead a team at work that, among other things, manages our version control system. We started to have some performance issues with our Gitlab instance as usage increased and it started to impacted users so we decided to restart the Gitlab services first.

We observed a pretty high load when we checked before stopping the services, around 10-12, which we expected to go down when we shutdown the services. However when the services were off, the load did not go down, which was very curious.

Lance investigates and as he watched ‘top’ he observed batches of inbound ssh connections, as one would expect. But when the connections happened, he immediately saw another batch of processes named ‘landscape-sysinfo’.

A little digging turned up some information indicating that whenever a shell is spawned, such as when there’s an ssh connection, the Message of the Day is presented. The MotD runs the ‘landscape-sysinfo’ program in order to collect metrics that are presented to users when they login. So we have literally hundreds of ssh connections at any given time as Jenkins and developers do their jobs so this program was producing a consistently high load average.

Since the vast majority of ssh connections are not interactive, we disabled the Message of the Day and the load dropped immediately to .01, with the Gitlab services off. When they were turned back on we stabilized around .7 and during the work day it doesn’t go over 5 during usage spikes.

Store Time Machine Backups on an Ubuntu Server

Posted in Infrastructure Management, Personal Computing, Systems Administration, Uncategorized on April 19th, 2014 by Jeff – Be the first to comment

I found this concise article (author’s claim verified) on setting up Mac OS X Time Machine backups on a network drive. I tried using SMB/CIFS to no avail but setting up a Netatalk share did the trick!

Note that I did not modify the Avahi configuration since it wasn’t necessary to make the share usable for backups.

Processing files from S3 with Cascading

Posted in Big Data, Computing, Data Management, Software Development on August 10th, 2013 by Jeff – Be the first to comment

   Cascading is a Hadoop ecosystem framework that provides a higher level abstraction over MapReduce. I recently worked on a Cascading prototype that would read log files from an Amazon Web Services S3 bucket, do a minor transform, land the output in HDFS then move the files to another S3 bucket configured for archiving.
read more »

Netalyzr – Network debugging tool

Posted in Infrastructure Management on January 1st, 2012 by Jeff – Be the first to comment

I’ve had a transient issue with my Internet access randomly “going away”. It’s annoying but generally clears up within a minute or two. I came across a tool called Netalyzr by a group within UC Berkeley. Netalyzr is a Java application available as either an in-browser Applet or a command line utility. It runs a number of network connectivity tests and provides a detailed report hosted on their web site that uses a simple red/yellow/green motif to show problems and their relative importance.

While Netalyzr didn’t clearly show what was going on with my Internet connection it did raise a red flag about network buffers that might be the issue. Unfortunately, that’s a router configuration issue on the part of my ISP so I’m not hopeful for a resolution. But I can always gather data then open a trouble ticket with the vendor.

Regardless, Netalyzr looks like a great tool for troubleshooting connectivity issues.

Prey Project, ping and Cygwin

Posted in Computing, Personal Computing on September 10th, 2011 by Jeff – Be the first to comment

File this with the obscure issue department…

The Prey Project looked like a nice system for tracking stolen devices and has gotten a lot of good press recently. I decided to try it out. After getting everything setup and working I noticed a lot of Cygwin bash shells running the ping command. The commands accumulated eventually degrading system performance which is when I noticed.

Prey has a partial UNIX environment (MingW) contained in it and consists of shell scripts wrapping a number of UNIX utilities compiled for Windows. I say partial because it doesn’t include the “ping” command which is a dependency for the software. And the shell scripts apparently don’t take into account the potential for a user having other UNIX-like environments installed (Cygwin also has a bash shell and the ping command but there are others as well.) So what was happening is that script (pull) naively looks at what operating system it is installed on and for a ping command and issue what it believes are the correct command line arguments. For Windows it’s this:

ping -n 1 www.google.com

This doesn’t work because Cygwin’s ping.exe doesn’t have a “-n” switch. But for some reason doesn’t fail when it encounters an invalid option. Rather, it tried to ping the IP address 0.0.0.1. This doesn’t work, of course, but the ping command tries forever thus respawning new instances of the bash shell and ping until it kills your computer.

Anyway, I hard coded a change to the script on my system and filed a bug with the Prey developers.

I also submitted an email to the Cygwin mailing list describing the Cygwin ping issue.