Standalone Sysadmin

Syndicate content
A blog for IT Admins who do everything by an IT Admin who does everything
Updated: 5 hours 40 min ago

Thank you.

6 hours 36 min ago

If you’re reading this, you’re probably an IT administrator of some sort (or want to be one). So thank you. Thank you for making your own part of the internet go. Sure, I’m an administrator, but I also use the internet, and without people like yourself, I wouldn’t be able to write this page, or make a telephone call, or talk to people on IM or twitter, or do any one of a countless number of things that I take for granted every day.

Thank you to all of the system, network, telephony, storage, application, and general IT administrators out there who make our modern life possible.


Etsy Now Sponsoring SysAdmin Appreciation Day Event in NYC!

Thu, 07/29/2010 - 12:15pm

Last night, I got an email out of the blue. It was from Chris Munns, sysadmin at Etsy, the home to a huge online community of people who make and sell things. The email basically asked if there was any way that Etsy could help sponsor the SysAdmin Appreciation Day event! Excellent.

The only question in my mind was, what kind of sponsorship would our event need? In the end, it winds up being a bunch of system administrators sitting around drinking, swapping war stories. I told them as much, and Chris responded:

Hey Matt,
Chad Dickerson who is the CTO here at Etsy was actually the one who wanted to us to help sponsor/participate. We were wondering if maybe we could just throw some money in for drinks on behalf of Etsy?

- Chris

Pick up some of the bar tab? Well, ok!

After some more discussion, we’ve got it settled down, and I am happy to say that Etsy is contributing a very significant amount towards our bar tab tomorrow. I’m not going to say how much just yet, because I haven’t worked out how it’s going to be handled, but I’ll be surprised if anyone ends up paying for a drink themselves.

A huge(!) thank you to Etsy! And if you’re wondering why a site largely dedicated to crafting cares this much about the community of System Administrators, you should read their blog, Code As Craft. They believe strongly in Dev/Ops cooperation, and they spend a lot of time on that blog discussing their infrastructure. If you’re interested in Hadoop installations and continuous deployments, I recommend you check it out.

If you were holding back because you didn’t want to spend the dough on drinks, then don’t be afraid any longer. Check out the event page, then register!

Event registration for SysAdmin Appreciation Day – NYC powered by Eventbrite


If the sysadmin of the year is for good work…

Wed, 07/28/2010 - 4:26am

…do we have an appropriate award for doing bad work?

I’m only asking, because today on reddit, I came across an amazing post.

There is a subreddit called IAMA, where you can submit a thread allowing people to ask you questions because you are, in some way, unusual or interesting. The thread I found was called “IAMA Wildly Incompetent Network Security Admin and have no business in my job“.

The job? He’s network security for a Vegas casino.


When you actually click on the thread, it gets way, way worse. There’s a summary at the top, so I’m stealing some and pasting here. This is all copyright of reddit user throwawayscared, I don’t want it.

Since alot of people are asking this question: The reason I dont spend time learning the job is partly due to laziness. I mean it’s awesome spending all day playing battlefieldheroes or transformice.

I refuse to wear my ID badge so people dont stop and ask me questions. I’ve been reprimanded and even warranted the CEO sending out a memo that stated ‘EVERYONE HAS TO WEAR THEIR BADGE’ and I still dont do it. I just changed my schedule to leave earlier than any execs and get in after they do so they never see me without it.
also working at a casino means you get free lunches too. we’re only supposed to eat once, but i go several times throughout the day. I once changed the settings on the turnstyle applicatoin to allow me unlimited cafeteria entries. Everyone else was set at 1. The benefits of admin passwords

To further prove how much I should be fired, I’d like to share a quick story with you. I have stolen every bit of computer shit I can get my hands on. When the security team started cracking down on thieving employees and searching us on the way out, I just started mailing the shit to my house through the mailroom. Then I just started listing shit on ebay and sending it to the buyers right through the same mailroom. I also convinced the mailroom dude that I should’t pay for postage. I’m not proud, but I’m certainly not ashamed.

wow. It’s like a trainwreck.

Please, don’t be this guy.


1 week to SysAdmin Appreciation Day

Fri, 07/23/2010 - 2:52pm

Just a quick reminder that if you’re in the New York City area next Friday, then you should come celebrate System Administrator Appreciation Day with us at The Gingerman. It’s on the east side of Manhattan, on 36th street. It’s an easy walk from Grand Central Terminal, and not too far from Penn Station, either.

Come one, come all. Raise a pint to…well…ourselves!

Online event registration for SysAdmin Appreciation Day – NYC powered by Eventbrite

Note: You don’t have to sign up to show up, but it helps me keep track of how many people will be there. Significant Others Welcome!


Replacement Dell PowerEdge R410 Motherboards Compromised

Wed, 07/21/2010 - 11:09am

This is probably not quite the news that Dell wanted to get…

According to an article at The Register, Dell service has shipped replacement motherboards that contained spyware, presumably placed there at the manufacturing site.

The original post on the Dell Community Forums has this quote from a Dell rep:

As part of Dell’s quality process, we have identified a potential issue with our service mother board stock, like the one you received for your PowerEdge R410, and are taking preventative action with our customers accordingly. The potential issue involves a small number of PowerEdge server motherboards sent out through service dispatches that may contain malware. This malware code has been detected on the embedded server management firmware as you indicated.

We take matters of information security very seriously and believe that any impact to a customer’s information security is unlikely. To date we have received no customer reports related to data security. Systems running non-Windows operating systems are not vulnerable to this malware and this issue is not present on motherboards shipped new with PowerEdge systems.

We have assembled a customer list and are directly contacting customers like you through a call campaign. On the call, you should be provided a phone number to call if you have additional questions. Hopefully you received this on your call. If not, let me know and we’ll get it to you as soon as possible so you have all of the follow-up information needed.

Dell’s apparently being proactive about it…but what other option do they have? “Our factory-supplied boards come enhanced with spyware” isn’t exactly the ideal sales pitch.

If you have recently gotten a replacement motherboard for a new gen Dell PowerEdge, you might want to call your rep to make sure you’re not affected.


Introduction to SNMP

Sat, 07/10/2010 - 12:03am

Introduction to what? This isn’t going to be a “how to configure SNMP for your server” kind of introduction. I’m no great expert there, but if there’s call for it, I can share my configuration bits to help. This is more of a “what the heck is SNMP” introduction. Hopefully it’ll be more valuable, since there are reams of existing documentation on how to actually configure the services, and not so many on why you should care.

Really, the system administration world is divided into two camps. Those of us who want to monitor our servers, network gear, and get performance metrics so that we can trend future usage, and those of us who don’t yet know that we want those things. The former group uses SNMP. The latter group will probably get something out of this post.

If you’re new to the idea of SNMP, bear with me for a second. Suppose that it would be handy to remotely query all of your network devices and retrieve stats from them. If you’re familiar with the concept of logging into, say, a router, you know that you can get information that way. If you buy intelligent switches, you know that you can telnet, ssh, or web-browse to the switch interface and check out what’s going on that way. Likewise, you can log into your servers and check out the stats there, but overall, there are as many different ways of getting this information as there are devices that you want it from. That’s no good, because no one wants to script that many possible interactive sessions.

This is the problem that SNMP was meant to address. SNMP means Simple Network Management Protocol, and it is just a well-agreed-upon language (protocol) that almost all network devices speak. By using SNMP, you can effectively move beyond the normal administrative interface of your network device and just query it for information. Sounds great, right? It Is!

Well, ok, it can be. From this pristine dream of a one-ness of network devices, we muddy the waters a bit when it comes to the specifics. As of right now, there are three different versions of the SNMP protocol, with the primary differences between v1 and v2 being capabilities, and the primary differences between v2 and v3 being security.

Starting with v1, and continuing with v2, SNMP didn’t actually have “usernames” and “passwords”, so much. They instead had “community strings”, which function as passwords, but without all of those messy account details getting in the way. Typically speaking, there was a community string for reading data (the default was “public”), and a community string for writing settings, when the device supported that (with a default of, yep, you guessed it, “private”). It’s hard to imagine why anyone thought this was an insecure protocol, but apparently some people were uncomfortable with the idea of all of their machines being monitored remotely with no accountability whatsoever. Weird, I know.

That brought about the idea of SNMP v3, which packs as many security features into it as the previous versions lacked. In fact, that’s pretty much all it does. The actual protocol request itself is still v1 or v2, but with extra security layers. By default, not only does SNMP v3 require the use of accounts with passwords, but the transmission itself is encrypted (DES by default, though some vendors support better encryption like 3DES)to protect the account credentials and data. In addition, each of the transmissions is signed (using MD5 or SHA-1) to guarantee that it wasn’t altered in transit. Because yeah, that’s not overkill for me querying the number of bits transmitted since the last time I asked.

Anyway, to use the universal car analogy, you can either have the jeep with no roll cage (v1/2) or the armored tank (v3).

Honestly, I use SNMP v2, and as much as I hate to admit it, I have a nearly universal read-only community string that I use for it. It’s not “public”, and I disable the write-access community string, but I run old hardware. A lot of it doesn’t work with v3. In fact, some of it doesn’t even work with v2, but for everything that does, I use v2. It is noticeably faster, and as far as security is concerned, 99% of my things are internal on a private IP-based switched network. If someone is sniffing my packets, I have bigger issues than my read-only community string being compromised. You, on the other hand, may want to check things over the internet. In that case, use SNMP v3. The encryption will be worth the time you invest.

So that’s an introduction to what the versions are, but that’s not much of an explanation of what SNMP *IS*. SNMP is a logical tree.

Imagine that you’re an snmp server in the mid 1990s. You don’t have a lot of RAM, but you have a lot of data to keep track of. Strange remote machines will be querying you to access this data. What method do you use to keep track of the data that they want?

In the case of SNMP, they used a tree. Every branch of the tree is separated from the parent and child branches by a period. Taken together, this string of numbers is called an OID, or Object IDentifier. The very top of the tree (or very bottom, depending on how you look at it) is the most abstracted…and you’re almost always going to see it start with a 1, which has been assigned to the Internet Standards Organization, or ISO. In fact, a lot of the OIDs that you run into will start with 1.3.6.1, which maps to ISO.identified-organization.dod.internet. You can browse the entire registered OID tree at http://www.oid-info.com, if you’re really bored.

Alright, so imagine that you’ve browsed all the way down to 1.3.6.1.2.1.2.2.1.16. Great. What the heck does that mean, though?

The other great tree of numbers strung together with dots, IP addresses, had the same problem a long time ago, and so DNS was invented, to map IP addresses to names. For a very similar reason, there is a Management Information Base, or MIB, that maps OIDs to useful names. That 1.3.6.1.2.1.2.2.1.16 monstrosity above? Yeah, it actually means ifOutOctets, shorthand for interface output octets. It’s a 32 bit counter that shows the number of octets which have been output by each interface. When I query it (more on that shortly) on a machine with 5 interfaces, I get the following output:

IF-MIB::ifOutOctets.1 = Counter32: 2766014067
IF-MIB::ifOutOctets.2 = Counter32: 3209623655
IF-MIB::ifOutOctets.3 = Counter32: 3606918534
IF-MIB::ifOutOctets.4 = Counter32: 2521574893
IF-MIB::ifOutOctets.5 = Counter32: 0

There are some very standard OIDs that are universal across pretty much all devices. On the other hand, many devices have specialized OIDs that you probably wouldn’t otherwise find (and certainly wouldn’t know what they meant!) unless you had the specific MIB for that device. For this reason, many manufacturers have made their MIBs available for download, but there are also websites that archive MIBs and make them searchable by the public. This can be a huge help if you want to know how many VPN users are currently logged in, or really anything else that is non-standard or hard to find.

Think of the MIB files as a map to the information you want to look for.

Now, how to actually get that information out of the device…

If you want to query by hand (certainly only a temporary measure), in the Unix/Linux world, I recommend net-snmp. It includes a suite of tools to poke and prod SNMP-enabled devices, but the two things that I use the most are snmpwalk and snmpget.

The block of results above were retrieved using snmpwalk. What I did was issue the following command:

snmpwalk -v 2c -c CommunityString servername 1.3.6.1.2.1.2.2.1.16

If you notice, the output from that command returned 5 lines, with the first field of each line ending in “ifOutOctets.#”, where # is the number of the interface. That’s because the actual OID of each of those values was 1.3.6.1.2.1.2.2.1.16.#! If I try to use ’snmpget’ (which, unlike snmpwalk, only returns one result), it fails:

snmpget -v 2c -c CommunityString servername 1.3.6.1.2.1.2.2.1.16
IF-MIB::ifOutOctets = No Such Instance currently exists at this OID

However, specifying the correct OID does the trick:

snmpget -v 2c -c CommunityString servername 1.3.6.1.2.1.2.2.1.16.1
IF-MIB::ifOutOctets.1 = Counter32: 2766027795

What ’snmpwalk’ actually does is walk the tree. I specified ‘1.3.6.1.2.1.2.2.1.16′, so it said “alright, I’m going to get that OID, then I’m going to dive in and get ‘.1′, then ‘.2′, etc etc until it reaches a failure message indicating that there aren’t any more children. By this method, you can actually query a huge part (or even all) of the tree.

In this case, I knew I had 5 interfaces, numbered 1-5 (according to the OID results from snmpwalk), but I didn’t know which interface was registered as which number…I did know, however, that one of the interfaces was called ‘eth0′, so I shaved some numbers off of the OID, and executed this snmpwalk:

snmpwalk -v 2c -c CommunityString servername 1.3.6.1.2.1.2.2 | grep eth0
IF-MIB::ifDescr.2 = STRING: eth0

Excellent. At this point, I know that ifDescr is the name (registered in the MIB) that holds the interface descriptions. So I just execute an snmpwalk against that:

snmpwalk -v 2c -c CommunityString servername ifDescr
IF-MIB::ifDescr.1 = STRING: lo
IF-MIB::ifDescr.2 = STRING: eth0
IF-MIB::ifDescr.3 = STRING: eth1
IF-MIB::ifDescr.4 = STRING: bond0
IF-MIB::ifDescr.5 = STRING: sit0

Easy as pie.

Of course, you don’t always want to query by hand…in fact, it’s probably the exception, rather than the rule. You want monitoring software to do all that stuff for you. Pretty much every monitoring software known to man can query snmp directly (and if it can’t, you know how to query it via the command line now, so you can write a script to do it, if it’s absolutely necessary). Most of the graphing solutions like Cacti, MRTG, and everything else include code to query, and even Nagios has a check_snmp plugin (which I highly recommend using, rather than creatively solving the problem yourself).

This really only leaves one stone unturned. SNMP Traps. Essentially, SNMP traps are a way of letting the SNMP server stop being passively queried and start actively letting someone know that something is wrong. Configuring a trap involves specifying a remote server (or servers) to alert when something goes horribly awry.

The remote server specified needs to be listening for SNMP traps. In Unix/Linux, it’s not too difficult to get net-snmp to listen for them, and on Windows, there is software available to do the same thing. Here’s one I found with a quick search. I’m sure there are more, so if you have a favorite, please let us know what it is in the comments.

The only thing left is to tie your notification system into the trap server, but I’ll leave that as an exercise for the reader.

Thanks for reading, and hopefully you got something out. If you have a favorite SNMP tip or trick (or I screwed something up), let us know in the comments!


Progress report and vacation next week

Fri, 07/09/2010 - 8:59am

It appears that long laid plans are finally coming to fruition.

That link was to a post written on June 4, 2009, the first time that I mention that I wanted to try puppet. And over a year later, here I am rolling puppet onto my production servers. It took forever, but there’s a lot of underlying infrastructure, too. The RPM building environment and skillset was the biggest hurdle. Compared to that, the repo and subversion repository were cake!

I do want to thank everyone who has given me a hand with my questions throughout the process. I feel like I’ve bugged R.I.Pienaar, Jordan Sissel, and Ben Cotton the most, but I appreciate everyone’s help.

You’re not going to hear too much out of me next week. I’m taking some vacation time and heading to the Damariscotta River Association’s Archaeological Field School in Maine. I’ll spend a week learning how to dig a hole

By far, the most frequent question that I’ve gotten when I tell people about this is, “How did you find out about that?”. As it turns out, there are some really good places online that list digs that you can attend. You do have to pay for them, typically, and you have to show up for a minimum amount of time, but you can go and learn how archaeology works. I used this database at Archaeological.org to find mine, but there are spots all over the world. If you’re into it, go find yourself one!

Also, in case you’re not sick of hearing about it, the last Friday of this month is the SysAdmin Appreciation Day Meetup in NYC. Remember, also, if you’re in the San Francisco Bay area, OpenDNS is throwing their own. There’s no reason that you can’t throw your own if you can’t come to ours. If you want to organize one, drop me an email and I’ll mention it on the blog.

Everybody have a good week!