Friday 24 May 2013

creating a check for a device with SNMP and Nagios

I'm assuming your monitoring software is based on nagios. 

First stop is to check if a check already exists in the monitoring system.

If not check http://exchange.nagios.org/. Download and read the script. Understand it and test it.

If you can't find one, you have two options. Create one from scratch or use check_snmp.

To use check_snmp you need to know the correct OIDs. Contact the vendor of the device or check the documentation sometimes they have all of this in one document. Otherwise use snmpwalk
snmpwalk -v2c -c communityname 192.168.1.10. You'll have to go through all the OID's, find the value you are interested in monitoring. You can setup check snmp with the OID.

 snmpwalk -v2c -c  communityname 192.168.1.10 1.3.6.1.4.1.20632.5.14
SNMPv2-SMI::enterprises.20632.5.14 = STRING: "42.0 degrees C"

That OID "1.3.6.1.4.1.20632.5.14" is for CPU temp.

Lets try it with check snmp
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14
SNMP OK - "43.0 degrees C" |

So this check will return ok so long as the temp stays at 43. That's fine for static values, but for changing values its no good. You can use the -r switch

This means the check will be ok so long as its under 49 degrees
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14 -r "4[0123456789]"
SNMP OK - "42.0 degrees C" |

If I changed it to 50 - 59, it would alert
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14 -r "5[0123456789]"
SNMP CRITICAL - *"42.0 degrees C"* |

If you want to write a check from scratch its a good idea to look at some checks already on http://exchange.nagios.org/. You'll need to get all the OID's you need. You'll also have to figure out what each value means. This can be a lot of work, forcing certain situations (unplugging cables etc) and checking the values returned. This is why most people just use check_snmp with a simple ok / not ok check.


No comments:

Post a Comment