Showing posts with label nagios. Show all posts
Showing posts with label nagios. Show all posts

Friday, 24 May 2013

creating a check for a device with SNMP and Nagios

I'm assuming your monitoring software is based on nagios. 

First stop is to check if a check already exists in the monitoring system.

If not check http://exchange.nagios.org/. Download and read the script. Understand it and test it.

If you can't find one, you have two options. Create one from scratch or use check_snmp.

To use check_snmp you need to know the correct OIDs. Contact the vendor of the device or check the documentation sometimes they have all of this in one document. Otherwise use snmpwalk
snmpwalk -v2c -c communityname 192.168.1.10. You'll have to go through all the OID's, find the value you are interested in monitoring. You can setup check snmp with the OID.

 snmpwalk -v2c -c  communityname 192.168.1.10 1.3.6.1.4.1.20632.5.14
SNMPv2-SMI::enterprises.20632.5.14 = STRING: "42.0 degrees C"

That OID "1.3.6.1.4.1.20632.5.14" is for CPU temp.

Lets try it with check snmp
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14
SNMP OK - "43.0 degrees C" |

So this check will return ok so long as the temp stays at 43. That's fine for static values, but for changing values its no good. You can use the -r switch

This means the check will be ok so long as its under 49 degrees
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14 -r "4[0123456789]"
SNMP OK - "42.0 degrees C" |

If I changed it to 50 - 59, it would alert
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14 -r "5[0123456789]"
SNMP CRITICAL - *"42.0 degrees C"* |

If you want to write a check from scratch its a good idea to look at some checks already on http://exchange.nagios.org/. You'll need to get all the OID's you need. You'll also have to figure out what each value means. This can be a lot of work, forcing certain situations (unplugging cables etc) and checking the values returned. This is why most people just use check_snmp with a simple ok / not ok check.


Wednesday, 15 May 2013

installing a new check into opsview

First check the opsview interface to see if the check is already installed.

If not check http://exchange.nagios.org/ or google search

Find a script that looks like it does the job with good ratings. Read the detail make sure there are no bugs affecting your software version / setup. If you can't find a script to do the job you will have to write one from scratch or use the default check_snmp script.

Open the script and get the OID's that they are using. Manual check them with snmpwalk, lets say my OID is "1.3.6.1.4.1.9.9.500.1.2.1.1.6"

snmpwalk -c public -v2c 192.168.0.1 1.3.6.1.4.1.9.9.500.1.2.1.1.6

Read the script see what the value that is returned means. If the script hasn't documented it, you have have to  get the vendors documentation or contact their support.

After testing with snmpwalk you can copy the script to the slave. You may have to "su - nagios" chown the script to nagios, chmod the script 755 and edit the !#/usr/bin/perl at the top of the script to the relevant path on your system.

Test the script by running it manually
./check_snmp_custom_check.pl -H 192.1680.1 -C public

If you are happy with the results you need to import the script into the master.

Copy the script to /usr/local/nagios/libexec
su / chown / chmod / edit #!
Go into the opsview web interface
Configuration -> Service checks
Click the Actions button -> create new service check

Fill in
Name
Description
Service group
Check period 24x7
You should be able to select the plugin check_snmp_custom_check.pl (if its not there try a reload)
Fill in the arguments "-H $HOSTADDRESS$ -C $SNMP_COMMUNITY$" view another check for help

Once complete reload opsview. Now try to add the check to a host (you may need another reload for it to appear).

Now the check should exist to be assigned to other hosts in the future.



Monday, 21 May 2012

How to run a manual check from nagios slave

You may need to su to the nagios user to run some checks
sudo su - nagios

/usr/local/nagios/libexec/check_http -H servername.domain.ie -S -w 5 -c 10
/usr/local/nagios/libexec/check_http -I 192.168.1.50 -S -w 5 -c 10
/usr/local/nagios/libexec/check_http -H 192.168.1.50 -S -w 5 -c 10

more info here http://nagiosplugins.org/man/check_http