Tuesday, 5 March 2013

investigating RAM, CPU or disk space alerts on linux servers

First establish what is alerting, RAM CPU or disk usage and connect to the server.


If the alert is for RAM usage
Connect to the server and run the "top" command (type "top" and press enter)
Top is similar to taskmgr on windows.

Press Shift and M to sort the processes by highest mem usage

The command "free -m" can also be useful for seeing how much memory is used/free.

You can see which process is using up all the memory. What user is running the process. You may need to contact the customer, application or DB team to find out if this is expected and what the next step should be.

If the alert is for CPU usage

Connect to the server and run the "top" command (like taskmgr on windows)
Press Shift and P to sort the processes by highest CPU usage

You can see which process is using up all the CPU. What user is running the process. You may need to contact the customer, application or DB team to find out if this is expected and what the next step should be.



If the alert is for disk space usage
Connect to the server and run the "df -h" command.

This will show you the percent usage on each partition/mount
To get further infromation on reads/writes to that partition run
"vmstat -p /mount/point" for example "vmstat -p /dev/sda2"

If we run the "df -h" command and we discover that /home is at 95% usage. We can see what is using up all the space by using the du command. First "cd /home". You can use the following du command:
"du -sm * | sort -nr | head -10" this will give the results in MB, sort them with the largest at the top and only show the top 10 results.

You can contact the owners of large files and ask them if they are required. You may find that log files grow to a large size or backup files are building up. Best practice is to set up a script to remove old files.
Before assigning to N&S you should complete the steps above. With the information provided you should be able to resolve the tag. If not take screen shots of your output and attach them to the tag.

More on using the top command
press shift P - sorts processes by highest CPU usage
press shift M - sort processes by highest Mem usage
press u type a username and press enter - shows only processes for that username (press u and enter to bring them all back)
press r and enter the PID to re-nice a process - sets a process to a higher or lower priority (be careful)
press shift R - this will sort by PID (shift R again to change it back)
press c - shows full command that was used to launch the process
press k and enter the PID - will kill the selected process (be careful)
press q - will quit the top application


Script to list largest directory or files
Just copy the script below onto the server you want to check
It will give you the top 10 directories and then the top files in those directories
You may need to make it executable

#!/bin/bash
# This will give the user back a listing of the largest files/dirs on the system

# make some tempfiles
mktemp1="/tmp/mktemp1"

# get overall 10 largest dirs
du -sm /* |sort -nr |head > $mktemp1

echo "Directory size listings for " $(hostname -s)
echo "Run date: "$(date)
echo "All sizes on left are in MB"
echo " "
echo "/ dir size list"
cat $mktemp1
echo " "

# for the largest 5 of the above get the sizes in them
top5=$(head -5 $mktemp1 |awk -F " " '{print $2}')

for dir in $top5
do
        echo $dir dir size list
        du -sm $dir/* |sort -nr |head
        echo " "
done

rm $mktemp1
exit 0



No comments:

Post a Comment