Wednesday 23 October 2013

some useful redhat commands

Check Main IP and aliases

watch /sbin/ifconfig

Check Physical/Virtual/Manufacturer/Serial

/usr/sbin/dmidecode | less

Check scheduled tasks

crontab -l

Check RedHat Version

cat /etc/redhat-release


How to check esx version
vmware -v
vimsh -n -e 'hostsvc/hostsummary' | grep fullName
cat /proc/vmware/version


Check Apache version
/usr/sbin/httpd -v


Detect Jboss version
ls -l /usr/jboss
ls /usr/ | grep jboss

Detect java version
java -version

Detect mysql version
mysql -V
"mysql  Ver 14.12 Distrib 5.0.45, for redhat-linux-gnu (i686) using readline 5.0"

Detect oracle
. oraenv
bct1
sqlplus "/ as sysdba"
select * from v$version where banner like 'Oracle%';

Detect f-prot version
/usr/local/f-prot/fpscan --versio

Amazon aws say they are not compatible with Cisco ASA versions older than 8.3

So make sure you update your ASA. This way they can't dodge support. The problem there is the difference in the way the NAT works. You are just going to have to get in there and convert the config and re-do all the NAT's.

Wednesday 16 October 2013

tailing the squid logs for hosts with the most sessions

tail -5000 /var/log/squid/access.log | grep 'website.name' | awk -F " " '{print $3}' | sort -nr | uniq -c | more

Wednesday 18 September 2013

find unauthorized SUID and SGID system executables

The administrator should take care to ensure that no rogue set-UID programs have been introduced into the system. In addition, if possible, the administrator should attempt a Set-UID audit and reduction. To check for these run the following script:
#!/bin/bash
for part in `awk '($3 == "ext2" || $3 == "ext3") { print $2 }' /etc/fstab`
do
 find $part -xdev \( -perm -04000 -o -perm -02000 \) -type f -print
done

find unauthorized world writable files in linux

World writeable files can be modified by any user on the system. Generally 
removing write access for the "other" category (chmod o-w ) is advisable, but 
always consult the relevant documentation in order to avoid breaking any 
application dependencies on a particular file. Run the following script to print 
a list of world writeable files to screen. These files should then be reviewed 
and if possible the world writeable permissions removed. 

#!/bin/bash
for part in `awk '($3 == "ext2" || $3 == "ext3") { print $2 }' /etc/fstab`
do
 find $part -xdev -perm -0002 -type f -print | less
done

SELinux TFTP policy

If you have SELINUX running SELINUX won't allow you to PUT or upload files to your TFTP server. You can use "audit2allow" to allow you to create custom SELINUX policies
To use this you need to examine your servers audit logs. /var/log/audit/audit.log. This is where selinux logs errors. If you are receiving permission denied errors when uploading or puttiing files due to SELINUX have a check of this log. If SELINUX is causing the problem you will see an error log entry that looks like this:

type=AVC msg=audit(1245199930.280:31): avc: denied { write } for pid=2584 comm="in.tftpd" name="tftpboot" dev=dm-0 ino=1747009 scontext=system_u:system_r:tftpd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:tftpdir_t:s0 tclass=dir
type=SYSCALL msg=audit(1245199930.280:31): arch=40000003 syscall=5 success=no exit=-13 a0=805e7a2 a1=8041 a2=1b6 a3=8041 items=0 ppid=2565 pid=2584 auid=4294967295 uid=99 gid=99 euid=99 suid=99 fsuid=99 egid=99 sgid=99 fsgid=99 tty=(none) ses=4294967295 comm="in.tftpd" exe="/usr/sbin/in.tftpd" subj=system_u:system_r:tftpd_t:s0-s0:c0.c1023 key=(null)
Using this error and the audit2allow tool we can create a policy that allows TFTP writes.

Step 1

Create some policy rules to load into SELINUX. Using the grep command input log entries which match our error from the audit file to the audit2allow tool. $ grep tftpd_t /var/log/audit/audit.log | audit2allow -M tftplocal

NOTE!

The audit2allow tool isn't infallible and sometimes you might want to check the rules that are contained in the output module the above command has created aren't too relaxed. These rules are kept in a file called tftplocal.te that gets created as a result of the above command. It should look something like this:
module tftplocal 1.0;

require {
        type tftpd_t;
        type tftpdir_t;
        class dir { write };
        class file { write };
}

#============= tftpd_t ==============
allow tftpd_t tftpdir_t:dir { write add_name };
allow tftpd_t tftpdir_t:file { write create };

Step 2

Import the selinux policy module created in step 1 $ semodule -i tftplocal.pp

checking the status of a service in linux

In this example I want to see if SMB is running

Check the status:
/etc/init.d/smb status

I can restart a service with
/etc/init.d/smb stop
/etc/init.d/smb start

or simply
/etc/init.d/smb restart

Check if the service is set to start on boot up
chkconfig --list | grep smb

SELinux

SElinux can stop samba (and other things) from working. You can turn it off by running the following command as root
"setenforce 0"
This is not recommended as it disables other security features but no one seems to know how to create exceptions for SElinux. SElinux will start again after a reboot. To stop it starting on reboot
sudo vi /etc/selinux/config
change the line SELINUX=enforcing to SELINUX=permissive
Save your changes and that should be it.

How many CPU sockets does my server have?

$ egrep "processor|physical id|core id" /proc/cpuinfo
processor   : 0
physical id  : 0
core id        : 0
processor   : 1
physical id  : 0
core id        : 0


The output show is for a single socket dual core machine.  Each core has a different processor ID, but the same physical ID (The physical ID indicating they are in fact on the same socket.  A virtual machine will usually only show the processor line and not the physical or core id's.

Thursday 12 September 2013

subnet calculator tool

Good calculator here for dividing up larger subnets into smaller ones
http://www.davidc.net/sites/default/subnets/subnets.html

A /16 breaks up into two /17's, a /17 breaks up into two /18's and so on

As you break up a /16 you get more networks but less hosts

mask networks hosts
/16 1 65534
/17 2 32766
/18 4 16382
/19 8 8190
/20 16 4096
/21 32 2046
/22 64 1022
/23 128 510
/24 256 254
/25 512 126
/26 1024 62
/27 2048 30
/28 4096 14
/29 8192 6
/30 16384 2

Friday 6 September 2013

memory leaks on checkpoint R70

I've had an ongoing issue with a checkpoint R70, the RAM usage creeps up on the management node of the cluster and needs to be rebooted every 4 months. The device is currently out of support contract so I can't get any support/hotfixes/updates from checkpoint.

I've noticed that the nodes in the cluster also have RAM usage creeping up over a much longer period of time, about 1 year. I had no idea how to check the ram usage on these device so I had to find out.

The cluster is checkpoints secure platform so its just the software installed on some HP servers. They run a sort of linux OS.

I was able to run top on the server. I could see the cpd process was taking up most of the RAM. I assume this is the check point daemon.

I ran the two following commands from checkpoint. Checkpoint documentation asks you to look for failed allocations. If you see that there is a problem. Otherwise it is most likely a memory leak.

# fw ctl pstat

Machine Capacity Summary:
  Memory used: 1% (29MB out of 1620MB) - below low watermark
  Concurrent Connections: 0% (58 out of 24900) - below low watermark
  Aggressive Aging is not active

Hash kernel memory (hmem) statistics:
  Total memory allocated: 20971520 bytes in 5115 4KB blocks using 5 pools
  Total memory bytes  used:  3217916   unused: 17753604 (84.66%)   peak:  504318                                                                             0
  Total memory blocks used:     1013   unused:     4102 (80%)   peak:     1351
  Allocations: 1213799129 alloc, 0 failed alloc, 1213766746 free

System kernel memory (smem) statistics:
  Total memory  bytes  used: 43472216   peak: 55769708
    Blocking  memory  bytes   used:  1403176   peak:  1440356
    Non-Blocking memory bytes used: 42069040   peak: 54329352
  Allocations: 220680 alloc, 0 failed alloc, 219982 free, 0 failed free

Kernel memory (kmem) statistics:
  Total memory  bytes  used: 25670260   peak: 39394220
        Allocations: 1214019254 alloc, 0 failed alloc, 1213986426 free, 0 failed                                                                              free

        External Allocations: 5124 for packets, 0 for SXL

# cpstat os -f memory

Total Virtual Memory (Bytes):  4271108096
Active Virtual Memory (Bytes): 1696493568
Total Real Memory (Bytes):     2123681792
Active Real Memory (Bytes):    1696399360
Free Real Memory (Bytes):      427282432
Memory Swaps/Sec:              -
Memory To Disk Transfers/Sec:  -

To clear the leak you can run the "CPSTOP;CPSTART" or reboot the device
Make sure you have DRAC/ILO or physical access to the box

when logging a call with CP support they will usually ask for a cpinfo
cpinfo -o mycpinfo.tgz

See which node is active in a cluster
cphaprob stat

Logs are usually in /var/log on the active node

From checkpoint documentation:

Presence of hmem failed allocations indicates that the hash kernel memory was full. This is not a serious memory problem but indicates there is a configuration problem. The value assigned to the hash memory pool, (either manually or automatically by changing the number concurrent connections in the capacity optimization section of a firewall) determines the size of the hash kernel memory. If a low hmem limit was configured it leads to improper usage of the OS memory. See „Capacity Optimization‟ in the „Firewall Health Checks‟ section for further information.

Presence of smem failed allocations indicates that the OS memory was exhausted or there are large non-sleep allocations. This is symptomatic of a memory shortage. If there are failed smem allocations and the memory is less than 2 GB, upgrading to 2GB may fix the problem. Decreasing the TCP end timeout and decreasing the number of concurrent connections can also help reduce memory consumption.
Section 1 – Physical Platform Checks
Performing a SecurePlatform Firewall Health Check Page 10

Presence of kmem failed allocations means that some applications did not get memory. This is
usually an indication of a memory problem; most commonly a memory shortage. The natural limit is
2GB, since the Kernel is 32bit.)

Memory shortage sometimes indicates a memory leak. In order to troubleshoot memory
shortage, stop the load you need to stop the load and let connections close. If the memory
consumption returns back to normal, you are not dealing with a memory leak. Such shortage might
happen when traffic volumes are too high for the device capacity. If the memory shortage happens
after a change in the system or the environment, undo the change, and check whether kmem
memory consumption goes down.

For optimum performance there should not be any failed memory allocations.

Monday 19 August 2013

setting up a VPN to an AWS instance

When you set up you instance in AWS and you want to configure a VPN to connect to it. A configuration file can be generated in the AWS console. You'll get something like the following

! Amazon Web Services
! Virtual Private Cloud
!
! AWS utilizes unique identifiers to manipulate the configuration of 
! a VPN Connection. Each VPN Connection is assigned an identifier and is 
! associated with two other identifiers, namely the 
! Customer Gateway Identifier and Virtual Private Gateway Identifier.
!
! Your VPN Connection ID                  : 
! Your Virtual Private Gateway ID         : 
! Your Customer Gateway ID                : 
!
!
! This configuration consists of two tunnels. Both tunnels must be 
! configured on your Customer Gateway. Only a single tunnel will be up at a 
! time to the VGW.
! 
! You may need to populate these values throughout the config based on your setup:
!  - External interface of the ASA
!  - Inbound ACL on the external interface
!  - Outside crypto map
!  and  - VPC address range
!  and  - Local subnet address range
!  - Target address that is part of acl-amzn to run SLA monitoring

! --------------------------------------------------------------------------------
! IPSec Tunnels
! --------------------------------------------------------------------------------
! #1: Internet Key Exchange (IKE) Configuration
!
! A policy is established for the supported ISAKMP encryption, 
! authentication, Diffie-Hellman, lifetime, and key parameters.
!
! Note that there are a global list of ISAKMP policies, each identified by 
! sequence number. This policy is defined as #201, which may conflict with
! an existing policy using the same number. If so, we recommend changing 
! the sequence number to avoid conflicts.
!
crypto isakmp identity address 
crypto isakmp enable 
crypto isakmp policy 201
  encryption aes
  authentication pre-share
  group 2
  lifetime 28800
  hash sha
exit
!
! The tunnel group sets the Pre Shared Key used to authenticate the 
! tunnel endpoints.
!
tunnel-group XX.XXX.XXX.40 type ipsec-l2l
tunnel-group XX.XXX.XXX.40 ipsec-attributes
   pre-shared-key somepassword
!
! This option enables IPSec Dead Peer Detection, which causes periodic
! messages to be sent to ensure a Security Association remains operational.
!
   isakmp keepalive threshold 10 retry 3
exit
!
tunnel-group XX.XXX.XXX.44 type ipsec-l2l
tunnel-group XX.XXX.XXX.44 ipsec-attributes
   pre-shared-key anotherpassword
!
! This option enables IPSec Dead Peer Detection, which causes periodic
! messages to be sent to ensure a Security Association remains operational.
!
   isakmp keepalive threshold 10 retry 3
exit

! --------------------------------------------------------------------------------
! #2: Access List Configuration
!
! Access lists are configured to permit creation of tunnels and to send applicable traffic over them.
! This policy may need to be applied to an inbound ACL on the outside interface that is used to manage control-plane traffic. 
! This is to allow VPN traffic into the device from the Amazon endpoints.
!
access-list  extended permit ip host XX.XX.XX.40 host my.fw.pub.ip
access-list  extended permit ip host XX.XX.XX.44 host my.fw.pub.ip
! The following access list named acl-amzn specifies all traffic that needs to be routed to the VPC. Traffic will
! be encrypted and transmitted through the tunnel to the VPC. Association with the IPSec security association
! is done through the "crypto map" command.
!
! This access list should contain a static route corresponding to your VPC CIDR and allow traffic from any subnet.
! If you do not wish to use the "any" source, you must use a single access-list entry for accessing the VPC range.
! If you specify more than one entry for this ACL without using "any" as the source, the VPN will function erratically.
! See section #4 regarding how to restrict the traffic going over the tunnel
!
!
access-list acl-amzn extended permit ip any  

!---------------------------------------------------------------------------------
! #3: IPSec Configuration
!
! The IPSec transform set defines the encryption, authentication, and IPSec
! mode parameters.
!
crypto ipsec transform-set transform-amzn esp-aes esp-sha-hmac
! The crypto map references the IPSec transform set and further defines
! the Diffie-Hellman group and security association lifetime. The mapping is created
! as #1, which may conflict with an existing crypto map using the same
! number. If so, we recommend changing the mapping number to avoid conflicts.
!
crypto map  1 match address acl-amzn
crypto map  1 set pfs group2
crypto map  1 set peer  XX.XXX.XXX.40 XX.XXX.XXX.44
crypto map  1 set transform-set transform-amzn
!
! Only set this if you do not already have an outside crypto map, and it is not applied:
!
crypto map  interface 
!
! Additional parameters of the IPSec configuration are set here. Note that
! these parameters are global and therefore impact other IPSec
! associations.
! Set security association lifetime until it is renegotiated.
crypto ipsec security-association lifetime seconds 3600
!
! This option instructs the firewall to clear the "Don't Fragment"
! bit from packets that carry this bit and yet must be fragmented, enabling
! them to be fragmented.
!
crypto ipsec df-bit clear-df 
!
! This configures the gateway's window for accepting out of order
! IPSec packets. A larger window can be helpful if too many packets
! are dropped due to reordering while in transit between gateways.
!
crypto ipsec security-association replay window-size 128
!
! This option instructs the firewall to fragment the unencrypted packets
! (prior to encryption).
!
crypto ipsec fragmentation before-encryption 
!
! This option causes the firewall to reduce the Maximum Segment Size of
! TCP packets to prevent packet fragmentation.
sysopt connection tcpmss 1387
!
! In order to keep the tunnel in an active state, the ASA needs to send traffic to the subnet 
! defined in acl-amzn. SLA monitoring can be configured to send pings to a destination in the subnet and
! keep the tunnel active. A possible destination for the ping is the VPC Gateway IP, which is the 
! first IP address in one of your subnets.
! For example: a VPC with a CIDR range of 192.168.50.0/24 will have a gateway: 192.168.50.1.
! 
! The monitor is created as #1, which may conflict with an existing monitor using the same
! number. If so, we recommend changing the sequence number to avoid conflicts.
!
sla monitor 1
   type echo protocol ipIcmpEcho  interface 
   frequency 5
exit
sla monitor schedule 1 life forever start-time now
!
! The firewall must allow icmp packets to use "sla monitor" 
icmp permit any 

!---------------------------------------------------------------------------------------
! #4: VPN Filter
! The VPN Filter will restrict traffic that is permitted through the tunnels. By default all traffic is denied.
! The first entry provides an example to include traffic between your VPC Address space and your office.
! You may need to run 'clear crypto isakmp sa', in order for the filter to take effect.
!
! access-list amzn-filter extended permit ip    
access-list amzn-filter extended deny ip any any
group-policy filter internal
group-policy filter attributes
vpn-filter value amzn-filter
tunnel-group XX.XXX.XXX.40 general-attributes
default-group-policy filter
exit
tunnel-group XX.XXX.XXX.44 general-attributes
default-group-policy filter
exit

!---------------------------------------------------------------------------------------
! #5: NAT Exemption
! If you are performing NAT on the ASA you will have to add a nat exemption rule.
! This varies depending on how NAT is set up.  It should be configured along the lines of:
! object network obj-SrcNet
!   subnet 0.0.0.0 0.0.0.0
! object network obj-amzn
!   subnet  
! nat (inside,outside) 1 source static obj-SrcNet obj-SrcNet destination static obj-amzn obj-amzn
! If using version 8.2 or older, the entry would need to look something like this:
! nat (inside) 0 access-list acl-amzn
! Or, the same rule in acl-amzn should be included in an existing no nat ACL.
!
!---------------------------------------------------------------------------------------
!  Additional Notes and Questions
!  - Amazon Virtual Private Cloud Getting Started Guide: 
!       http://docs.amazonwebservices.com/AmazonVPC/latest/GettingStartedGuide
!  - Amazon Virtual Private Cloud Network Administrator Guide: 
!       http://docs.amazonwebservices.com/AmazonVPC/latest/NetworkAdminGuide
!  - Troubleshooting Cisco ASA Customer Gateway Connectivity:
!       http://docs.amazonwebservices.com/AmazonVPC/latest/NetworkAdminGuide/Cisco_ASA_Troubleshooting.html
!  - XSL Version: 2009-07-15-1119716

If you have a brand new firewall you could *probably* copy paste this stuff in and it would work, but I still wouldn't advise it. Most users have a lot of config on their firewalls already. Also there are some global settings in there that you might not be comfortable with changing if you have VPNs to other customers for example.

The config generated by AWS didn't match up exactly to the 8.6 code running on my ASA.

Template from amazon
My config
Already present
crypto isakmp identity address
crypto isakmp identity address
Yes
crypto isakmp enable <outside_interface>
crypto ikev1 enable OUTSIDE
Yes
crypto isakmp policy 201
crypto ikev1 policy 201
No, needed to be added. Keep in mind on other firewalls a policy 201 may exist already and you’ll have to use a different number
tunnel-group xx.xx.xx.xx type ipsec-l2l

No, needs to be copied exactly from the template
access-list <outside_access_in> extended permit ip host XX.XXX.XX.44 host my.fw.pub.ip

No, needs to be added. You need to determine what is the name of the ACL applied to the outside interface.
access-list acl-amzn extended permit ip any <vpc_subnet> <vpc_subnet_mask>
access-list CUSTOMER_AWS extended permit ip any int.net.ip.add 255.255.0.0
No, needs to be added. You should use a descriptive name for the ACL
crypto ipsec transform-set transform-amzn esp-aes esp-sha-hmac
crypto ipsec ikev1 transform-set ESP-AES-128-SHA esp-aes esp-sha-hmac
Yes, there was already a transform set for aes 128 and sha1
crypto map <amzn_vpn_map> 1
crypto map S2S 190
No, needs to be added, you need to make sure the map number is not in use. Make sure to update the ACL name and the transform set.
crypto map <amzn_vpn_map> interface <outside_interface>
crypto map S2S interface OUTSIDE
Yes
Additional global settings

No, going to leave these out for now. If we have issues with the VPNs we can test.
VPN Filter

No, needs to be added, ACL, networks and Filter names should be updated.
NAT exemption

No, needed to add an object for the customer private network


Config applied to the ASA firewall in the end, you can see I left out a lot of the global settings. So far the VPNs are working fine:

crypto ikev1 policy 201
authentication pre-share
encryption aes
hash sha
group 2
lifetime 28800

tunnel-group XX.XXX.XX.44 type ipsec-l2l
tunnel-group XX.XXX.XX.44 ipsec-attributes
ikev1 pre-shared-key *****
isakmp keepalive threshold 10 retry 3

tunnel-group XX.XXX.XX.40 type ipsec-l2l
tunnel-group XX.XXX.XX.40 ipsec-attributes
ikev1 pre-shared-key *****
isakmp keepalive threshold 10 retry 3
  
access-list outside_in extended permit ip host XX.XXX.XX.44 host my.fw.pub.ip
access-list outside_in extended permit ip host XX.XXX.XX.40 host my.fw.pub.ip

access-list CUSTOMER_AWS extended permit ip any cust.int.net.id 255.255.0.0

crypto map S2S 190 match address CUSTOMER_AWS
crypto map S2S 190 set pfs group2
crypto map S2S 190 set peer  XX.XXX.XX.44 XX.XXX.XX..40
crypto map S2S 190 set transform-set ESP-AES-128-SHA
crypto map S2S 190 set security-association lifetime seconds 3600
crypto map S2S 190 set security-association lifetime kilobytes 4608000

access-list CUSTOMER_AWS_VPN_FILTER_ACL extended permit ip cust.int.net.id 255.255.0.0 192.168.1.0 255.255.255.0
access-list CUSTOMER_AWS_VPN_FILTER_ACL extended deny ip any any
group-policy CUSTOMER_AWS_VPN_FILTER internal
group-policy CUSTOMER_AWS_VPN_FILTER attributes
vpn-filter value CUSTOMER_AWS_VPN_FILTER_ACL
tunnel-group XX.XXX.XX.44 general-attributes
default-group-policy CUSTOMER_AWS_VPN_FILTER
exit
tunnel-group XX.XXX.XX.40 general-attributes
default-group-policy CUSTOMER_AWS_VPN_FILTER

object network obj-cust.int.net.id
subnet cust.int.net.id 255.255.0.0

nat (inside,outside) 1 source static obj-192.168.1.0 obj-192.168.1.0 destination static obj-cust.int.net.id obj-cust.int.net.idnation static obj-cust.int.net.id obj-cust.int.net.id

Thursday 15 August 2013

Tip on using packet tracer on Cisco ASA

I use the packet tracer tool quite often on ASA's. A Cisco engineer told me its better to always do the traces from the inside out, because traffic coming from the VPN is encrypted and we cannot inject encrypted traffic. He also said its a good idea to run it twice just incase the VPN isn't up already.

packet-tracer input INSIDE tcp 192.168.10.10 22 172.30.10.10 4444 detailed

instead of

packet-tracer input OUTSIDE tcp 172.30.10.10 4444 192.168.10.10 22 detailed

When we see the follow at the end of our trace

Type: VPN
Subtype: encrypt
Result: ALLOW

We know the data was encrypted and sent over the VPN

I've also seen
Phase: 7
Type: VPN
Subtype: encrypt
Result: DROP

Everything looked good on my end. The other need needed to update the proxy ID's.

Email Header Analyzer

http://www.mxtoolbox.com

Great website for testing email stuff

Allowing external access to a webserver on the usual ports on Cisco ASA

Had to restrict access to a web app, figured it would be useful to leave this here

Setup the object 

object network obj-172.20.50.50
 host 172.20.50.50

Setup the static NAT

object network obj-172.20.50.50
 nat (INSIDE,OUTSIDE) static 200.100.200.300

Setup the group of hosts who are allowed access (you can use any if you want the internet to have access but I want to restrict)

object-group network MYWEBAPP_HOSTS_ALLOWED_IN
 network-object host 80.70.60.50
 network-object host 90.100.200.50
 network-object host 100.123.123.123

Set up the group of ports you want to allow access

object-group service PORTS_80_AND_443 tcp-udp
 port-object eq 80
 port-object eq 443

Add an entry to the outside access-list
access-list OUTSIDE_IN extended permit tcp object-group MYWEBAPP_HOSTS_ALLOWED_IN 172.20.50.50 object-group PORTS_80_AND_443


Monday 12 August 2013

can't get to servers with a static NAT from internal servers in other DMZ's

This was the NAT method I was using:
nat (INSIDE,OUTSIDE) source static obj-172.20.100.140 obj-172.20.100.140 destination static OBJ-200.100.100.200 OBJ-200.100.100.200 no-proxy-arp route-lookup

The above has worked fine for me before in many situations but I had an issue that I couldn't connect to servers with a static NAT because the firewall was trying to get to the Public IP.

Had to change to this NAT method

Make sure you have your object is set up

object network obj-172.20.100.140
 host 172.20.100.140

This does the static NAT
object network obj-172.20.100.140
 nat (INSIDE,OUTSIDE) static 200.100.100.200

This NAT is processed at the right time so internal servers can get to the server but it still has it's static NAT.




Thursday 8 August 2013

tracking what servers are using port 25 with linux CLI tools

There was an issue with an unknown server sending out emails and getting the public IPs blacklisted, one of my colleagues came up with this line to find what that server was by searching the syslog.

grep 'Built outbound TCP connection' my-asa-log.log | grep '/25' | grep -v 'INSIDE:192.160.10.50' | awk -F " " '{print $15}' | awk -F "/" '{print $1} | sort | uniq -c


grep 'Built outbound TCP connection' my-asa-log.log
search for outbound connections in the ASA syslog file

grep '/25'
Search for connections to port 25

grep -v 'INSIDE:192.160.10.50'
Remove entires for 192.160.10.50 (the real email server)

awk -F " " '{print $15}'
Print column 15 which was

awk -F "/" '{print $1}'
I think this was the date

sort
sorts the data alpha numeric

uniq -c
Only shows one instance of an IP address and shows the count of how many times it appeared


Tuesday 30 July 2013

mounting screws and cage nuts for equipment racks

As far as I can see there are 2 main sizes that are used M5 and M6. M6 being bigger.

I've found most modern devices will only work with M5. The rack mounts won't fit the M6. Some older devices and heavy items like SANs and 3U servers etc will take the M6 ones. Some devices come with smaller holes for the M5 and a larger one in the middle of the mount for the M6 so you can use either.

Got these before, they run out faster than you would expect since I ping at least one off into oblivion every time.
http://www.amazon.co.uk/StarTech-com-Cage-Nuts-Server-Cabinets/dp/B00009XT0G/ref=sr_1_4?ie=UTF8&qid=1375172819&sr=8-4&keywords=cage+nuts

I've used this kind of cage nut tool and it really helps (no more stabbing your self with a screw driver)
http://www.amazon.co.uk/Economy-Cage-racking-cagenut-extraction/dp/B006BZDHYY/ref=sr_1_5?ie=UTF8&qid=1375172947&sr=8-5&keywords=cage+nuts+tool

I've never used one of these but looks good
http://www.amazon.co.uk/Deluxe-Cage-racking-extract-cagenut/dp/B006BZFG3O/ref=sr_1_1?ie=UTF8&qid=1375172947&sr=8-1&keywords=cage+nuts+tool



Thursday 25 July 2013

troubleshoot VPNs on a juniper device

configuring an IP on centos / redhat

I was used to using ubuntu so this was a bit different

sudo vi /etc/sysconfig/network
set hostname and default gateway

sudo vi /etc/sysconfig/network-scripts/ifcfg-eth0
set up and IP on the interface

service network restart


Tuesday 23 July 2013

WARNING: The crypto map entry is incomplete!

I've often prepared my crypto maps in advance and then pasted them in. An error that has thrown me in the past is "WARNING: The crypto map entry is incomplete!". At the time I was working with some old PIX firewalls where I was never sure if it the firewall was actually going to do what it was told. Here is an example:

V1FWCL01(config)# crypto map S2S 190 match address CUSTOMER_ACL
WARNING: The crypto map entry is incomplete!
V1FWCL01(config)# crypto map S2S 190 set pfs group2
WARNING: The crypto map entry is incomplete!
V1FWCL01(config)# crypto map S2S 190 set peer  xx.xx.xx.xx xxx.xxx.xxx.xxx
WARNING: The crypto map entry is incomplete!
V1FWCL01(config)# crypto map S2S 190 set transform-set ESP-AES-128-SHA
V1FWCL01(config)# crypto map S2S 190 set security-association lifetime seconds 3600
V1FWCL01(config)# crypto map S2S 190 set security-association lifetime kilobytes 4608000

You will get this warning until the crypto map gets the 3 things it needs
  • The ACL
  • The peer address
  • The transform set

VPN troubleshooting guide from Cisco

http://www.cisco.com/en/US/products/ps6120/products_tech_note09186a00807e0aca.shtml

Monday 22 July 2013

unable to connect to VMware console of VM

Check events on the esx host
Connect to esx host and run df of vdf

Look for volumes that have filled up.

Sometimes you have to restart the services to get them to release the space after you have cleared log files out etc.

good tools for testing website load speed

Online tool (good quick test)
http://tools.pingdom.com

Desktop tool (more detail)
http://fiddler2.com/


How long does DNS take to propagate and short bash script to test after changing DNS entries

I was changing a customer's DNS to point at a new public IPs. They were concerned about how long it would take to switch over. The DNS provider told me they usually tell customers it will be complete in their network within 24 hours and they can't control outside of that. They said normally the DNS change will take effect immediately and should have propagated out across the internet within 24 hours so this covers them with customers.

Wrote a short bash script to do an nslookup so we could see the new public IP's and a curl to show the website was up.


echo "**********************************************"
echo "nslookup on my.customer.com"
echo "**********************************************"
echo " "
nslookup my.customer.com
echo " "
echo "**********************************************"
echo "curl on my.customer.com, expecting http 200 OK"
echo "**********************************************"
echo " "
curl -IL my.customer.com
echo " "

Thursday 18 July 2013

Good explanation of NAT on Cisco ASA 8.3+

http://www.tunnelsup.com/tup/2011/06/24/nat-for-cisco-asas-version-8-3


Video here:
http://www.youtube.com/watch?v=REGJodyLJEU

NAT for Cisco ASA's Version 8.3+

| Comments

There are two major kinds of NAT in 8.3+ Auto NAT and Manual NAT. Auto is done inside the object and cannot take into consideration the destination of the traffic. Manual is done in global configuration and can NAT either the source IPs and destination IPs.

Auto NAT

The new term “autoNAT” is used in 8.3. Auto NAT is when the NAT command appears INSIDE the object statement on the firewall. There are two major variants of auto NAT: dynamic and static. Auto NAT is also sometimes referenced as “Network Object NAT” because the configuration is done within the network object.
Regular Dynamic PAT
To create a many-to-one NAT where the entire inside network is getting PAT’d to a single outside IP do the following.
Old 8.2 command:
nat (inside) 1 10.0.0.0 255.255.255.0
global (outside) 1 interface
New 8.3 equivalent command:
object network inside-net
  subnet 10.0.0.0 255.255.255.0
nat (inside,outside) dynamic interface

Note: the “interface” command is the 2nd interface in the nat statement, in this case the outside.
Static Auto-NAT
To create a one to one NAT within the object like when you have a webserver in your DMZ you can do the following NAT configuration.
object network dmz-webserver
  host 192.168.1.23
nat (dmz,outside) static 209.165.201.28

Please note, the nat (inside,outside) part of these commands are a lot easier to read in 8.3. The first interface is the interface the traffic is coming into the ASA on and the second interface is the interface that this traffic is going out of the ASA on. So the command “nat (dmz,outside) static 209.165.201.28” should be read as “NAT the IP address 192.168.1.23 to 209.165.201.28 if the traffic is coming in on the dmz interface and going out the outside interface, or vice versa.” This will not NAT traffic coming from the inside going to the DMZ, nor should it NAT the traffic coming from the DMZ going to the inside.
Using the any interface in the NAT statement
ASA 8.3 introduces the any interface when configuring NAT. For instance if you have a system on the DMZ that you wish to NAT not only to the outside interface, but to any interface you can use this command:
object network dmz-webserver
  host 192.168.1.23
nat (dmz,any) static 200.200.200.200

This makes it so users on the inside can web to 200.200.200.200 and if traffic is routed to the firewall it will NAT it to the real IP in the DMZ.
Port forwarding using Auto NAT
Suppose you have 2 web servers in your DMZ but you only have 1 IP address. You can configure port forwarding using the auto NAT feature in the following way:
object network dmz-webserver1
  host 192.168.1.25
nat (dmz,outside) static interface service tcp 8000 www
object network dmz-webserver2
  host 192.168.1.23
nat (dmz,outside) static interface service tcp 8080 www

This will make it so if you go to the IP address of the outside interface over port 8000 it will take you to 192.168.1.25 port 80 but if you go there using port 8080 it will take you to 192.168.1.23 port 80.
Confused yet? I hope not because it’s about to get weird…

Manual NAT or Twice NAT or Policy NAT or Reverse NAT

The limitation that Auto NAT has is that it cannot take the destination into consideration when conducting it’s NAT. This also of course results in it not being able to alter the destination address either. To accomplish either of these tasks you must use “manual NAT”.
All of these terms are identical: Manual NAT, Twice NAT, Policy NAT, Reverse NAT. Don’t be confused by fancy mumbo jumbo.
Policy NAT Exemption aka NAT Zero aka No NAT
In ASA 8.3 code this is known as Policy NAT exemption. This is commonly used to not NAT traffic over a VPN tunnel.
object network inside-net
  subnet 10.0.0.0 255.255.255.0
object network vpn-subnets
  range 10.1.0.0 10.5.255.255
nat (inside,outside) source static inside-net inside-net destination static vpn-subnets vpn-subnets

Policy NAT exemption for incoming remote access VPNs
In order for a packet to come in through a firewall from a lesser security interface to a higher security interface it must have a translation and an ACL to permit it through. If you are setting up remote access VPN then the ACL is usually bypassed since it’s tunneled traffic. There still needs to be a translation. This is completed by doing the following (Note the order of the interfaces in the NAT statement):
object-group network OBJ-INSIDE-NETWORKS
  network-object 172.16.200.0 255.255.255.0
object network obj-172.16.101.0
  subnet 172.16.101.0 255.255.255.0
nat (OUTSIDE,INSIDE) source static obj-172.16.101.0 obj-172.16.101.0 destination static OBJ-INSIDE-NETWORKS OBJ-INSIDE-NETWORKS

Dynamic Policy NAT
This is when you want to specify an ACL for your NAT traffic to match on and if it matches that ACL then NAT it to something
Suppose you are trying to build a VPN tunnel to another site. The problem is that your private IP addresses are overlapping with their private IP addresses so they tell you that you MUST come from 172.27.27.27. If this was a static one to one translation it wouldn’t be so hard but in this case we have many users all needing to use that IP address.
In the pre 8.3 configuration your code would look something like this:
access-list ACL-VENDOR-VPN-NAT extended permit ip 192.168.1.0 255.255.255.0 host 172.16.75.5
nat (inside) 3 access-list ACL-VENDOR-VPN-NAT
global (outside) 3 172.27.27.27

In the new ASA 8.3 config the code looks like this:
object network inside-net
  subnet 192.168.1.0 255.255.255.0
object network vendor-vpn-nat
  host 172.16.75.5
object network translated-ip
  host 172.27.27.27
nat (inside,outside) source dynamic inside-net translated-ip destination static vendor-vpn-nat vendor-vpn-nat

Miscellaneous Notes

Use real IPs in access-lists
In ASA version 8.3 you must specify the real IP and not the translate IP. For instance to permit your traffic to the webserver through the outside ACL you must put:
access-list ACL-OUTSIDE-IN extended permit tcp any host 192.168.1.25 eq 80
This is a major change from pre 8.3 which would specify the public or NAT’d IP address.

Show commands

To view this configuration you must check two places to see what is being NAT’d.
show run object
show run nat
The command “show run object in-line” is sometimes useful to when using the pipe commands.
You can also see the order of NAT and number of NAT translation hit counts with:
show nat

Optional Destination keyword in manual NAT

The destination keyword and addresses in the manual NAT command is optional. This means that both of these configurations do the same work:
object network inside-net
subnet 10.0.0.0 255.255.255.0
nat (inside,outside) dynamic interface
!
object network inside-net
subnet 10.0.0.0 255.255.255.0
nat (inside,outside) source dynamic inside-net interface

NAT order and after-auto NAT’ing

The order of operation in NAT commands is documented here:
http://www.cisco.com/en/US/partner/docs/security/asa/asa83/configuration/guide/nat_overview.html#wp1118157
The NAT operation will only take place once. Once there is a match on a NAT it will stop looking down the line to see whether it needs to NAT this traffic or not. The order of operation for this is like so:
  1. Twice NAT statements
  2. Auto NAT statements
  3. After-Auto NAT statements
Let’s say you have a Manual or Twice NAT that you want to be considered AFTER all of the auto NATs. You can specify this by adding the “after-auto” keyword which would look something like this:
nat (inside,outside) after-auto source dynamic any

Using Descriptions

The description keyword can be added to the end of a manual NAT statement to keep things more organized like so:
nat (OUTSIDE,INSIDE) source static obj-172.16.101.0 obj-172.16.101.0 destination static OBJ-INSIDE-NETWORKS OBJ-INSIDE-NETWORKS description ANYCON-NONAT

Inactive NAT statements

You may deactivate a manual NAT statement by adding the “inactive” keyword at the end of the statement like so:
nat (OUTSIDE,INSIDE) source static obj-172.16.101.0 obj-172.16.101.0 destination static OBJ-INSIDE-NETWORKS OBJ-INSIDE-NETWORKS inactive

Cisco Documentation on NAT for 8.3

CLI NAT configuration guide for ASA 8.3http://www.cisco.com/en/US/partner/docs/security/asa/asa83/configuration/guide/nat_overview.html
Upgrading to ASA 8.3 – What you need to knowhttps://supportforums.cisco.com/docs/DOC-12690
Video examples and tutorialhttps://supportforums.cisco.com/docs/DOC-12324

ASA Pre-8.3 to 8.3 NAT configuration exampleshttps://supportforums.cisco.com/docs/DOC-9129
ASA NAT migration problems when upgrading to 8.3 ; Syslog “%ASA-5-305013: Asymmetric NAT rules matched for forward and reverse flows”https://supportforums.cisco.com/docs/DOC-12569

Monday 15 July 2013

creating large test files on windows


Use fsutil

Powershell script
$file = Read-Host “Enter File Path”
$size = Read-Host “Enter File Size followed by MB or GB (Example: 10MB or 10GB)”
$objFile = [io.file]::Create($file)
$objFile.SetLength((Invoke-Expression $size))
$objFile.Close()
Write-Host “File Created: $file Size: $size”

Thursday 11 July 2013

generating a self signed cert

  • openssl genrsa -des3 -out mykeyname.key 1024
  • openssl req -new -key mykeyname.key -out mykeyname.csr
  • openssl x509 -req -days 365 -in mykeyname.csr -signkey mykeyname.key -out mykeyname.cer
  • openssl pkcs12 -export -in mykeyname.cer -inkey mykeyname.key -out mykeyname.p12 -name mykeyname -CAfile mykeyname.cer -caname mykeyname -chain

The certificate (mykeyname.cer) was added into the Trusted Root Certification Authorities in the default domain policy. (Default Domain Policy -> Computer Configuration -> Windows Settings -> Public Key Policies -> Trusted Root Certification Authorities)

ioping shows disk latency in the same way as ping shows network latency

Interesting tool

https://code.google.com/p/ioping/

Wednesday 10 July 2013

python script to test VPN connectivity opens socket to IP addresses and ports

Quick script I  wrote to test VPN connections. It reads in a list (data.csv) of IP's and ports and attempts to open a socket to them, then reports the result.

#!/usr/bin/env python

# Import some needed modules
import socket
import re
import sys
import csv

# Function that will take input of a customer name an ip address and a port number
# It will attempt to open a socket to that IP/port and report on success or failure

def test_connection(customer, address, port):
        # Create a TCP socket
        s = socket.socket()
        # Set the socket timeout to 10 seconds
        s.settimeout(15)
        msg1 = "Attempting to connect to customer %s on IP %s and port %s" % (customer, address, port)
        print msg1

        try:
                s.connect((address, port))
                msg2 = "Connection to customer [ %s ] on IP [ %s ] and port [ %s ] was [ OK ]" % (customer, address, port)
                pl = print_and_log_this(msg2)
                # If we can connect return True
                return True
        except socket.error, e:
                # If we can't connect and get an error return False
                msg3 = "Connection to customer [ %s ] on IP [ %s ] and port [ %s ] has [ FAILED ] with error [ %s ]" % (customer, address, port, e)
                pl = print_and_log_this(msg3)
                return False
        s.close()

# Function to print a message on screen and append it to a log file
def print_and_log_this (message):
    print message
    logfile = open("testipandportlog.txt", "a")
    logentry = message + "\n"
    logfile.write(logentry)
    logfile.close()


# Main fucntion of the program (program starts here)
if __name__ == '__main__':

        # Read in lines from the csv file, each line should have the following information
        # Customer-name,IP-address,Port-number
        # For each line in the CSV use the fuction created above to check connectivity
        data = open("data.csv", "rb")
        reader = csv.reader(data)
        for line in reader:
                c = line[0]
                a = line[1]
                p = eval(line[2]) # convert string to int
                check = test_connection(c, a, p)

        # Close our handle on the file
        data.close()
        #exit()



Monday 8 July 2013

two datacenters one ISP

Had an issue today on the ISP end. Lost connectivity to several sites. The issue was routing between my ISP and the destinations. We have two data centers but are stuck with one ISP. In a better world we should use a datacenter that has more than one ISP. Alternatively use two separate data centers with different ISPs. This way if you hit a routing issue you can fail over to the other site and other ISP.

When pinging one of the public IP's I was trying to reach I got the message Time to live exceeded from an IP near where my traffic exits the ISP network onto the internet. From researching the message it points towards a routing loop. The ISP claimed an attack was causing the issue, but maybe someone just made a mistake.

Looking glasses can be helpful to spot BGP issues with your ISP
This one is from BT
http://lg.as2110.net/

Look up the destination IP and compare the result to your own ISP.

Base software to install on windows servers


Latest version of powershell and powershell ISE
http://windirstat.info/ - Disk usage report
http://www.7-zip.org/ - Opens lots of archives
http://www.wireshark.org/ - Network traffic capture
http://technet.microsoft.com/en-us/sysinternals/bb842062 - Sysinternals Suite lots of very useful tools
http://technet.microsoft.com/en-us/library/cc771275(v=ws.10).aspx - Telnet client
Your backup or monitoring agent if required
Your AV solution
Configure NTP
Configure Logging
Right click -> Computer -> Properties -> Advanced -> Startup and Recovery -> Settings -> Kernel memory dump

http://getgreenshot.org/ - screenshots

http://www.nirsoft.net/ - suite of tools bluescreen view being popular

RSAT - remote admin tools for windows server etc (install and then appwiz to add them)
ASDM - for Cisco
filezilla - transfering files
foxit reader - PDF
google chrome - web browser
java - required for asdm and maybe other aps
mRemoteNG - saving connections, ssh rdp, vnc, webpage external app
remote desktop connection manager (might not be needed if you have above)
keepass - password store
zenmap - gui for nmap
portqueryui - good for checking open ports and a nice screenshot
openssl (sclient) W32openssl
openssl s_client -connect www.google.com:443
openssl can convert certs too
sublime text or notepad++
sublimetext editor
winscp
grep for windows
dig for windows
my traceroute (https://winmtr.en.uptodown.com/windows)
sysinternals suite 
maybe microsoft powertoys for admins
psexec can be replaced by powershell PSSessions or Invoke-command. Psexec is still great because it runs locally on the target system. Can run as local system account.

NTRadping - Radius testing
https://community.microfocus.com/t5/OES-Tips-Information/NTRadPing-1-5-RADIUS-Test-Utility/ta-p/1777768

Wednesday 3 July 2013

servers time going out of sync

I had an issue where a monitoring slave went out of sync with the master.

Quick fix - set the date/time manually
date MMddhhmmyyyy” so 17:32 13/12/2012 would be “date 121317322012”. You need to be root to run this command so “sudo su” or “sudo bash” first.
Or restart ntpd

You should have an NTP server configured in your network. FYI you can configure a Cisco ASA to provide NTP. The NTP server should be syncing off a local ntp server see http://www.pool.ntp.org/en/.

Some troubleshooting / information gathering steps below:

Step 1 - log on to both servers (the out of sync and the in sync)
Run "watch -n 1 date" this should highlight the difference in time

Step 2 - is the server VM or physical
sudo /usr/sbin/dmidecode | grep "Manufacturer: \|Product Name: "
If its a VM check for the vmware or hyper-v tools check time sync settings there
Check time sync settings on the VM host
If physical check ntp settings

Step 3 - check ntp config
vi /etc/ntp.conf 

Step 4 - check scheduled tasks, is ntpdate or ntpd running
sudo bash
crontab -l
*/1 * * * * /usr/bin/ntpd -q ?
This runs ntpd with –q which quits after it has set the time the “ > /dev/null” just redirects any output to trash because we don’t want to see the output. The 2>&1 is used to redirect stderr to stdout.  /dev/null 2>&1
ntpdate and ntpd are different and you shouldn't have both running at the same time

Step 5 - check service startup settings
/sbin/chkconfig –list

Step 6 - check ntpd service status
/etc/init.d/ntpd status


Thursday 20 June 2013

creating a vhd with diskpart

Start a cmd prompt as administrator

DISKPART

CREATE VDISK FILE=”c:\myvhd.vhd” MAXIMUM=20000
Maximum is the size in MB

SELECT VDISK FILE=”c:\myvhd.vhd”

ATTACH VDISK

CREATE PARTITION PRIMARY

ASSIGN LETTER=X

FORMAT QUICK LABEL=MYVHD

EXIT

Now you can copy stuff onto it in my computer. Then detach the vhd and attach it to your VM in hyper-v. This can be useful for moving files on and off a VM. Or just creating an OS VHD.

Monday 17 June 2013

reset a single VPN and check the VPN uptime on a Cisco ASA


Reset this site to site VPN
clear ipsec sa peer 200.200.200.100

Show the uptime on the VPN, look for duration
show vpn-sessiondb detail l2l | b 200.200.200.100

Unfortunately on the older PIX firewalls you can't do this you have to reset all VPNs :(

Installing Cisco ASA firewalls in the rack

Attend site with all the equipment required
Laptop and charger
Console cable
Socket board with male connection
network cables
Cable testing tools
Screwdrivers
Cage nuts
Firewall power cables
Mounts etc
Reusable cable ties / velcro
Labeler

Identify the cold / warm side of the rack

Mount firewalls so hot air is blown into the warm side of the rack.

You should make sure you have the latest software image installed. Also the correct security K9 etc. The correct license should also be applied.

Run a "wr erase" to wipe out the config.

Configure interfaces.

Cisco PIX firewall not responding to arps

I was moving some app servers to new public IP addresses. After the move the websites were not available. Everything looked correct on the firewall. When I ran a capture on the firewall I saw that packets were not making it to the firewall. The provider put in some static routes as a temp fix. Later we removed the temp fix and reloaded the firewall. It didn't resolve the issue.

I found the setting "sysopt noproxyarp outside" in the config on the firewall.

I ran "no sysopt noproxyarp outside" and I was able to access the websites.

From Cisco documentation
"Proxy ARP allows the security appliance to reply to an ARP request on behalf of hosts behind it. It does this by replying to ARP requests for the static mapped addresses of those hosts. The security appliance responds to the request with its own MAC address and then forwards the IP packets on to the appropriate inside host."

I idea who put this setting in and why it wasn't causing an issue before. Anyway issue is resolved now.

Friday 24 May 2013

creating a check for a device with SNMP and Nagios

I'm assuming your monitoring software is based on nagios. 

First stop is to check if a check already exists in the monitoring system.

If not check http://exchange.nagios.org/. Download and read the script. Understand it and test it.

If you can't find one, you have two options. Create one from scratch or use check_snmp.

To use check_snmp you need to know the correct OIDs. Contact the vendor of the device or check the documentation sometimes they have all of this in one document. Otherwise use snmpwalk
snmpwalk -v2c -c communityname 192.168.1.10. You'll have to go through all the OID's, find the value you are interested in monitoring. You can setup check snmp with the OID.

 snmpwalk -v2c -c  communityname 192.168.1.10 1.3.6.1.4.1.20632.5.14
SNMPv2-SMI::enterprises.20632.5.14 = STRING: "42.0 degrees C"

That OID "1.3.6.1.4.1.20632.5.14" is for CPU temp.

Lets try it with check snmp
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14
SNMP OK - "43.0 degrees C" |

So this check will return ok so long as the temp stays at 43. That's fine for static values, but for changing values its no good. You can use the -r switch

This means the check will be ok so long as its under 49 degrees
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14 -r "4[0123456789]"
SNMP OK - "42.0 degrees C" |

If I changed it to 50 - 59, it would alert
./check_snmp -H 10.7.11.219 -C cudaSNMP -o 1.3.6.1.4.1.20632.5.14 -r "5[0123456789]"
SNMP CRITICAL - *"42.0 degrees C"* |

If you want to write a check from scratch its a good idea to look at some checks already on http://exchange.nagios.org/. You'll need to get all the OID's you need. You'll also have to figure out what each value means. This can be a lot of work, forcing certain situations (unplugging cables etc) and checking the values returned. This is why most people just use check_snmp with a simple ok / not ok check.


command for finding interfaces which have not been used on cisco switches


Only available on 4500's with supervisor
# SHOW INTERFACE LINK

# show int | i proto|Last in

# show int | i proto.*notconnect|proto.*administratively down|Last in.* [6-9]w|Last in.*[0-9][0-9]w|[0-9]y|disabled|Last input never, output never, output hang never

This last command and filters out text you don't need.

Investigating high CPU usage on cisco switches

show processes cpu sorted | excl 0.00%  0.00%  0.00%

This command will show you the process that is using the most CPU. If its over 5% then there is a problem. Google the process name to see what it does and take it from there. View the graphs in your monitoring system to narrow down when it started. Check the logs from the switch.

9200 16.x TS steps
https://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/ios-xe-16/213549-troubleshoot-high-cpu-usage-in-catalyst.html

  • IOSd
  • LSMPI
  • FED 
  • Doppler ASIC
  • Physical interface

IOSd: This is the Cisco IOS® daemon that runs on the Linux kernel. It is run as a software process within the kernel

LSMPI: Linux Shared Memory Punt Interface

Forwarding Engine Driver (FED): This is the heart of the Cisco Catalyst switch and is responsible for all hardware programming/forwarding


  • Packet Delivery System (PDS): This is the architecture and process of how packets are delivered to and from the various subsystems. As an example, it controls how packets are delivered from the FED to the IOSd and vice versa
  • Control Plane (CP): The control plane is a generic term used to group together the functions and traffic that involve the CPU of the Catalyst Switch. This includes traffic such as Spanning Tree Protocol (STP), Hot Standby Router Protocol (HSRP), and routing protocols that are destined to the switch, or sent from the switch. This also includes application layer protocols like Secure Shell (SSH), and Simple Network Management Protocol (SNMP) that must be handled by the CPU
  • Data Plane (DP): Typically the data plane encompasses the hardware ASICs and traffic that is forwarded without assistance from the Control Plane
  • Punt: Ingress protocol control packet which intercepted by DP sent to the CP to process it
  • Inject: CP generated protocol packet sent to DP to egress out on IO interface(s)

show processes cpu sorted 5min | e 0.00%  0.00%  0.00% problem.
Look for highest execution time

show platform hardware fed switch active qos queue stats internal cpu policer

show platform software fed switch active punt cause summary
show platform software fed switch active punt cause clear
show platform software fed switch active punt cause summary

show platform software fed switch active punt cpuq rates | e 0        0        0        0        0        0

show platform software fed switch active punt rates interfaces

show platform software fed switch active punt rates interfaces 0x000001d2

show platform software fed switch active punt rates interfaces 0x000001d2 | e 0        0        0        0

show monitor capture cpuCap buffer brief

show monitor capture cpuCap buffer detailed

Packet captures on switch
https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3850/software/release/16-3/configuration_guide/b_163_consolidated_3850_cg/b_163_consolidated_3850_cg_chapter_01001011.html

show processes cpu history
*'s show spikes, #'s are used for average


Script for  Intermittent High CPU
From https://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/ios-xe-16/213549-troubleshoot-high-cpu-usage-in-catalyst.html#anc28

In the event that the high CPU on the switch is intermittent, it is possible to set up a script on the switch to automatically run these commands at the time of high CPU events. The entry-val is used to determine how high the CPU is before the script triggers. The script monitors the 5 second CPU average SNMP OID. Two files are written to the flash, tac-cpu-<timestamp>.txt contains the command outputs, and tac-cpu-<timestamp>.pcap contains the CPU ingress capture. These files can then be reviewed at a later date.

config t
no event manager applet high-cpu authorization bypass
event manager applet high-cpu authorization bypass
event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.3.1 get-type next entry-op gt entry-val 80 poll-interval 1 ratelimit 300 maxrun 180
action 0.01 syslog msg "High CPU detected, gathering system information."
action 0.02 cli command "enable"
action 0.03 cli command "term exec prompt timestamp"
action 0.04 cli command "term length 0"
action 0.05 cli command "show clock"
action 0.06 regex "([0-9]|[0-9][0-9]):([0-9]|[0-9][0-9]):([0-9]|[0-9][0-9])" $_cli_result match match1
action 0.07 string replace "$match" 2 2 "."
action 0.08 string replace "$_string_result" 5 5 "."
action 0.09 set time $_string_result
action 1.01 cli command "show proc cpu sort | append flash:tac-cpu-$time.txt"
action 1.02 cli command "show proc cpu hist | append flash:tac-cpu-$time.txt"
action 1.03 cli command "show proc cpu platform sorted | append flash:tac-cpu-$time.txt"
action 1.04 cli command "show interface | append flash:tac-cpu-$time.txt"
action 1.05 cli command "show interface stats | append flash:tac-cpu-$time.txt"
action 1.06 cli command "show log | append flash:tac-cpu-$time.txt"
action 1.07 cli command "show ip traffic | append flash:tac-cpu-$time.txt"
action 1.08 cli command "show users | append flash:tac-cpu-$time.txt"
action 1.09 cli command "show platform software fed switch active punt cause summary | append flash:tac-cpu-$time.txt"
action 1.10 cli command "show platform software fed switch active cpu-interface | append flash:tac-cpu-$time.txt"
action 1.11 cli command "show platform software fed switch active punt cpuq all | append flash:tac-cpu-$time.txt"
action 2.08 cli command "no monitor capture tac_cpu"
action 2.09 cli command "monitor capture tac_cpu control-plane in match any file location flash:tac-cpu-$time.pcap"
action 2.10 cli command "monitor capture tac_cpu start" pattern "yes"
action 2.11 cli command "yes"
action 2.12 wait 10
action 2.13 cli command "monitor capture tac_cpu stop"
action 3.01 cli command "term default length"
action 3.02 cli command "terminal no exec prompt timestamp"
action 3.03 cli command "no monitor capture tac_cpu"