30 Linux System Monitoring Tools Every SysAdmin Should Know

Started by Administrator, Jan 08, 2023, 02:57 AM

Previous topic - Next topic

Administrator

Need to monitor Linux server performance? Try these built-in commands and a few add-on tools. Most distributions come with tons of Linux monitoring tools. These tools provide metrics which can be used to get information about system activities. You can use these tools to find the possible causes of a performance problem. The commands discussed below are some of the most fundamental commands when it comes to system analysis and debugging Linux server issues such as:


Finding out system bottlenecks
Disk (storage) bottlenecks
CPU and memory bottlenecks
Network bottleneck.




Tutorial details
Difficulty level    Intermediate
Root privileges    Yes
Requirements    Linux terminal
Category    System Management
OS compatibility    Alma • Alpine • Arch • CentOS • Debian • Fedora • Linux • Mint • openSUSE • Pop!_OS • RHEL • Rocky • Stream • SUSE • Ubuntu • WSL
Est. reading time    19 minutes
1. top – Process activity monitoring command
top command display Linux processes. It provides a dynamic real-time view of a running system i.e. actual process activity. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.

Commonly Used Hot Keys With top Linux monitoring tools
Here is a list of useful hot keys:

Hot Key    Usage
t    Displays summary information off and on.
m    Displays memory information off and on.
A    Sorts the display by top consumers of various system resources. Useful for quick identification of performance-hungry tasks on a system.
f    Enters an interactive configuration screen for top. Helpful for setting up top for a specific task.
o    Enables you to interactively select the ordering within top.
r    Issues renice command.
k    Issues kill command.
z    Turn on or off color/mono

2. vmstat – Virtual memory statistics
The vmstat command reports information about processes, memory, paging, block IO, traps, and cpu activity.

Quotevmstat 3

Sample Outputs:

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b  swpd  free  buff  cache  si  so    bi    bo  in  cs us sy id wa st
 0  0      0 2540988 522188 5130400    0    0    2    32    4    2  4  1 96  0  0
 1  0      0 2540988 522188 5130400    0    0    0  720 1199  665  1  0 99  0  0
 0  0      0 2540956 522188 5130400    0    0    0    0 1151 1569  4  1 95  0  0
 0  0      0 2540956 522188 5130500    0    0    0    6 1117  439  1  0 99  0  0
 0  0      0 2540940 522188 5130512    0    0    0  536 1189  932  1  0 98  0  0
 0  0      0 2538444 522188 5130588    0    0    0    0 1187 1417  4  1 96  0  0
 0  0      0 2490060 522188 5130640    0    0    0    18 1253 1123  5  1 94  0  0
Display Memory Utilization Slabinfo
vmstat -m

Get Information About Active / Inactive Memory Pages
vmstat -a
See "How do I find out Linux Resource utilization to detect system bottlenecks?" for more info.

Linux Find Out What Process Are Using Swap Space
Use the smem command:

smem

Another option is to combine pgrep command with the grep command to find out SWAP mem usage:
pgrep memcached
grep --color VmSwap /proc/48440/status

Quote3. w – Find out who is logged on and what they are doing
We use the w command displays information about the users currently on the machine, and their processes.
w username
w vivek
17:58:47 up 5 days, 20:28,  2 users,  load average: 0.36, 0.26, 0.24
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    10.1.3.145       14:55    5.00s  0.04s  0.02s vim /etc/resolv.conf
root     pts/1    10.1.3.145       17:43    0.00s  0.03s  0.00s w
4. uptime – Tell how long the Linux system has been running
We use the uptime command to see how long the server has been running. The current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes.

uptime
QuoteOutput:

 18:02:41 up 41 days, 23:42,  1 user,  load average: 0.00, 0.00, 0.00
1 can be considered as optimal load value. The load can change from system to system. For a single CPU system 1 – 3 and SMP systems 6-10 load value might be acceptable.
5. ps – Displays the Linux processes
Use the ps command to report a snapshot of the current processes under Linux. To select all processes pass the -A or -e option as follows:

Quoteps -A

Here is what I see:

  PID TTY          TIME CMD
    1 ?        00:00:02 init
    2 ?        00:00:02 migration/0
    3 ?        00:00:01 ksoftirqd/0
    4 ?        00:00:00 watchdog/0
    5 ?        00:00:00 migration/1
    6 ?        00:00:15 ksoftirqd/1
....
.....
 4881 ?        00:53:28 java
 4885 tty1    00:00:00 mingetty
 4886 tty2    00:00:00 mingetty
 4887 tty3    00:00:00 mingetty
 4888 tty4    00:00:00 mingetty
 4891 tty5    00:00:00 mingetty
 4892 tty6    00:00:00 mingetty
 4893 ttyS1    00:00:00 agetty
12853 ?        00:00:00 cifsoplockd
12854 ?        00:00:00 cifsdnotifyd
14231 ?        00:10:34 lighttpd
14232 ?        00:00:00 php-cgi
54981 pts/0    00:00:00 vim
55465 ?        00:00:00 php-cgi
55546 ?        00:00:00 bind9-snmp-stat
55704 pts/1    00:00:00 ps
Please note that ps is just like top command, but provides more information. Let us see some more examples.

Show Long Format Output
ps -Al

To turn on extra full mode (it will show command line arguments passed to process):
ps -AlF

Display Threads ( LWP and NLWP)
ps -AlFH

Watch Threads After Processes
ps -AlLm

Print All Process On The Server
ps ax
ps axu

Want To Print A Process Tree?
ps -ejH
ps axjf
pstree

Get Security Information of Linux Process
ps -eo euser,ruser,suser,fuser,f,comm,label
ps axZ
ps -eM

Let Us Print Every Process Running As User Vivek
ps -U vivek -u vivek u

Configure ps Command Output In a User-Defined Format
ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
ps -eopid,tt,user,fname,tmout,f,wchan

Try To Display Only The Process IDs of Lighttpd
ps -C lighttpd -o pid=

OR
pgrep lighttpd

OR
pgrep -u vivek php-cgi

Print The Name of PID 55977
ps -p 55977 -o comm=

QuotePlease note that ps is just like top command, but provides more information. Let us see some more examples.

Show Long Format Output
ps -Al

To turn on extra full mode (it will show command line arguments passed to process):
ps -AlF

Display Threads ( LWP and NLWP)
ps -AlFH

Watch Threads After Processes
ps -AlLm

Print All Process On The Server
ps ax
ps axu

Want To Print A Process Tree?
ps -ejH
ps axjf
pstree

Get Security Information of Linux Process
ps -eo euser,ruser,suser,fuser,f,comm,label
ps axZ
ps -eM

Let Us Print Every Process Running As User Vivek
ps -U vivek -u vivek u

Configure ps Command Output In a User-Defined Format
ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
ps -eopid,tt,user,fname,tmout,f,wchan

Try To Display Only The Process IDs of Lighttpd
ps -C lighttpd -o pid=

OR
pgrep lighttpd

OR
pgrep -u vivek php-cgi

Print The Name of PID 55977
ps -p 55977 -o comm=

Top 10 Memory Consuming Process
ps -auxf | sort -nr -k 4 | head -10

Show Us Top 10 CPU Consuming Process
ps -auxf | sort -nr -k 3 | head -10

See "Show All Running Processes in Linux" for more info.

6. free – Show Linux server memory usage
The free command shows the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel.
free

Session from my Linux home server system:

            total      used      free    shared    buffers    cached
Mem:      12302896    9739664    2563232          0    523124    5154740
-/+ buffers/cache:    4061800    8241096
Swap:      1052248          0    1052248
See the following resources for more info:

Linux Find Out Virtual Memory PAGESIZE
Linux Limit CPU Usage Per Process
How much RAM does my Ubuntu / Fedora Linux desktop PC have?
7. iostat – Montor Linux average CPU load and disk activity
We use the iostat command to report Central Processing Unit (CPU) statistics and input/output statistics for devices, partitions and network filesystems (NFS) under Linux operating sytems. For example:
iostat

From my RHEL 5 server:


Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in)    06/26/2009

avg-cpu:  %user  %nice %system %iowait  %steal  %idle
          3.50    0.09    0.51    0.03    0.00  95.86

Device:            tps  Blk_read/s  Blk_wrtn/s  Blk_read  Blk_wrtn
sda              22.04        31.88      512.03  16193351  260102868
sda1              0.00        0.00        0.00      2166        180
sda2            22.04        31.87      512.03  16189010  260102688
sda3              0.00        0.00        0.00      1615          0
See "Linux Track NFS Directory / Disk I/O Stats" for more info.

8. sar – Monitor, collect and report Linux system activity
sar command used to collect, report, and save system activity information. To see network counter, enter:
sar -n DEV | more

The network counters from the 24th:
sar -n DEV -f /var/log/sa/sa24 | more

You can also display real time usage using sar:
sar 4 5

RHEL 5 server outputs:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in)        06/26/2009

06:45:12 PM      CPU    %user    %nice  %system  %iowait    %steal    %idle
06:45:16 PM      all      2.00      0.00      0.22      0.00      0.00    97.78
06:45:20 PM      all      2.07      0.00      0.38      0.03      0.00    97.52
06:45:24 PM      all      0.94      0.00      0.28      0.00      0.00    98.78
06:45:28 PM      all      1.56      0.00      0.22      0.00      0.00    98.22
06:45:32 PM      all      3.53      0.00      0.25      0.03      0.00    96.19
Average:          all      2.02      0.00      0.27      0.01      0.00 
9. mpstat – Monitor multiprocessor usage on Linux
Quotempstat command displays activities for each available processor, processor 0 being the first one. mpstat -P ALL to display average CPU utilization per processor:
Quote14.el5 (www03.nixcraft.in)        06/26/2009

06:48:11 PM  CPU  %user  %nice    %sys %iowait    %irq  %soft  %steal  %idle    intr/s
06:48:11 PM  all    3.50    0.09    0.34    0.03    0.01    0.17    0.00  95.86  1218.04
06:48:11 PM    0    3.44    0.08    0.31    0.02    0.00    0.12    0.00  96.04  1000.31
06:48:11 PM    1    3.10    0.08    0.32    0.09    0.02    0.11    0.00  96.28    34.93
06:48:11 PM    2    4.16    0.11    0.36    0.02    0.00    0.11    0.00  95.25      0.00
06:48:11 PM    3    3.77    0.11    0.38    0.03    0.01    0.24    0.00  95.46    44.80
06:48:11 PM    4    2.96    0.07    0.29    0.04    0.02    0.10    0.00  96.52    25.91
06:48:11 PM    5    3.26    0.08    0.28    0.03    0.01    0.10    0.00  96.23    14.98
06:48:11 PM    6    4.00    0.10    0.34    0.01    0.00    0.13    0.00  95.42      3.75
06:48:11 PM    7    3.30    0.11    0.39    0.03    0.01    0.46    0.00  95.69    76.89
10. pmap – Montor process memory usage on Linux
The last line is very important:

mapped: 933712K total amount of memory mapped to files
writeable/private: 4304K the amount of private address space
shared: 768000K the amount of address space this process is sharing with others

The last line is very important:

mapped: 933712K total amount of memory mapped to files
writeable/private: 4304K the amount of private address space
shared: 768000K the amount of address space this process is sharing with others

11. netstat – Linux network and statistics monitoring tool
Use the netstat command that shows network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. For instance:
Use the netstat command that shows network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. For instance:

Quote# netstat -tulpn
# netstat -nat

12. ss – Network Statistics

We use ss command to dump socket statistics. It allows showing information similar to netstat. Please note that the netstat is mostly obsolete. Hence you need to use the ss command. To ss all TCP and UDP sockets on Linux, type:
Quote# ss -t -a

OR
Quote# ss -u -a

Show all TCP sockets with process SELinux security contexts:
Quote# ss -t -a -Z

See the following resources about ss and netstat commands on Linux:

  • ss: Display Linux TCP / UDP Network and Socket Information
  • Get Detailed Information About Particular IP address Connections Using netstat Command

13. iptraf – Get real-time network statistics on Linux

Use the iptraf command on Linux. It is an interactive colorful IP LAN monitor which is based upn an ncurses-based IP LAN monitor that generates various network statistics including TCP info, UDP counts, ICMP and OSPF information, Ethernet load info, node stats, IP checksum errors, and others. It can provide the following info in easy to read format for Linux developers and sysadmins:

  • Network traffic statistics by TCP connection
  • IP traffic statistics by network interface
  • Network traffic statistics by protocol
  • Network traffic statistics by TCP/UDP port and by packet size
  • Network traffic statistics by Layer2 address

https://pix.cobrasoft.org/images/2023/01/08/iptraf3.webp

https://pix.cobrasoft.org/images/2023/01/08/iptraf2.webp

14. tcpdump – Detailed network traffic analysis

We use the tcpdump command to dump traffic on a network. However, you need good understanding of TCP/IP protocol to utilize this tool. For example, to display traffic info about DNS, enter:
Quote# tcpdump -i eth1 'udp port 53'

View all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets, enter:
Quote# tcpdump 'tcp port 80 and (((ip[2:2] - ((ip
  • &0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'



Show all FTP session to 202.54.1.5, enter:
Quote# tcpdump -i eth1 'dst 202.54.1.5 and (port 21 or 20'

Print all HTTP session to 192.168.1.5:
Quote# tcpdump -ni eth0 'dst 192.168.1.5 and tcp and port http'

Use wireshark to view detailed information about files, enter:
Quote# tcpdump -n -i eth1 -s 0 -w output.txt src or dst port 80

15. iotop – Linux I/O monitor
The iotop command is used to monitor, I/O usage information, using the Linux kernel. It shows a table of current I/O usage sorted by processes or threads on the server. For example:
Quote$ sudo iotop

https://pix.cobrasoft.org/images/2023/01/08/iotop-monitoring-linux-disk-read-write-IO.webp

16. htop – interactive process viewer

The htop command is a free and open source ncurses-based process viewer for Linux and Unix-like systems such as macOS, FreeBSD and more. It is much better than top command. Very easy to use. You can select processes for killing or renicing without using their PIDs or leaving htop interface. Open the terminal and then type:
Quote$ htop