The load status of the operating system reflects the resource usage of the application, from which the bottlenecks of application optimization can be found.
System load average refers to the average number of processes that are running or uninterruptible. \
is running, which means running state, occupying the CPU, or ready state, waiting for CPU scheduling. \
Do not disturb, indicating blocking, waiting for I/O
Recommendation: [linux video tutorial]
In the Linux system, you need to view Generally, the uptime command is used for load conditions (w command and top command are also acceptable)*
1. uptime command
$ uptime\ 16:33:56 up 69 days, 5:10, 1 user, load average: 0.14, 0.24, 0.29
The above information is analyzed as follows:
16:33:56 : Current time
up 69 days, 5:10 : The system has been running for 69 days, 5 hours and 10 minutes
1 user : There is currently 1 user logged into the system load average: 0.14, 0.24, 0.29: The average load of the system in the past 1 minute, 5 minutes, and 15 minutes
load average: 0.14, 0.24, 0.29: The average load of the system in the past 1 minute, 5 minutes, and 15 minutes Load
Average load analysis
View the number of logical CPU cores:
$ grep 'model name' /proc/cpuinfo | wc -l\ 1\
The running results indicate that there is 1 logical CPU core. Taking 1 CPU core as an example, assuming that the CPU processes up to 100 processes per minute –
load=0, no process requires CPU
load=0.5, and the CPU processes 50 processes
load=1, the CPU has processed 100 processes, and the CPU is fully occupied at this time, but the system can still operate smoothly
load=1.5, the CPU has processed 100 processes, and there are still 50 A process is being eliminated and waiting for CPU processing. At this time, the CPU is already overloaded.
In order for the system to run smoothly, the load value should not exceed 1.0, so that no process needs to wait, and all processes can be processed for the first time. be processed in no time. \
Obviously, 1.0 is a key value. If it exceeds this value, the system will not be in optimal condition. Generally 0.7 is an ideal value. \
In addition, the health status of the load value is also related to the number of CPU cores in the system. If the number of CPU cores is 2, then the health value of the load value should be 2, and so on. \
The load of the evaluation system generally uses the average load value within 15 minutes.
2. w command
$ w\ 17:47:40 up 69 days, 6:24, 1 user, load average: 0.46, 0.26, 0.25\ USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT\ lvinkim pts/0 14.18.144.2 15:55 0.00s 0.02s 0.00s w
Line 1: Same as uptime one. \
Below line 2, the list of currently logged in users.
3. top command
$ top\ top - 17:51:23 up 69 days, 6:28, 1 user, load average: 0.31, 0.30, 0.26\ Tasks: 99 total, 1 running, 98 sleeping, 0 stopped, 0 zombie\ Cpu(s): 2.3%us, 0.2%sy, 0.0%ni, 97.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st\ Mem: 1922244k total, 1737480k used, 184764k free, 208576k buffers\ Swap: 0k total, 0k used, 0k free, 466732k cached\ \ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND \ 1 root 20 0 19232 1004 708 S 0.0 0.1 0:01.17 init \ 2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd \ ...
Line 1: Same as uptime one.
Line 2: Process number information.
Tasks: 99 total: There are 99 processes in total
1 running: 1 process is occupying the CPU
98 sleeping: 98 sleeping processes
0 stopped: 0 stopped processes
0 zombie: 0 zombie processes
Line 3: CPU usage
us (user): occupied by non-nice user processes CPU ratio
sy (system): The ratio of the kernel and kernel processes occupying the CPU
ni (nice): The ratio of the CPU occupied by processes that have changed priorities in the user process space
id (idle): CPU idle ratio. If the system is slow and this value is high, it means that the reason for the slow system is not high CPU load.
wa (iowait): The time ratio of the CPU waiting to perform I/O operations. , this indicator can be used to troubleshoot disk I/O problems, usually combined with wa and id to determine
hi (Hardware IRQ): The ratio of the time spent by the CPU processing hardware interrupts
si (Software Interrupts): The ratio of time spent by the CPU processing software interrupts
st (steal): The elapsed time, the ratio of CPU time occupied by other tasks in the virtual machine
Some situations that need attention :
User process us has a high proportion and I/O operation wa is low: It means that the reason for the slowness of the system is that the process takes up a lot of CPU. It is usually accompanied by a low idle ratio id, which means that the CPU has very little idling time. .
I/O operation wa is low and idle ratio id is high: the possibility of CPU resource bottleneck can be eliminated.
I/O operation wa high: It means that I/O takes up a lot of CPU time. You need to check the use of swap space. The swap space is located on the disk and the performance is much lower than that of the memory. When the memory is exhausted, start using swap. When using space, it will have a serious impact on performance, so for servers with higher performance requirements, it is generally recommended to turn off the swap space. On the other hand, if there is plenty of memory but wa is high, you need to check which process is taking up a lot of I/O resources.
More load situations can be flexibly judged in practice.
4. iostat command
The iostat command can check the IO usage of the system partition
$ iostat \ Linux 2.6.32-573.22.1.el6.x86_64 (sgs02) 01/20/2017 _x86_64_ (1 CPU)\ \ avg-cpu: %user %nice %system %iowait %steal %idle\ 2.29 0.00 0.25 0.04 0.00 97.41\ \ Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn\ vda 1.15 3.48 21.88 21016084 131997520
Some noteworthy IO indicators:
Device: disk name
tps: I/O transfer requests per second
Blk_read/s: How many blocks are read per second. To view the block size, please refer to the command tune2fs
Blk_wrtn/s: How many blocks are written per second
Blk_read: How many blocks are read in total
–Blk_wrtn: How many blocks are written in total
5. iotop command
iotop The command is similar to the top command, but it displays the I/O status of each process, which is useful for locating processes with heavy I/O operations. \
# iotop\ Total DISK READ: 0.00 B/s | Total DISK WRITE: 774.52 K/s\ TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND \ 272 be/3 root 0.00 B/s 0.00 B/s 0.00 % 4.86 % [jbd2/vda1-8]\ 9072 be/4 mysql 0.00 B/s 268.71 K/s 0.00 % 0.00 % mysqld\ 5058 be/4 lvinkim 0.00 B/s 3.95 K/s 0.00 % 0.00 % php-fpm: pool www\ 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init
You can see the reading and writing intensity of different tasks.
6. sysstat tool
很多时候当检测到或者知道历史的高负载状况时,可能需要回放历史监控数据,这时 sar 命令就派上用场了,sar 命令同样来自 sysstat 工具包,可以记录系统的 CPU 负载、I/O 状况和内存使用记录,便于历史数据的回放。
sysstat 的配置文件在 /etc/sysconfig/sysstat 文件,历史日志的存放位置为 /var/log/sa\
统计信息都是每 10 分钟记录一次,每天的 23:59 会分割统计文件,这些操作的频率都在 /etc/cron.d/sysstat 文件配置。\
七、sar 命令
使用 sar 命令查看当天 CPU 使用:
$ sar\ Linux 2.6.32-431.23.3.el6.x86_64 (szs01) 01/20/2017 _x86_64_ (1 CPU)\ \ 10:50:01 AM CPU %user %nice %system %iowait %steal %idle\ 11:00:01 AM all 0.45 0.00 0.22 0.40 0.00 98.93\ Average: all 0.45 0.00 0.22 0.40 0.00 98.93
使用 sar 命令查看当天内存使用:
$ sar -r\ Linux 2.6.32-431.23.3.el6.x86_64 (szs01) 01/20/2017 _x86_64_ (1 CPU)\ \ 10:50:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit\ 11:00:01 AM 41292 459180 91.75 44072 164620 822392 164.32\ Average: 41292 459180 91.75 44072 164620 822392 164.32
使用 sar 命令查看当天 IO 统计记录:
$ sar -b\ Linux 2.6.32-431.23.3.el6.x86_64 (szs01) 01/20/2017 _x86_64_ (1 CPU)\ \ 10:50:01 AM tps rtps wtps bread/s bwrtn/s\ 11:00:01 AM 3.31 2.14 1.17 37.18 16.84\ Average: 3.31 2.14 1.17 37.18 16.84
更多 sar 用法,请 man sar 。
The above is the detailed content of How to check system load in Linux. For more information, please follow other related articles on the PHP Chinese website!