The following column linux system tutorial will introduce and summarize the use of linux perf. I hope it will be helpful to friends in need!
Introduction
perf is a performance analysis tool provided in Linux systems. It is implemented based on a kernel subsystem called "Performance counters" and supports hardware (CPU, Performance analysis at the PMU (Performance Monitoring Unit) and software (software counters, tracepoint) levels.
Events in perf
Perf, like other performance tuning tools, infers the entire program by sampling monitoring objects and based on the distribution of sampling points. Behavior. Through the perf list command, we can see that perf supports many sampling events, such as branch-misses, cpu-clock, etc. The predefined events in perf belong to different types, such as hardware-generated events (cache hit/branch miss) and software-generated events (context switch/page fault), etc.
tracepoint
Tracepoint is some hooks defined in the Linux kernel. If enabled, they will be triggered when specific logic is executed to facilitate other tools to obtain the system For internal running status and other information, perf uses tracepoint. It records and counts various tracepoint events and generates analysis reports.
Usage
The specific usage of the perf tool is as follows:
perf [--version] [--help] COMMAND [ARGS]
The COMMAND list can be viewed by executing perf --help, listed below Several commonly used commands.
perf stat
The function of perf stat is to execute a command and collect various data during its operation. It can provide an overall overview of the running status of a program. For example:
user@localhost:~$ perf stat hostname localhost Performance counter stats for 'hostname': 0.313464 task-clock (msec) # 0.481 CPUs utilized 2 context-switches # 0.006 M/sec 0 cpu-migrations # 0.000 K/sec 153 page-faults # 0.488 M/sec 896,723 cycles # 2.861 GHz 620,709 instructions # 0.69 insn per cycle 121,143 branches # 386.465 M/sec 6,247 branch-misses # 5.16% of all branches 0.000651441 seconds time elapsed
In the above example, the hostname command is run through perf stat, and some indicators during its operation are summarized and displayed, such as task-clock, context-switches and waiting. By default, perf stat will output statistics of several commonly used events, such as:
task-clock-msecs:cpu 使用率 context-switches:进程切换次数 page-faults:发生缺页的次数 cpu-migrations:表示进程运行过程中发生了多少次CPU迁移,即被调度器从一个CPU转移到另外一个CPU上运行 cycles:处理器时钟,一条机器指令可能需要多个cycles instructions: 机器指令数目 branches:遇到的分支指令数 branch-misses是预测错误的分支指令数
In addition, we can use the -e parameter to specify the events we are interested in, such as:
user@localhost:~$ perf stat -e cache-misses hostname localhost Performance counter stats for 'hostname': 682 cache-misses 0.000646676 seconds time elapsed
perf top
The function of perf top is to display the current performance statistics of the system in real time. The previous perf stat is used to analyze a specific program, and sometimes we may not know which program affects system performance. At this time, we can use perf top to find suspicious programs. For example:
Samples: 775 of event 'cpu-clock', Event count (approx.): 92931021 Overhead Shared Object Symbol 8.93% [kernel] [k] vsnprintf 7.73% perf [.] rb_next 5.92% [kernel] [k] kallsyms_expand_symbol.clone.0 5.07% [kernel] [k] format_decode 4.59% [kernel] [k] number 3.40% perf [.] symbols__insert 3.03% libslang.so.2.2.1 [.] SLtt_smart_puts
The above example shows that perf counts the data of cpu-clock events and sorts them according to the proportion. Like perf stat, we can specify statistics of other events through the -e parameter. For example, perf top -e context-switches can view the top N processes with the most process switches.
perf record & perf report
perf record is similar to perf stat. It can run a command and generate statistical information, but perf record will not display the results. out, instead outputting the results to a file. The files generated by perf record can be parsed with perf report.
perf record can also use the -g parameter to generate a calling graph during analysis to help locate the higher-level logical distribution.
Others
Through the example, we can find that the Symbol column in the perf analysis results displays the names of c language functions. For Java, the functions generated by JIT compilation will be directly displayed in the symbol instead of the function name of Java. At this time, it is not so easy to locate the problem. We need to use additional means to combine the symbol with the symbol table of the Java program. Correspondence will be discussed in detail later.
Recommended learning: "linux video tutorial"
The above is the detailed content of What is linux perf? how to use? (Usage summary). For more information, please follow other related articles on the PHP Chinese website!