Explore the secrets of IO performance optimization in Linux systems-LINUX-php.cn

Home

System Tutorial

LINUX

Explore the secrets of IO performance optimization in Linux systems

王林

Feb 09, 2024 pm 03:21 PM

linux linux tutorial linux system linux command shell script embeddedlinux Getting started with linux linux learning

Explore the secrets of IO performance optimization in Linux systems

In today’s context of big data and artificial intelligence, IO performance is crucial for any computer system. For Linux systems, we need to have an in-depth understanding of its IO performance model and its optimization strategies. This article will introduce in detail the Linux system IO model and performance optimization methods for different IO operations.

The current mainstream third-party IO testing tools include [neiqian]fio[/neiqian], [neiqian]iometer[/neiqian] and [neiqian]Orion[/neiqian]. Each of these three tools has its own merits.

fio is more convenient to use under Linux system, iometer is more convenient to use under window system, Orion is Oracle's IO testing software, which can simulate the reading and writing of Oracle database scenarios without installing Oracle database.

The following is an IO test on SAN storage using the fio tool on a Linux system.

1. Install fio

Method 1: Download the fio-2.1.10.tar file from the fio official website. After decompression, you can use fio after ./configure, make, and make install.

Method 2: Install through yum under Linux system, yum install -y fio

2. [neiqian]fio[/neiqian] parameter explanation

You can use fio -help to view each parameter. For specific parameters, you can view the how to document on the official website. The following is a description of several common parameters

filename=/dev/emcpowerb 支持文件系统或者裸设备，-filename=/dev/sda2或-filename=/dev/sdb
direct=1 测试过程绕过机器自带的buffer，使测试结果更真实
rw=randwread 测试随机读的I/O
rw=randwrite 测试随机写的I/O
rw=randrw 测试随机混合写和读的I/O
rw=read 测试顺序读的I/O
rw=write 测试顺序写的I/O
rw=rw 测试顺序混合写和读的I/O
bs=4k 单次io的块文件大小为4k
bsrange=512-2048 同上，提定数据块的大小范围
size=5g 本次的测试文件大小为5g，以每次4k的io进行测试
numjobs=30 本次的测试线程为30
runtime=1000 测试时间为1000秒，如果不写则一直将5g文件分4k每次写完为止
ioengine=psync io引擎使用pync方式，如果要使用libaio引擎，需要yum install libaio-devel包
rwmixwrite=30 在混合读写的模式下，写占30%
group_reporting 关于显示结果的，汇总每个进程的信息
此外
lockmem=1g 只使用1g内存进行测试
zero_buffers 用0初始化系统buffer
nrfiles=8 每个进程生成文件的数量

Copy after login

3. Detailed explanation of fio test scenarios and report generation

testing scenarios:

100% random, 100% read, 4K

fio -filename=/dev/emcpowerb -direct=1 -iodepth 1 -thread -rw=randread -ioengine=psync -

bs=4k -size=1000G -numjobs=50 -runtime=180 -group_reporting -name=rand_100read_4k

Copy after login

100% random, 100% written, 4K

fio -filename=/dev/emcpowerb -direct=1 -iodepth 1 -thread -rw=randwrite -ioengine=psync -

bs=4k -size=1000G -numjobs=50 -runtime=180 -group_reporting -name=rand_100write_4k

Copy after login

100% sequence, 100% read, 4K

fio -filename=/dev/emcpowerb -direct=1 -iodepth 1 -thread -rw=read -ioengine=psync -

bs=4k -size=1000G -numjobs=50 -runtime=180 -group_reporting -name=sqe_100read_4k

Copy after login

100% order, 100% writing, 4K

fio -filename=/dev/emcpowerb -direct=1 -iodepth 1 -thread -rw=write -ioengine=psync -

bs=4k -size=1000G -numjobs=50 -runtime=180 -group_reporting -name=sqe_100write_4k

Copy after login

100% random, 70% read, 30% write 4K

fio -filename=/dev/emcpowerb -direct=1 -iodepth 1 -thread -rw=randrw -rwmixread=70 -

ioengine=psync -bs=4k -size=1000G -numjobs=50 -runtime=180 -group_reporting -name=randrw_70read_4k

Copy after login

Result report view:

[root@rac01-node02]# fio -filename=/dev/sdc4 -direct=1 -iodepth 1 -thread -rw=randrw -rwmixre
ad=70 -ioengine=psync -bs=4k -size=1000G -numjobs=50 -runtime=180 -group_reporting -name=r
andrw_70read_4k_local
randrw_70read_4k_local: (g=0): rw=randrw, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
...
fio-2.1.10
Starting 50 threads
Jobs: 21 (f=21): [mm_m_m__mmmmmm__mm_m_mmm_mm__m_m_m] [3.4% done] [7004KB/2768KB/0KB /s]
 [1751/692/0 iops] [eta 01h:27m:00s]
randrw_70read_4k_local: (groupid=0, jobs=50): err= 0: pid=13710: Wed May 31 10:23:31 2017
read : io=1394.2MB, bw=7926.4KB/s, iops=1981, runt=180113msec

clat (usec): min=39, max=567873, avg=24323.79, stdev=25645.98
lat (usec): min=39, max=567874, avg=24324.23, stdev=25645.98
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 5], 10.00th=[ 6], 20.00th=[ 7],
| 30.00th=[ 9], 40.00th=[ 12], 50.00th=[ 16], 60.00th=[ 21],
| 70.00th=[ 27], 80.00th=[ 38], 90.00th=[ 56], 95.00th=[ 75],
| 99.00th=[ 124], 99.50th=[ 147], 99.90th=[ 208], 99.95th=[ 235],
| 99.99th=[ 314]
bw (KB /s): min= 15, max= 537, per=2.00%, avg=158.68, stdev=38.08

write: io=615280KB, bw=3416.8KB/s, iops=854, runt=180113msec

clat (usec): min=167, max=162537, avg=2054.79, stdev=7665.24
lat (usec): min=167, max=162537, avg=2055.38, stdev=7665.23
clat percentiles (usec):
| 1.00th=[ 201], 5.00th=[ 227], 10.00th=[ 249], 20.00th=[ 378],
| 30.00th=[ 548], 40.00th=[ 692], 50.00th=[ 844], 60.00th=[ 996],
| 70.00th=[ 1160], 80.00th=[ 1304], 90.00th=[ 1720], 95.00th=[ 3856],
| 99.00th=[40192], 99.50th=[58624], 99.90th=[98816], 99.95th=[123392],
| 99.99th=[148480]
bw (KB /s): min= 6, max= 251, per=2.00%, avg=68.16, stdev=29.18
lat (usec) : 50=0.01%, 100=0.03%, 250=3.15%, 500=5.00%, 750=5.09%
lat (usec) : 1000=4.87%
lat (msec) : 2=9.64%, 4=4.06%, 10=21.42%, 20=18.08%, 50=19.91%
lat (msec) : 100=7.24%, 250=1.47%, 500=0.03%, 750=0.01%

cpu : usr=0.07%, sys=0.21%, ctx=522490, majf=0, minf=7
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%

submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=356911/w=153820/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
READ: io=1394.2MB, aggrb=7926KB/s, minb=7926KB/s, maxb=7926KB/s, mint=180113msec,
 maxt=180113msec
WRITE: io=615280KB, aggrb=3416KB/s, minb=3416KB/s, maxb=3416KB/s, mint=180113msec, 
maxt=180113msec

Disk stats (read/write):
sdc: ios=356874/153927, merge=0/10, ticks=8668598/310288, in_queue=8978582, util=99.99%
io=执行了多少M的IO

bw=平均IO带宽
iops=IOPS
runt=线程运行时间
slat=提交延迟
clat=完成延迟
lat=响应时间
bw=带宽
cpu=利用率
IO depths=io队列
IO submit=单个IO提交要提交的IO数
IO complete=Like the above submit number, but for completions instead.
IO issued=The number of read/write requests issued, and how many of them were short.
IO latencies=IO完延迟的分布

io=总共执行了多少size的IO
aggrb=group总带宽
minb=最小.平均带宽.
maxb=最大平均带宽.
mint=group中线程的最短运行时间.
maxt=group中线程的最长运行时间.

ios=所有group总共执行的IO数.
merge=总共发生的IO合并数.
ticks=Number of ticks we kept the disk busy.
io_queue=花费在队列上的总共时间.
util=磁盘利用率

Copy after login

4. Extended IO queue depth

At a certain moment, there are N inflight IO requests, including IO requests in the queue and IO requests being processed by the disk. N is the queue depth.
Increasing the hard disk queue depth is to make the hard disk work continuously and reduce the idle time of the hard disk.
Increase the queue depth -> Improve utilization -> Obtain peak IOPS and MBPS -> Note that the response time is within an acceptable range,
There are many ways to increase the queue depth. Using asynchronous IO and initiating multiple IO requests at the same time is equivalent to having multiple IO requests in the queue. Multi-threads initiating synchronous IO requests is equivalent to having multiple IO requests in the queue.
Increase the application IO size. After reaching the bottom layer, it will become multiple IO requests, which is equivalent to multiple IO requests in the queue. The queue depth increases.
As the queue depth increases, the waiting time of IO in the queue will also increase, resulting in longer IO response time, which requires a trade-off.

Why do we need to parallelize disk I/O? The main purpose is to improve the performance of the application. This is particularly important for virtual disks (or LUNs) composed of multiple physical disks.
If I/O is submitted one at a time, although the response time is shorter, the system throughput is very small.
In comparison, submitting multiple I/Os at one time not only shortens the head movement distance (through the elevator algorithm), but also improves IOPS.
If an elevator can only take one person at a time, then once everyone takes the elevator, they can reach their destination quickly (response time), but it will take a longer waiting time (queue length).
Submitting multiple I/Os to the disk system at once balances throughput and overall response time.

Linux system to view the default queue depth:

[root@qsdb ~]# lsscsi -l
[0:0:0:0] disk DGC VRAID 0533 /dev/sda

state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0 timeout=30
[0:0:1:0] disk DGC VRAID 0533 /dev/sdb

state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0 timeout=30
[2:0:0:0] disk DGC VRAID 0533 /dev/sdd

state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0 timeout=30
[2:0:1:0] disk DGC VRAID 0533 /dev/sde

state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0 timeout=30
[4:2:0:0] disk IBM ServeRAID M5210 4.27 /dev/sdc

state=running queue_depth=256 scsi_level=6 type=0 device_blocked=0 timeout=90
[9:0:0:0] cd/dvd Lenovo SATA ODD 81Y3677 IB00 /dev/sr0

state=running queue_depth=1 scsi_level=6 type=5 device_blocked=0 timeout=30

Copy after login

Use the dd command to set bs=2M for testing:

dd if=/dev/zero of=/dev/sdd bs=2M count=1000 oflag=direct

Copy after login

Recorded the read of 1000 0. Recorded the write of 1000 0. 2097152000 bytes (2.1 GB) copied, 10.6663 seconds, 197 MB/second

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sdd 0.00 0.00 0.00 380.60 0.00 389734.40 1024.00 2.39 6.28 2.56 97.42

Copy after login

It can be seen that after 2MB IO reaches the bottom layer, it will become multiple 512KB IOs. The average queue length is 2.39. The utilization rate of this hard disk is 97%, and MBPS reaches 197MB/s.
(Why does it become 512KB IO? You can use Google to check the meaning and usage of the kernel parameter max_sectors_kb.) In other words, increasing the queue depth can test the peak value of the hard disk.

5. Detailed explanation of viewing IO command iostat in Linux system

[root@rac01-node01 /]# iostat -xd 3
Linux 3.8.13-16.2.1.el6uek.x86_64 (rac01-node01) 05/27/2017 _x8664 (40 CPU)
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.05 0.75 2.50 0.50 76.59 69.83 48.96 0.00 1.17 0.47 0.14
scd0 0.00 0.00 0.02 0.00 0.11 0.00 5.25 0.00 21.37 20.94 0.05
dm-0 0.00 0.00 2.40 1.24 75.88 69.83 40.00 0.01 1.38 0.38 0.14
dm-1 0.00 0.00 0.02 0.00 0.14 0.00 8.00 0.00 0.65 0.39 0.00
sdc 0.00 0.00 0.01 0.00 0.11 0.00 10.20 0.00 0.28 0.28 0.00
sdb 0.00 0.00 0.01 0.00 0.11 0.00 10.20 0.00 0.15 0.15 0.00
sdd 0.00 0.00 0.01 0.00 0.11 0.00 10.20 0.00 0.25 0.25 0.00
sde 0.00 0.00 0.01 0.00 0.11 0.00 10.20 0.00 0.14 0.14 0.00

Copy after login

Output parameter description:

rrqms：每秒这个设备相关的读取请求有多少被Merge了（当系统调用需要读取数据的时候，VFS将请求发到各个FS，如果FS发现不同的读取请求读取的是相同Block的数据，FS会将这个请求合并Merge）
wrqm/s：每秒这个设备相关的写入请求有多少被Merge了。
rsec/s：The number of sectors read from the device per second.
wsec/s：The number of sectors written to the device per second.
rKB/s：The number of kilobytes read from the device per second.
wKB/s：The number of kilobytes written to the device per second.
avgrq-sz：平均请求扇区的大小,The average size (in sectors) of the requests that were issued to the device.
avgqu-sz：是平均请求队列的长度。毫无疑问，队列长度越短越好,The average queue length of the requests that were issued to the device.

await：每一个IO请求的处理的平均时间（单位是微秒毫秒）。这里可以理解为IO的响应时间，一般地系统IO响应时间应该低于5ms，如果大于10ms就比较大了。
这个时间包括了队列时间和服务时间，也就是说，一般情况下，await大于svctm，它们的差值越小，则说明队列时间越短，反之差值越大，队列时间越长，说明系统出了问题。
svctm：表示平均每次设备I/O操作的服务时间（以毫秒为单位）。如果svctm的值与await很接近，表示几乎没有I/O等待，磁盘性能很好。
如果await的值远高于svctm的值，则表示I/O队列等待太长，系统上运行的应用程序将变慢。
%util： 在统计时间内所有处理IO时间，除以总共统计时间。例如，如果统计间隔1秒，该设备有0.8秒在处理IO，而0.2秒闲置，那么该设备的%util = 0.8/1 = 80%，
所以该参数暗示了设备的繁忙程度，一般地，如果该参数是100%表示磁盘设备已经接近满负荷运行了（当然如果是多磁盘，即使%util是100%，因为磁盘的并发能力，所以磁盘使用未必就到了瓶颈）。

Copy after login

Through the exploration and experiments of this article, we can see that Linux system IO performance optimization is not a problem that can be solved by simply improving the system hardware configuration, but requires comprehensive consideration and optimization for specific application scenarios and IO operations. . We can use various methods and tools to tune IO performance in Linux systems, such as using IO scheduler, using RAID array, using hard disk cache, etc. We hope that our exploration can be enlightening and helpful to the majority of users, so that your Linux system IO performance can be improved to a higher level.

The above is the detailed content of Explore the secrets of IO performance optimization in Linux systems. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Will R.E.P.O. Have Crossplay?

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7560

CakePHP Tutorial

1384

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Difference between centos and ubuntu Apr 14, 2025 pm 09:09 PM

The key differences between CentOS and Ubuntu are: origin (CentOS originates from Red Hat, for enterprises; Ubuntu originates from Debian, for individuals), package management (CentOS uses yum, focusing on stability; Ubuntu uses apt, for high update frequency), support cycle (CentOS provides 10 years of support, Ubuntu provides 5 years of LTS support), community support (CentOS focuses on stability, Ubuntu provides a wide range of tutorials and documents), uses (CentOS is biased towards servers, Ubuntu is suitable for servers and desktops), other differences include installation simplicity (CentOS is thin)

How to install centos Apr 14, 2025 pm 09:03 PM

CentOS installation steps: Download the ISO image and burn bootable media; boot and select the installation source; select the language and keyboard layout; configure the network; partition the hard disk; set the system clock; create the root user; select the software package; start the installation; restart and boot from the hard disk after the installation is completed.

Centos options after stopping maintenance Apr 14, 2025 pm 08:51 PM

CentOS has been discontinued, alternatives include: 1. Rocky Linux (best compatibility); 2. AlmaLinux (compatible with CentOS); 3. Ubuntu Server (configuration required); 4. Red Hat Enterprise Linux (commercial version, paid license); 5. Oracle Linux (compatible with CentOS and RHEL). When migrating, considerations are: compatibility, availability, support, cost, and community support.

How to use docker desktop Apr 15, 2025 am 11:45 AM

How to use Docker Desktop? Docker Desktop is a tool for running Docker containers on local machines. The steps to use include: 1. Install Docker Desktop; 2. Start Docker Desktop; 3. Create Docker image (using Dockerfile); 4. Build Docker image (using docker build); 5. Run Docker container (using docker run).

Detailed explanation of docker principle Apr 14, 2025 pm 11:57 PM

Docker uses Linux kernel features to provide an efficient and isolated application running environment. Its working principle is as follows: 1. The mirror is used as a read-only template, which contains everything you need to run the application; 2. The Union File System (UnionFS) stacks multiple file systems, only storing the differences, saving space and speeding up; 3. The daemon manages the mirrors and containers, and the client uses them for interaction; 4. Namespaces and cgroups implement container isolation and resource limitations; 5. Multiple network modes support container interconnection. Only by understanding these core concepts can you better utilize Docker.

What computer configuration is required for vscode Apr 15, 2025 pm 09:48 PM

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

What to do after centos stops maintenance Apr 14, 2025 pm 08:48 PM

After CentOS is stopped, users can take the following measures to deal with it: Select a compatible distribution: such as AlmaLinux, Rocky Linux, and CentOS Stream. Migrate to commercial distributions: such as Red Hat Enterprise Linux, Oracle Linux. Upgrade to CentOS 9 Stream: Rolling distribution, providing the latest technology. Select other Linux distributions: such as Ubuntu, Debian. Evaluate other options such as containers, virtual machines, or cloud platforms.

What to do if the docker image fails Apr 15, 2025 am 11:21 AM

Troubleshooting steps for failed Docker image build: Check Dockerfile syntax and dependency version. Check if the build context contains the required source code and dependencies. View the build log for error details. Use the --target option to build a hierarchical phase to identify failure points. Make sure to use the latest version of Docker engine. Build the image with --t [image-name]:debug mode to debug the problem. Check disk space and make sure it is sufficient. Disable SELinux to prevent interference with the build process. Ask community platforms for help, provide Dockerfiles and build log descriptions for more specific suggestions.

See all articles