Summarize knowledge related to Linux disk cache-Linux Operation and Maintenance-php.cn

The following column linux system tutorial will introduce and summarize the relevant knowledge about Linux disk cache. I hope it will be helpful to friends in need!

Preface

I recently encountered a disk-related online failure, so I would like to summarize the knowledge about Linux disk cache that I didn’t know much about before.

In general, there are probably two reasons for the emergence of disk cache: the first is that the speed of accessing the disk is much slower than the speed of accessing the memory. The access speed can be improved by caching the disk content in the memory; the second is According to the principle of locality of the program, once the data has been accessed, it is likely to be accessed again in a short period of time, so caching disk content in memory can improve the running speed of the program.

Principle of locality

Principle of program locality: The program shows locality rules when executed, that is, within a period of time, the execution of the entire program is limited to the program a certain part of. Correspondingly, the storage space accessed by execution is also limited to a certain memory area. Specifically, locality usually has two forms: temporal locality and spatial locality.

Temporal locality: A memory location referenced once will be referenced multiple times in the future.

Spatial locality: If a memory location is referenced, then its nearby locations will also be referenced in the future.

Page Cache

In order to reduce IO operations on the disk, the Linux system will cache the contents of the open disk, and the cache location is physical memory, and then Convert disk access into memory access, effectively improving program speed. Linux's caching method uses physical memory to cache the content on the disk, which is called page cache.

The page cache is composed of physical pages in memory, and its contents correspond to physical blocks on disk. The size of the page cache will be dynamically adjusted according to the free memory size of the system. It can expand the size by occupying memory, and can also shrink itself to relieve memory usage pressure.

Before the emergence of the virtual memory mechanism, the operating system used the block cache series. However, after the emergence of virtual memory, the operating system managed IO with greater granularity, so the page cache mechanism was adopted. The page cache is based on pages. File-oriented caching mechanism.

Reading the page cache

When the Linux system reads a file, it will first read the file content from the page cache. If the page cache does not exist, the system will First, the file content is read from the disk and updated into the page cache, and then the file content is read from the page cache and returned.

The general process is as follows:

The process calls the library function read to initiate a read file request
The kernel checks the open file List, call the read interface provided by the file system
Find the inode corresponding to the file, and then calculate the specific page to be read
Pass The inode searches for the corresponding page cache. 1) If the page cache node is hit, the file content is returned directly; 2) If there is no corresponding page cache, a page fault exception (page fault) will be generated. At this time, the system will create a new empty page cache and read the file content from the disk, update the page cache, and then repeat step 4
Read the file and return

So, all file content reads, regardless of whether the page cache is initially hit, will ultimately come directly from the page cache.

Writing to the page cache

Because of the existence of the page cache, when a process calls write, updates to the file are only written to the page cache of the file. , and then mark the corresponding page as dirty, and the whole process is over. The Linux kernel will periodically write dirty pages back to disk and then clear the dirty flag.

Since the write operation will only write the changes to the page cache, the process will not block until disk IO occurs. If the computer crashes at this time, the changes in the write operation may not occur on the disk. Therefore, for some write operations with strict requirements, such as data systems, it is necessary to actively call fsync and other operations to synchronize changes to the disk in a timely manner. The read operation is different. Read usually blocks until the process reads the data. In order to reduce the delay of the read operation, the Linux system still uses "pre-read" technology, that is, when reading data from the disk, the kernel will Read more pages into the page cache.

Writeback thread

The writeback of the page cache is completed by a separate thread in the kernel. The writeback thread will perform writeback in the following 3 situations. Write:

When free memory is below the threshold. When the free memory is insufficient, part of the cache needs to be released. Since only non-dirty pages can be released, all dirty pages need to be written back to the disk to turn them into clean pages that can be recycled.
When the processing time of dirty pages in memory exceeds the threshold. This is to ensure that dirty pages do not remain in memory indefinitely, reducing the risk of data loss.
When the user process calls the sync and fsync system calls. This is to provide user processes with a method of forced writeback to meet usage scenarios with strict writeback requirements.

Implementation of write-back thread

Name	Version	Description
bdflush	Before version 2.6	The bdflush kernel thread runs in the background. There is only one bdflush thread in the system. When the memory consumption falls below a specific threshold, the bdflush thread is awakened. kupdated runs periodically and writes back dirty pages. However, there is only one bdflush thread in the entire system. When the system writeback task is heavy, the bdflush thread may be blocked on the I/O of a certain disk, causing the I/O writeback operations of other disks to not be executed in time.
pdflush	Introduced in version 2.6	pdflush The number of threads is dynamic and depends on the I/O load of the system. It is a global task for all disks in the system. However, since pdflush is oriented to all disks, it is possible that multiple pdflush threads are all blocked on a congested disk, which also causes the I/O writeback of other disks to not be executed in time.
flusher thread	Introduced after version 2.6.32	The number of flusher threads is not unique, and the flusher thread is not oriented to all disks, but Each flusher thread corresponds to a disk

Recycling of the page cache

The replacement logic of the page cache in Linux is a modified LRU implementation, also known as dual-chain strategy. Unlike before, Linux no longer maintains one LRU linked list, but maintains two linked lists: active linked list and inactive linked list. Pages on the active list are considered "hot" and will not be swapped out, while pages on the inactive list can be swapped out. Pages in the active list must be in the inactive list when they are accessed. Both linked lists are maintained by pseudo-LRU rules: pages are added from the tail and removed from the head, just like a queue. The two linked lists need to be balanced - if the active linked list becomes too large and exceeds the inactive linked list, then the head page of the active linked list will be moved back to the inactive linked list, where it can be recycled again. The double linked list strategy solves the dilemma of only one access in the traditional LRU algorithm. And it is easier to implement pseudo-LRU semantics. This double linked list method is also called LRU/2. The more common one is n linked lists, so it is called LRU/n.

[Recommended study: "linux video tutorial"]

Summary

The online faults encountered this time The fundamental reason is that temporary files are used for caching in the business logic. If a temporary file is deleted within a short period of time after it is created, the operations on the file will be performed in the page cache and will not be actually written back to the disk. . When a program encounters a problem and slows down its response, the temporary file's survival time becomes longer, which may cause it to be written back to the disk, causing excessive disk pressure and affecting the entire system.

The above is the detailed content of Summarize knowledge related to Linux disk cache. For more information, please follow other related articles on the PHP Chinese website!