For tables using the InnoDB storage engine, storage space is managed in units of pages, as the basic granularity for swapping in and out between memory and disk. When we load a page from disk into memory, disk I/O will be performed. The overhead of disk I/O greatly affects the overall performance. If we read the corresponding page directly from the memory, wouldn't it reduce the performance loss caused by disk I/O and the efficiency will be improved a lot. Based on this, Buffer Pool (
Buffer Pool
) appeared, so next, let’s talk about the Buffer Pool in InnoDB.
Some people may think that since the buffer pool is so good, why not just store all the data in the buffer pool? No, no, no , The buffer pool is a continuous piece of memory allocated by the operating system. Memory has a much smaller capacity than disk and is expensive. So how much memory will the operating system allocate to the buffer pool?
Of course, if your machine has a very large memory capacity, you can configure the startup option parameters in the configuration file innodb_buffer_pool_size
The unit is bytes, and the minimum cannot be less than 5MB.
The buffer pool divides the continuous memory allocated by the operating system into several pages (buffer pages) with a default size of 16KB [At this time, there is no actual The disk page is cached in the Buffer Pool]. When we swap a page from the disk into the buffer pool, how do we allocate the location? Therefore, some control information is needed to identify the buffer pages in these buffer pools. This control information is stored in a memory area called a control block and corresponds to the buffer page one-to-one. The size of the control block is also fixed. Therefore, in this continuous memory space, memory fragmentation will inevitably occur. In summary, the internal structure of the buffer pool is as follows:
Above The linked list node information is mentioned in the control block, so what are the linked list nodes used for? It is to better manage the pages in the buffer pool. The linked list is used to link control blocks, because there is a one-to-one correspondence between control blocks and buffer pages.
Links the control blocks corresponding to all free buffer pages to form a linked list.
Solution to the problem: When swapping a page from the disk into the buffer pool, how to distinguish which page in the buffer pool is free? With the free linked list, when a disk page is swapped into the buffer pool, a free buffer page is obtained directly from the free linked list, and the corresponding information in the disk page is filled in the control block corresponding to the buffer page, and then Just delete the control block from the free linked list.
If the data of the buffer page in the buffer pool is modified, causing it to be inconsistent with the data on the disk, the page is called a dirty page. Link the control blocks corresponding to all dirty pages to form an update linked list, and refresh the data of the corresponding cache page to the disk at a certain time in the future based on this linked list.
The size of the buffer pool is limited. If the cached pages exceed the size of the buffer pool, that is, there are no free buffer pages. When there are new pages to be added, When entering the buffer pool, the LRU strategy is adopted to remove old buffer pages from the buffer pool, and then add new pages. Since the LRU linked list involves a lot of content, we will introduce it separately next.
The optimization mechanism on I/O. As the name suggests, pre-reading will asynchronously These pages are loaded into the buffer pool and are expected to be needed soon. These requests introduce all pages in a range, which is the so-called locality principle
. The purpose is to reduce the disk I/O.
Before understanding the read-ahead mechanism, let’s review the InnoDB logical storage unit: tablespace → segment → extent → page. Specifically mention the area, which will be used later: an area is a continuous 64 pages
in physical location, that is, the size of an area is 1MB.
The pre-reading mechanism can be subdivided into the following two types:
When a page is accessed [that is, the most recent access]
The page is in the buffer pool, the corresponding control block is moved to the head of the LRU linked listThe pages read ahead into the buffer pool will be placed at the head of the LRU linked list, but many of them The page may not be read.
Loading many pages with low frequency into the buffer pool will remove pages with high frequency from the buffer Eliminated from the pool. For example,
full table scan
As shown in the figure, the hot data area and the cold data area occupy different proportions respectively, then we can start it through
innodb_old_blocks_pct option to control the proportion of cold data area
.
When a page is loaded into the buffer pool for the first time, the control block at the end of the cold data area is first eliminated (that is, its corresponding page is eliminated), and then the control block corresponding to the new page is eliminated. Blocks will be placed at the head of the cold data area first.
Let me talk about the conclusion first. This problem has not been well optimized. The reasons are as follows [take full table scan as an example]:
A page visited for the first time will also be Put it at the head of the cold data area, but subsequent access will put it at the head of the hot data area, which will also crowd out pages with higher access frequency.
The buffer pool will try to save frequently used data. When MySQL performs a page read operation, it will first determine whether the page is in the buffer pool. If it exists, it will be directly Read, if it does not exist, the page will be stored in the buffer pool through memory or disk and then read.
The above is the detailed content of Take you to understand the database buffer pool (Buffer Pool) in MySQL. For more information, please follow other related articles on the PHP Chinese website!