Comprehensive analysis of memcached – 2. Understanding memcached's memory storage

The following is the second part of "Comprehensive Analysis of Memcached".

Published date: 2008/7/9
Original link: http://gihyo.jp/dev/feature/01/memcached/0002

The link to this series of articles is here:

The 1st time: http://www.phpchina.com/html/29/n-35329.html
The 2nd time: http://www.phpchina.com/html/30/n-35330.html
The 3rd time: http://www.phpchina.com/html/31 /n-35331.html
The 4th time: http://www.phpchina.com/html/32/n-35332.html href="http://www.phpchina.com/html/32/n-35332.html">
The 5th time: http: //www.phpchina.com/html/32/n-35333.html href="http://www.phpchina.com/html/32/n-35333.html"> href="http://www.phpchina.com/html/32/n-35332.html">

Slab Allocation mechanism: organize memory for reuse
- Main terms of Slab Allocation
Principle of caching records in Slab
Disadvantages of Slab Allocator
Using Growth Factor for tuning
Check the internal status of memcached
Check the usage status of slabs
Summary of memory storage

I am Maesaka from the research and development team of mixi Co., Ltd. Cheer. The last article introduced memcached as a distributed cache server. This time we will introduce the implementation of memcached's internal structure and how to manage memory. In addition, the weaknesses caused by the internal structure of memcached will also be explained.

Slab Allocation mechanism: organize memory for reuse

Recent memcached uses a mechanism called Slab Allocator by default to allocate and manage memory. Before the advent of this mechanism, memory allocation was performed by simply mallocing and freeing all records. However, this approach can cause memory fragmentation, increase the burden on the operating system's memory manager, and in the worst case, cause the operating system to be slower than the memcached process itself. Slab Allocator was born to solve this problem.

Let’s take a look at the principle of Slab Allocator. The following is the goal of the slab allocator from the memcached documentation:

The primary goal of the slabs subsystem in memcached was to eliminate memory fragmentation issues totally by using fixed-size memory chunks coming from a few predetermined size classes.

In other words, the basic principle of Slab Allocator is to divide the allocated memory into blocks of specific lengths according to the predetermined size to completely solve the memory fragmentation problem.

The principle of Slab Allocation is quite simple. Divide the allocated memory into chunks of various sizes, and divide chunks of the same size into groups (sets of chunks) (Figure 1).

Comprehensive analysis of memcached – 2. Understanding memcacheds memory storage_PHP tutorial

Figure 1 Slab Allocation construction diagram

Moreover, slab allocator also has the purpose of reusing allocated memory. In other words, the allocated memory will not be released, but reused.

The main term of Slab Allocation

Page

The memory space allocated to Slab, the default is 1MB. After being allocated to the slab, it is divided into chunks according to the size of the slab.

Chunk

Memory space used for caching records.

Slab Class

A group of chunks of a specific size.

The principle of caching records in slab

The following explains how memcached selects slab and caches the data sent by the client into the chunk.

Memcached selects the slab that is most suitable for the data size based on the size of the received data (Figure 2). Memcached stores a list of free chunks in the slab, selects chunks based on the list, and then caches the data in it.

Figure 2 Method of selecting the group to store records

In fact, Slab Allocator also has advantages and disadvantages. Its shortcomings are introduced below.

Disadvantages of Slab Allocator

Slab Allocator solved the original memory fragmentation problem, but the new mechanism also brought new problems to memcached.

The problem is that because a specific length of memory is allocated, the allocated memory cannot be effectively used. For example, if 100 bytes of data are cached into a 128-byte chunk, the remaining 28 bytes are wasted (Figure 3).

Figure 3 Usage of chunk space

There is currently no perfect solution to this problem, but a more effective solution is recorded in the document.

The most efficient way to reduce the waste is to use a list of size classes that closely matches (if that's at all possible) common sizes of objects that the clients of this particular installation of memcached are likely to store.

That is, if the common size of the data sent by the client is known in advance, or if only data of the same size is cached, waste can be reduced by simply using a list of groups suitable for the data size.

But unfortunately, no optimization can be done yet, we can only look forward to future versions. However, we can adjust the difference in slab class size. Next, the growth factor option is explained.

Use Growth Factor for tuning

Memcached specifies the Growth Factor when starting (via the -f option), you can control the differences between slabs to some extent. The default value is 1.25. However, before this option was available, this factor used to be fixed at 2, known as the "powers of 2" strategy.

Let’s try starting memcached in verbose mode using the previous settings:

$ memcached -f 2 -vv

Copy after login

The following is the verbose output after startup:

slab class 1: chunk size 128 perslab 8192 slab class 2: chunk size 256 perslab 4096 slab class 3: chunk size 512 perslab 2048 slab class 4: chunk size 1024 perslab 1024 slab class 5: chunk size 2048 perslab 512 slab class 6: chunk size 4096 perslab 256 slab class 7: chunk size 8192 perslab 128 slab class 8: chunk size 16384 perslab 64 slab class 9: chunk size 32768 perslab 32 slab class 10: chunk size 65536 perslab 16 slab class 11: chunk size 131072 perslab 8 slab class 12: chunk size 262144 perslab 4 slab class 13: chunk size 524288 perslab 2

Copy after login

As can be seen, from Starting with a 128-byte group, the size of the group increases to twice the original size. The problem with this setting is that the differences between slabs are relatively large, and in some cases it is quite a waste of memory. Therefore, in order to minimize memory waste, the growth factor option was added two years ago.

Let’s take a look at the output at the current default setting (f=1.25) (limited by space, only the 10th group is written here):

slab class 1: chunk size 88 perslab 11915 slab class 2: chunk size 112 perslab 9362 slab class 3: chunk size 144 perslab 7281 slab class 4: chunk size 184 perslab 5698 slab class 5: chunk size 232 perslab 4519 slab class 6: chunk size 296 perslab 3542 slab class 7: chunk size 376 perslab 2788 slab class 8: chunk size 472 perslab 2221 slab class 9: chunk size 592 perslab 1771 slab class 10: chunk size 744 perslab 1409

Copy after login

It can be seen that the gap ratio between groups A factor of 2 is much smaller and more suitable for caching records of a few hundred bytes. Judging from the above output results, you may feel that there are some calculation errors. These errors are deliberately set to maintain the alignment of the byte numbers.

When introducing memcached into the product, or deploying it directly using the default value, it is best to recalculate the expected average length of the data and adjust the growth factor to obtain the most appropriate setting. Memory is a precious resource, and it would be a shame to waste it.

Next, let’s introduce how to use memcached’s stats command to view slab utilization and other various information.

View the internal status of memcached

Memcached has a command called stats, which can be used to obtain a variety of information. There are many ways to execute commands, and telnet is the simplest:

$ telnet 主机名 端口号

Copy after login

After connecting to memcached, enter stats and press Enter to obtain various information including resource utilization. Additionally, enter "stats slabs" or "stats items" to get information about cached records. Please enter quit to end the program.

For detailed information on these commands, please refer to the protocol.txt document in the memcached software package.

$ telnet localhost 11211 Trying ::1... Connected to localhost. Escape character is '^]'. stats STAT pid 481 STAT uptime 16574 STAT time 1213687612 STAT version 1.2.5 STAT pointer_size 32 STAT rusage_user 0.102297 STAT rusage_system 0.214317 STAT curr_items 0 STAT total_items 0 STAT bytes 0 STAT curr_connections 6 STAT total_connections 8 STAT connection_structures 7 STAT cmd_get 0 STAT cmd_set 0 STAT get_hits 0 STAT get_misses 0 STAT evictions 0 STAT bytes_read 20 STAT bytes_written 465 STAT limit_maxbytes 67108864 STAT threads 4 END quit

Copy after login

In addition, if you install libmemcached, a client library for C/C++ language, the memstat command will be installed. The method of use is very simple. You can obtain the same information as telnet in fewer steps, and you can also obtain information from multiple servers at once.

$ memstat --servers=server1,server2,server3,...

Copy after login

libmemcached can be obtained from:

http://tangent.org/552/libmemcached.html

View slabs Usage Status

Using memcached I created a Perl script called memcached-tool written by Brad, which can easily obtain the usage status of slab (it organizes the return value of memcached into an easy-to-read format). The script can be obtained from the following address:

http://code.sixapart.com/svn/memcached/trunk/server/scripts/memcached-tool

Use The method is also extremely simple:

$ memcached-tool 主机名:端口 选项

Copy after login

There is no need to specify options when checking slabs usage, so just use the following command:

$ memcached-tool 主机名:端口

Copy after login

The information obtained is as follows:

 # Item_Size Max_age 1MB_pages Count Full? 1 104 B 1394292 s 1215 12249628 yes 2 136 B 1456795 s 52 400919 yes 3 176 B 1339587 s 33 196567 yes 4 224 B 1360926 s 109 510221 yes 5 280 B 1570071 s 49 183452 yes 6 352 B 1592051 s 77 229197 yes 7 440 B 1517732 s 66 157183 yes 8 552 B 1460821 s 62 117697 yes 9 696 B 1521917 s 143 215308 yes 10 872 B 1695035 s 205 246162 yes 11 1.1 kB 1681650 s 233 221968 yes 12 1.3 kB 1603363 s 241 183621 yes 13 1.7 kB 1634218 s 94 57197 yes 14 2.1 kB 1695038 s 75 36488 yes 15 2.6 kB 1747075 s 65 25203 yes 16 3.3 kB 1760661 s 78 24167 yes

Copy after login

The meaning of each column is:

Column meaning #slab class number Item_SizeChunk size Max_age The survival time of the oldest record in LRU 1MB_pages The number of pages allocated to Slab Count The number of records in Slab Full?Slab Whether it contains free chunks

The information obtained from this script is very convenient for tuning and is highly recommended.

Summary of memory storage

This time I briefly explain the caching mechanism and tuning method of memcached. I hope readers can understand the memory management principles of memcached and its advantages and disadvantages.

Next time, we will continue to explain the principles of LRU and Expire, as well as the latest development direction of memcached - pluggable architecher).

Copyright Statement: You can reprint at will, but the original author charlee, the original link and this statement must be noted when reprinting.