Let's talk about the String type in Redis data structure-Redis-php.cn

This article will take you to understand the String type in the Redis data structure, and talk about the KV storage structure of Redis. I hope it will be helpful to you!

Let's talk about the String type in Redis data structure

Redis is often used as a distributed KV cache. Many people just use it, but they don’t know that there are many unknown secrets underneath. [Related recommendations: Redis video tutorial]

String type

String is the most basic data type supported by Redis. First, let’s take a look. String, what is its data structure and storage like.

Redefine SDS to store String

As we all know, redis is written in c language, and c language does not have String type, only char[], and During initialization, the size must be specified and the type cannot be changed. In order to realize functions such as dynamic addition and expansion, such as incr command and append command, redis defines and maintains an SDS (Simple Dynamic String) to implement these functions.

Let’s first take a look at the data structure defined in the redis source code. There are 5 types here to save space.

Lets talk about the String type in Redis data structure

1. len: To get the length of char[], you need to traverse the array. The time complexity of len(char[]) is O(n);
2. alloc : There is no String type in C language, only char[], and char[] must allocate space length first. char[] has a pre-allocated length and needs to be expanded after the data grows;

3. falgs: always occupies one byte. The lowest 3 bits are used to indicate the header type. There are 5 types of headers, and there are constant definitions in sds.h.
4. buf[]: char array in C language, with '\0' representing the end, which means that storing binary data cannot contain '\0'. There will be problems with binary storage of pictures, audio, etc. - this is why Redis He said that the SDS he implemented is a binary-safe string.

SDS improvements to c original char array

1. SDS implemented by Redis supports expansion
2. Contains length len, and the complexity of obtaining the length is O(1 )
3. Space pre-allocation
4. Lazy space release (discussed below)

Advantages and disadvantages of SDS

Advantages

Can support expansion
Includes length len, obtaining length complexity O(1)
Space pre-allocation

Disadvantages

Need to allocate additional memory
Efficiency issues caused by frequent allocation and recycling

The memory allocation library jemalloc used by Redis

When jemalloc allocates memory, it will find a power of 2 that is larger than N but closest to N based on the number of bytes we apply for N as the allocated space. This can reduce the number of frequent allocations. for example. If you apply for 6 bytes of space, jemalloc will actually allocate 8 bytes of space; if you apply for 24 bytes of space, jemalloc will allocate 32 bytes. Therefore, in the scenario we just mentioned, the dictEntry structure occupies 32 bytes.

Space pre-allocation

Space pre-allocation is used to optimize the string growth operation of SDS: When the SDS API modifies an SDS, and space needs to be allocated to the SDS When expanding, the program will not only allocate the space necessary for modification to SDS, but also allocate additional unused space to SDS.

Among them, the amount of additional unused space allocated is determined by the following formula:

If the SDS is modified, the length of the SDS (that is, the value of the len attribute) will be less than 1 MB, then the program allocates unused space of the same size as the len attribute. At this time, the value of the SDS len attribute will be the same as the value of the free attribute. For example, if after modification, the len of SDS will become 13 bytes, then the program will also allocate 13 bytes of unused space, and the actual length of the buf array of SDS will become 13 13 1 = 27 bytes ( An extra byte is used to hold the null character).
If after modifying the SDS, the length of the SDS will be greater than or equal to 1 MB, then the program will allocate 1 MB of unused space. For example, if after modification, the len of SDS will become 30 MB, then the program will allocate 1 MB of unused space, and the actual length of the buf array of SDS will be 30 MB 1 MB 1 byte.

Through the space pre-allocation strategy, Redis can reduce the number of memory reallocations required to continuously perform string growth operations.

Lazy release

Lazy space release is used to optimize the string shortening operation of SDS: When the SDS API needs to shorten the string saved by SDS, the program does not immediately Use memory reallocation to recycle the extra bytes after shortening, but use the free attribute to record the number of these bytes and wait for future use.

KV storage structure of Redis

In redis, all storage is stored in the form of KV key-value pairs. K is a string type, which is SDS; V may It is a string, list, hash, etc. (data structures supported by Redis). V is not directly set to a specific type, but is encapsulated with a layer of redisObject; the actual stored data structure is specifically pointed to by the ptr pointer.

Furthermore, in order to better save space, redis also stores ptr pointers in different ways. On the one hand, when a Long type integer is saved, the pointer in RedisObject is directly assigned to integer data, so that There is no need for additional pointers to point to integers, which saves the space overhead of pointers. On the other hand, when string data is saved and the string is less than or equal to 44 bytes, the metadata, pointers and SDS in RedisObject are a continuous memory area, thus avoiding memory fragmentation. This layout method is also called embstr encoding method. Of course, when the string is larger than 44 bytes, the amount of data in SDS begins to increase, and Redis no longer layouts SDS and RedisObject together. Instead, it allocates independent space to SDS and uses a pointer to point to the SDS structure. This layout method is called raw encoding mode. As shown in the figure

Lets talk about the String type in Redis data structure

embstr encoding
Stores a short string, a memory allocation;
It is read-only, if the content is After modification, it will become raw encoding (even if it does not exceed 44 bytes);
raw encoding
can allocate memory space multiple times to store long strings larger than 44 bytes.

raw If the character length of raw SDS is reduced to less than 44, will it be reversed into embstr encoding?
No; the underlying coding of Redis is irreversible (will not be rolled back) after the change.

Summary

Redis is a commonly used caching middleware. We must understand its data structure and storage clearly so that we can choose a more appropriate data structure when using it. and memory estimates.

redis memory calculation address http://www.redis.cn/redis_memory/

For more programming-related knowledge, please visit: Introduction to Programming! !

The above is the detailed content of Let's talk about the String type in Redis data structure. For more information, please follow other related articles on the PHP Chinese website!