This article will take you to understand the String type in the Redis data structure, and talk about the KV storage structure of Redis. I hope it will be helpful to you!
Redis is often used as a distributed KV cache. Many people just use it, but they don’t know that there are many unknown secrets underneath. [Related recommendations: Redis video tutorial]
String is the most basic data type supported by Redis. First, let’s take a look. String, what is its data structure and storage like.
As we all know, redis is written in c language, and c language does not have String type, only char[], and During initialization, the size must be specified and the type cannot be changed. In order to realize functions such as dynamic addition and expansion, such as incr command and append command, redis defines and maintains an SDS (Simple Dynamic String) to implement these functions.
Let’s first take a look at the data structure defined in the redis source code. There are 5 types here to save space.
1. len: To get the length of char[], you need to traverse the array. The time complexity of len(char[]) is O(n);
2. alloc : There is no String type in C language, only char[], and char[] must allocate space length first. char[] has a pre-allocated length and needs to be expanded after the data grows;
3. falgs: always occupies one byte. The lowest 3 bits are used to indicate the header type. There are 5 types of headers, and there are constant definitions in sds.h.
4. buf[]: char array in C language, with '\0' representing the end, which means that storing binary data cannot contain '\0'. There will be problems with binary storage of pictures, audio, etc. - this is why Redis He said that the SDS he implemented is a binary-safe string.
1. SDS implemented by Redis supports expansion
2. Contains length len, and the complexity of obtaining the length is O(1 )
3. Space pre-allocation
4. Lazy space release (discussed below)
Advantages
Disadvantages
When jemalloc allocates memory, it will find a power of 2 that is larger than N but closest to N based on the number of bytes we apply for N as the allocated space. This can reduce the number of frequent allocations. for example. If you apply for 6 bytes of space, jemalloc will actually allocate 8 bytes of space; if you apply for 24 bytes of space, jemalloc will allocate 32 bytes. Therefore, in the scenario we just mentioned, the dictEntry structure occupies 32 bytes.
Space pre-allocation is used to optimize the string growth operation of SDS: When the SDS API modifies an SDS, and space needs to be allocated to the SDS When expanding, the program will not only allocate the space necessary for modification to SDS, but also allocate additional unused space to SDS.
Among them, the amount of additional unused space allocated is determined by the following formula:
Through the space pre-allocation strategy, Redis can reduce the number of memory reallocations required to continuously perform string growth operations.
Lazy space release is used to optimize the string shortening operation of SDS: When the SDS API needs to shorten the string saved by SDS, the program does not immediately Use memory reallocation to recycle the extra bytes after shortening, but use the free attribute to record the number of these bytes and wait for future use.
In redis, all storage is stored in the form of KV key-value pairs. K is a string type, which is SDS; V may It is a string, list, hash, etc. (data structures supported by Redis). V is not directly set to a specific type, but is encapsulated with a layer of redisObject; the actual stored data structure is specifically pointed to by the ptr pointer.
Furthermore, in order to better save space, redis also stores ptr pointers in different ways. On the one hand, when a Long type integer is saved, the pointer in RedisObject is directly assigned to integer data, so that There is no need for additional pointers to point to integers, which saves the space overhead of pointers. On the other hand, when string data is saved and the string is less than or equal to 44 bytes, the metadata, pointers and SDS in RedisObject are a continuous memory area, thus avoiding memory fragmentation. This layout method is also called embstr encoding method. Of course, when the string is larger than 44 bytes, the amount of data in SDS begins to increase, and Redis no longer layouts SDS and RedisObject together. Instead, it allocates independent space to SDS and uses a pointer to point to the SDS structure. This layout method is called raw encoding mode. As shown in the figure
raw If the character length of raw SDS is reduced to less than 44, will it be reversed into embstr encoding?
No; the underlying coding of Redis is irreversible (will not be rolled back) after the change.
Redis is a commonly used caching middleware. We must understand its data structure and storage clearly so that we can choose a more appropriate data structure when using it. and memory estimates.
redis memory calculation address http://www.redis.cn/redis_memory/
For more programming-related knowledge, please visit: Introduction to Programming! !
The above is the detailed content of Let's talk about the String type in Redis data structure. For more information, please follow other related articles on the PHP Chinese website!