The underlying implementation principle of PHP arrays is: 1. Hash table, a data structure that maps different keywords to different units; 2. Linked list, which is a type of data composed of different linked list nodes Structure; 3. PHP array, using linking method to resolve hash conflicts.
1. Hash table
Hash table, as the name suggests, maps different keywords A data structure to different units. The method of mapping different keywords to different units is called a hash function
Ideally, after processing by the hash function, the keywords and units will correspond one-to-one; but if the keyword value is enough In many cases, it is easy for multiple keywords to be mapped to the same unit, that is, a hash conflict.
The solution to the hash conflict is to use either the linking method or the open addressing method
Link method
That is, when different keywords are mapped to the same unit, use a linked list to save these keywords in the same unit
Open addressing method
That is, when inserting data, if it is found that there is data in the unit to which the keyword is mapped, it means that a conflict has occurred, and then continue to search for the next unit until an available unit is found
Because the open addressing method occupies the location of other keyword mapping units, subsequent keywords are more likely to have hash conflicts, so performance degradation is prone to occur
2. Linked list
Since linked lists were mentioned above, here we briefly talk about the basics of linked lists. Linked lists are divided into many types. Commonly used data structures include: queues, stacks, two-way linked lists, etc.
A linked list is a data structure composed of different linked list nodes. Linked list nodes generally consist of elements pointing to pointers to the next node. The doubly linked list, as the name suggests, is composed of a pointer element pointing to the previous node and a pointer pointing to the next node.
We will not expand on the content of the data structure. We will have special content to introduce it in detail later. Data structure
3. PHP array
The way PHP solves hash conflicts is to use the linking method, so PHP arrays are implemented by hash table linked lists, to be precise Said, it is implemented by a hash table doubly linked list
4. Internal structure-hash table
HashTable结构体主要用来存放哈希表的基本信息 typedef struct _hashtable { uint nTableSize; // hash Bucket的大小,即哈希表的容量,最小为8,以2x增长。 uint nTableMask; // nTableSize-1 , 索引取值的优化 uint nNumOfElements; // hash Bucket中当前存在的元素个数,count()函数会直接返回此值 ulong nNextFreeElement; // 下一个可使用的数字键值 Bucket *pInternalPointer; // 当前遍历的指针(foreach比for快的原因之一) Bucket *pListHead; // 存储整个哈希表的头元素指针 Bucket *pListTail; // 存储整个哈希表的尾元素指针 Bucket **arBuckets; // 存储hash数组 dtor_func_t pDestructor; // 在删除元素时执行的回调函数,用于资源的释放 zend_bool persistent; //指出了Bucket内存分配的方式。如果persisient为TRUE,则使用操作系统本身的内存分配函数为Bucket分配内存,否则使用PHP的内存分配函数。 unsigned char nApplyCount; // 标记当前hash Bucket被递归访问的次数(防止多次递归) zend_bool bApplyProtection;// 标记当前hash桶允许不允许多次访问,不允许时,最多只能递归3次 #if ZEND_DEBUG int inconsistent; #endif } HashTable;
The Bucket structure is used to save the specific content of the data
typedef struct bucket { ulong h; // 对char *key进行hash后的值,或者是用户指定的数字索引值 uint nKeyLength; // hash关键字的长度,如果数组索引为数字,此值为0 void *pData; // 指向value,一般是用户数据的副本,如果是指针数据,则指向pDataPtr void *pDataPtr; // 如果是指针数据,此值会指向真正的value,同时上面pData会指向此值 struct bucket *pListNext; // 指向整个哈希表的该单元的下一个元素 struct bucket *pListLast; // 指向整个哈希表的该单元的上一个元素 struct bucket *pNext; // 指向由于哈希冲突导致存放在同一个单元的链表中的下一个元素 struct bucket *pLast; // 指向由于哈希冲突导致存放在同一个单元的链表中的上一个元素 // 保存当前值所对于的key字符串,这个字段只能定义在最后,实现变长结构体 char arKey[1]; } Bucket;
Related learning recommendations: PHP programming from entry to proficiency
The above is the detailed content of What is the underlying implementation principle of PHP arrays?. For more information, please follow other related articles on the PHP Chinese website!