Traditionally, the reference counting memory mechanism used in PHP cannot handle circular reference memory leaks. However, 5.3.0 PHP uses the synchronization algorithm in the article » Concurrent Cycle Collection in Reference Counted Systems to deal with this memory leak problem.
A complete explanation of the algorithm is a bit beyond the scope of this section, and only the basic parts will be introduced. First, we need to establish some basic rules. If a reference count is increased, it will continue to be used and of course no longer in the garbage. If the reference count is reduced to zero, the variable container will be cleared (free). That is, a garbage cycle occurs only when the reference count decreases to a non-zero value. Secondly, during a garbage cycle, find out which parts are garbage by checking whether the reference count is reduced by 1 and checking which variable containers have zero references.
To avoid having to check all garbage cycles where reference counts may be reduced, this algorithm puts all possible roots (possible roots are zval variable containers) in the root buffer (marked in purple ), which also ensures that each possible garbage root appears only once in the buffer. Garbage collection is performed on all different variable containers within the buffer only when the root buffer is full. Look at step A in the image above.
In step B, the algorithm uses depth-first search to find all possible roots. After finding it, the reference count in each variable container is decremented by "1". To ensure that the same variable container is not decremented by "1" twice, Those that have been subtracted by "1" are marked in gray. In step C, the algorithm again uses a depth-first search for each root node, checking the reference count of each variable container. If the reference count is 0, the variable container is marked white (blue in the diagram). If the reference count is greater than 0, resume the operation of decrementing the reference count by "1" using depth-first search at this point (that is, incrementing the reference count by "1"), and then re-mark them in black. In the last step D, the algorithm traverses the root buffer to remove the variable container roots (zval roots) from there, and at the same time, checks if there are any variable containers that were marked white in the previous step. Each white-marked variable container is cleared.
Now that you have a basic understanding of this algorithm, let’s go back and see how this is integrated with PHP. By default, PHP's garbage collection mechanism is turned on, and there is a php.ini setting that allows you to modify it: zend.enable_gc.
When the garbage collection mechanism is turned on, the loop search algorithm described above will be executed every time the root buffer is full. The root cache area has a fixed size and can store 10,000 possible roots. Of course, you can modify this 10,000 value by modifying the constant GC_ROOT_BUFFER_MAX_ENTRIES in the PHP source file Zend/zend_gc.c and then recompiling PHP. When garbage collection is turned off, the loop search algorithm never executes, however, it is possible that the root will always exist in the root buffer regardless of whether garbage collection is activated in the configuration.
When the garbage collection mechanism is turned off, if the root buffer is full of possible roots, more possible roots will obviously not be recorded. Possible roots that are not recorded will not be analyzed and processed by this algorithm. If they are part of a cyclic reference cycle, they will never be cleared and cause a memory leak.
The reason possible roots are recorded even when garbage collection is not available is that recording possible roots is faster than checking whether garbage collection is on every time a possible root is found. However, the garbage collection and analysis mechanism itself takes a lot of time.
In addition to modifying the configuration zend.enable_gc, you can also turn on and off the garbage collection mechanism by calling the gc_enable() and gc_disable() functions respectively. Calling these functions has the same effect as modifying configuration items to turn on or off the garbage collection mechanism. Ability to force periodic collection even when the root buffer may not be full. You can call the gc_collect_cycles() function for this purpose. This function will return the number of cycles recycled using this algorithm.
The reason you allow turning garbage collection on and off and allowing autonomous initialization is because some parts of your application may be time-sensitive. In this case, you probably don't want to use garbage collection. Of course, turning off garbage collection for certain parts of your application runs the risk of possible memory leaks, since some possible roots may not fit into the limited root buffer. Therefore, just before you call the gc_disable() function to release the memory, it may be wise to call the gc_collect_cycles() function first. Because this will clear out all possible roots that have been stored in the root buffer, then when the garbage collection mechanism is turned off, an empty buffer can be left to have more space to store possible roots.