Reference Counting
Python's default garbage collection mechanism is "reference counting", and each object maintains an ob_ref field. Its advantage is that its mechanism is simple. When a new reference points to the object, the reference count is increased by 1. When the reference of an object is destroyed, it is decreased by 1. Once the reference count of the object is 0, the object is immediately recycled and the memory occupied will be released. Its disadvantage is that it requires extra space to maintain reference counts, but the main problem is that it cannot solve "cyclic references".
What is a circular reference? A and B refer to each other and there is no external reference to either A or B. Although their reference counts are both 1, they should obviously be recycled. Example:
a = { } # a 的引用为 1 b = { } # b 的引用为 1 a['b'] = b # b 的引用增 1,b的引用为2 b['a'] = a # a 的引用增 1,a的引用为 2 del a # a 的引用减 1,a的引用为 1 del b # b 的引用减 1, b的引用为 1
In this example, the del statement decreases The reference counts of a and b are deleted and the variable names used for reference are deleted. However, since the two objects each contain a reference to the other object, although the last two objects cannot be accessed by name, the reference count is not reduced to zero. . Therefore, this object will not be destroyed, it will always reside in memory, which causes a memory leak. In order to solve the circular reference problem, Python introduced two GC mechanisms: mark-sweep and generational collection.
Mark-Sweep
Mark-Sweep is a garbage collection algorithm based on tracing recycling technology. Objects are connected through references (pointers) to form a directed graph. Objects constitute the nodes of this directed graph, and reference relationships constitute the edges of this directed graph. Starting from the root object, objects are traversed along directed edges. Reachable objects are marked as useful objects, and unreachable objects are objects to be cleared. The so-called root objects are some global reference objects and references in the function stack. The objects referenced by these references cannot be deleted.
As Python’s auxiliary garbage collection technology, the mark clearing algorithm mainly deals with some container objects, such as list, dict, tuple, instance, etc., because it is impossible to cause circular reference problems for strings and numerical objects. Python uses a doubly linked list to organize these container objects.
Generational recycling
Generational recycling is an operation method that exchanges space for time. Python divides the memory into different collections based on the survival time of the object. Each collection is called a generation. Python divides the memory into 3 " "Generation", which are the young generation (0th generation), the middle generation (1st generation), and the old generation (2nd generation). They correspond to three linked lists. Their garbage collection frequency increases with the object's survival time. And decrease. Newly created objects will be allocated in the young generation. When the total number of young generation linked lists reaches the upper limit, the Python garbage collection mechanism will be triggered to recycle those objects that can be recycled, and those objects that will not be recycled will be moved to Go to the middle age, and so on. The objects in the old age are the objects that have survived the longest, even within the life cycle of the entire system. At the same time, generational recycling is based on mark-and-sweep technology.
Generational collection also serves as Python’s auxiliary garbage collection technology to process those container objects
Related word search: mechanism garbage collection