JVM memory management------Exquisite explanation of GC algorithm (copy algorithm and marking/collation algorithm)-javaTutorial-php.cn

Copy Algorithm

Let’s first take a look at the practice of the copy algorithm. The copy algorithm divides the memory into two intervals. At any point in time, all dynamically allocated objects can only be allocated in one of the intervals. (called the active interval), while the other interval (called the idle interval) is idle.
When the effective memory space is exhausted, the JVM will pause the program and start the copy algorithm GC thread. Next, the GC thread will copy all the surviving objects in the active interval to the free interval, and arrange them strictly according to the memory address. At the same time, the GC thread will update the memory reference address of the surviving object to point to the new memory address.
At this time, the free interval has been exchanged with the active interval, and all garbage objects have now remained in the original active interval, which is the current free interval. In fact, when the active interval is converted into a space interval, the garbage objects have been recycled all at once.
Does it sound complicated?
In fact, it is not complicated at all. With the foundation of the previous chapter, I believe it will not take too much effort for you to understand this algorithm. LZ draws a picture for you to illustrate the problem, as shown below.

JVM memory management------Exquisite explanation of GC algorithm (copy algorithm and marking/collation algorithm)

In fact, this picture is still the example of the previous chapter, but now the memory is divided into two parts by the copy algorithm. Let’s take a look at the GC thread processing of the copy algorithm. What the two areas will look like after that is shown below.

JVM memory management------Exquisite explanation of GC algorithm (copy algorithm and marking/collation algorithm)

You can see that objects 1 and 4 have been cleared, while objects 2, 3, 5, and 6 are regularly arranged in the free interval just now, as well. It is within the current activity range. At this time, the left half has become an idle interval. It is not difficult to imagine that after the next GC, the left half will become an active interval again.
Obviously, the copy algorithm makes up for the shortcomings of the chaotic memory layout in the mark/clear algorithm. But at the same time, its shortcomings are also quite obvious.
1. It wastes half of the memory, which is terrible.
2. If the survival rate of the object is very high, we can go to the extreme. Assuming 100% survival, then we need to copy all objects and reset all reference addresses. The time spent copying this work will become non-negligible when the object survival rate reaches a certain level.
So it is not difficult to see from the above description that if the replication algorithm is to be used, at least the survival rate of the object must be very low, and the most important thing is that we must overcome the waste of 50% of memory.

Marking/organizing algorithm

The marking/organizing algorithm is very similar to the marking/clearing algorithm. It is also divided into two stages: marking and sorting. Next, LZ will introduce to you what has been done in these two stages.
Marking: Its first phase is exactly the same as the mark/clear algorithm, which traverses GC Roots and then marks the surviving objects.
Organization: Move all surviving objects and arrange them in order of memory addresses, and then recycle all memory after the end memory address. Therefore, the second stage is called the finishing stage.
The diagram before and after GC is very similar to the diagram of the copy algorithm, except that there is no difference between the active interval and the free interval, and the process is very similar to the mark/clear algorithm. Let’s look at the status of the objects in the memory before GC. with the layout as shown below.

JVM memory management------Exquisite explanation of GC algorithm (copy algorithm and marking/collation algorithm)

This picture is actually exactly the same as the mark/clear algorithm, except that in order to facilitate the continuous arrangement of memory rules, LZ added a rectangle to represent the memory area. If the GC thread starts working at this time, then the marking phase begins immediately. This stage is the same as the marking stage of the mark/clear algorithm. We look at the state of the object after the marking stage, as shown below.

JVM memory management------Exquisite explanation of GC algorithm (copy algorithm and marking/collation algorithm)

There is nothing to explain. Next, it should be the finishing stage. Let's take a look at what the memory layout looks like after the sorting phase is completed, as shown below.

JVM memory management------Exquisite explanation of GC algorithm (copy algorithm and marking/collation algorithm)

As you can see, the marked surviving objects will be organized and arranged in order according to the memory address, while the unmarked memory will be cleaned up. In this way, when we need to allocate memory to a new object, the JVM only needs to hold the starting address of the memory, which is obviously much less overhead than maintaining a free list.
It is not difficult to see that the mark/sort algorithm can not only make up for the shortcomings of scattered memory areas in the mark/clear algorithm, but also eliminate the high cost of halving the memory in the copy algorithm. It can be said to kill two birds with one stone, kill two birds with one stone, and kill two birds with one stone. ,one. . . . One woman and two men?
However, any algorithm will have its shortcomings. The only shortcoming of the marking/organizing algorithm is that it is not very efficient. It not only needs to mark all surviving objects, but also sort out the reference addresses of all surviving objects. In terms of efficiency, the marking/collation algorithm is lower than the copying algorithm.

Algorithm Summary

Here LZ will summarize the common points of the three algorithms and their respective advantages and disadvantages, so that you can compare them, and it will definitely be clearer.
What they have in common are mainly the following two points.
1. The three algorithms are all based on the root search algorithm to determine whether an object should be recycled. The theoretical basis that supports the normal operation of the root search algorithm is the relevant content of the variable scope in the grammar. Therefore, the most fundamental way to prevent memory leaks is to master the variable scope, instead of using the C/C++-style memory management method mentioned in the previous chapter on memory management.
2. When the GC thread is started, or when the GC process starts, they must pause the application (stop the world).
LZ will show you the difference between them according to the following points. (> means the former is better than the latter, = means the two have the same effect)
Efficiency: Copy algorithm>Mark/organize algorithm>Mark/clear algorithm (The efficiency here is just a simple comparison of time complexity, This may not necessarily be the case).
Memory neatness: copy algorithm = mark/organize algorithm> mark/clear algorithm.
Memory utilization: [b] Mark/Collate algorithm = Mark/Clear algorithm > Copy algorithm. [/b]
It can be seen that the mark/clear algorithm is a relatively backward algorithm, but the latter two algorithms are built on this basis. As the saying goes, "Don't forget the well digger when you are in trouble", so you should not Forget about the mark/sweep algorithm predecessors. Also, at some point, mark/sweep comes into play.

Conclusion

Now we have a clear understanding of the three algorithms. It can be seen that in terms of efficiency, the replication algorithm is the leader, but it wastes too much memory. Try to take into account the three indicators mentioned above. The marking/organizing algorithm is relatively smoother, but the efficiency is still unsatisfactory. It has one more marking stage than the copy algorithm and one more stage than the marking/clearing algorithm. The process of organizing memory.
Isn’t there an optimal algorithm?
Of course not. The world is fair and everything has two sides. Just imagine, how could you find someone who is beautiful, hard-working, rich, reasonable, has the right personality, the right family background, height, appearance, etc. Wait, a suitable woman? Even if you find one, there is at least one thing that this woman will definitely not be satisfied with, that is, she probably won’t happen to fall in love with any of the hard-working ape friends who are similar to LZ. Do you want to say that you are much better than LZ? Then LZ just wants to tell you that Gao Fushuai will not crawl in front of the computer to read technical articles, 0.0.
But the ancients are awesome. The ancients said that when looking for a wife, you don’t have to find the best one, but the most suitable one. After listening to this sentence, I instantly felt that the world was much better.
The algorithm is the same. There is no best algorithm, only the most appropriate algorithm.
Since these three algorithms have their own flaws, experts will naturally not allow this to happen. Therefore, experts have proposed that different algorithms can be used to process according to different characteristics of the object, similar to the principle that each carrot and cabbage has its own preferences. So a miracle happened, and the experts finally found the god-level algorithm among GC algorithms-----the generational collection algorithm.
As for how this god-level algorithm is processed, LZ will discuss it with all ape friends in the next chapter. This time it will end here. I hope you will gain something.

The above is the content of JVM memory management------GC algorithm (copy algorithm and marking/collation algorithm). For more related content, please pay attention to the PHP Chinese website (www.php.cn)!