Why is it recommended to use multi-process instead of multi-threading in python? --Reposted from an article by a colleague
Recently, I have been reading about multi-threading in Python. We often hear veterans say: "Multi-threading in python is useless, it is recommended to use multi-process!", but why? What do you mean?
You need to know what it is, and you also need to know why it is so. So with the following in-depth research:
First of all, emphasize the background:
1. What is GIL? The full name of GIL is Global Interpreter Lock (global interpreter lock). The source is the consideration at the beginning of python design and the decision made for data security.
2. Each CPU can only execute one thread at the same time (multi-threading under a single-core CPU is actually only concurrency, not parallelism. From a macro perspective, concurrency and parallelism both process multiple requests at the same time. The concept. But there is a difference between concurrency and parallelism. Parallelism means that two or more events occur at the same time; concurrency means that two or more events occur at the same time interval)
<.>Under Python multi-threading, the execution method of each thread:
1. Obtain GIL
2. Execute the code until sleep or the python virtual machine suspends it.
3. Release the GIL
It can be seen that if a thread wants to execute, it must first get the GIL. We can think of the GIL as a "pass", and in a python process, there is only one GIL. Threads that cannot obtain a pass are not allowed to enter the CPU for execution.
In python2.x, the release logic of GIL is that the current thread encounters an IO operation or the tick count reaches 100 (ticks can be regarded as a counter of python itself, specifically Used for GIL, it is reset to zero after each release. This count can be adjusted through sys.setcheckinterval) before release.
Every time the GIL lock is released, threads compete for locks and switch threads, which consumes resources. And because of the GIL lock, a process in Python can only execute one thread at the same time (the thread that has obtained the GIL can execute). This is why Python's multi-threading efficiency is not high on multi-core CPUs.
So is python’s multi-threading completely useless?
Here we have a classified discussion:
1. CPU-intensive code (various loop processing, counting, etc.). In this case, due to a lot of calculation work, the tick count is very The threshold will be reached soon, and then the release and re-competition of GIL will be triggered (switching back and forth between multiple threads certainly consumes resources), so multi-threading under Python is not friendly to CPU-intensive codes.
2. For IO-intensive codes (file processing, web crawlers, etc.), multi-threading can effectively improve efficiency (if there are IO operations under a single thread, IO waiting will be performed, causing unnecessary waste of time, and multi-threading is turned on It can automatically switch to thread B while thread A is waiting, without wasting CPU resources, thereby improving program execution efficiency). Therefore, Python's multi-threading is more friendly to IO-intensive code.
In python3. The program is more friendly, but it still does not solve the problem caused by GIL that only one thread can be executed at the same time, so the efficiency is still unsatisfactory.
Please note: Multi-core multi-threading is worse than single-core multi-threading. The reason is that under single-core multi-threading, every time the GIL is released, the thread that wakes up can get it. GIL lock, so it can be executed seamlessly, but under multi-core, after CPU0 releases the GIL, the threads on other CPUs will compete, but the GIL may be immediately obtained by CPU0, causing the awakened threads on several other CPUs to Waiting awake until the switching time and then entering the pending state will cause thread thrashing, resulting in lower efficiency
Back to the original question: We often hear veterans say: "If you want to make full use of multi-core CPUs in Python, use multi-processes." What is the reason?
The reason is: each process has its own independent GIL and does not interfere with each other, so that it can be executed in parallel in a true sense. Therefore, in python, the execution efficiency of multi-process is better than that of multi-threading (only for multi-core for CPU).
So here’s the conclusion: under multi-core, if you want to improve efficiency in parallel, a more common method is to use multiple processes, which can effectively improve execution efficiency
http://www.bkjia.com/PHPjc/1117253.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/1117253.htmlTechArticleWhy is it recommended to use multi-process instead of multi-threading in python? --Reposted from an article by a colleague. Recently, I have been reading about multi-threading in Python. We often hear veterans say: "Multi-threading under python is a waste...