Table of Contents
Why GIL is needed
Implementation of GIL
Some notes
GIL optimization
The consistency of user data cannot rely on GIL
Home Backend Development Python Tutorial What is the GIL in Python

What is the GIL in Python

May 14, 2023 pm 02:40 PM
python gil

Why GIL is needed

GIL is essentially a lock. Students who have studied operating systems know that locks are introduced to avoid data inconsistencies caused by concurrent access. There are many global variables defined outside functions in CPython, such as usable_arenas and usedpools in memory management. If multiple threads apply for memory at the same time, these variables may be modified at the same time, causing data confusion. In addition, Python's garbage collection mechanism is based on reference counting. All objects have an ob_refcnt field that indicates how many variables currently reference the current object. Operations such as variable assignment and parameter passing will increase the reference count. Exiting the scope or returning from the function will reduce the references. count. Similarly, if multiple threads modify the reference count of the same object at the same time, it is possible that ob_refcnt is different from the real value, which may cause a memory leak. Objects that will not be used will not be recycled. In more serious cases, the object may not be recycled. The referenced object caused the Python interpreter to crash.

Implementation of GIL

The definition of GIL in CPython is as follows

struct _gil_runtime_state {
    unsigned long interval; // 请求 GIL 的线程在 interval 毫秒后还没成功,就会向持有 GIL 的线程发出释放信号
    _Py_atomic_address last_holder; // GIL 上一次的持有线程,强制切换线程时会用到
    _Py_atomic_int locked; // GIL 是否被某个线程持有
    unsigned long switch_number; // GIL 的持有线程切换了多少次
    // 条件变量和互斥锁,一般都是成对出现
    PyCOND_T cond;
    PyMUTEX_T mutex;
    // 条件变量,用于强制切换线程
    PyCOND_T switch_cond;
    PyMUTEX_T switch_mutex;
};
Copy after login

The most essential thing is the locked field protected by mutex, which indicates whether the GIL is currently held. The other fields are for Used to optimize the GIL. When a thread applies for GIL, it calls the take_gil() method, and when it releases GIL, it calls the drop_gil() method. In order to avoid starvation, when a thread waits for interval milliseconds (default is 5 milliseconds) and has not applied for GIL, it will actively send a signal to the thread holding GIL, and the GIL holder will check the signal at the appropriate time. , if it is found that other threads are applying, the GIL will be forcibly released. The appropriate timing mentioned here is different in different versions. In the early days, it was checked every 100 instructions. In Python 3.10.4, it was checked at the end of the conditional statement, the end of each loop body of the loop statement, and the end of the function call. It will be checked when the time comes.

The function take_gil() that applies for GIL is simplified as follows

static void take_gil(PyThreadState *tstate)
{
    ...
    // 申请互斥锁
    MUTEX_LOCK(gil->mutex);
    // 如果 GIL 空闲就直接获取
    if (!_Py_atomic_load_relaxed(&gil->locked)) {
        goto _ready;
    }
    // 尝试等待
    while (_Py_atomic_load_relaxed(&gil->locked)) {
        unsigned long saved_switchnum = gil->switch_number;
        unsigned long interval = (gil->interval >= 1 ? gil->interval : 1);
        int timed_out = 0;
        COND_TIMED_WAIT(gil->cond, gil->mutex, interval, timed_out);
        if (timed_out &&  _Py_atomic_load_relaxed(&gil->locked) && gil->switch_number == saved_switchnum) {
            SET_GIL_DROP_REQUEST(interp);
        }
    }
_ready:
    MUTEX_LOCK(gil->switch_mutex);
    _Py_atomic_store_relaxed(&gil->locked, 1);
    _Py_ANNOTATE_RWLOCK_ACQUIRED(&gil->locked, /*is_write=*/1);

    if (tstate != (PyThreadState*)_Py_atomic_load_relaxed(&gil->last_holder)) {
        _Py_atomic_store_relaxed(&gil->last_holder, (uintptr_t)tstate);
        ++gil->switch_number;
    }
    // 唤醒强制切换的线程主动等待的条件变量
    COND_SIGNAL(gil->switch_cond);
    MUTEX_UNLOCK(gil->switch_mutex);
    if (_Py_atomic_load_relaxed(&ceval2->gil_drop_request)) {
        RESET_GIL_DROP_REQUEST(interp);
    }
    else {
        COMPUTE_EVAL_BREAKER(interp, ceval, ceval2);
    }
    ...
    // 释放互斥锁
    MUTEX_UNLOCK(gil->mutex);
}
Copy after login

In order to ensure atomicity, the entire function body needs to apply for and release the mutex lock gil->mutex at the beginning and end respectively. If the current GIL is idle, get the GIL directly. If it is not idle, wait for the condition variable gil->cond interval milliseconds (not less than 1 millisecond). If it times out and no GIL switching occurs during the period, set gil_drop_request to request forced switching. The GIL holds the thread, otherwise it continues to wait. Once the GIL is successfully obtained, the values ​​of gil->locked, gil->last_holder and gil->switch_number need to be updated, the condition variable gil->switch_cond must be awakened, and the mutex lock gil->mutex must be released.

The function drop_gil() that releases GIL is simplified as follows

static void drop_gil(struct _ceval_runtime_state *ceval, struct _ceval_state *ceval2,
         PyThreadState *tstate)
{
    ...
    if (tstate != NULL) {
        _Py_atomic_store_relaxed(&gil->last_holder, (uintptr_t)tstate);
    }
    MUTEX_LOCK(gil->mutex);
    _Py_ANNOTATE_RWLOCK_RELEASED(&gil->locked, /*is_write=*/1);
    // 释放 GIL
    _Py_atomic_store_relaxed(&gil->locked, 0);
    // 唤醒正在等待 GIL 的线程
    COND_SIGNAL(gil->cond);
    MUTEX_UNLOCK(gil->mutex);
    if (_Py_atomic_load_relaxed(&ceval2->gil_drop_request) && tstate != NULL) {
        MUTEX_LOCK(gil->switch_mutex);
        // 强制等待一次线程切换才被唤醒,避免饥饿
        if (((PyThreadState*)_Py_atomic_load_relaxed(&gil->last_holder)) == tstate)
        {
            assert(is_tstate_valid(tstate));
            RESET_GIL_DROP_REQUEST(tstate->interp);
            COND_WAIT(gil->switch_cond, gil->switch_mutex);
        }
        MUTEX_UNLOCK(gil->switch_mutex);
    }
}
Copy after login

First release the GIL under the protection of gil->mutex, and then wake up other threads that are waiting for the GIL. In a multi-CPU environment, the current thread has a higher probability of reacquiring the GIL after releasing the GIL. In order to avoid starving other threads, the current thread needs to be forced to wait for the condition variable gil->switch_cond. It can only obtain the GIL when other threads Only then will the current thread be awakened.

Some notes

GIL optimization

Code subject to GIL constraints cannot be executed in parallel, which reduces the overall performance. In order to minimize the performance loss, Python does not perform IO operations or not When intensive CPU calculations involving object access occur, the GIL will be actively released, reducing the granularity of the GIL, such as

  • reading and writing files

  • Network access

  • Encrypted data/Compressed data

So strictly speaking, in the case of a single process, multiple Python threads may be accessed simultaneously Execution, for example, one thread is running normally and another thread is compressing data.

The consistency of user data cannot rely on GIL

GIL is a lock generated to maintain the consistency of internal variables of the Python interpreter. The consistency of user data is not responsible for GIL. Although GIL also ensures the consistency of user data to a certain extent. For example, instructions that do not involve jumps and function calls in Python 3.10.4 will be executed atomically under the constraints of GIL, but the consistency of data in business logic The user needs to lock it himself to ensure it.

The following code uses two threads to simulate the user's collection of fragments and winning awards

from threading import Thread

def main():
    stat = {"piece_count": 0, "reward_count": 0}
    t1 = Thread(target=process_piece, args=(stat,))
    t2 = Thread(target=process_piece, args=(stat,))
    t1.start()
    t2.start()
    t1.join()
    t2.join()
    print(stat)

def process_piece(stat):
    for i in range(10000000):
        if stat["piece_count"] % 10 == 0:
            reward = True
        else:
            reward = False
        if reward:
            stat["reward_count"] += 1
        stat["piece_count"] += 1

if __name__ == "__main__":
    main()
Copy after login

Assuming that the user can get a reward every time he collects 10 fragments, and each thread has collected 10,000,000 fragments, it should be 9999999 rewards were obtained (the last time was not calculated), a total of 20000000 fragments should be collected, and 1999998 rewards were obtained, but the results of the first run on my computer were as follows

{'piece_count': 20000000, 'reward_count': 1999987}
Copy after login

The total number of fragments is consistent with expectations, but the number of rewards But there are 12 missing. The number of pieces is correct because in Python 3.10.4, stat["piece_count"] = 1 is executed atomically under GIL constraints. Since the execution thread may be switched at the end of each loop, it is possible that thread t1 will increase piece_count to 100 at the end of a certain loop, but before the next loop starts to judge modulo 10, the Python interpreter switches to thread t2 for execution, and t2 will increase piece_count. If you reach 101, you will miss a reward.

Attachment: How to avoid being affected by GIL

Having said so much, if I don’t talk about the solution, it is just a popular science post, but it is useless. GIL is so bad, is there a way around it? Let’s take a look at what solutions are available.

Use multiprocess to replace Thread

The emergence of the multiprocess library is largely to make up for the inefficiency of the thread library due to GIL. It completely replicates a set of interfaces provided by thread to facilitate migration. The only difference is that it uses multiple processes instead of multiple threads. Each process has its own independent GIL, so there will be no GIL contention between processes.

Of course multiprocess is not a panacea. Its introduction will increase the difficulty of data communication and synchronization between time threads in the program. Take the counter as an example. If we want multiple threads to accumulate the same variable, for thread, declare a global variable and wrap three lines with the thread.Lock context. In multiprocess, since the processes cannot see each other's data, they can only declare a Queue in the main thread, put and then get, or use shared memory. This additional implementation cost makes coding multi-threaded programs, which is already very painful, even more painful. Where are the specific difficulties? Interested readers can further read this article

Use other parsers

As mentioned before, since GIL is only a product of CPython, are other parsers better? ? Yes, parsers like JPython and IronPython do not require the help of the GIL due to the nature of their implementation languages. However, by using Java/C# for the parser implementation, they also lost the opportunity to take advantage of the community's many useful features of the C language module. So these parsers have always been relatively niche. After all, everyone will choose the former over function and performance in the early stage. Done is better than perfect.

So it’s hopeless?

Of course, the Python community is also working very hard to continuously improve the GIL, and even try to remove the GIL. And there have been a lot of improvements in each minor version. Interested readers can further read this Slide

Another improvement Reworking the GIL

– Change the switching granularity from based on opcode counting to based on time slice counting

&ndash ; Prevent the thread that recently released the GIL lock from being scheduled again immediately

– Added thread priority function (high-priority threads can force other threads to release the GIL lock they hold)

The above is the detailed content of What is the GIL in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Will R.E.P.O. Have Crossplay?
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PHP and Python: Code Examples and Comparison PHP and Python: Code Examples and Comparison Apr 15, 2025 am 12:07 AM

PHP and Python have their own advantages and disadvantages, and the choice depends on project needs and personal preferences. 1.PHP is suitable for rapid development and maintenance of large-scale web applications. 2. Python dominates the field of data science and machine learning.

How is the GPU support for PyTorch on CentOS How is the GPU support for PyTorch on CentOS Apr 14, 2025 pm 06:48 PM

Enable PyTorch GPU acceleration on CentOS system requires the installation of CUDA, cuDNN and GPU versions of PyTorch. The following steps will guide you through the process: CUDA and cuDNN installation determine CUDA version compatibility: Use the nvidia-smi command to view the CUDA version supported by your NVIDIA graphics card. For example, your MX450 graphics card may support CUDA11.1 or higher. Download and install CUDAToolkit: Visit the official website of NVIDIACUDAToolkit and download and install the corresponding version according to the highest CUDA version supported by your graphics card. Install cuDNN library:

Detailed explanation of docker principle Detailed explanation of docker principle Apr 14, 2025 pm 11:57 PM

Docker uses Linux kernel features to provide an efficient and isolated application running environment. Its working principle is as follows: 1. The mirror is used as a read-only template, which contains everything you need to run the application; 2. The Union File System (UnionFS) stacks multiple file systems, only storing the differences, saving space and speeding up; 3. The daemon manages the mirrors and containers, and the client uses them for interaction; 4. Namespaces and cgroups implement container isolation and resource limitations; 5. Multiple network modes support container interconnection. Only by understanding these core concepts can you better utilize Docker.

Python vs. JavaScript: Community, Libraries, and Resources Python vs. JavaScript: Community, Libraries, and Resources Apr 15, 2025 am 12:16 AM

Python and JavaScript have their own advantages and disadvantages in terms of community, libraries and resources. 1) The Python community is friendly and suitable for beginners, but the front-end development resources are not as rich as JavaScript. 2) Python is powerful in data science and machine learning libraries, while JavaScript is better in front-end development libraries and frameworks. 3) Both have rich learning resources, but Python is suitable for starting with official documents, while JavaScript is better with MDNWebDocs. The choice should be based on project needs and personal interests.

MiniOpen Centos compatibility MiniOpen Centos compatibility Apr 14, 2025 pm 05:45 PM

MinIO Object Storage: High-performance deployment under CentOS system MinIO is a high-performance, distributed object storage system developed based on the Go language, compatible with AmazonS3. It supports a variety of client languages, including Java, Python, JavaScript, and Go. This article will briefly introduce the installation and compatibility of MinIO on CentOS systems. CentOS version compatibility MinIO has been verified on multiple CentOS versions, including but not limited to: CentOS7.9: Provides a complete installation guide covering cluster configuration, environment preparation, configuration file settings, disk partitioning, and MinI

How to operate distributed training of PyTorch on CentOS How to operate distributed training of PyTorch on CentOS Apr 14, 2025 pm 06:36 PM

PyTorch distributed training on CentOS system requires the following steps: PyTorch installation: The premise is that Python and pip are installed in CentOS system. Depending on your CUDA version, get the appropriate installation command from the PyTorch official website. For CPU-only training, you can use the following command: pipinstalltorchtorchvisiontorchaudio If you need GPU support, make sure that the corresponding version of CUDA and cuDNN are installed and use the corresponding PyTorch version for installation. Distributed environment configuration: Distributed training usually requires multiple machines or single-machine multiple GPUs. Place

How to choose the PyTorch version on CentOS How to choose the PyTorch version on CentOS Apr 14, 2025 pm 06:51 PM

When installing PyTorch on CentOS system, you need to carefully select the appropriate version and consider the following key factors: 1. System environment compatibility: Operating system: It is recommended to use CentOS7 or higher. CUDA and cuDNN:PyTorch version and CUDA version are closely related. For example, PyTorch1.9.0 requires CUDA11.1, while PyTorch2.0.1 requires CUDA11.3. The cuDNN version must also match the CUDA version. Before selecting the PyTorch version, be sure to confirm that compatible CUDA and cuDNN versions have been installed. Python version: PyTorch official branch

How to install nginx in centos How to install nginx in centos Apr 14, 2025 pm 08:06 PM

CentOS Installing Nginx requires following the following steps: Installing dependencies such as development tools, pcre-devel, and openssl-devel. Download the Nginx source code package, unzip it and compile and install it, and specify the installation path as /usr/local/nginx. Create Nginx users and user groups and set permissions. Modify the configuration file nginx.conf, and configure the listening port and domain name/IP address. Start the Nginx service. Common errors need to be paid attention to, such as dependency issues, port conflicts, and configuration file errors. Performance optimization needs to be adjusted according to the specific situation, such as turning on cache and adjusting the number of worker processes.

See all articles