Home > Backend Development > Python Tutorial > Introducing BlockBuster: is my asyncio event loop blocked?

Introducing BlockBuster: is my asyncio event loop blocked?

Patricia Arquette
Release: 2025-01-09 06:29:43
Original
903 people have browsed it

Introducing BlockBuster: is my asyncio event loop blocked?

Python 3.5 introduced asynchronous I/O as an alternative to threads to handle concurrency. The advantage of asynchronous I/O and the asyncio implementation in Python is that by not spawning memory-intensive operating system threads, the system uses fewer resources and is more scalable. Furthermore, in asyncio, scheduling points are clearly defined via the await syntax, whereas in thread-based concurrency, the GIL may be released at unpredictable code points. As a result, asyncio-based concurrency systems are easier to understand and debug. Finally, the asyncio task can be canceled, which is not easy to do when using threads.

However, in order to truly benefit from these advantages, it is important to avoid blocking calls in async coroutines. Blocking calls can be network calls, file system calls, sleep calls, etc. These blocking calls are harmful because, under the hood, asyncio uses a single-threaded event loop to run coroutines concurrently. So if you make a blocking call in a coroutine, it blocks the entire event loop and all coroutines, affecting the overall performance of your application.

The following is an example of a blocking call that prevents code from executing concurrently:

<code class="language-python">import asyncio
import datetime
import time

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    time.sleep(1)  # time.sleep 是一个阻塞函数
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

asyncio.run(main())</code>
Copy after login
Copy after login

The running result is similar to:

<code>2025-01-07 18:50:15.327677: 1 start
2025-01-07 18:50:16.328330: 1 stop
2025-01-07 18:50:16.328404: 2 start
2025-01-07 18:50:17.333159: 2 stop</code>
Copy after login
Copy after login

As you can see, the two coroutines are not running concurrently.

To overcome this problem you need to use a non-blocking equivalent or defer execution to the thread pool:

<code class="language-python">import asyncio
import datetime
import time

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    await asyncio.sleep(1)  # 将阻塞的 time.sleep 调用替换为非阻塞的 asyncio.sleep 协程
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

asyncio.run(main())</code>
Copy after login
Copy after login

The running result is similar to:

<code>2025-01-07 18:53:53.579738: 1 start
2025-01-07 18:53:53.579797: 2 start
2025-01-07 18:53:54.580463: 1 stop
2025-01-07 18:53:54.580572: 2 stop</code>
Copy after login

Here two coroutines run concurrently.

Now the problem is that it's not always easy to identify whether a method is blocking or not. Especially if the code base is large or uses third-party libraries. Sometimes, blocking calls are made in deep parts of the code.

For example, does this code block?

<code class="language-python">import blockbuster
from importlib.metadata import version

async def get_version():
    return version("blockbuster")</code>
Copy after login

Does Python load package metadata into memory on startup? Is it done when the blockbuster module is loaded? Or when we call version()? Are the results cached and will subsequent calls be non-blocking? The correct answer is done when calling version(), which involves reading the installed package's METADATA file. And the results are not cached. Therefore, version() is a blocking call and should always be deferred to the thread. It's hard to know this fact without digging into importlib's code.

One way to detect blocking calls is to activate asyncio's debug mode to log blocking calls that take too long. But this is not the most efficient approach, as many blocking times shorter than the trigger timeout will still hurt performance, and blocking times in test/development may be different than in production. For example, database calls may take longer in a production environment if the database must fetch a large amount of data.

This is where BlockBuster comes in! When activated, BlockBuster will patch several blocking Python framework methods that will throw errors if they are called from the asyncio event loop. The default patching methods include methods of os, io, time, socket, and sqlite modules. For a complete list of methods detected by BlockBuster, see the project readme. You can then activate BlockBuster in unit test or development mode to catch any blocking calls and fix them. If you know the awesome BlockHound library for the JVM, it's the same principle, but for Python. BlockHound was a great source of inspiration for BlockBuster, thanks to the creators.

Let’s see how to use BlockBuster on the above blocking code snippet.

First, we need to install the blockbuster package

<code class="language-python">import asyncio
import datetime
import time

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    time.sleep(1)  # time.sleep 是一个阻塞函数
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

asyncio.run(main())</code>
Copy after login
Copy after login

We can then use the pytest fixture and the blockbuster_ctx() method to activate the BlockBuster at the beginning of each test and deactivate it during teardown.

<code>2025-01-07 18:50:15.327677: 1 start
2025-01-07 18:50:16.328330: 1 stop
2025-01-07 18:50:16.328404: 2 start
2025-01-07 18:50:17.333159: 2 stop</code>
Copy after login
Copy after login

If you run this with pytest you will get

<code class="language-python">import asyncio
import datetime
import time

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    await asyncio.sleep(1)  # 将阻塞的 time.sleep 调用替换为非阻塞的 asyncio.sleep 协程
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

asyncio.run(main())</code>
Copy after login
Copy after login

Note: Typically, in a real project, the blockbuster() fixture will be set up in a conftest.py file.

Conclusion

I believe BlockBuster is very useful in asyncio projects. It has helped me detect many blocking call issues in projects I've worked on. But it's not a panacea. In particular, some third-party libraries do not use Python framework methods to interact with the network or file system, but instead wrap C libraries. For these libraries, you can add rules in your test setup to trigger blocking calls to these libraries. BlockBuster is also open source: contributions are very welcome to add rules for your favorite libraries in the core project. If you see issues and areas for improvement, I'd love to receive your feedback in the project issue tracker.

Some links:

  • GitHub Project
  • Question
  • Packages on Pypi

The above is the detailed content of Introducing BlockBuster: is my asyncio event loop blocked?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template