Python 3.5 introduced asynchronous I/O as an alternative to threads to handle concurrency. The advantage of asynchronous I/O and the asyncio implementation in Python is that by not spawning memory-intensive operating system threads, the system uses fewer resources and is more scalable. Furthermore, in asyncio, scheduling points are clearly defined via the await
syntax, whereas in thread-based concurrency, the GIL may be released at unpredictable code points. As a result, asyncio-based concurrency systems are easier to understand and debug. Finally, the asyncio task can be canceled, which is not easy to do when using threads.
However, in order to truly benefit from these advantages, it is important to avoid blocking calls in async coroutines. Blocking calls can be network calls, file system calls, sleep
calls, etc. These blocking calls are harmful because, under the hood, asyncio uses a single-threaded event loop to run coroutines concurrently. So if you make a blocking call in a coroutine, it blocks the entire event loop and all coroutines, affecting the overall performance of your application.
The following is an example of a blocking call that prevents code from executing concurrently:
<code class="language-python">import asyncio import datetime import time async def example(name): print(f"{datetime.datetime.now()}: {name} start") time.sleep(1) # time.sleep 是一个阻塞函数 print(f"{datetime.datetime.now()}: {name} stop") async def main(): await asyncio.gather(example("1"), example("2")) asyncio.run(main())</code>
The running result is similar to:
<code>2025-01-07 18:50:15.327677: 1 start 2025-01-07 18:50:16.328330: 1 stop 2025-01-07 18:50:16.328404: 2 start 2025-01-07 18:50:17.333159: 2 stop</code>
As you can see, the two coroutines are not running concurrently.
To overcome this problem you need to use a non-blocking equivalent or defer execution to the thread pool:
<code class="language-python">import asyncio import datetime import time async def example(name): print(f"{datetime.datetime.now()}: {name} start") await asyncio.sleep(1) # 将阻塞的 time.sleep 调用替换为非阻塞的 asyncio.sleep 协程 print(f"{datetime.datetime.now()}: {name} stop") async def main(): await asyncio.gather(example("1"), example("2")) asyncio.run(main())</code>
The running result is similar to:
<code>2025-01-07 18:53:53.579738: 1 start 2025-01-07 18:53:53.579797: 2 start 2025-01-07 18:53:54.580463: 1 stop 2025-01-07 18:53:54.580572: 2 stop</code>
Here two coroutines run concurrently.
Now the problem is that it's not always easy to identify whether a method is blocking or not. Especially if the code base is large or uses third-party libraries. Sometimes, blocking calls are made in deep parts of the code.
For example, does this code block?
<code class="language-python">import blockbuster from importlib.metadata import version async def get_version(): return version("blockbuster")</code>
Does Python load package metadata into memory on startup? Is it done when the blockbuster
module is loaded? Or when we call version()
? Are the results cached and will subsequent calls be non-blocking? The correct answer is done when calling version()
, which involves reading the installed package's METADATA file. And the results are not cached. Therefore, version()
is a blocking call and should always be deferred to the thread. It's hard to know this fact without digging into importlib
's code.
One way to detect blocking calls is to activate asyncio's debug mode to log blocking calls that take too long. But this is not the most efficient approach, as many blocking times shorter than the trigger timeout will still hurt performance, and blocking times in test/development may be different than in production. For example, database calls may take longer in a production environment if the database must fetch a large amount of data.
This is where BlockBuster comes in! When activated, BlockBuster will patch several blocking Python framework methods that will throw errors if they are called from the asyncio event loop. The default patching methods include methods of os
, io
, time
, socket
, and sqlite
modules. For a complete list of methods detected by BlockBuster, see the project readme. You can then activate BlockBuster in unit test or development mode to catch any blocking calls and fix them. If you know the awesome BlockHound library for the JVM, it's the same principle, but for Python. BlockHound was a great source of inspiration for BlockBuster, thanks to the creators.
Let’s see how to use BlockBuster on the above blocking code snippet.
First, we need to install the blockbuster
package
<code class="language-python">import asyncio import datetime import time async def example(name): print(f"{datetime.datetime.now()}: {name} start") time.sleep(1) # time.sleep 是一个阻塞函数 print(f"{datetime.datetime.now()}: {name} stop") async def main(): await asyncio.gather(example("1"), example("2")) asyncio.run(main())</code>
We can then use the pytest fixture and the blockbuster_ctx()
method to activate the BlockBuster at the beginning of each test and deactivate it during teardown.
<code>2025-01-07 18:50:15.327677: 1 start 2025-01-07 18:50:16.328330: 1 stop 2025-01-07 18:50:16.328404: 2 start 2025-01-07 18:50:17.333159: 2 stop</code>
If you run this with pytest you will get
<code class="language-python">import asyncio import datetime import time async def example(name): print(f"{datetime.datetime.now()}: {name} start") await asyncio.sleep(1) # 将阻塞的 time.sleep 调用替换为非阻塞的 asyncio.sleep 协程 print(f"{datetime.datetime.now()}: {name} stop") async def main(): await asyncio.gather(example("1"), example("2")) asyncio.run(main())</code>
Note: Typically, in a real project, the
blockbuster()
fixture will be set up in aconftest.py
file.
I believe BlockBuster is very useful in asyncio projects. It has helped me detect many blocking call issues in projects I've worked on. But it's not a panacea. In particular, some third-party libraries do not use Python framework methods to interact with the network or file system, but instead wrap C libraries. For these libraries, you can add rules in your test setup to trigger blocking calls to these libraries. BlockBuster is also open source: contributions are very welcome to add rules for your favorite libraries in the core project. If you see issues and areas for improvement, I'd love to receive your feedback in the project issue tracker.
Some links:
The above is the detailed content of Introducing BlockBuster: is my asyncio event loop blocked?. For more information, please follow other related articles on the PHP Chinese website!