Python と ArkScript の非同期モデルの比較-Python チュートリアル-php.cn

Comparing Python and ArkScript asynchronous models

Python は最近非常に注目を集めています。今年 10 月に予定されている 3.13 リリースでは、GIL を削除するという大規模な作業が開始されます。 (ほぼ) GIL なしの Python を試してみたい好奇心旺盛なユーザー向けに、プレリリースがすでにリリースされています。

この誇大宣伝のせいで、私も自分の言語である ArkScript を掘り下げるようになりました。私も過去に Global VM Lock を持っていたからです (2020 年のバージョン 3.0.12 で追加、2022 年の 3.1.3 で削除)。物事を比較し、Python GIL の使用方法と理由をさらに深く掘り下げるように強制します。

定義

まず、GIL (グローバルインタプリタロック) とは何かを定義しましょう。

グローバルインタープリターロック (GIL) は、コンピューター言語インタープリターでスレッドの実行を同期するために使用されるメカニズムで、一度に 1 つのネイティブスレッド (プロセスごと) だけが基本操作 (メモリ割り当てや参照カウントなど) を実行できるようにします。時間です。

Wikipedia — グローバルインタープリタロック

同時実行とは、2 つ以上のタスクが重複する期間で開始、実行、完了できることを指しますが、両方が同時に実行されるという意味ではありません。
並列処理 とは、マルチコアプロセッサなどでタスクが文字通り同時に実行されることです。

詳しい説明については、この Stack Overflow の回答を確認してください。

PythonのGIL

GIL は、すべてのデータ構造のロックを取得および解放する必要がないため、シングルスレッドプログラムの速度を向上させることができます。インタプリタ全体がロックされているため、デフォルトで安全です。

ただし、インタープリターごとに 1 つの GIL があるため、並列処理が制限されます。複数のコアを使用するには、別のプロセス (スレッドの代わりにマルチプロセッシングモジュールを使用) でまったく新しいインタープリターを生成する必要があります。これには、プロセス間通信について考慮する必要があり、無視できないオーバーヘッドが追加されるため、新しいスレッドを生成するよりもコストが高くなります (ベンチマークについては、「GeekPython — GIL become Optional in Python 3.13」を参照)。

Python の非同期にはどのような影響がありますか?

Python の場合、これは主な実装である CPython に依存しており、スレッドセーフなメモリ管理がありません。 GIL がない場合、次のシナリオでは競合状態が生成されます。

共有変数 count = 5 を作成します
スレッド 1: カウント *= 2
スレッド 2: カウント += 1

スレッド 1 が最初に実行される場合、カウントは 11 になります (カウント * 2 = 10、カウント + 1 = 11)。

スレッド 2 が最初に実行される場合、カウントは 12 になります (カウント + 1 = 6、カウント * 2 = 12)。

実行順序は重要ですが、さらに悪いことが起こる可能性があります。両方のスレッドが同時にカウントを読み取った場合、一方が他方の結果を消去し、カウントは 10 または 6 になります!

全体として、GIL を使用すると、一般に (CPython) 実装が簡単かつ高速になります。

シングルスレッドの場合は高速です (操作ごとにロックを取得/解放する必要がありません)
IO バウンドプログラムのマルチスレッドの場合は高速になります (これらは GIL の外部で発生するため)
C で計算集約的な作業を行う CPU 依存プログラムのマルチスレッドの場合は高速になります (C コードを呼び出す前に GIL が解放されるため)

GIL のおかげでスレッドセーフが保証されるため、C ライブラリのラッピングも簡単になります。

欠点は、コードが同時のように非同期ですが、並列ではないということです。

[!NOTE]
Python 3.13 では GIL が削除されます!

PEP 703 では、ビルド構成 --disable-gil が追加されたため、Python 3.13 以降をインストールすると、マルチスレッドプログラムのパフォーマンス向上の恩恵を受けることができます。

Python の非同期/待機モデル

Python では、関数には色が必要です。関数は「通常」または「非同期」のいずれかです。これは実際には何を意味しますか?

>>> def foo(call_me):
...     print(call_me())
... 
>>> async def a_bar():
...     return 5
... 
>>> def bar():
...     return 6
... 
>>> foo(a_bar)
<coroutine object a_bar at 0x10491f480>
<stdin>:2: RuntimeWarning: coroutine 'a_bar' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
>>> foo(bar)
6

ログイン後にコピー

非同期関数はすぐに値を返すのではなく、コルーチンを呼び出すため、呼び出している関数が非同期コールバックを受け取るように設計されていない限り、どこでもコールバックとして使用することはできません。

非同期関数を呼び出すために必要な await キーワードを使用するには、「通常の」関数を非同期にする必要があるため、関数の階層が得られます。

         can call
normal -----------> normal

         can call
async -+-----------> normal
       |
       .-----------> async

ログイン後にコピー

呼び出し元を信頼すること以外に、コールバックが非同期かどうかを知る方法はありません (例外をチェックするために try/Except ブロック内で最初にコールバックを呼び出そうとする場合を除きますが、それは見苦しいです)。

ArkScript parallelism

In the beginning, ArkScript was using a Global VM Lock (akin to Python's GIL), because the http.arkm module (used to create HTTP servers) was multithreaded and it caused problems with ArkScript's VM by altering its state through modifying variables and calling functions on multiple threads.

Then in 2021, I started working on a new model to handle the VM state so that we could parallelize it easily, and wrote an article about it. It was later implemented by the end of 2021, and the Global VM Lock was removed.

ArkScript async/await

ArkScript does not assign a color to async functions, because they do not exist in the language: you either have a function or a closure, and both can call each other without any additional syntax (a closure is a poor man object, in this language: a function holding a mutable state).

Any function can be made async at the call site (instead of declaration):

(let foo (fun (a b c)
    (+ a b c)))

(print (foo 1 2 3))  # 6

(let future (async foo 1 2 3))
(print future)          # UserType<0, 0x0x7f0e84d85dd0>
(print (await future))  # 6
(print (await future))  # nil

ログイン後にコピー

Using the async builtin, we are spawning a std::future under the hood (leveraging std::async and threads) to run our function given a set of arguments. Then we can call await (another builtin) and get a result whenever we want, which will block the current VM thread until the function returns.

Thus, it is possible to await from any function, and from any thread.

The specificities

All of this is possible because we have a single VM that operates on a state contained inside an Ark::internal::ExecutionContext, which is tied to a single thread. The VM is shared between the threads, not the contexts!

        .---> thread 0, context 0
        |            ^
VM <----+       can't interact
        |            v
        .---> thread 1, context 1

ログイン後にコピー

When creating a future by using async, we are:

copying all the arguments to the new context,
creating a brand new stack and scopes,
finally create a separate thread.

This forbids any sort of synchronization between threads since ArkScript does not expose references or any kind of lock that could be shared (this was done for simplicity reasons, as the language aims to be somewhat minimalist but still usable).

However this approach isn't better (nor worse) than Python's, as we create a new thread per call, and the number of threads per CPU is limited, which is a bit costly. Luckily I don't see that as problem to tackle, as one should never create hundreds or thousands of threads simultaneously nor call hundreds or thousands of async Python functions simultaneously: both would result in a huge slow down of your program.

In the first case, this would slowdown your process (even computer) as the OS is juggling to give time to every thread ; in the second case it is Python's scheduler that would have to juggle between all of your coroutines.

[!NOTE]
Out of the box, ArkScript does not provide mechanisms for thread synchronization, but even if we pass a UserType (which is a wrapper on top of type-erased C++ objects) to a function, the underlying object isn't copied.

With some careful coding, one could create a lock using the UserType construct, that would allow synchronization between threads.
(let lock (module:createLock))
(let foo (fun (lock i) {
  (lock true)
  (print (str:format "hello {}" i))
  (lock false) }))
(async foo lock 1)
(async foo lock 2)
ログイン後にコピー

Conclusion

ArkScript and Python use two very different kinds of async / await: the first one requires the use of async at the call site and spawns a new thread with its own context, while the latter requires the programmer to mark functions as async to be able to use await, and those async functions are coroutines, running in the same thread as the interpreter.