Why Utilize memory_order_seq_cst for Stop Flag Setting, While Checking with memory_order_relaxed?
In his discussion on atomic operations, Herb Sutter presents an example usage of atomics, including a stop flag mechanism:
Worker threads continuously check the stop flag:
while (!stop.load(std::memory_order_relaxed)) { // Do stuff. }
Reasons for Not Using relaxed on Store Operation
Although Herb suggests that using memory_order_relaxed for checking the flag is acceptable due to minimal latency concerns, there is no noticeable performance advantage to employing stricter memory orders, even if latency were a priority.
The reasoning behind not using relaxed on the store operation remains unclear, possibly due to an oversight or a personal preference.
Latency Considerations
ISO C standards do not enforce specific timeframes for store visibility or provide guidance on how to influence it. These provisions apply to all atomic operations, including relaxed. However, implementations are encouraged to make store values accessible to atomic loads within a reasonable timeframe.
In practice, specific latency is determined by the implementation, with hardware cache coherence mechanisms typically allowing visibility within tens of nanoseconds in best-case scenarios and sub-microsecond intervals in near-worst-case scenarios.
Memory Order Implications
Different memory orders for store or load operations do not expedite stores in real time, they merely control whether subsequent operations can become globally visible while the store is still pending.
In essence, stronger orders and barriers do not accelerate events absolutely, but rather postpone others until the store or load is complete. This holds true for all real-world CPUs, which strive to make stores visible to other cores instantaneously.
Hence, increasing memory orders, such as using seq_cst, ensures that changes to the stop flag are immediately visible to worker threads, guaranteeing a swift shutdown. However, it does not impact the actual visibility latency.
Benefits of relaxed Check
Using memory_order_relaxed for the check operation has several advantages:
Additional Considerations
Herb correctly identifies that using relaxed for the dirty flag is also acceptable due to the synchronization provided by thread.join. However, it should be noted that dirty requires atomicity to prevent simultaneous writes of the same value, which is still considered a data race under ISO C standards.
In conclusion, while using memory_order_seq_cst for setting the stop flag ensures immediate visibility to worker threads, there is no performance benefit to doing so over relaxed for the load operation. memory_order_relaxed offers advantages in terms of instruction-level parallelism and memory bandwidth utilization, making it the preferred choice in such scenarios.
The above is the detailed content of Why use `memory_order_seq_cst` for setting the stop flag but `memory_order_relaxed` for checking it?. For more information, please follow other related articles on the PHP Chinese website!