在Java中,线程池大小通常被设置成CPU核心数+1,《Java Concurrency In Practise》8.2节中有这么一段话:
对于计算密集型的任务,在拥有N个处理器的系统上,当线程池的大小为N+1时,通常能实现最优的效率。(即使当计算密集型的线程偶尔由于缺失故障或者其他原因而暂停时,这个额外的线程也能确保CPU的时钟周期不会被浪费。)
btw: 不太熟悉Java,网上引用,没经过实践。
并发编程网上也有一篇相关的文章,要点如下:
如果是CPU密集型应用,则线程池大小设置为N+1
如果是IO密集型应用,则线程池大小设置为2N+1
最佳线程数目 = ((线程等待时间+线程CPU时间)/线程CPU时间 )* CPU数目
线程等待时间所占比例越高,需要越多线程。线程CPU时间所占比例越高,需要越少线程。
疑问:对于n核和2n线程的处理器有什么需要注意的地方?
以上都是引用自Java,Python方面的资料相对较少,所以想讨论一下。
由于CPython中GIL存在,Python同一时刻只能运行一个线程,所以这里不讨论计算型任务,只看IO型任务,Python线程池大小应该怎么设置才算合理?(IO最好的办法是采用异步,主要想讨论下不支持异步的情形)
How to estimate the number of processes that need to be set up for srv?
Principle
The sum of the memory occupied by each process needs to be less than the total memory
IO-intensive
Involves some blocking network communication overhead, the number of processes can be increased, for example, configured to 3 times the number of CPU cores. If the business involves a lot of blocking network overhead, you can appropriately increase the number of processes, for example, 5 times the number of CPU cores or even higher.
CPU-intensive
That is, there is no external network IO overhead, or no blocking network IO overhead. For example, if asynchronous IO is used to read network resources and the process will not be blocked by business code, the number of processes can be set to equal to the CPU The number of cores is the same.
The central idea is whether the bottleneck of your response is io or CPU.
If your response bottleneck is CPU
If your response bottleneck is in IO (such as: network IO)