Detailed explanation of the principles of thread pool and Executor in Java-javaTutorial-php.cn

This article mainly introduces relevant information that explains the analysis of Java thread pool and Executor principles in detail. Examples and analysis principles are provided here to help everyone understand this part of knowledge. Friends in need can refer to it

Detailed analysis of the principles of Java thread pool and Executor

The role and basic knowledge of thread pool

Before we begin, let’s discuss “threads” The concept of "pool". "Thread pool", as the name suggests, is a thread cache. It is a collection of one or more threads. Users can simply throw the tasks that need to be performed to the thread pool without getting too entangled with the details of execution. So what are the functions of the thread pool? Or what are the advantages compared to using Thread directly? I briefly summarized the following points:

Reduce the consumption caused by thread creation and destruction

For the implementation of Java Thread, I mentioned in the previous chapter Analyzed in this blog. Java Thread and kernel thread are 1:1 (Linux). In addition, Thread has a lot of member data in the Java layer and C++ layer, so Java Thread is actually relatively heavy. Creating and destroying a Java Thread requires both the OS and the JVM to do a lot of work, so if the Java Thread is cached, a certain efficiency improvement can be achieved.

More convenient and transparent implementation of computing resource control

To discuss this article, you may need to give some examples. Take the very famous web server Nginx as an example. Nginx is known for its powerful concurrency capabilities and low resource consumption. In order to achieve these strict requirements, Nginx strictly limits the number of worker threads (worker threads are generally equal to the number of CPUs). The focus of this design is to reduce the performance loss caused by thread switching. This optimization method is also applicable to Java. If a new Thread is created for each task, the final result will be that the program resources are difficult to control (a certain function fills up the CPU), and the overall execution speed will be relatively slow. The Java thread pool provides FixedThreadPool, which you can use to control the maximum number of threads.

With so much "nonsense" mentioned above, let's analyze it based on the implementation of Java thread pool! Java's thread pool has several implementations:

cached ThreadPool

The characteristic of the cached thread pool is that it caches previous threads and newly submitted tasks Can run in a cached thread, which achieves the first advantage mentioned above.

fixed ThreadPool

One of the characteristics of cachedThreadPool is that if there is no idle thread to execute the newly submitted task, a new thread will be created. FixedThreadPool will not do this. It will save the task and wait until there is an idle thread before executing it. That is to say, the second advantage mentioned above is achieved.

scheduled ThreadPool

The characteristic of scheduled ThreadPool is that it can realize task scheduling, such as delayed execution and periodic execution of tasks.

In addition to the above three, Java also implements newWorkStealingPool, which is based on the Fork/Join framework. I haven't looked into this yet, so I'll leave it alone. In Java's concurrency support, Executor is used to package various thread pools. The name "executor" is actually quite appropriate. A thread pool is just an executor!

1. Implementation of cached ThreadPool and fixed ThreadPool

As can be seen from the previous description, these two thread pools are very similar. This is indeed the case. In fact, they are implemented at the same time. If not, let’s look at a practical example:

ThreadPoolExecutor executor1 = (ThreadPoolExecutor)Executors.newCachedThreadPool();

Copy after login

ThreadPoolExecutor executor2 = (ThreadPoolExecutor)Executors.newFixedThreadPool(4);

Copy after login

These are two kinds of threads The method of creating a pool looks very similar! If you don't think so, I can only show you the truth.

public static ExecutorService newCachedThreadPool() {
  return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
                 60L, TimeUnit.SECONDS,
                 new SynchronousQueue<Runnable>());
}

public static ExecutorService newFixedThreadPool(int nThreads) {
  return new ThreadPoolExecutor(nThreads, nThreads,
                 0L, TimeUnit.MILLISECONDS,
                 new LinkedBlockingQueue<Runnable>());
}

Copy after login

Yes, they call the same constructor, but the parameters are slightly different. So let's take a look at the meaning of these parameters and the difference between the two sets of parameters. First, you still need to post the constructor of ThreadPoolExecutor.

public ThreadPoolExecutor(int corePoolSize,
             int maximumPoolSize,
             long keepAliveTime,
             TimeUnit unit,
             BlockingQueue<Runnable> workQueue) {
  this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
     Executors.defaultThreadFactory(), defaultHandler);
}

Copy after login

In order to look fresh, I won’t paste the constructor of another layer, and that constructor is just a simple assignment. The function prototype here can already give us a lot of information. I have to say that the JDK code naming is really good, just like comments.

maximumPoolSize is the maximum number of threads in the thread pool; for cached ThreadPool, this value is Integer.MAX_VALUE, which is basically equivalent to infinity. What kind of machine can run billions of threads! ! For fixed ThreadPool, this value is the number of thread pools set by the user.
keepAliveTime and unit determine the cache expiration time of the thread; for cached ThreadPool, the cache expiration time of the thread is one minute. In other words, if a worker thread has nothing to do for one minute, it will be revoked to save money. resource. The time passed in to fixed ThreadPool is 0, which means that the worker thread in fixed ThreadPool will never expire.

corePoolSize是线程池的最小线程数；对于cached ThreadPool，这个值为0，因为在完全没有任务的情况下，cached ThreadPool的确会成为“光杆司令”。至于fixed ThreadPool，这个fixed已经表明corePoolSize是等于线程总数的。
接下来，我们根据一个简单的使用例子，来看看一下cached ThreadPool的流程。

public class Task implements Callable<String> {

private String name;
public Task(String name) {
  this.name = name;
}
@Override
public String call() throws Exception {
  System.out.printf("%s: Starting at : %s\n", this.name, new Date());
  return "hello, world";
}
public static void main(String[] args) {
  ThreadPoolExecutor executor = (ThreadPoolExecutor)Executors.newCachedThreadPool();
  Task task = new Task("test");
  Future<String> result = executor.submit(task);
  try {
    System.out.printf("%s\n", result.get());
  } catch (InterruptedException | ExecutionException e) {
    e.printStackTrace();
  }
  executor.shutdown();
  System.out.printf("Main ends at : %s\n", new Date());
}
}

Copy after login

首先，来看看executor.submit(task)，这其实调用了ThreadPoolExecutor.execute(Runnable command)方法，这个方法的代码如下，整段代码的逻辑是这样的。首先检查线程池的线程数是否不够corePoolSize，如果不够就直接新建线程并把command添加进去；如果线程数已经够了或者添加失败（多个线程增加添加的情况），就尝试把command添加到队列中（workQueue.offer(command)），如果添加失败了，就reject掉cmd。大体的逻辑是这样的，这段代码有很多基于线程安全的设计，这里为了不跑题，就先忽略细节了。

public void execute(Runnable command) {
  if (command == null)
    throw new NullPointerException();
  int c = ctl.get();
  if (workerCountOf(c) < corePoolSize) {
    if (addWorker(command, true))
      return;
    c = ctl.get();
  }
  if (isRunning(c) && workQueue.offer(command)) {
    int recheck = ctl.get();
    if (! isRunning(recheck) && remove(command))
      reject(command);
    else if (workerCountOf(recheck) == 0)
      addWorker(null, false);
  }
  else if (!addWorker(command, false))
    reject(command);
}

Copy after login

到这里，看起来线程池实现的整体思路其实也没多么复杂。但是还有一个问题——一个普通的Thread在执行完自己的run方法后会自动退出。那么线程池是如何实现Worker线程不断的干活，甚至在没有任务的时候。其实答案很简单，就是Worker其实在跑大循环，Worker实际运行方法如下：

final void runWorker(Worker w) {
  Thread wt = Thread.currentThread();
  Runnable task = w.firstTask;
  w.firstTask = null;
  w.unlock(); // allow interrupts
  boolean completedAbruptly = true;
  try {
    while (task != null || (task = getTask()) != null) {
      w.lock();
  /***/
      try {
        beforeExecute(wt, task);
        Throwable thrown = null;
        try {
          task.run();
        /***/
        } finally {
          afterExecute(task, thrown);
        }
      } finally {
        task = null;
        w.completedTasks++;
        w.unlock();
      }
    }
    completedAbruptly = false;
  } finally {
    processWorkerExit(w, completedAbruptly);
  }
}

Copy after login

关键就在这个while的判断条件，对于需要cached线程的情况下，getTask()会阻塞起来，如果缓存的时间过期，就会返回一个null，然后Worker就退出了，也就结束了它的服役周期。而在有任务的情况下，Woker会把task拿出来，然后调用task.run()执行任务，并通过Future通知客户线程（即future.get()返回）。这样一个简单的线程池使用过程就完了。。。

当然，线程池的很多精髓知识——基于线程安全的设计，我都没有分析。有兴趣可以自己分析一下，也可以和我讨论。此外Scheduled ThreadPool这里也没有分析，它的要点其实是调度，主要是根据时间最小堆来驱动的。

The above is the detailed content of Detailed explanation of the principles of thread pool and Executor in Java. For more information, please follow other related articles on the PHP Chinese website!