The following article will introduce to you how to use nodejs "multi-threading" to handle high-concurrency tasks. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.
Related recommendations: "nodejs video tutorial"
Moore The law was proposed by Intel co-founder Gordon Moore in 1965, that is, the number of components that can be accommodated on an integrated circuit will double every 18 to 24 months, and the performance will also increase by one times. That is, processor (CPU) performance doubles every approximately two years.
More than 50 years have passed since Moore’s Law was proposed. Today, as chip components get closer to the scale of a single atom, it becomes increasingly difficult to keep up with Moore's Law.
In 2019, NVIDIA CEO Jen-Hsun Huang said at the ECS exhibition: "Moore's Law used to grow 10 times every 5 years and 100 times every 10 years. But today, Moore's Law can only grow by a few percentage points every year. , maybe only 2 times every 10 years. Therefore, Moore's Law is over."
The performance of a single processor (CPU) is getting closer and closer to the bottleneck. If you want to break through this bottleneck, you need to make full use of Multi-threading technology
allows a single or multiple CPU
to execute multiple threads at the same time to complete computer tasks faster.
We all know that Javascript
is a single-threaded language, Nodejs
uses Javascript
features, using the event-driven model, implement asynchronous I/O, and behind asynchronous I/O is multi-thread scheduling.
Node
For the implementation of asynchronous I/O, you can refer to Pu Ling's "In-depth introduction to Node.js"
In the Go
language, you can create Goroutine
to explicitly call a new thread, and control the maximum number of concurrencies through the environment variable GOMAXPROCS
.
In Node
, there is no API
that can explicitly create a new thread. Node
implements some asynchronous I/O APIs, such as fs.readFile
, http.request
. The bottom layer of these asynchronous I/O is to call new threads to perform asynchronous tasks, and then use the event-driven model to obtain the execution results.
Server-side development and tool development may require the use of multi-threaded development. For example, use multi-threads to handle complex crawler tasks, use multi-threads to handle concurrent requests, use multi-threads for file processing, etc...
When we use multi-threads, we must control the maximum number of simultaneous concurrencies. Because the maximum number of concurrencies is not controlled, errors caused by file descriptor
exhaustion, network errors caused by insufficient bandwidth, errors caused by port restrictions, etc. may occur.
There is no API
or environment variable for controlling the maximum number of concurrencies in Node
, so next, we will use a few simple lines of code to implement it.
Let’s first assume the following demand scenario. I have a crawler that needs to crawl 100 Nugget articles every day. If one article Crawling is too slow. Crawling 100 articles at a time will cause many requests to fail directly due to too many network connections.
Then we can implement it, request 10 articles each time, and complete it in 10 times. This can not only increase efficiency by 10 times, but also ensure stable operation.
Let’s take a look at a single request task. The code is implemented as follows:
const axios = require("axios"); async function singleRequest(article_id) { // 这里我们直接使用 axios 库进行请求 const reply = await axios.post( "https://api.juejin.cn/content_api/v1/article/detail", { article_id, } ); return reply.data; }
For the convenience of demonstration, here we request the same address 100 times. Let’s create 100 request tasks. The code The implementation is as follows:
// 请求任务列表 const requestFnList = new Array(100) .fill("6909002738705629198") .map((id) => () => singleRequest(id));
Next, let’s implement the concurrent request method. This method supports executing multiple asynchronous tasks at the same time and can limit the maximum number of concurrencies. After a task in the task pool is executed, a new asynchronous task will be pushed to continue execution to ensure high utilization of the task pool. The code is implemented as follows:
const chalk = require("chalk"); const { log } = require("console"); /** * 执行多个异步任务 * @param {*} fnList 任务列表 * @param {*} max 最大并发数限制 * @param {*} taskName 任务名称 */ async function concurrentRun(fnList = [], max = 5, taskName = "未命名") { if (!fnList.length) return; log(chalk.blue(`开始执行多个异步任务,最大并发数: ${max}`)); const replyList = []; // 收集任务执行结果 const count = fnList.length; // 总任务数量 const startTime = new Date().getTime(); // 记录任务执行开始时间 let current = 0; // 任务执行程序 const schedule = async (index) => { return new Promise(async (resolve) => { const fn = fnList[index]; if (!fn) return resolve(); // 执行当前异步任务 const reply = await fn(); replyList[index] = reply; log(`${taskName} 事务进度 ${((++current / count) * 100).toFixed(2)}% `); // 执行完当前任务后,继续执行任务池的剩余任务 await schedule(index + max); resolve(); }); }; // 任务池执行程序 const scheduleList = new Array(max) .fill(0) .map((_, index) => schedule(index)); // 使用 Promise.all 批量执行 const r = await Promise.all(scheduleList); const cost = (new Date().getTime() - startTime) / 1000; log(chalk.green(`执行完成,最大并发数: ${max},耗时:${cost}s`)); return replyList; }
As can be seen from the above code, the key to using Node
for concurrent requests is Promise.all
, Promise.all
Multiple asynchronous tasks can be executed simultaneously.
In the above code, an array with a length of max
is created, and the corresponding number of asynchronous tasks is placed in the array. Then use Promise.all
to execute these asynchronous tasks at the same time. When a single asynchronous task is completed, a new asynchronous task will be taken out of the task pool to continue execution, maximizing efficiency.
Next, we use the following code to perform the execution test (the code is implemented as follows)
(async () => { const requestFnList = new Array(100) .fill("6909002738705629198") .map((id) => () => singleRequest(id)); const reply = await concurrentRun(requestFnList, 10, "请求掘金文章"); })();
The final execution result is as shown below:
At this point, our concurrent request is completed! Next, let’s test the speed of different concurrencies respectively~ The first is 1 concurrency, that is, no concurrency (as shown below)
It takes 11.462 seconds! When concurrency is not used, the task takes a very long time. Next, let’s take a look at how long it takes under other concurrency conditions (as shown below)
As you can see from the above figure, as our number of concurrency increases, the task execution speed becomes faster and faster! This is the advantage of high concurrency, which can improve efficiency several times or even dozens of times in some cases!
If we take a closer look at the above time-consuming, we will find that as the number of concurrency increases, the time-consuming will still have a threshold and cannot completely increase in multiples. This is because Node
actually does not open a thread for each task for processing, but only opens a new thread for asynchronous I/O
tasks. Therefore, Node
is more suitable for processing I/O
intensive tasks, but not suitable for CPU
(computing) intensive tasks.
At this point, we have finished introducing the use of Node "multi-threading" to handle high-concurrency tasks. If you want the program to be more perfect, you also need to consider the task timeout and fault tolerance mechanism. If you are interested, you can implement it yourself.
For more programming-related knowledge, please visit: Introduction to Programming! !
The above is the detailed content of How does node.js 'multi-threading' handle high-concurrency tasks?. For more information, please follow other related articles on the PHP Chinese website!