Let's talk about multi-process and multi-threading in Node.js
Everyone knows that Node is single-threaded, but they don’t know that it also provides a multi-thread (thread) module to speed up the processing of some special tasks. This article will lead you to understand the multi-threading of Node.js (thread), I hope it will be helpful to everyone!
We all know that Node.js uses a single-threaded, event-driven asynchronous I/O model, and its characteristics determine that it cannot be exploited The advantages of CPU multi-core are not good at completing some non-I/O type operations (such as executing scripts, AI calculations, image processing, etc.). In order to solve such problems, Node.js provides a conventional multi-thread solution ( For discussions on processes and threads, please refer to the author’s other article Node.js and Concurrency Model). This article will introduce the multi-thread (thread) mechanism of Node.js.
child_process
We can use the child_process
module to create a child process of Node.js to complete some special tasks (such as executing Script), this module mainly provides methods such as exec
, execFile
, fork
, spwan
, etc. We will briefly introduce these methods below. usage of.
exec
const { exec } = require('child_process'); exec('ls -al', (error, stdout, stderr) => { console.log(stdout); });
This method processes the command string according to the executable file specified by options.shell
and caches it during the execution of the command. Output until the command execution is completed, and then return the execution results in the form of callback function parameters.
The parameters of this method are explained as follows:
command
: The command to be executed (such asls -al
);options
: Parameter settings (optional), related properties are as follows:cwd
: The current working directory of the child process, which defaults to the value ofprocess.cwd()
;env
: environment variable setting (key value pair object), the default value isprocess.env
;##encoding
: character encoding, the default value is:
utf8;
shell
: executable file that handles command strings, the default value on
Unixis
/bin/ The default value on sh,
Windowsis the value of
process.env.ComSpec(if it is empty, it is
cmd.exe); for example:
Running the above example will outputconst { exec } = require('child_process'); exec("print('Hello World!')", { shell: 'python' }, (error, stdout, stderr) => { console.log(stdout); });
Copy after loginHello World!
Note: It happens that, which is equivalent to the child process executing the
python -c "print('Hello World!')"command, so When using this attribute, please note that the specified executable file must support the execution of relevant statements through the
-coption.
Node.js
also supports the
-coption, but it is equivalent to the
--checkoption and is only used to detect the specified If the script contains syntax errors, the relevant script will not be executed.
signal
: Use the specified
AbortSignal to terminate the child process. This property is available above v14.17.0, such as:In the above example, we can terminate the child process early by callingconst { exec } = require('child_process'); const ac = new AbortController(); exec('ls -al', { signal: ac.signal }, (error, stdout, stderr) => {});
Copy after loginac.abort()
.
timeout
: The timeout time of the child process (if the value of this attribute is greater than
0, then when the running time of the child process exceeds the specified value , will send the termination signal specified by the attribute
killSignal) to the child process), in millimeters, the default value is
0;
maxBuffer
: The maximum buffer (binary) allowed by stdout or stderr. If exceeded, the child process will be killed and any output will be truncated. The default value is
1024 * 1024;
killSignal
: child process termination signal, the default value is
SIGTERM;
uid
: Execute the
uidof the child process;
gid
: Execute the
gidof the child process;
windowsHide
: Whether to hide the console window of the child process, commonly used in
Windowssystems, the default value is
false;
- ##callback
: callback function, including
error
,stdout
,stderr
Three parameters:<ul><li><code>error
:如果命令行执行成功,值为null
,否则值为 Error 的一个实例,其中error.code
为子进程的退出的错误码,error.signal
为子进程终止的信号; stdout
和stderr
:子进程的stdout
和stderr
,按照encoding
属性的值进行编码,如果encoding
的值为buffer
,或者stdout
、stderr
的值是一个无法识别的字符串,将按照buffer
进行编码。
execFile
const { execFile } = require('child_process'); execFile('ls', ['-al'], (error, stdout, stderr) => { console.log(stdout); });
该方法的功能类似于 exec
,唯一的区别是 execFile
在默认情况下直接用指定的可执行文件(即参数 file
的值)处理命令,这使得其效率略高于 exec
(如果查看 shell 的处理逻辑,笔者感觉这效率可忽略不计)。
该方法的参数解释如下:
file
:可执行文件的名字或路径;args
:可执行文件的参数列表;options
:参数设置(可不指定),相关属性如下:shell
:值为false
时表示直接用指定的可执行文件(即参数file
的值)处理命令,值为true
或其它字符串时,作用等同于exec
中的shell
,默认值为false
;windowsVerbatimArguments
:在Windows
中是否对参数进行引号或转义处理,在Unix
中将忽略该属性,默认值为false
;- 属性
cwd
、env
、encoding
、timeout
、maxBuffer
、killSignal
、uid
、gid
、windowsHide
、signal
在上文中已介绍,此处不再重述。
callback
:回调函数,等同于exec
中的callback
,此处不再阐述。
fork
const { fork } = require('child_process'); const echo = fork('./echo.js', { silent: true }); echo.stdout.on('data', (data) => { console.log(`stdout: ${data}`); }); echo.stderr.on('data', (data) => { console.error(`stderr: ${data}`); }); echo.on('close', (code) => { console.log(`child process exited with code ${code}`); });
该方法用于创建新的 Node.js 实例以执行指定的 Node.js 脚本,与父进程之间以 IPC 方式进行通信。
该方法的参数解释如下:
modulePath
:要运行的 Node.js 脚本路径;args
:传递给 Node.js 脚本的参数列表;options
:参数设置(可不指定),相关属性如:detached
:参见下文对spwan
中options.detached
的说明;execPath
:创建子进程的可执行文件;execArgv
:传递给可执行文件的字符串参数列表,默认取process.execArgv
的值;serialization
:进程间消息的序列号类型,可用值为json
和advanced
,默认值为json
;slient
: 如果为true
,子进程的stdin
、stdout
和stderr
将通过管道传递给父进程,否则将继承父进程的stdin
、stdout
和stderr
;默认值为false
;stdio
:参见下文对spwan
中options.stdio
的说明。这里需要注意的是:- 如果指定了该属性,将忽略
slient
的值; - 必须包含一个值为
ipc
的选项(比如[0, 1, 2, 'ipc']
),否则将抛出异常。
- 如果指定了该属性,将忽略
属性
cwd
、env
、uid
、gid
、windowsVerbatimArguments
、signal
、timeout
、killSignal
在上文中已介绍,此处不再重述。
spwan
const { spawn } = require('child_process'); const ls = spawn('ls', ['-al']); ls.stdout.on('data', (data) => { console.log(`stdout: ${data}`); }); ls.stderr.on('data', (data) => { console.error(`stderr: ${data}`); }); ls.on('close', (code) => { console.log(`child process exited with code ${code}`); });
该方法为 child_process
模块的基础方法,exec
、execFile
、fork
最终都会调用 spawn
来创建子进程。
该方法的参数解释如下:
command
:可执行文件的名字或路径;args
:传递给可执行文件的参数列表;options
:参数设置(可不指定),相关属性如下:argv0
:发送给子进程 argv[0] 的值,默认取参数command
的值;detached
:是否允许子进程可以独立于父进程运行(即父进程退出后,子进程可以继续运行),默认值为false
,其值为true
时,各平台的效果如下所述:- 在
Windows
系统中,父进程退出后,子进程可以继续运行,并且子进程拥有自己的控制台窗口(该特性一旦启动后,在运行过程中将无法更改); - 在非
Windows
系统中,子进程将作为新进程会话组的组长,此刻不管子进程是否与父进程分离,子进程都可以在父进程退出后继续运行。
需要注意的是,如果子进程需要执行长时间的任务,并且想要父进程提前退出,需要同时满足以下几点:
- 调用子进程的
unref
方法从而将子进程从父进程的事件循环中剔除; detached
设置为true
;stdio
为ignore
。
比如下面的例子:
// hello.js const fs = require('fs'); let index = 0; function run() { setTimeout(() => { fs.writeFileSync('./hello', `index: ${index}`); if (index < 10) { index += 1; run(); } }, 1000); } run(); // main.js const { spawn } = require('child_process'); const child = spawn('node', ['./hello.js'], { detached: true, stdio: 'ignore' }); child.unref();
Copy after login- 在
stdio
:子进程标准输入输出配置,默认值为pipe
,值为字符串或数组:- 值为字符串时,会将其转换为含有三个项的数组(比如
pipe
被转换为['pipe', 'pipe', 'pipe']
),可用值为pipe
、overlapped
、ignore
、inherit
; - 值为数组时,其中数组的前三项分别代表对
stdin
、stdout
和stderr
的配置,每一项的可用值为pipe
、overlapped
、ignore
、inherit
、ipc
、Stream 对象、正整数(在父进程打开的文件描述符)、null
(如位于数组的前三项,等同于pipe
,否则等同于ignore
)、undefined
(如位于数组的前三项,等同于pipe
,否则等同于ignore
)。
- 值为字符串时,会将其转换为含有三个项的数组(比如
属性
cwd
、env
、uid
、gid
、serialization
、shell
(值为boolean
或string
)、windowsVerbatimArguments
、windowsHide
、signal
、timeout
、killSignal
在上文中已介绍,此处不再重述。
小结
上文对 child_process
模块中主要方法的使用进行了简短介绍,由于 execSync
、execFileSync
、forkSync
、spwanSync
方法是 exec
、execFile
、spwan
的同步版本,其参数并无任何差异,故不再重述。
cluster
通过 cluster
模块我们可以创建 Node.js 进程集群,通过 Node.js 进程进群,我们可以更加充分地利用多核的优势,将程序任务分发到不同的进程中以提高程序的执行效率;下面将通过例子为大家介绍 cluster
模块的使用:
const http = require('http'); const cluster = require('cluster'); const numCPUs = require('os').cpus().length; if (cluster.isPrimary) { for (let i = 0; i < numCPUs; i++) { cluster.fork(); } } else { http.createServer((req, res) => { res.writeHead(200); res.end(`${process.pid}\n`); }).listen(8000); }
上例通过 cluster.isPrimary
属性判断(即判断当前进程是否为主进程)将其分为两个部分:
- 为真时,根据 CPU 内核的数量并通过
cluster.fork
调用来创建相应数量的子进程; - 为假时,创建一个 HTTP server,并且每个 HTTP server 都监听同一个端口(此处为
8000
)。
运行上面的例子,并在浏览器中访问 http://localhost:8000/
,我们会发现每次访问返回的 pid
都不一样,这说明了请求确实被分发到了各个子进程。Node.js 默认采用的负载均衡策略是轮询调度,可通过环境变量 NODE_CLUSTER_SCHED_POLICY
或 cluster.schedulingPolicy
属性来修改其负载均衡策略:
NODE_CLUSTER_SCHED_POLICY = rr // 或 none cluster.schedulingPolicy = cluster.SCHED_RR; // 或 cluster.SCHED_NONE
另外需要注意的是,虽然每个子进程都创建了 HTTP server,并都监听了同一个端口,但并不代表由这些子进程自由竞争用户请求,因为这样无法保证所有子进程的负载达到均衡。所以正确的流程应该是由主进程监听端口,然后将用户请求根据分发策略转发到具体的子进程进行处理。
由于进程之间是相互隔离的,因此进程之间一般通过共享内存、消息传递、管道等机制进行通讯。Node.js 则是通过消息传递
来完成父子进程之间的通信,比如下面的例子:
const http = require('http'); const cluster = require('cluster'); const numCPUs = require('os').cpus().length; if (cluster.isPrimary) { for (let i = 0; i < numCPUs; i++) { const worker = cluster.fork(); worker.on('message', (message) => { console.log(`I am primary(${process.pid}), I got message from worker: "${message}"`); worker.send(`Send message to worker`) }); } } else { process.on('message', (message) => { console.log(`I am worker(${process.pid}), I got message from primary: "${message}"`) }); http.createServer((req, res) => { res.writeHead(200); res.end(`${process.pid}\n`); process.send('Send message to primary'); }).listen(8000); }
运行上面的例子,并访问 http://localhost:8000/
,再查看终端,我们会看到类似下面的输出:
I am primary(44460), I got message from worker: "Send message to primary" I am worker(44461), I got message from primary: "Send message to worker" I am primary(44460), I got message from worker: "Send message to primary" I am worker(44462), I got message from primary: "Send message to worker"
利用该机制,我们可以监听各子进程的状态,以便在某个子进程出现意外后,能够及时对其进行干预,以保证服务的可用性。
cluster
模块的接口非常简单,为了节省篇幅,这里只对 cluster.setupPrimary
方法做一些特别声明,其它方法请查看官方文档:
cluster.setupPrimary
调用后,相关设置将同步到在cluster.settings
属性中,并且每次调用都基于当前cluster.settings
属性的值;cluster.setupPrimary
调用后,对已运行的子进程没有影响,只影响后续的cluster.fork
调用;cluster.setupPrimary
调用后,不影响后续传递给cluster.fork
调用的env
参数;cluster.setupPrimary
只能在主进程中使用。
worker_threads
前文我们对 cluster
模块进行了介绍,通过它我们可以创建 Node.js 进程集群以提高程序的运行效率,但 cluster
基于多进程模型,进程间高成本的切换以及进程间资源的隔离,会随着子进程数量的增加,很容易导致因系统资源紧张而无法响应的问题。为解决此类问题,Node.js 提供了 worker_threads
,下面我们通过具体的例子对该模块的使用进行简单介绍:
// server.js const http = require('http'); const { Worker } = require('worker_threads'); http.createServer((req, res) => { const httpWorker = new Worker('./http_worker.js'); httpWorker.on('message', (result) => { res.writeHead(200); res.end(`${result}\n`); }); httpWorker.postMessage('Tom'); }).listen(8000); // http_worker.js const { parentPort } = require('worker_threads'); parentPort.on('message', (name) => { parentPort.postMessage(`Welcone ${name}!`); });
上例展示了 worker_threads
的简单使用,在使用 worker_threads
的过程中,需要注意以下几点:
通过
worker_threads.Worker
创建 Worker 实例,其中 Worker 脚本既可以为一个独立的JavaScript
文件,也可以为字符串
,比如上例可修改为:const code = "const { parentPort } = require('worker_threads'); parentPort.on('message', (name) => {parentPort.postMessage(`Welcone ${name}!`);})"; const httpWorker = new Worker(code, { eval: true });
Copy after login通过
worker_threads.Worker
创建 Worker 实例时,可以通过指定workerData
的值来设置 Worker 子线程的初始元数据,比如:// server.js const { Worker } = require('worker_threads'); const httpWorker = new Worker('./http_worker.js', { workerData: { name: 'Tom'} }); // http_worker.js const { workerData } = require('worker_threads'); console.log(workerData);
Copy after login通过
worker_threads.Worker
创建 Worker 实例时,可通过设置SHARE_ENV
以实现在 Worker 子线程与主线程之间共享环境变量的需求,比如:const { Worker, SHARE_ENV } = require('worker_threads'); const worker = new Worker('process.env.SET_IN_WORKER = "foo"', { eval: true, env: SHARE_ENV }); worker.on('exit', () => { console.log(process.env.SET_IN_WORKER); });
Copy after login-
不同于
cluster
中进程间的通信机制,worker_threads
采用的 MessageChannel 来进行线程间的通信:- Worker 子线程通过
parentPort.postMessage
方法发送消息给主线程,并通过监听parentPort
的message
事件来处理来自主线程的消息; - 主线程通过 Worker 子线程实例(此处为
httpWorker
,以下均以此代替 Worker 子线程)的postMessage
方法发送消息给httpWorker
,并通过监听httpWorker
的message
事件来处理来自 Worker 子线程的消息。
- Worker 子线程通过
- 子进程之间的内存空间是互相隔离的,而 Worker 子线程共享所属进程的内存空间;
- 子进程之间的切换成本要远远高于 Worker 子线程之间的切换成本。
在 Node.js 中,无论是 cluster
创建的子进程,还是 worker_threads
创建的 Worker 子线程,它们都拥有属于自己的 V8 实例以及事件循环,所不同的是:
尽管看起来 Worker 子线程比子进程更高效,但 Worker 子线程也有不足的地方,即cluster
提供了负载均衡,而 worker_threads
则需要我们自行完成负载均衡的设计与实现。
Summary
This article introduces the three modules of child_process
, cluster
and worker_threads
in Node.js Using these three modules, we can take full advantage of the multi-core advantages of the CPU and efficiently solve the operating efficiency of some special tasks (such as AI, image processing, etc.) in a multi-thread (thread) mode. Each module has its applicable scenarios. This article only explains its basic use. How to use it efficiently based on your own problems still needs to be explored by yourself. Finally, if there are any mistakes in this article, I hope you can correct them. I wish you all happy coding every day.
For more node-related knowledge, please visit: nodejs tutorial!
The above is the detailed content of Let's talk about multi-process and multi-threading in Node.js. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



To connect to a MySQL database, you need to follow these steps: Install the mysql2 driver. Use mysql2.createConnection() to create a connection object that contains the host address, port, username, password, and database name. Use connection.query() to perform queries. Finally use connection.end() to end the connection.

Function exception handling in C++ is particularly important for multi-threaded environments to ensure thread safety and data integrity. The try-catch statement allows you to catch and handle specific types of exceptions when they occur to prevent program crashes or data corruption.

The main differences between Node.js and Java are design and features: Event-driven vs. thread-driven: Node.js is event-driven and Java is thread-driven. Single-threaded vs. multi-threaded: Node.js uses a single-threaded event loop, and Java uses a multi-threaded architecture. Runtime environment: Node.js runs on the V8 JavaScript engine, while Java runs on the JVM. Syntax: Node.js uses JavaScript syntax, while Java uses Java syntax. Purpose: Node.js is suitable for I/O-intensive tasks, while Java is suitable for large enterprise applications.

Detailed explanation and installation guide for PiNetwork nodes This article will introduce the PiNetwork ecosystem in detail - Pi nodes, a key role in the PiNetwork ecosystem, and provide complete steps for installation and configuration. After the launch of the PiNetwork blockchain test network, Pi nodes have become an important part of many pioneers actively participating in the testing, preparing for the upcoming main network release. If you don’t know PiNetwork yet, please refer to what is Picoin? What is the price for listing? Pi usage, mining and security analysis. What is PiNetwork? The PiNetwork project started in 2019 and owns its exclusive cryptocurrency Pi Coin. The project aims to create a one that everyone can participate

PHP multithreading refers to running multiple tasks simultaneously in one process, which is achieved by creating independently running threads. You can use the Pthreads extension in PHP to simulate multi-threading behavior. After installation, you can use the Thread class to create and start threads. For example, when processing a large amount of data, the data can be divided into multiple blocks and a corresponding number of threads can be created for simultaneous processing to improve efficiency.

Concurrency and multithreading techniques using Java functions can improve application performance, including the following steps: Understand concurrency and multithreading concepts. Leverage Java's concurrency and multi-threading libraries such as ExecutorService and Callable. Practice cases such as multi-threaded matrix multiplication to greatly shorten execution time. Enjoy the advantages of increased application response speed and optimized processing efficiency brought by concurrency and multi-threading.

Mutexes are used in C++ to handle multi-threaded shared resources: create mutexes through std::mutex. Use mtx.lock() to obtain a mutex and provide exclusive access to shared resources. Use mtx.unlock() to release the mutex.

Steps to connect MyCAT in Node.js: Install the mycat-ts dependency. Create a connection pool, specify the host, port, username, password and database. Use the query method to execute SQL queries. Use the close method to close the connection pool.
