ParallelJS: Elegant Web Worker Solution
ParallelJS provides an elegant solution to the problems that may arise when using Web Worker, providing a practical API with convenient abstraction and helper tools. The Worker interface introduced by HTML5 allows the creation of functions with long run time and high computational demands, which can be used simultaneously to improve website response speed. ParallelJS allows parallelization of JavaScript code, leveraging simultaneous multithreading (SMT) to use modern CPUs more efficiently. The ParallelJS library provides methods such as spawn
, map
and reduce
, which are used to perform calculations, process data, and aggregate fragmentation results in parallel.
One of the coolest new possibilities brought by HTML5 is the Worker interface of the Web Workers API. Before that, we had to adopt some tips to show users a responsive website. The Worker interface allows us to create functions with long run time and high computational demands. Additionally, Worker instances can be used simultaneously, allowing us to generate any number of these workers as needed. In this article, I will discuss why multithreading is important and how to implement it in JavaScript using ParallelJS.
Why do you need multi-threading?
This is a reasonable question. Historically, the ability to generate threads provides an elegant way to divide work in a process. The operating system is responsible for scheduling the available time for each thread, so that threads with higher priority and higher workloads will take precedence over low priority idle threads. In the past few years, simultaneous multithreading (SMT) has become the key to accessing modern CPU computing power. The reason is simple: Moore's Law is still valid in terms of the number of transistors per unit area. However, frequency scaling has to be stopped for a variety of reasons. Therefore, available transistors must be used in other ways. It is decided that architectural improvements (such as SIMD) and multi-core represent the best choice.
To use SMT, we need to write parallel code, that is, code that runs in parallel to obtain a single result. We usually need to consider special algorithms, because most sequential codes are either difficult to parallelize or are very inefficient. The reason is Amdahl's law, which states that the acceleration ratio S is given by the following formula:
where N is the number of parallel workers (such as processor, core, or thread), and P is the parallel part. In the future, more multi-core architectures relying on parallel algorithms may be used. In the field of high-performance computing, GPU systems and special architectures such as Intel Xeon Phi represent such platforms. Finally, we should distinguish between general concurrent applications or algorithms and parallel execution. Parallelism is the (possibly relevant) simultaneous execution of the calculation. Conversely, concurrency is a combination of independent execution processes.
Multi-threading in JavaScript
In JavaScript, we already know how to write concurrent programs, i.e. using callback functions. This knowledge can now be transferred to the creation of a parallel program! According to its own structure, JavaScript is executed in a single thread mediated by an event loop (usually following reactor pattern). For example, this provides some good abstraction for us to handle asynchronous requests to (external) resources. It also ensures that previously defined callbacks are always fired in the same execution thread. There are no cross-thread exceptions, race conditions, or other issues related to threads. However, this doesn't bring us closer to SMT in JavaScript. With the introduction of Worker interfaces, an elegant solution has been found. From the main application perspective, code in Web Worker should be considered a task that runs concurrently. Communication is also conducted in this way. We use the Message API, which can also be used for communication from included websites to hosted pages. For example, the following code responds to incoming messages by sending a message to the initiator.
window.addEventListener('message', function (event) { event.source.postMessage('Howdy Cowboy!', event.origin); }, false);
In theory, a Web Worker can also generate another Web Worker. However, in fact, most browsers prohibit this. Therefore, the only way to communicate between Web Workers is through the main application. Communications through messages are conducted concurrently, so only asynchronous (non-blocking) communications are performed. At first, this may be strange in programming, but it brings many advantages. Most importantly, our code should have no race conditions! Let's look at a simple example of using two parameters to represent the start and end of the sequence to calculate a sequence of prime numbers in the background. First, we create a file called prime.js with the following content:
onmessage = function (event) { var arguments = JSON.parse(event.data); run(arguments.start, arguments.end); }; function run (start, end) { var n = start; while (n < end) { var k = Math.sqrt(n); var found = false; for (var i = 2; !found && i <= k; i++) { found = n % i === 0; } if (!found) { postMessage(n.toString()); } n++; } }
Now, we just need to use the following code in the main application to start the background worker.
if (typeof Worker !== 'undefined') { var w = new Worker('prime.js'); w.onmessage = function(event) { console.log(event); }; var args = { start : 100, end : 10000 }; w.postMessage(JSON.stringify(args)); }
Quite a lot of work. What's especially annoying is using another file. This produces a nice separation, but seems completely redundant for smaller tasks. Fortunately, there is a solution. Consider the following code:
var fs = (function () { /* code for the worker */ }).toString(); var blob = new Blob( [fs.substr(13, fs.length - 14)], { type: 'text/javascript' } ); var url = window.URL.createObjectURL(blob); var worker = new Worker(url); // Now setup communication and rest as before
Of course, we might want to have a better solution than such magic numbers (13 and 14), and depending on the browser, the fallback of the Blob and createObjectURL must be used. If you are not a JavaScript expert, fs.substr(13, fs.length - 14) is to extract the function body. We do this by converting the function declaration to a string (called with toString()) and removing the signature of the function itself.
Can ParallelJS help?
This is where ParallelJS comes into play. It provides a nice API for some convenience as well as Web Worker. It includes many auxiliary tools and very useful abstractions. We first provide some data to process.
var p = new Parallel([1, 2, 3, 4, 5]); console.log(p.data);
data field produces the provided array. No "parallel" operations have been called yet. However, instance p contains a set of methods, such as spawn, which will create a new Web Worker. It returns a Promise, which makes using the results a breeze.
window.addEventListener('message', function (event) { event.source.postMessage('Howdy Cowboy!', event.origin); }, false);
The problem with the above code is that the calculations will not really be parallel. We only create a single background worker that processes the entire array of data at once. Only after processing the complete array can we get the result. A better solution is to use the map function of the Parallel instance.
onmessage = function (event) { var arguments = JSON.parse(event.data); run(arguments.start, arguments.end); }; function run (start, end) { var n = start; while (n < end) { var k = Math.sqrt(n); var found = false; for (var i = 2; !found && i <= k; i++) { found = n % i === 0; } if (!found) { postMessage(n.toString()); } n++; } }
In the previous example, the core is very simple and may be too simple. In a real example, many operations and functions will be involved. We can use the require function to include the imported functions.
if (typeof Worker !== 'undefined') { var w = new Worker('prime.js'); w.onmessage = function(event) { console.log(event); }; var args = { start : 100, end : 10000 }; w.postMessage(JSON.stringify(args)); }
reduce function helps to aggregate fragmented results into a single result. It provides a convenient abstraction for collecting subresults and performing certain operations after knowing all subresults.
Conclusion
ParallelJS provides us with an elegant way to avoid problems that may arise when using Web Worker. Additionally, we get a nice API with some useful abstractions and helpers. Further improvements can be integrated in the future. In addition to being able to use SMT in JavaScript, we may also want to use vectorization. If supported, SIMD.js seems to be a viable approach. In some (hopefully not too far away) futures, using GPUs for computing may also be a valid option. There is a wrapper for CUDA (a parallel computing architecture) in Node.js, but the original JavaScript code is still not possible. Until then, ParallelJS was our best choice to take advantage of multi-core CPUs to handle long-running computing. And you? How do you use JavaScript to unleash the power of modern hardware?
FAQs (FAQ) on ParallelJS with ParallelJS
ParallelJS is a JavaScript library that allows you to parallelize data processing by leveraging multi-core processors. It works by creating a new Parallel object and passing an array of data to it. This data can then be processed in parallel using the .map()
method, which applies the specified function to each item in the array. Then return the result in the new array.
ParallelJS can be installed using npm (Node.js package manager). Simply run the command "npm install paralleljs" in the terminal. Once the installation is complete, you can reference it in your JavaScript file using "var Parallel = require('paralleljs');".
ParallelJS allows you to make the most of your data processing tasks with multi-core processors. This can greatly speed up processing time on large datasets. It also provides a simple and intuitive API that makes parallelizing code easy.
Yes, ParallelJS can be used in the browser. You can include it in the HTML file using the script tag and the URL of the ParallelJS file. Once included, you can use the Parallel object just like in Node.js.
.map()
method in ParallelJS? The .map()
method in ParallelJS is used to apply a function to each item in a data array. This function is passed as a string to the .map()
method. Then return the result in the new array. For example, "var p = new Parallel([1, 2, 3]); p.map('function(n) { return n * 2; }');" will return a value of [2, 4, 6] new array.
.reduce()
method in ParallelJS? The .reduce()
method in ParallelJS is used to reduce the array of data to a single value using the specified function. This function is passed as a string to the .reduce()
method. For example, "var p = new Parallel([1, 2, 3]); p.reduce('function(a, b) { return a b; }');" will return the value 6.
Yes, methods in ParallelJS can be linked together. For example, you can use the .map()
method to process the data and then use the .reduce()
method to combine the results into a single value.
The errors in ParallelJS can be handled using the .catch()
method. This method accepts a function that is called if an error occurs during processing. The error object will be passed to this function.
Yes, ParallelJS can be used with other JavaScript libraries. However, you need to make sure to include the library in the worker context using the .require()
method.
While ParallelJS can greatly speed up processing time on large datasets, it may not be the best choice for all tasks. For small datasets, the overhead of creating workers and transferring data may outweigh the benefits of parallelization. It is better to test ParallelJS with your specific use case to see if it provides performance advantages.
The above is the detailed content of Parallel JavaScript with ParallelJS. For more information, please follow other related articles on the PHP Chinese website!