Learning how to create thread in Nodejs is one of the recommended steps toward boosting the performance of your Nodejs applications.
The worker_threads
module helps in Nodejs multi threading. First, import the Worker
constructor in one file.
const { Worker } = require('worker_threads')
And the parentPort
object in a separate file.
const { parentPort } = require('worker_threads')
Next, do the heavy computation in the file containing the parentPort
object and forward the result using the message
event to the Worker
constructor in the other file.
Alternatively, you can achieve Nodejs multi threading through the setImmediate
API and apply concepts like clustering and child processes.
It would be best to understand processing and threading, single and multi threading before practically learning how to create thread in Nodejs. Let's get started.
Understanding processes and threads
The first step towards knowing how to create thread in Nodejs is to distinguish a process from a thread.
A process is a program under execution, while a thread is a basic unit of CPU utilization. A program can have multiple processes, and a single process can link to many threads.
A thread has an ID, program counter, register set, and a stack. A multithreaded process can perform many tasks per time by assigning each thread a task. Besides, each thread is assigned a stack and a register but can share the process' code section, data, and files.
Multithreading increases concurrency by utilizing the available CPU cores, leads to responsive applications because a failure of its section(s) is unlikely to halt the entire program, and enables you to be economical with your resources because they can be shared amongst threads.
ALSO READ: CPU, processors, core, threads - Explained in layman's terms
Why knowing how to create thread in Nodejs is necessary
It would help to know the difference between JavaScript and Node.js before doing Nodejs multithreading.
JavaScript code ran entirely in the browser before the creation of Node.js. The code gets pushed to the JavaScript engine embedded in the browser. The communication occurs through browser API that is accessible through the window
object.
The window
object, in turn, provides ways like the document
object to access the JavaScript engine. Through a single thread, the JavaScript engine processes requests.
Later, Node.js was created by embedding a C++ program with Chrome's JavaScript's V8 engine. You can now write JavaScript programs that communicate with the operating system outside the browser.
With the introduction of the (Nodejs) JavaScript runtime environment came the ability to undertake complex roles with JavaScript. For example, apart from creating servers, you can read and write files and run CLI functions.
However, Node.js continues to rely on the single-threaded, asynchronous I/O architecture that mainly suits simple tasks in the browser. The architecture is possible through callback functions, events and promises.
A single thread starts a request. But instead of waiting for a response to arrive, it continues to serve other code sections. When the response arrives, the thread delivers it to the target code sections, then proceeds to serve other parts of the process.
The problem arises when the pending request is CPU-intensive and not available as an asynchronous API, and yet other processes cannot proceed without the pending response. That leads to unresponsive applications.
That is where knowing how to create thread in Nodejs comes to your rescue.
Use Nodejs multithreading strategies.
Although JavaScript code runs in a single thread, its runtime environment can be customized to do multithreading. Here are two recommended ways to achieve the desired effect.
How to create thread in Nodejs using the setImmediate API
The setImmediate
API splits the CPU-intensive task into smaller chunks. The API receives a callback function and monitors its execution per unit time.
If a tedious process occurs, the API returns the callback function to the event loop, executing it after the stack is empty. That gives the CPU-intensive task ample time to finish executing without blocking the main thread.
Example
Assume we want to manipulate the photos from the JSON Placeholder. We can fetch the data and do some calculations on the massive data IDs without blocking the main thread.
Using a fetch.js
file, fetch the photos and write the data in a data.json
file.
const fs = require('fs')
const fetchAndSaveData = async () => {
const response = await fetch('https://jsonplaceholder.typicode.com/photos')
const theData = await response.json()
fs.writeFile('data.json', JSON.stringify(theData, null, 2), e => e ? console.log(e) : '')
}
fetchAndSaveData()
Import the data.json
file's content into a new main.js
file.
const data = require('./data.json')
Loop through the (data) array and push each photo's ID into a temporary newIDs
array.
const dataIDs = []
for (let i in data) { dataIDs.push(data[i].id) }
Next, manipulate each element of the newIDs
array, as follows.
const processHugeData = () => {
if (dataIDs.length === 0) { console.log('Done!') }
else {
const newIDs = dataIDs.splice(0, 50)
for (let id of newIDs) { console.log(`${id} => ${ Math.floor(id * 2.9 - id * 23 + 159000) }`)}
setImmediate(processHugeData)
}
}
processHugeData()
We partition the huge data into chunks of 50 and then do some calculations on the smaller portions of the data before printing the output.
While the heavy calculation is happening, we return the remaining chunk of the data to the event loop to be processed during the next execution phase. When there is no data to manipulate, we print Done!
The main drawback of using the setImmediate()
API to create a thread in Nodejs is that not many tasks can easily be split before manipulating the data.
One solution would be forking the current process using the built-in child_process
module before running CPU-intensive tasks on it.
You then message the input and output across the processes because the forked processes don't share the memory with the original process.
The challenge is that forking processes is time-consuming and uses CPU resources. Worse yet, killing one process leads to data loss. The recommended solution to the problem is to use worker threads.
How to create thread in Nodejs using the worker threads module
The worker_threads
module is the preferred way to do Nodejs multithreading. It is simpler to use and less resource-intensive.
The first step is to import the parentPort
object
const { parentPort } = require('worker_threads')
and Worker constructor the from the built-in worker_threads
module in two separate files.
const { Worker } = require('worker_threads')
Next, you use the objects as shown below.
Example
Assume we want to loop through 5 billion records using parallel threads. We do the heavy task in the file we have imported the parentPort
object.
let sum = 0
for (let i = 0; i < 5000000000; i++) {
sum++
}
parentPort.postMessage(sum)
We then pipe the (sum
) output into postMessage()
method of the parentPort
object.
Lastly, we switch to the main thread, housed in the second main.js
file, and pass the (computational) filename into the new Worker
instance.
const worker = new Worker('./compute.js')
worker.on('message', data => console.log(data))
The worker instance listens for incoming message event using the on()
method.
The worker instance receives the incoming data through a callback function when the message event is raised. We print the data after the Nodejs multi threading of the data is complete.
Although the response can last for some time, it does not block the I/O because the worker_threads
module internally creates a parallel thread for the CPU-intensive computation.
Conclusion
The first step towards knowing how to create thread in Nodejs is to understand Nodejs programs, processes, and threads.
Next, you should know how Nodejs scales up applications using its single-threaded architecture and the architecture's drawbacks.
Lastly, you can try out possible solutions before settling for the worker_threads
module, as shown in this tutorial.