Overview
We all know that Node.js is an event-driven architecture. This means Node.js will only respond whenever there is an event initiated. That is everything that happens in Node is the reaction to an event.
In reality, a simple get API call to the node traverses a cascade of callbacks. These are abstracted away from developers and handled by a library called libuv which provides a mechanism called an event loop.
The event loop is the fundamental concept to understand node asynchronous and non-blocking I/0.
Explain how the event loop works and the phases of the event loop and actions executed in each phase
Understand difference between the blocking and non-blocking I/0
Process vs threads
Let's dive into the concept of process. Every application has a container. The process is the top-level container of the application. When you run an application with Node.js, e.g. node index.js
, you are creating a Node.js process that the application runs in.
Every process has a dedicated memory pool that is shared by all threads in the process. This means you can create a variable in one thread and read it from another one.
To understand the concept better, imagine going to a restaurant that has a dedicated waitress who asks you for your order and takes the order to a chef. While your food is being cooked, the waitress cannot take anybody else's order since they are dedicated to you. This means they are blocked from any other actions! Other customers come in as well, but nobody goes to take their orders, since the waitress is still waiting to serve your food.
The easiest solution to this problem would be to hire more people to take orders. After some time, the number of waitresses would be equal to the number of clients who are dining. This is the classic example of how applications like Apache server process requests where a "waitress" is a thread. Every request gets its own thread. This outlines the concept of multithreading.
This solution may not be memory-efficient due to its dependency on the release of resources and allocation of memory on time. Due to threads sharing limited resources allocated to the application process, the creation and release of the resources may result in significant overhead and in turn impact performance.
On top of this, multithreading contributes to software complexity as threads need to communicate with each other to stay synchronized with the main thread and other threads. For example, a thread needs to notify the main thread when an operation is finished.
Multithreading can introduce a "Race Condition", a bug that happens due to lack of synchronization between two threads, and our inability to know which thread will access a shared variable first.
To avoid these complexities, Node.js is single-threaded. This means that all operations execute in a single thread. In other words, Node.js applications have a single call stack.
A call stack operates as a queue. During execution, when an application steps into a function (executes the function), it pushes the function into the stack. When the application steps out of the function (the function returns), that function is removed from the call stack. A call stack records where in the program structure we are at any given time.
In the scenario where a slow, or processing-heavy, function is added to the stack, we cannot move out of it and on to the next function until the current function has finished its execution.
This can cause blocking, and slow execution of the application. While the stack is blocked, users cannot interact with the application as the Node.js runtime has one thread and can only do one thing at a time... or can it?
Behind the scenes, there are C/C++ APIs that provide asynchronous input/output (I/O), and interaction with the operation system (OS), that allow code execution similar to multithreading without the same memory shortcomings. The Event Loop was implemented to assist with the interactions between these asynchronous components and the main application thread. The Event loop is implemented as part of the libUV library that provides cross-platform asynchronous I/O in Node.js.
Node is Single-threaded
Node is a single-threaded programming language that means only one thing happens at a given time, but due to its asynchronous nature task can be executed in parallel. Event Loop plays a major role in the execution of asynchronous functions. The event loop plays a major role in scheduling the task and delegates the operation to OS and waits for the outcome. Hence the event loop helps us to understand Node's asynchronous processes and its non-blocking I/O nature.
Phases of an event loop
The phases will help us to understand which work is done in which phase. The event loop has six phases which are repeated for as long as the application still has code that needs to be executed
The event loop starts at the moment the Node begins to excel your index.js entry point of the application.
The six phases create one cycle also known as a tick. A node process exits when there is no more pending work in the event loop or process.exit() called. The program runs as long as there are tasks queued in the event loop or in the call stack
Timers:
Everything that was scheduled via setTimeout() or setInterval() will be processed here.
At the beginning of this phase, the Event Loop updates its own time. Then it checks a queue, or pool, of timers. This queue consists of all timers that are currently set. The Event Loop takes the timer with the shortest wait time and compares it with the Event Loop's current time. If the wait time has elapsed, then the timer's callback is queued to be called once the call stack is empty.
However, the execution of the callbacks is controlled by the Poll phase of the event loop.
IO Callbacks:
This is a phase of non-blocking I/0. The event loop in Node.js executes system-related callbacks.
when your application is waiting for a file to be read, it doesn't have to necessarily wait until the system gets back to it with the content of the file. it can continue the code execution and receive file content asynchronously when it's ready.
This is what a non-blocking I/0 interface allows us to do. The asynchronous I/O request is pushed to the queue and the main stack can continue working as expected
fs.readFile("/file.md", (err, data) => {
if (err) throw err;
});
The fs.readFile
operation is a classic I/O operation. Node.js will pass the request to read a file filesystem of your OS. Then the code execution will immediately continue past the fs.readFile()
code to newMethod()
. When the I/O operation is complete, or errors out, its callback will be placed in the pending queue and it will be processed during the I/O callbacks phase of the Event Loop.
Idle / Waiting/preparation:
During this phase, the event loop does nothing much. The event loop is idle and generally gathers information and plans what needs to be executed during the next phase. No mechanism could guarantee code execution during this phase.
I/O Polling: All the code we write is executed here. Depending on the code it may execute immediately or it may add something to the queue to be executed during a future tick of the event loop.
During this phase, the Event Loop is managing the I/O workload, calls the functions in the queue until the queue is empty, and calculates how long it should wait until moving to the next phase. All callbacks in this phase are called synchronously in the order that they were added to the queue, from oldest to newest.
Note: this phase is optional. It may not happen on every tick, depending on the state of your application.
If there are any setImmediate()
timers scheduled, Node.js will skip this phase during the current tick and move to the setImmediate()
phase. If there are no functions in the queue and no timers, the application will wait for callbacks to be added to the queue and execute them immediately, until the internal setTimeout()
that is set at the beginning of this phase is up. At that point, it moves on to the next phase. The value of the delay in this timeout also depends on the state of the application.
Set Immediate:
Node.js has a special timer, setImmediate()
, and its callbacks are executed during this phase. This phase runs as soon as the poll phase becomes idle. If setImmediate()
is scheduled within the I/O cycle it will always be executed before other timers regardless of how many timers are present.
Close events:
This phase executes the callbacks of all close events. For example, a close event of web socket callback, or when process. exit()
is called. This is when the Event Loop is wrapping up one cycle and is ready to move to the next one. It is primarily used to clean the state of the application.