Node.js cleverly implements hot update of web application code

Background

I believe that students who have developed web applications using Node.js must have been troubled by the problem that newly modified code must be restarted before the Node.js process can be updated. Students who are accustomed to using PHP for development will find it very inapplicable. As expected, PHP is the best programming language in the world. Manually restarting the process is not only a very annoying duplication of work, but when the application scale becomes larger, the startup time gradually begins to become unignorable.

Of course, as a programmer, no matter which language you use, you will not let such things torture you. The most direct and universal way to solve this kind of problem is to monitor file modifications and restart the process. This method has also been provided by many mature solutions, such as the abandoned node-supervisor, the now popular PM2, or the relatively lightweight node-dev, etc., which are all based on this idea.

This article provides another idea. With only a small modification, you can achieve true zero-restart hot update code and solve the annoying code update problem when developing web applications with Node.js.

General idea

Speaking of code hot update, the most famous one at the moment is the hot update function of the Erlang language. This language is characterized by high concurrency and distributed programming. Its main application scenarios are fields such as securities trading and game servers. . These scenarios more or less require services to have means of operation and maintenance during operation, and code hot updating is a very important part of it, so we can first briefly take a look at Erlang's approach.

Since I have never used Erlang, the following content is all hearsay. If you want to have an in-depth and accurate understanding of Erlang's code hot update implementation, it is best to consult the official documentation.

Erlang's code loading is managed by a module called code_server. Except for some necessary code at startup, most of the code is loaded by code_server.
When code_server finds that the module code has been updated, it will reload the module. New requests thereafter will be executed using the new module, while requests that are still executing will continue to be executed using the old module.
The old module will be labeled old after the new module is loaded, and the new module will be labeled current. During the next hot update, Erlang will scan and kill the old modules that are still executing, and then continue to update the modules according to this logic.
Not all codes in Erlang allow hot updates. Basic modules such as kernel, stdlib, compiler and other basic modules are not allowed to be updated by default
We can find that Node.js also has a module similar to code_server, that is, the require system, so Erlang's approach should also be tried on Node.js. By understanding Erlang’s approach, we can roughly summarize the key issues in solving code hot updates in Node.js

How to update module code
How to handle requests using the new module
How to release resources of old modules

Then let’s analyze these problem points one by one.

How to update module code

To solve the problem of module code update, we need to read the module manager implementation of Node.js and directly link to module.js. Through simple reading, we can find that the core code lies in Module._load. Let’s simplify the code and post it.

// Check the cache for the requested file.
// 1. If a module already exists in the cache: return its exports object.
// 2. If the module is native: call `NativeModule.require()` with the
// filename and return the result.
// 3. Otherwise, create a new module for the file and save it to the cache.
// Then have it load the file contents before returning its exports
// object.
Module._load = function(request, parent, isMain) {
 var filename = Module._resolveFilename(request, parent);

 var cachedModule = Module._cache[filename];
 if (cachedModule) {
 return cachedModule.exports;
 }

 var module = new Module(filename, parent);
 Module._cache[filename] = module;
 module.load(filename);

 return module.exports;
};

require.cache = Module._cache;

Copy after login

You can find that the core of it is Module._cache. As long as this module cache is cleared, the module manager will reload the latest code the next time it is required.

Write a small program to verify it

// main.js
function cleanCache (module) {
 var path = require.resolve(module);
 require.cache[path] = null;
}

setInterval(function () {
 cleanCache('./code.js');
 var code = require('./code.js');
 console.log(code);
}, 5000);
// code.js
module.exports = 'hello world';

Copy after login

Let’s execute main.js and modify the contents of code.js at the same time. We can find that in the console, our code has been successfully updated to the latest code.

Node.js cleverly implements hot update of web application code_node.js

So the problem of the module manager updating the code has been solved. Next, let’s look at how we can make the new module actually executed in the web application.

How to use the new module to handle requests

In order to be more in line with everyone's usage habits, we will directly use Express as an example to expand on this problem. In fact, using a similar idea, most web applications can be applied.

First of all, if our service is like Express's DEMO and all the code is in the same module, we cannot hot load the module

var express = require('express');
var app = express();

app.get('/', function(req, res){
 res.send('hello world');
});

app.listen(3000);

Copy after login

To achieve hot loading, just like the basic libraries not allowed in Erlang, we need some basic code that cannot be hot updated to control the update process. And if an operation like app.listen is executed again, it will not be much different from restarting the Node.js process. Therefore, we need some clever code to isolate frequently updated business code from infrequently updated basic code.

// app.js 基础代码
var express = require('express');
var app = express();
var router = require('./router.js');

app.use(router);

app.listen(3000);
// router.js 业务代码
var express = require('express');
var router = express .Router();

// 此处加载的中间件也可以自动更新
router.use(express.static('public'));

router.get('/', function(req, res){
 res.send('hello world');
});

module.exports = router;

Copy after login

然而很遗憾，经过这样处理之后，虽然成功的分离了核心代码， router.js 依然无法进行热更新。首先，由于缺乏对更新的触发机制，服务无法知道应该何时去更新模块。其次， app.use 操作会一直保存老的 router.js 模块，因此即使模块被更新了，请求依然会使用老模块处理而非新模块。

那么继续改进一下，我们需要对 app.js 稍作调整，启动文件监听作为触发机制，并且通过闭包来解决 app.use 的缓存问题

// app.js
var express = require('express');
var fs = require('fs');
var app = express();

var router = require('./router.js');

app.use(function (req, res, next) {
 // 利用闭包的特性获取最新的router对象，避免app.use缓存router对象
 router(req, res, next);
});

app.listen(3000);

// 监听文件修改重新加载代码
fs.watch(require.resolve('./router.js'), function () {
 cleanCache(require.resolve('./router.js'));
 try {
  router = require('./router.js');
 } catch (ex) {
  console.error('module update failed');
 }
});

function cleanCache(modulePath) {
 require.cache[modulePath] = null;
}

Copy after login

再试着修改一下 router.js 就会发现我们的代码热更新已经初具雏形了，新的请求会使用最新的 router.js 代码。除了修改 router.js 的返回内容外，还可以试试看修改路由功能，也会如预期一样进行更新。

当然，要实现一个完善的热更新方案需要更多结合自身方案做一些改进。首先，在中间件的使用上，我们可以在 app.use 处声明一些不需要热更新或者说每次更新不希望重复执行的中间件，而在 router.use 处则可以声明一些希望可以灵活修改的中间件。其次，文件监听不能仅监听路由文件，而是要监听所有需要热更新的文件。除了文件监听这种手段外，还可以结合编辑器的扩展功能，在保存时向 Node.js 进程发送信号或者访问一个特定的 URL 等方式来触发更新。

如何释放老模块的资源

要解释清楚老模块的资源如何释放的问题，实际上需要先了解 Node.js 的内存回收机制，本文中并不准备详加描述，解释 Node.js 的内存回收机制的文章和书籍很多，感兴趣的同学可以自行扩展阅读。简单的总结一下就是当一个对象没有被任何对象引用的时候，这个对象就会被标记为可回收，并会在下一次GC处理的时候释放内存。

那么我们的课题就是，如何让老模块的代码更新后，确保没有对象保持了模块的引用。首先我们以如何更新模块代码一节中的代码为例，看看老模块资源不回收会出现什么问题。为了让结果更显著，我们修改一下 code.js

// code.js
var array = [];

for (var i = 0; i < 10000; i++) {
 array.push('mem_leak_when_require_cache_clean_test_item_' + i);
}

module.exports = array;
// app.js
function cleanCache (module) {
 var path = require.resolve(module);
 require.cache[path] = null;
}

setInterval(function () {
 var code = require('./code.js');
 cleanCache('./code.js');
}, 10);

Copy after login

好~我们用了一个非常笨拙但是有效的方法，提高了 router.js 模块的内存占用，那么再次启动 main.js 后，就会发现内存出现显著的飙升，不到一会 Node.js 就提示 process out of memory。然而实际上从 app.js 与 router.js 的代码中观察的话，我们并没发现哪里保存了旧模块的引用。

我们借助一些 profile 工具如 node-heapdump 就可以很快的定位到问题所在，在 module.js 中我们发现 Node.js 会自动为所有模块添加一个引用

function Module(id, parent) {
 this.id = id;
 this.exports = {};
 this.parent = parent;
 if (parent && parent.children) {
 parent.children.push(this);
 }

 this.filename = null;
 this.loaded = false;
 this.children = [];
}

Copy after login

因此相应的，我们可以调整一下cleanCache函数，将这个引用在模块更新的时候一并去除。

// app.js
function cleanCache(modulePath) {
 var module = require.cache[modulePath];
 // remove reference in module.parent
 if (module.parent) {
  module.parent.children.splice(module.parent.children.indexOf(module), 1);
 }
 require.cache[modulePath] = null;
}

setInterval(function () {
 var code = require('./code.js');
 cleanCache(require.resolve('./code.js'));
}, 10);

Copy after login

再执行一下，这次好多了，内存只会有轻微的增长，说明老模块占用的资源已经正确的释放掉了。

使用了新的 cleanCache 函数后，常规的使用就没有问题，然而并非就可以高枕无忧了。在 Node.js 中，除了 require 系统会添加引用外，通过 EventEmitter 进行事件监听也是大家常用的功能，并且 EventEmitter 有非常大的嫌疑会出现模块间的互相引用。那么 EventEmitter 能否正确的释放资源呢？答案是肯定的。

// code.js
var moduleA = require('events').EventEmitter();

moduleA.on('whatever', function () {
});

Copy after login

When the code.js module is updated and all references are moved out, moduleA will also be automatically released as long as it is not referenced by other unreleased modules, including our internal event listeners.

There is only one malformed EventEmitter application scenario that cannot be dealt with under this system, that is, code.js will listen for events on a global object every time it is executed, which will cause constant mounting events on the global object. At the same time, Node.js will quickly prompt that too many event bindings have been detected, which may be a memory leak.

At this point, you can see that as long as the references automatically added by Node.js in the require system are processed, resource recycling of old modules is not a big problem, although we cannot achieve the next hot update like Erlang. The remaining old modules are subject to fine-grained control such as scanning, but we can solve the problem of resource release of old modules through reasonable avoidance methods.

In web applications, another reference problem is that unreleased modules or core modules have references to modules that need to be hot updated, such as app.use. As a result, the resources of the old module cannot be released, and new requests cannot be processed correctly. Use new modules for processing. The solution to this problem is to control the exposed entries of global variables or references, and manually update the entries during the execution of hot updates. For example, the encapsulation of router in How to Use a New Module to Process Requests is an example. Through the control of this entry, no matter how we reference other modules in router.js, they will be released with the release of the entry.

Another problem that can cause resource release is operations like setInterval, which will keep the life cycle of the object from being released. However, we rarely use this type of technology in web applications, so we do not pay attention to it in the plan.

Epilogue

So far, we have solved the three major problems of Node.js code hot update in web applications. However, because Node.js itself lacks an effective scanning mechanism for retained objects, it cannot 100% eliminate problems caused by setInterval. The resource of the old module cannot be released. It is also because of such limitations that in the YOG2 framework we currently provide, this technology is mainly used in the development and debugging period to achieve rapid development through hot updates. Code updates in the production environment still use restart or PM2's hot reload function to ensure the stability of online services.

Since hot update is actually closely related to the framework and business architecture, this article does not give a general solution. For reference, let’s briefly introduce how we use this technology in the YOG2 framework. Since the YOG2 framework itself supports app splitting between front-end and back-end subsystems, our update strategy is to update the code at the app granularity. At the same time, because operations like fs.watch will have compatibility issues, and some alternatives such as fs.watchFile will consume more performance, so we combined the test machine deployment function of YOG2 to inform the framework that it needs to be updated by uploading and deploying new code. App code. While updating the module cache at the App granularity, the routing cache and template cache will be updated to complete all code updates.

If you are using a framework like Express or Koa, you only need to follow the methods in the article and combine your own business needs with some modifications to the main route, and you can apply this technology well.