Home > Web Front-end > JS Tutorial > Understanding Readable Streams in NodeJS

Understanding Readable Streams in NodeJS

青灯夜游
Release: 2020-11-20 17:45:57
forward
9452 people have browsed it

Understanding Readable Streams in NodeJS

Related recommendations: "node js tutorial"

What is a readable stream

A readable stream is a stream that produces data for program consumption. Our common data production methods include reading disk files, reading network request content, etc. Take a look at the previous example of what a stream is:

const rs = fs.createReadStream(filePath);
Copy after login

rs is a readable stream, and its method of producing data is to read To get the file from the disk, our common console process.stdin is also a readable stream:

process.stdin.pipe(process.stdout);
Copy after login

You can print out the input of the console with a simple sentence. The way process.stdin produces data is to read User input on the console.

Look back at our definition of readable streams: Readable streams are streams that produce data for program consumption.

Custom readable stream

In addition to the fs.CreateReadStream provided by the system, we also often use the src provided by gulp or vinyl-fs Method

gulp.src(['*.js', 'dist/**/*.scss'])
Copy after login

If we want to produce data in a specific way and give it to the program for consumption, how do we start?

You can do it in two simple steps

  1. Inherit the Readable class of the stream module
  2. Override the _read method and call this.push Put the produced data into the queue to be read

The Readable class has completed most of the work for the readable stream, we only need to inherit it, and then A custom readable stream can be implemented by writing the method of producing data in the _read method.

If we want to implement a stream that generates a random number every 100 milliseconds (of no use)

const Readable = require('stream').Readable;

class RandomNumberStream extends Readable {
    constructor(max) {
        super()
    }

    _read() {
        const ctx = this;
        setTimeout(() => {
            const randomNumber = parseInt(Math.random() * 10000);

            // 只能 push 字符串或 Buffer,为了方便显示打一个回车
            ctx.push(`${randomNumber}\n`);
        }, 100);
    }
}

module.exports = RandomNumberStream;
Copy after login

The class inheritance part of the code is very simple, mainly look at the implementation of the _read method, there are several What is worth noting

  1. The Readable class has the implementation of the _read method by default, but nothing is done. What we do is to override and override the
  2. _read method which has a parameter size, Used to specify how much data should be read and returned to the read method, but it is just a reference data. Many implementations ignore this parameter. We also ignore it here. We will mention it in detail later
  3. Push to the buffer through this.push The concept of data and buffer will be mentioned later. For the time being, it is understood that it is squeezed into the water pipe and can be consumed.
  4. The content of push can only be strings or Buffers, not numbers.
  5. The push method has the following Two parameters encoding, used to specify encoding when the first parameter is a string

Execute it to see the effect

const RandomNumberStream = require('./RandomNumberStream');

const rns = new RandomNumberStream();

rns.pipe(process.stdout);
Copy after login

In this way, you can see that the numbers are continuously displayed on the console On, we implemented a readable stream that generates random numbers, and there are still a few small problems to be solved

How to stop

We send data to the buffer every 100 milliseconds Pushing a number to the area, then just like reading a local file, when it is finished, how to stop to indicate that the data has been read?

Just push a null to the buffer. Let's modify the code to allow consumers to define how many random numbers are needed:

const Readable = require('stream').Readable;

class RandomNumberStream extends Readable {
    constructor(max) {
        super()
        this.max = max;
    }

    _read() {
        const ctx = this;

        setTimeout(() => {
            if (ctx.max) {
                const randomNumber = parseInt(Math.random() * 10000);

                // 只能 push 字符串或 Buffer,为了方便显示打一个回车
                ctx.push(`${randomNumber}\n`);
                ctx.max -= 1;
            } else {
                ctx.push(null);
            }
        }, 100);
    }
}

module.exports = RandomNumberStream;
Copy after login

We use a max identifier to allow consumers to specify the number of characters required, which can be specified during instantiation

const RandomNumberStream = require('./RandomNumberStream');

const rns = new RandomNumberStream(5);

rns.pipe(process.stdout);
Copy after login

This way you can see that the console only prints 5 characters

Why is setTimeout instead of setInterval

Careful students may notice that we Producing a random number in 100 milliseconds does not call setInterval, but uses setTimeout. Why is it just delayed and not repeated, but the result is correct?

This requires understanding the two ways in which streams work

  1. Flow mode: Data is read out by the underlying system and provided to the application as quickly as possible
  2. Pause mode: The read() method must be called explicitly to read several data blocks

The stream is in pause mode by default, which means that the program needs to explicitly call the read() method. But in our example, we can get the data without calling it, because our stream is switched to flow mode through the pipe() method, so our _read() method will automatically be called repeatedly until the data is read, so we Data only needs to be read once in each _read() method.

Switching between flow mode and pause mode

The flow can be switched from the default pause mode to flow mode in the following ways:

  1. Start data monitoring by adding a data event listener
  2. Call the resume() method to start the data flow
  3. Call the pipe() method to transfer the data to another writable stream

There are two ways to switch from flow mode to pause mode:

  1. 在流没有 pipe() 时,调用 pause() 方法可以将流暂停
  2. pipe() 时,需要移除所有 data 事件的监听,再调用 unpipe() 方法

data 事件

使用了 pipe() 方法后数据就从可读流进入了可写流,但对我们好像是个黑盒,数据究竟是怎么流向的呢?我们看到切换流动模式和暂停模式的时候有两个重要的名词

  1. 流动模式对应的 data 事件
  2. 暂停模式对应的 read() 方法

这两个机制是我们能够驱动数据流动的原因,先来看一下流动模式 data 事件,一旦我们监听了可读流的 data 时、事件,流就进入了流动模式,我们可以改写一下上面调用流的代码

const RandomNumberStream = require('./RandomNumberStream');

const rns = new RandomNumberStream(5);

rns.on('data', chunk => {
  console.log(chunk);
});
Copy after login

这样我们可以看到控制台打印出了类似下面的结果

<Buffer 39 35 37 0a>
<Buffer 31 30 35 37 0a>
<Buffer 38 35 31 30 0a>
<Buffer 33 30 35 35 0a>
<Buffer 34 36 34 32 0a>
Copy after login

当可读流生产出可供消费的数据后就会触发 data 事件,data 事件监听器绑定后,数据会被尽可能地传递。data 事件的监听器可以在第一个参数收到可读流传递过来的 Buffer 数据,这也就是我们打印的 chunk,如果想显示为数字,可以调用 Buffer 的 toString() 方法。

当数据处理完成后还会触发一个 end 事件,应为流的处理不是同步调用,所以如果我们希望完事后做一些事情就需要监听这个事件,我们在代码最后追加一句:

rns.on('end', () => {
  console.log('done');
});
Copy after login

这样可以在数据接收完了显示 'done'

当然数据处理过程中出现了错误会触发 error 事件,我们同样可以监听,做异常处理:

rns.on('error', (err) => {
  console.log(err);
});
Copy after login

read(size)

流在暂停模式下需要程序显式调用 read() 方法才能得到数据。read() 方法会从内部缓冲区中拉取并返回若干数据,当没有更多可用数据时,会返回null。

使用 read() 方法读取数据时,如果传入了 size 参数,那么它会返回指定字节的数据;当指定的size字节不可用时,则返回null。如果没有指定size参数,那么会返回内部缓冲区中的所有数据。

现在有一个矛盾了,在流动模式下流生产出了数据,然后触发 data 事件通知给程序,这样很方便。在暂停模式下需要程序去读取,那么就有一种可能是读取的时候还没生产好,如果我们才用轮询的方式未免效率有些低。

NodeJS 为我们提供了一个 readable 的事件,事件在可读流准备好数据的时候触发,也就是先监听这个事件,收到通知又数据了我们再去读取就好了:

const rns = new RandomNumberStream(5);

rns.on('readable', () => {
  let chunk;
  while((chunk = rns.read()) !== null){
    console.log(chunk);
  }
});
Copy after login

这样我们同样可以读取到数据,值得注意的一点是并不是每次调用 read() 方法都可以返回数据,前面提到了如果可用的数据没有达到 size 那么返回 null,所以我们在程序中加了个判断。

数据会不会漏掉

开始使用流动模式的时候我经常会担心一个问题,上面代码中可读流在创建好的时候就生产数据了,那么会不会在我们绑定 readable 事件之前就生产了某些数据,触发了 readable 事件,我们还没有绑定,这样不是极端情况下会造成开头数据的丢失嘛

可事实并不会,按照 NodeJS event loop 我们创建流和调用事件监听在一个事件队列里面,儿生产数据由于涉及到异步操作,已经处于了下一个事件队列,我们监听事件再慢也会比数据生产块,数据不会丢失。

看到这里,大家其实对 data事件、readable事件触发时机, read() 方法每次读多少数据,什么时候返回 null 还有又一定的疑问,因为到现在为止我们接触到的仍然是一个黑盒,后面我们介绍了可写流后会在 back pressure 机制部分对这些内部细节结合源码详细讲解,且听下回分解吧。

更多编程相关知识,请访问:编程入门!!

The above is the detailed content of Understanding Readable Streams in NodeJS. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:cnblogs.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template