This article will introduce to you how to use the string_decoder module to convert buffer into string in Nodejs. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.
string_decoder
The module is used to convert Buffer into the corresponding string. Users can obtain the string corresponding to the buffer by calling stringDecoder.write(buffer)
. [Recommended learning: "nodejs Tutorial"]
What's special about it is that when the incoming buffer is incomplete (for example, three bytes of characters, only two are passed in) ), an internal buffer will be maintained internally to cache the incomplete bytes, and wait until the user calls stringDecoder.write(buffer)
again to pass in the remaining bytes to spell out complete characters.
This can effectively avoid errors caused by incomplete buffer, and is very useful for many scenarios, such as package body parsing in network requests, etc.
This section demonstrates decode.write(buffer)
, decode.end([buffer])# respectively. ##Usage of two main APIs.
decoder.write(buffer)The call passes in the Buffer object
, and returns accordingly The corresponding string
you;
const StringDecoder = require('string_decoder').StringDecoder;
const decoder = new StringDecoder('utf8');
// Buffer.from('你') => <Buffer e4 bd a0>
const str = decoder.write(Buffer.from([0xe4, 0xbd, 0xa0]));
console.log(str); // 你
decoder.end([buffer]) is called, the internal remaining The buffer will be returned at once. If you bring the
buffer parameter at this time, it is equivalent to calling
decoder.write(buffer) and
decoder.end() at the same time.
const StringDecoder = require('string_decoder').StringDecoder; const decoder = new StringDecoder('utf8'); // Buffer.from('你好') => <Buffer e4 bd a0 e5 a5 bd> let str = decoder.write(Buffer.from([0xe4, 0xbd, 0xa0, 0xe5, 0xa5])); console.log(str); // 你 str = decoder.end(Buffer.from([0xbd])); console.log(str); // 好
How the string_decoder module handles it.
is passed in,
Okay is still 1 byte short, at this time,
decoder.write (xx)Return
you.
decoder.write(Buffer.from([0xbd])) again, pass in the remaining 1 byte, and successfully return
Good.
const StringDecoder = require('string_decoder').StringDecoder; const decoder = new StringDecoder('utf8'); // Buffer.from('你好') => <Buffer e4 bd a0 e5 a5 bd> let str = decoder.write(Buffer.from([0xe4, 0xbd, 0xa0, 0xe5, 0xa5])); console.log(str); // 你 str = decoder.write(Buffer.from([0xbd])); console.log(str); // 好
is called, and �
is returned, and the corresponding buffer is < ;Buffer ef bf bd>
. <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:js;toolbar:false;">const StringDecoder = require(&#39;string_decoder&#39;).StringDecoder;
// Buffer.from(&#39;好&#39;) => <Buffer e5 a5 bd>
let decoder = new StringDecoder(&#39;utf8&#39;);
let str = decoder.end( Buffer.from([0xe5]) );
console.log(str); // �
console.log(Buffer.from(str)); // <Buffer ef bf bd></pre><div class="contentsignin">Copy after login</div></div>The official document explains this situation like this (almost like nonsense). It is roughly a convention. When the <code>utf8
code point is invalid, replace it with ef bf bd
.
Returns any remaining input stored in the internal buffer as a string. Bytes representing incomplete UTF-8 and UTF-16 characters will be replaced with substitution characters appropriate for the character encoding.
A UTF-8 character you should remember "EF BF BD" http://liudanking.com/golang/utf-8_replacement_character/
The above is the detailed content of Use string_decoder module in Nodejs to convert buffer into string. For more information, please follow other related articles on the PHP Chinese website!