최근 '대규모 WEB 서비스 개발 기술'이라는 책을 읽었습니다. 이 책에서는 데이터를 압축하고 디스크 IO를 줄이기 위해 데이터를 압축하는 "가변 길이 바이트코드 알고리즘" 알고리즘을 언급합니다.
가변 길이 바이트코드 알고리즘:
모든 바이트의 최상위 비트(첨자 7)는 플래그 비트로만 사용되며 바이트 위치에 해당하는 128의 거듭제곱을 곱해야 합니다.
신중한 연구 끝에 PHP 버전으로 번역했습니다:이것은 그의 의사 코드입니다. 🎜 >
<code><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><span><?php function codeNumber<span>(<span>$n</span>)</span>{ <span>$bytes</span> = []; while <span>(true)</span>{ array_unshift<span>(<span>$bytes</span>, bcmod<span>(<span>$n</span>, <span>128</span>)</span>)</span>; if<span>(<span>$n</span> 128</span>)</span>{ break; }else{ <span>$n</span> = intval<span>(<span>$n</span>/<span>128</span>)</span>; } } <span>$bytes</span>[count<span>(<span>$bytes</span>)</span> - <span>1</span>] += <span>128</span>; return <span>$bytes</span>; } function encode<span>(<span>$numbers</span>)</span>{ <span>$bytestream</span> = []; foreach <span>(<span>$numbers</span> as <span>$n</span>)</span>{ <span>$bytestream</span> = array_merge<span>(<span>$bytestream</span>, codeNumber<span>(<span>$n</span>)</span>)</span>; } return <span>$bytestream</span>; } function decode<span>(<span>$bytestream</span>)</span>{ <span>$numbers</span> = []; <span>$n</span> = <span>0</span>; for <span>(<span>$i</span> = <span>0</span>; <span>$i</span> (<span>$bytestream</span>)</span>; <span>$i</span>++)</span>{ if<span>(<span>$bytestream</span>[<span>$i</span>] 128</span>)</span>{ <span>$n</span> = <span>128</span> * <span>$n</span> + <span>$bytestream</span>[<span>$i</span>]; }else{ <span>$n</span> = <span>128</span> * <span>$n</span> + <span>(<span>$bytestream</span>[<span>$i</span>] - <span>128</span>)</span>; array_push<span>(<span>$numbers</span>, <span>$n</span>)</span>; <span>$n</span> = <span>0</span>; } } return <span>$numbers</span>; } <span>$a</span> = encode<span>([<span>5</span>, <span>130</span>, <span>288</span>])</span>; var_dump<span>(<span>$a</span>)</span>; var_dump<span>(decode<span>(<span>$a</span>)</span>)</span>; 打印出来的内容是: array<span>(<span>5</span>)</span> { [<span>0</span>]=> int<span>(<span>133</span>)</span> [<span>1</span>]=> string<span>(<span>1</span>)</span><span>"1"</span> [<span>2</span>]=> int<span>(<span>130</span>)</span> [<span>3</span>]=> string<span>(<span>1</span>)</span><span>"2"</span> [<span>4</span>]=> int<span>(<span>160</span>)</span> } array<span>(<span>3</span>)</span> { [<span>0</span>]=> int<span>(<span>5</span>)</span> [<span>1</span>]=> int<span>(<span>130</span>)</span> [<span>2</span>]=> int<span>(<span>288</span>)</span> } //写二进制 <span>$h</span> = fopen<span>(<span>'ejz3.txt'</span>, <span>'wb'</span>)</span>; foreach <span>(<span>$a</span> as <span>$k</span> => <span>$v</span>)</span> { <span>$str3</span> = pack<span>(<span>'H*'</span>, sprintf<span>(<span>"%02x"</span>, <span>$v</span>)</span>)</span>; fwrite<span>(<span>$h</span>, <span>$str3</span>)</span>; } fclose<span>(<span>$h</span>)</span>; //读二进制 <span>$str2</span> = file_get_contents<span>(<span>'ejz3.txt'</span>)</span>; <span>$str2</span> = unpack<span>(<span>"H*"</span>, <span>$str2</span>)</span>; <span>$value</span> = str_split<span>(<span>$str2</span>[<span>1</span>], <span>2</span>)</span>; foreach <span>(<span>$value</span> as <span>$k</span> => <span>$v</span>)</span> { <span>$value</span>[<span>$k</span>] = base_convert<span>(<span>$v</span>, <span>16</span>, <span>10</span>)</span>; } </span></span></span></span></span></span></span></span></span></span></span></span></span></span></code>