エンコードエラーを回避しながら、「atob」を使用して JavaScript で UTF-8 Base64 文字列をデコードするにはどうすればよいでしょうか?-jsチュートリアル-php.cn

エンコードエラーを回避しながら、「atob」を使用して JavaScript で UTF-8 Base64 文字列をデコードするにはどうすればよいでしょうか?

Mary-Kate Olsen

リリース： 2024-10-31 21:08:29

オリジナル

1191 人が閲覧しました

How do you decode UTF-8 base64 strings in JavaScript using `atob` while avoiding encoding errors?

atob を使用して一般的なテキストソースから Base64 をデコードする

atob を使用して、出力を UTF-8 で生成するサービスからの API 応答文字列をデコードすると、エラーが発生する可能性がありますまたは壊れた文字列エンコーディング。これは、JavaScript の Base64 処理の制限によるものです。

<code class="js">const notOK = "✓"
console.log(btoa(notOK)); // error</code>

ログイン後にコピー

Unicode の問題

ECMAScript でこのエラーが解決された後でも、base64 はバイナリであるため、「Unicode 問題」は残ります。この形式では、エンコードされた各文字が 1 バイトを占めると想定されます。多くの Unicode 文字はエンコードに 1 バイト以上を必要とするため、エンコードの失敗につながる可能性があります。

出典: MDN (2021)

<code class="js">const ok = "a";
console.log(ok.codePointAt(0).toString(16)); // 0x61: occupies 1 byte

const notOK = "✓";
console.log(notOK.codePointAt(0).toString(16)); // 0x2713: occupies 2 bytes</code>

ログイン後にコピー

バイナリの相互運用性を備えたソリューション

どのソリューションを選択すればよいかわからない場合は、おそらくこれが最適です。 ASCII Base64 ソリューションとこの回答の履歴については、スクロールを続けてください。

UTF-8 文字列をバイナリ表現に、またはその逆に変換することで、バイナリアプローチの使用を検討してください。

エンコーディングUTF-8 ⇢ バイナリ

<code class="js">function toBinary(string) {
  const codeUnits = new Uint16Array(string.length);
  for (let i = 0; i < codeUnits.length; i++) {
    codeUnits[i] = string.charCodeAt(i);
  }
  return btoa(String.fromCharCode(...new Uint8Array(codeUnits.buffer)));
}
encoded = toBinary("✓ à la mode") // "EycgAOAAIABsAGEAIABtAG8AZABlAA=="</code>

ログイン後にコピー

バイナリのデコード ⇢ UTF-8

<code class="js">function fromBinary(encoded) {
  const binary = atob(encoded);
  const bytes = new Uint8Array(binary.length);
  for (let i = 0; i < bytes.length; i++) {
    bytes[i] = binary.charCodeAt(i);
  }
  return String.fromCharCode(...new Uint16Array(bytes.buffer));
}
decoded = fromBinary(encoded) // "✓ à la mode"</code>

ログイン後にコピー

ASCII Base64 相互運用性を備えたソリューション

UTF-8 の機能を保持するには、次のような別のアプローチがあります。 ASCII Base64 の相互運用性が推奨されており、テキストベースの Base64 文字列との互換性を維持しながら「Unicode 問題」を修正します。

エンコード UTF-8 ⇢ ASCII Base64

<code class="js">function b64EncodeUnicode(str) {
    // Percent-encode Unicode, then convert to byte array
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function(match, p1) {
            return String.fromCharCode('0x' + p1);
    }));
}
b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="</code>

ログイン後にコピー

デコード ASCII Base64 ⇢ UTF -8

<code class="js">function b64DecodeUnicode(str) {
    // Convert byte array to percent-encoding, then decode
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);
    }).join(''));
}
b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"</code>

ログイン後にコピー

TypeScript サポート

<code class="ts">function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
        return String.fromCharCode(parseInt(p1, 16))
    }))
}
function b64DecodeUnicode(str) {
    return decodeURIComponent(Array.prototype.map.call(atob(str), function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2)
    }).join(''))
}</code>

ログイン後にコピー