请问哪位大牛能详细而又通俗的解释下,
Python2下unicode、utf-8、decode、encode之间的关系。
我感觉我在这方面的认识还不够清晰,希望大牛们能帮帮忙,谢谢!!
ASCII 、unicode 是字符集,utf-8是字符集的编码方式。
utf-8 是 unicode 字符集一种编码方式。
<code class="language-python"><span class="n">In</span> <span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">a</span><span class="o">=</span><span class="s">'你好'</span> <span class="n">In</span> <span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">a</span> <span class="n">Out</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="s">'</span><span class="se">\xe4\xbd\xa0\xe5\xa5\xbd</span><span class="s">'</span> <span class="n">In</span> <span class="p">[</span><span class="mi">3</span><span class="p">]:</span> <span class="n">b</span><span class="o">=</span><span class="n">a</span><span class="o">.</span><span class="n">decode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">)</span> <span class="n">In</span> <span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="n">b</span> <span class="n">Out</span><span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="s">u'</span><span class="se">\u4f60\u597d</span><span class="s">'</span> <span class="n">In</span> <span class="p">[</span><span class="mi">5</span><span class="p">]:</span> <span class="nb">type</span><span class="p">(</span><span class="n">b</span><span class="p">)</span> <span class="n">Out</span><span class="p">[</span><span class="mi">5</span><span class="p">]:</span> <span class="nb">unicode</span> <span class="n">In</span> <span class="p">[</span><span class="mi">6</span><span class="p">]:</span> <span class="nb">type</span><span class="p">(</span><span class="n">a</span><span class="p">)</span> <span class="n">Out</span><span class="p">[</span><span class="mi">6</span><span class="p">]:</span> <span class="nb">str</span> <span class="n">In</span> <span class="p">[</span><span class="mi">7</span><span class="p">]:</span> <span class="n">c</span><span class="o">=</span><span class="n">b</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'utf-8'</span><span class="p">)</span> <span class="n">In</span> <span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="n">c</span> <span class="n">Out</span><span class="p">[</span><span class="mi">8</span><span class="p">]:</span> <span class="s">'</span><span class="se">\xe4\xbd\xa0\xe5\xa5\xbd</span><span class="s">'</span> <span class="n">In</span> <span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="n">c</span><span class="o">==</span><span class="n">a</span> <span class="n">Out</span><span class="p">[</span><span class="mi">9</span><span class="p">]:</span> <span class="bp">True</span> </code>