Declaring UTF-8 Strings in Python Source Code
Consider the following code in Python 2:
<code class="python">u = unicode('d…') s = u.encode('utf-8') print s</code>
When running this code, a SyntaxError is raised due to a non-ASCII character in the source code. To resolve this issue, UTF-8 encoding must be declared in the source code header:
<code class="python"># -*- coding: utf-8 -*- ....</code>
This declaration informs Python to use UTF-8 encoding for the source file. Once declared, UTF-8 characters can be used anywhere in the code. For example:
<code class="python"># -*- coding: utf-8 -*- u = 'idzie wąż wąską dróżką' uu = u.decode('utf8') s = uu.encode('cp1250') print(s)</code>
In Python 3, UTF-8 is the default source encoding, so Unicode characters can be used without any special declaration.
The above is the detailed content of How do you declare UTF-8 strings in Python source code?. For more information, please follow other related articles on the PHP Chinese website!