Using UTF-8 Encoding in Python Source Code
In certain scenarios, you may encounter errors related to non-ASCII characters when working with Unicode strings in Python source code. This is because the default encoding for Python 2 source files is not UTF-8.
Declaring UTF-8 Strings
In Python 3, UTF-8 is the default source encoding, so you can directly use Unicode characters without any special declaration. However, in Python 2, you need to explicitly declare the UTF-8 encoding in the source file header using the following syntax:
# -*- coding: utf-8 -*-
Place this line at the beginning of your Python 2 source file.
For example, consider the following Python 2 code:
<code class="python"># -*- coding: utf-8 -*- u = 'idzie wąż wąską dróżką' uu = u.decode('utf8') s = uu.encode('cp1250') print(s)</code>
This code uses UTF-8 encoding and successfully converts the Unicode string to a CP1250-encoded byte string for printing.
By declaring UTF-8 encoding, you ensure that Python will interpret the Unicode characters correctly and avoid errors related to non-ASCII characters. It is important to note that this declaration must be placed at the beginning of the source file, before any other code.
The above is the detailed content of How to Use UTF-8 Encoding in Python 2 Source Code?. For more information, please follow other related articles on the PHP Chinese website!