Decoding UTF-8 Encoded URLs in Python
In Python, decoding a URL encoded with UTF-8 can be a straightforward task. Consider a scenario where you have a URL string like "example.com?title=правовая защита" that needs to be decoded to "example.com?title==правовая защита".
The key to decoding such URLs lies in understanding the encoding method. In this case, the data is UTF-8 encoded bytes that have been escaped with URL quoting. To decode this data, we will use Python's urllib.parse.unquote() function, which handles decoding from percent-encoded data to UTF-8 bytes and then to text seamlessly.
<code class="python">from urllib.parse import unquote url = unquote(url)</code>
This code will decode the URL to its intended form:
example.com?title=правовая+защита
For Python 2, the equivalent function is urllib.unquote(), but this returns a bytestring that requires manual decoding:
<code class="python">from urllib import unquote url = unquote(url).decode('utf8')</code>
By following these steps, you can effectively decode UTF-8 encoded URLs in Python, allowing you to access and utilize the intended data.
The above is the detailed content of How to Decode UTF-8 Encoded URLs in Python?. For more information, please follow other related articles on the PHP Chinese website!