How to Decode UTF-8 URL Encoded Strings in Python 2.7?

Barbara Streisand
Release: 2024-11-04 07:33:02
Original
838 people have browsed it

How to Decode UTF-8 URL Encoded Strings in Python 2.7?

Url Decode UTF-8 in Python

Problem: Given a URL encoded in UTF-8 format, how can it be decoded to its intended string representation in Python 2.7?

Solution:

The problem stems from the presence of UTF-8 encoded bytes that are escaped with URL quoting. To correctly decode this data, a two-step process is required:

  1. URL Decoding: Use urllib.parse.unquote() in Python 3 or urllib.unquote() in Python 2 to convert the URL-encoded bytes back to their original representation.
  2. UTF-8 Decoding: For Python 2, the decoded bytestring needs to be explicitly converted to a text string using decode('utf8').
<code class="python">from urllib.parse import unquote

url = 'example.com?title=%D0%BF%D1%80%D0%B0%D0%B2%D0%BE%D0%B2%D0%B0%D1%8F+%D0%B7%D0%B0%D1%89%D0%B8%D1%82%D0%B0'
decoded_url = unquote(url)

print(decoded_url)  # Output: example.com?title=правовая+защита</code>
Copy after login

This approach seamlessly handles the decoding from percent-encoded data to UTF-8 bytes and finally to text.

The above is the detailed content of How to Decode UTF-8 URL Encoded Strings in Python 2.7?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!