Home > Backend Development > PHP Tutorial > Why are UTF-8 Characters Corrupted When Using `file_get_contents()`?

Why are UTF-8 Characters Corrupted When Using `file_get_contents()`?

Susan Sarandon
Release: 2024-12-09 22:42:13
Original
469 people have browsed it

Why are UTF-8 Characters Corrupted When Using `file_get_contents()`?

file_get_contents() Interrupts UTF-8 Characters

The issue arises when loading HTML from an external server with UTF-8 encoding. Characters like ľ, š, č, ť, ž are corrupted and replaced with invalid characters.

The Root of the Problem

The file_get_contents() function may be encountering encoding issues. By default, it interprets the data as ASCII, which fails to handle UTF-8 characters correctly.

Proposed Solution

To resolve this, consider using an alternative encoding method.

1. Manual Encoding Conversion

Use the mb_convert_encoding() function to convert the fetched HTML to UTF-8:

$html = file_get_contents('http://example.com/foreign.html');
$utf8_html = mb_convert_encoding($html, 'UTF-8', mb_detect_encoding($html, 'UTF-8', true));
Copy after login

2. Output Encoding

Ensure the output is properly encoded by adding the following line to the script:

header('Content-Type: text/html; charset=UTF-8');
Copy after login

3. HTML Entity Conversion

Convert the fetched HTML to HTML entities before outputting it:

$html = file_get_contents('http://example.com/foreign.html');
$html_entities = htmlentities($html, ENT_COMPAT, 'UTF-8');
echo $html_entities;
Copy after login

4. JSON Decoding

If the external HTML is stored as JSON, decode it using the JSON class:

$json = file_get_contents('http://example.com/foreign.html');
$decoded_json = json_decode($json, true);
$html = $decoded_json['html'];
Copy after login

By utilizing these techniques, you can circumvent the encoding issues caused by file_get_contents() and ensure the proper display of UTF-8 characters.

The above is the detailed content of Why are UTF-8 Characters Corrupted When Using `file_get_contents()`?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template