关于curl抓取回来的网页中中文乱码的有关问题
关于curl抓取回来的网页中中文乱码的问题!
目前使用这个函数进行转码:
function safeEncoding($str){
$code=mb_detect_encoding($str,array('ASCII','GB2312','GBK','UTF-8'));//检测字符串编码
if($code=="CP936"){
$result=$str;
}
else{
//$result=mb_convert_encoding($str,'UTF-8',$code);//将编码$code转换为utf-8编码
$result=iconv($code,"UTF-8",$str);
}
return $result;
}
可是还是有问题,遇到有些明明charset写着gb2312的抓回来还是乱码,请教高手,curl到底该怎么全面的进行转码呢?就是把gbk,gb2312等网页常用格式转成utf8的。
------解决方案--------------------
把你的文件编码转换下试试!
------解决方案--------------------
EUC-CN是GB 2312最常用的表示方法。浏览器编码表上的“GB2312”,通常都是指“EUC-CN”表示法。
这不是关键!
你需要的是 把gbk,gb2312等网页常用格式转成utf8的
而你的代码正好没有这样做
if(! mb_check_encoding($str, 'utf-8')) {
$str = mb_convert_encoding($str,'UTF-8','gbk');
}

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Microsoft has released a preview version of the Unity extension for Visual Studio Code. This new Unity extension will allow you to write and debug Unity games. This new Unity extension brings some popular features already available in Visual Studio and Visual Studio Tools for Unity, making C# in Visual Studio Code better for Unity development. Currently, the Unity extension for VSCode provides the following: A Unity debugger for debugging your Unity editor and Unity player. Unity-specific C# analyzer and refactoring. Uni

How to deal with UnsupportedEncodingException in Java? In Java programming, you may encounter UnsupportedEncodingException. This exception is usually caused by incorrect encoding conversion or an unsupported encoding. In this article, we will introduce the causes of UnsupportedEncodingException exception and how to deal with it. What is UnsupportedE

1The basic unit of Unicode computer storage is the byte, which is composed of 8 bits. Since English only consists of 26 letters plus a number of symbols, English characters can be stored directly in bytes. But other languages (such as Chinese, Japanese, Korean, etc.) have to use multiple bytes for encoding due to the large number of characters. With the spread of computer technology, non-Latin character encoding technology continues to develop, but there are still two major limitations: no multi-language support: the encoding scheme of one language cannot be used in another language and there is no unified standard: for example There are many encoding standards in Chinese such as GBK, GB2312, GB18030, etc. Since the encoding methods are not unified, developers need to convert back and forth between different encodings, and many errors will inevitably occur.

What are the similarities and differences between __str__ and __repr__? We all know the representation of strings. Python's built-in function repr() can express objects in the form of strings to facilitate our identification. This is the "string representation". repr() obtains the string representation of an object through the special method __repr__. If __repr__ is not implemented, when we print an instance of a vector to the console, the resulting string may be. >>>classExample:pass>>>print(str(Example()))>>>

Abstract Redis is an open source, high-performance key-value database that is widely used in various scenarios. In Redis, data type (type) and encoding (encoding) are very important concepts. This blog will introduce in detail the data types supported by Redis as well as the corresponding encoding methods and underlying implementation principles. To view the internal encoding of a Redis key, you can use the Redis command OBJECTENCODINGkey. Among them, key is the key name you want to query. For example, if you want to query the internal encoding of a key named mykey, you can execute the following command: 127.0.0.1:6379>objectencodin

HTMLisaspecifictypeofcodefocusedonstructuringwebcontent,while"code"broadlyincludeslanguageslikeJavaScriptandPythonforfunctionality.1)HTMLdefineswebpagestructureusingtags.2)"Code"encompassesawiderrangeoflanguagesforlogicandinteract

UnsupportedEncodingException may occur in Java, mainly because the encoding is not supported. When processing text data, it is often necessary to perform encoding conversion, that is, to convert the content of one encoding format into the content of another encoding format. If the encoding type used for encoding conversion is not supported, an UnsupportedEncodingException will be thrown. This article will introduce the solution to this exception. one,

Use the functions provided by the encoding/csv package to read and write CSV files. CSV (Comma-SeparatedValues) is a commonly used data storage format that can simply save data as a text file in comma-separated form. In Python, you can use the encoding/csv package in the standard library to conveniently read and write CSV files. First, we need to import the encoding/csv package: importcsv Next
