php iconv() 编码转换出错 Detected an illegal character
原型:string iconv ( string $in_charset , string $out_charset , string $str )
特别是第二个参数说明:the output charset.
用iconv()转换一个输出字符编码不支持的字符时,如iconv('utf-8', 'gb2312', 'www.phprm.com'),会遇到这样的错误提示:notice: iconv() [function.iconv]: detected an illegal character in input string ...
因为gb2312表示的是简体中文,不支持像"www.phprm.com"之类的更为复杂的汉字以及一些特殊字符,这当然会报错了,解决办法有两种:
1. 扩大输出字符编码的范围,如iconv('utf-8', 'gbk', 'www.phprm.com'),则可以正确地输出,因为gbk支持的字符范围更广;
2. 在输出的字符编码字符串后面加上"//ignore",如iconv('utf-8', 'gb2312//ignore', 'www.phprm.com'),这样做其实是忽略了不能转换的字符,避免了出错但却不能够正确地输出(即空白不、输出)。
下面来看看关于php教程 iconv() : detected an illegal character in input string处理方法:
$str = iconv('utf-8', 'gbk//ignore', unescape(isset($_get['str'])? $_get['str']:''));
本地测试//ignore能忽略掉它不认识的字接着往下转,并且不报错,而//translit是截掉它不认识的字及其后面的内容,并且报错。//ignore是我需要的。
在网上找到下面这篇文章,发现mb_convert_encoding也可以,但效率比iconv差,转换字符串编码iconv与mb_convert_encoding的区别:
iconv — convert string to requested character encoding(php 4 >= 4.0.5, php 5)
mb_convert_encoding — convert character encoding(php 4 >= 4.0.6, php 5)
用法:
string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )
需要先启用 mbstring 扩展库,在 php.ini里将; extension=php_mbstring.dll 前面的 ; 去掉
string iconv ( string in_charset, string out_charset, string str )
注意:第二个参数,除了可以指定要转化到的编码以外,还可以增加两个后缀://translit 和 //ignore,其中://translit 会自动将不能直接转化的字符变成一个或多个近似的字符,//ignore 会忽略掉不能转化的字符,而默认效果是从第一个非法字符截断。
returns the converted string or false on failure.
使用:
1. 发现iconv在转换字符"-"到gb2312时会出错,如果没有ignore参数,所有该字符后面的字符串都无法被保存。不管怎么样,这个"-"都无法转换成功,无法输出。另外mb_convert_encoding没有这个bug.
2. mb_convert_encoding 可以指定多种输入编码,它会根据内容自动识别,但是执行效率比iconv差太多;如:$str = mb_convert_encoding($str,"euc-jp","ascii,jis,euc-jp,sjis,utf- 8");“ascii,jis,euc-jp,sjis,utf-8”的顺序不同效果也有差异
3. 一般情况下用 iconv,只有当遇到无法确定原编码是何种编码,或者iconv转化后无法正常显示时才用mb_convert_encoding 函数
from_encoding is specified by character code name before conversion. it can be array or string - comma separated enumerated list. if it is not specified, the internal encoding will be used.
$str = mb_convert_encoding($str, "ucs-2le", "jis, eucjp-win, sjis-win"); $str = mb_convert_encoding($str, "euc-jp', "auto");
本文链接:
收藏随意^^请保留教程地址.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

In the process of text processing, it is a common requirement to convert strings in different encoding formats. The iconv (InternationalizationConversion) function provided in the PHP language can meet this need very conveniently. This article will introduce the use of iconv function in detail from the following aspects: Definition of iconv function and introduction to common parameters Example demonstration: Convert GBK encoded string to UTF-8 encoded string Example demonstration: Convert UTF

Use Java's Character.isDefined() function to determine whether a character is a defined character. In Java programming, sometimes you need to determine whether a character is a defined character. For convenience, Java provides the isDefined() function of the Character class, which can help us quickly determine whether a character is a defined character. This article explains how to use this function and provides some code examples. Character class represents a single character in Java

Java uses the isLetterOrDigit() function of the Character class to determine whether a character is a letter or number. In Java programming, we often need to perform some operations and judgments on characters. One of the common needs is to determine whether a character is a letter or a number. Java provides the isLetterOrDigit() function of the Character class to help us implement this function. The Character class is a wrapper class used to operate and judge characters.

Interpretation of Java documentation: Detailed explanation of the isAlphabetic() method of the Character class 1. Overview In the Java Character class, the isAlphabetic() method is used to determine whether a given character is an alphabetic character. It returns a boolean value, true indicating that the given character is an alphabetic character, false indicating that the given character is not an alphabetic character. This article will provide a detailed analysis of the use and principle of this method, and provide code examples to help readers better understand

Use Java's Character.isLetterOrDigit() function to determine whether a character is a letter or number. In Java, we often need to determine whether a character is a letter or number. In order to simplify this process, Java provides a built-in function Character.isLetterOrDigit(), which can help us quickly complete this judgment. The Character.isLetterOrDigit() function accepts a character as a parameter

iconv-fencoding[-tencoding][inputfile]...[Function] Converts the contents of a given file from one encoding to another. [Description]-fencoding: Convert characters from encoding to encoding. -tencoding: Convert characters to encoding. -l: List the known set of encoded characters -ofile: Specify the output file -c: Ignore illegal characters in the output -s: Suppress warning messages, but not error messages --verbose: Display progress information -f and -t can The specified legal characters are listed in the command with the -l option. [Example]* List currently supported character encodings

Use java's Character.isUpperCase() function to determine whether a character is an uppercase letter. In Java programming, sometimes we need to determine whether a character is an uppercase letter. Fortunately, Java provides a very convenient way to achieve this function, which is to use the isUpperCase() function of the Character class. This article will introduce how to use this function to make judgments and illustrate it with code examples. First, we need to understand Chara

Java documentation interpretation: Detailed explanation of the isLowerCase() method of the Character class. The Character class in Java provides many methods to handle character operations. The isLowerCase() method is used to determine whether a character is a lowercase letter. The specific use and application scenarios of this method will be explained in detail in this article. 1. The function and usage of isLowerCase() method The isLowerCase() method of Character class
