php上传文件是最最基础的一个技术点,但是深入进去也有不少问题需要解决,这不,上传中文文件后,文件名变成了乱码。
下面是问题代码,很简单:
<span> 1</span> <span><</span><span>html</span><span>></span> <span> 2</span> <span><</span><span>body</span><span>></span> <span> 3</span> <span> 4</span> <span><</span><span>form </span><span>action</span><span>="upload_file.php"</span><span> method</span><span>="post"</span> <span> 5</span> <span>enctype</span><span>="multipart/form-data"</span><span>></span> <span> 6</span> <span><</span><span>label </span><span>for</span><span>="file"</span><span>></span>Filename:<span></</span><span>label</span><span>></span> <span> 7</span> <span><</span><span>input </span><span>type</span><span>="file"</span><span> name</span><span>="file"</span><span> id</span><span>="file"</span> <span>/></span> <span> 8</span> <span><</span><span>br </span><span>/></span> <span> 9</span> <span><</span><span>input </span><span>type</span><span>="submit"</span><span> name</span><span>="submit"</span><span> value</span><span>="Submit"</span> <span>/></span> <span>10</span> <span></</span><span>form</span><span>></span> <span>11</span> <span>12</span> <span></</span><span>body</span><span>></span> <span>13</span> <span></</span><span>html</span><span>></span>
<span> 1</span> <?<span>php </span><span> 2</span> <span>if</span> (<span>$_FILES</span>["file"]["error"] > 0<span>) </span><span> 3</span> <span>{ </span><span> 4</span> <span>echo</span> "Return Code: " . <span>$_FILES</span>["file"]["error"] . "<br />"<span>; </span><span> 5</span> <span>}</span><span>else</span> <span> 7</span> <span>{ </span><span> 8</span> <span>echo</span> "Upload: " . <span>$_FILES</span>["file"]["name"] . "<br />"<span>; </span><span> 9</span> <span>echo</span> "Type: " . <span>$_FILES</span>["file"]["type"] . "<br />"<span>; </span><span>10</span> <span>echo</span> "Size: " . (<span>$_FILES</span>["file"]["size"] / 1024) . " Kb<br />"<span>; </span><span>11</span> <span>echo</span> "Temp file: " . <span>$_FILES</span>["file"]["tmp_name"] . "<br />"<span>; </span><span>12</span> <span>13</span> <span>if</span> (<span>file_exists</span>("upload/" . <span>$_FILES</span>["file"]["name"<span>])) </span><span>14</span> <span> { </span><span>15</span> <span>echo</span> <span>$_FILES</span>["file"]["name"] . " already exists. "<span>; </span><span>16</span> <span> } </span><span>17</span> <span>else</span> <span>18</span> <span> { </span><span>19</span> <span>move_uploaded_file</span>(<span>$_FILES</span>["file"]["tmp_name"], <span>20</span> "upload/" . <span>$_FILES</span>["file"]["name"]);<br /> }<br /> }
上传了一个文件名为“测试数据.txt”的文件,oh ho,文件是传上去了,但是文件名为乱码。
网上搜索一下解决方案,将
<span>move_uploaded_file</span>(<span>$_FILES</span>["file"]["tmp_name"], "upload/" . <span>$_FILES</span>["file"]["name"]);
改成
<span>move_uploaded_file</span>(<span>$_FILES</span>["file"]["tmp_name"],"upload/" . <span>iconv</span>("UTF-8","gbk",<span>$_FILES</span>["file"]["name"]));
结果发现iconv函数返回值为false。
查一下函数手册,发现第二个参数有特别的用法,简单翻译一下就是我可以在编码的后面追加//TRANSLIT 或 //IGNORE ,前者会将无法翻译的字符转成最接近的字符,后者就是直接忽略不能转化的字符。
试一下:
<span>1</span> <span>var_dump</span>( <span>iconv</span>("UTF-8","gbk//TRANSLIT",<span>$_FILES</span>["file"]["name"<span>])); </span><span>2</span> <span>var_dump</span>( <span>iconv</span>("UTF-8","gbk//IGNORE",<span>$_FILES</span>["file"]["name"]));
结果:
<p><span>bool(false) string(4) ".txt"<br /> </span></p>
也就是说中文都没法转化,甚至连接近的字符都没有,看来网上介绍的方法也并非万能。
猜测一下,也许我的系统在创建中文文件的时候会乱码,于是我将代码改写了一下:
<span>move_uploaded_file</span>(<span>$_FILES</span>["file"]["tmp_name"], "upload/测试数据.txt");
结果创建成功,没有乱码。。。也就是说不是系统问题。
想一下,我的php文件本身是utf8编码的,那么
<span>move_uploaded_file</span>(<span>$_FILES</span>["file"]["tmp_name"],"upload/测试数据.txt");
这个语句肯定使用的是utf8编码,那么之前上传的文件名肯定就不是utf8编码了,那么以下的语句肯定是错误的,因为源字符串本身就不是utf8编码的:
<span>iconv</span>("UTF-8","gbk//TRANSLIT",<span>$_FILES</span>["file"]["name"]);
使用函数检查源字符串的编码:
<span>1</span> <span>$e</span>=mb_detect_encoding(<span>$text</span>, <span>array</span>(‘UTF-8’, ‘GBK’,<span>’gb2312’)); </span><span>2</span> <span>echo</span> <span>$e</span>;
结果是CP936,也就是源字符串编码是GBK。
试一下
<span>move_uploaded_file</span>(<span>$_FILES</span>["file"]["tmp_name"],"upload/" . <span>iconv</span>("gbk","UTF-8",<span>$_FILES</span>["file"]["name"]));
问题解决,不再乱码
实际上还有一种解决办法,就是在html文件的head标签中间加入
<span><</span><span>meta </span><span>http-equiv</span><span>="Content-Type"</span><span> content</span><span>="text/html; charset=utf-8"</span> <span>/></span>
从而使编码保持统一,也就不需要再转码了