比如你要下载这个地址的文件:
http://torrent.google.com.btba.xinqiys.com/upload/2016/08/26/【BT吧】[720p]-龙狼血战-6.56GB.torrent
你直接读取的话,会报404.
必须转换成
http://torrent.google.com.btba.xinqiys.com/upload/2016/08/26/%E3%80%90BT%E5%90%A7%E3%80%91%5B720p%5D-%E9%BE%99%E7%8B%BC%E8%A1%80%E6%88%98-6.56GB.torrent
这样的地址才能读取。
意思就是地址有中文就不行,必须转换中文。
有什么办法能方便的转换吗?
我之前用php搞过,现在nodejs,感觉挺麻烦的,有现成的模块啥的,那是最好了。
补充,自己写了个
function url_encode(url){
url = encodeURIComponent(url);
url = url.replace(/\%3A/g, ":");
url = url.replace(/\%2F/g, "/");
url = url.replace(/\%3F/g, "?");
url = url.replace(/\%3D/g, "=");
url = url.replace(/\%26/g, "&");
return url;
}
附个之前php的解决方案:
function cnurl($url){
global $_G;
if(ischinese($url) != 'encn') return $url;
$_G['cn_charset'] = $_G['cn_charset'] ? $_G['cn_charset'] : $_G['cache']['evn_milu_pick']['charset'];
if(!$_G['cn_charset']){
$content = get_contents($url);
$_G['cn_charset'] = strtoupper(get_charset($content));
}
$url = url_unescape($url);
$url_info = parse_url($url);
$url_query = $url_info['query'];
parse_str($url_query, $url_arr);
$args_arr = array();
if($url_arr){
foreach((array)$url_arr as $k => $v){
$v = cnurl_format($v);
$args_arr[] = $k.'='.$v;
}
$args_str = implode('&', $args_arr);
$url = str_replace($url_query, $args_str, $url);
}else{
return cnurl_format($url);
}
return $url;
}
function cnurl_format($str){
global $_G;
$str = trim($str);
if(!$str) return;
$str = url_unescape($str);
if(ischinese($str) == 'allen') return $str;
$str = piconv($str, CHARSET, $_G['cn_charset']);
return preg_replace(array('/\%3A/i', '/\%2F/i' , '/\%3F/i', '/\%3D/i', '/\%26/i'), array(':', '/', '?', '=', '&'), rawurlencode($str) );
}
function url_unescape($str) {
$str = rawurldecode($str);
preg_match_all("/(?:%u.{4})|&#x.{4};|&#\d+;|.+/U",$str,$r);
$ar = $r[0];
foreach($ar as $k=>$v) {
if(substr($v,0,2) == "%u"){
$ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,-4)));
}elseif(substr($v,0,3) == "&#x"){
$ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,3,-1)));
}elseif(substr($v,0,2) == "&#") {
$ar[$k] = iconv("UCS-2","UTF-8",pack("n",substr($v,2,-1)));
}
}
return join("",$ar);
}
之前在php觉得复杂那是因为,如果对方网站的编码跟你自己的编码不一致,你转换的中文就有问题。比如你把utf-8编码的中文进行urlencode之后,再去访问这个地址,就有可能出错。因为人家的编码可能是gbk编码。
所以,如果你不知道对方编码的情况下,这个问题就很棘手。
但是用js试了一下,好像没php那么复杂,至少编码方面无需处理,我暂时搞不清楚状况。
There are two functions
and
encodeURIComponent()
inencodeURI()
js that encode URI. You can just use theencodeURI()
function directly. TheencodeURIComponent()
function will escape the punctuation marks used to separate each part of the URI. For example '/' etc.Looking at your format, it seems to be urlencode. For js, use encodeURIComponent
Convert using regular expression