需求:
<code>客户端传过来一段字符串,需要从字符串中匹配出所有的url,包括域名或IP后面的参数(含端口)</code>
URL样例:
<code>http://127.0.0.1/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23 或者 http://www.baidu.com/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23 </code>
当然简单URL也是要匹配出来的
求解正则
需求:
<code>客户端传过来一段字符串,需要从字符串中匹配出所有的url,包括域名或IP后面的参数(含端口)</code>
URL样例:
<code>http://127.0.0.1/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23 或者 http://www.baidu.com/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23 </code>
当然简单URL也是要匹配出来的
求解正则
先用比较宽泛的正则匹配出所有的url,例如
<code>https?:\/\/\S+</code>
然后对于这堆url依次采用parse_url
函数
<code>^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$ http://regexlib.com/Search.aspx?k=url&c=-1&m=5&ps=20</code>
Java 大概这么写
<code class="java">String str = "接收到的字符串" String regex = "(http:|https:)//[^[A-Za-z0-9,:\\._\\?%&+\\-=/#]]*"; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(str); while (matcher.find()) { String url=matcher.group(); System.out.println(url); }</code>
以下字符串通过测试.
<code class="java">String str="http://127.0.0.1:6666/ " + "https://www.baidu.com/ " + "http://127.0.0.1/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23\n" + "或者\n" + "哈哈http://www.baidu.com:85676/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23 6666都是对的";</code>
输出
<code class="java">http://127.0.0.1:6666/ https://www.baidu.com/ http://127.0.0.1/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23 http://www.baidu.com:85676/metinfo/img/img.php?class1=1&serch_sql=%201=if%28ascii%28substr%28user%28%29,1,1%29%29=114,1,2%29%23</code>
什么,你问的式PHP?抱歉,我不会PHP。。。
正则一样的,自己动动脑袋吧。