php中实现的一个curl批处理的实例 - 猿客
curl是利用URL语法在命令行方式下工作的开源文件传输工具
本文在php中实现了的一个curl批处理的实例。
代码如下:
<span style="color: #008080;"> 1</span> <span style="color: #008080;">header</span>("Content-Type:text/html;charset=utf8"<span style="color: #000000;">); </span><span style="color: #008080;"> 2</span> <span style="color: #008080;"> 3</span> <span style="color: #008000;">/*</span><span style="color: #008000;"> 先获取两个页面的所有a标签 </span><span style="color: #008000;">*/</span> <span style="color: #008080;"> 4</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 初始化两个简单处理句柄</span> <span style="color: #008080;"> 5</span> <span style="color: #800080;">$ch1</span> =<span style="color: #000000;"> curl_init(); </span><span style="color: #008080;"> 6</span> <span style="color: #800080;">$ch2</span> =<span style="color: #000000;"> curl_init(); </span><span style="color: #008080;"> 7</span> curl_setopt_array(<span style="color: #800080;">$ch1</span>,<span style="color: #0000ff;">array</span><span style="color: #000000;">( </span><span style="color: #008080;"> 8</span> CURLOPT_URL => 'http://www.sina.com.cn', <span style="color: #008080;"> 9</span> CURLOPT_HEADER => 0, <span style="color: #008080;"> 10</span> CURLOPT_RETURNTRANSFER => 1, <span style="color: #008080;"> 11</span> <span style="color: #000000;">)); </span><span style="color: #008080;"> 12</span> curl_setopt_array(<span style="color: #800080;">$ch2</span>,<span style="color: #0000ff;">array</span><span style="color: #000000;">( </span><span style="color: #008080;"> 13</span> CURLOPT_URL => 'http://www.baidu.com/', <span style="color: #008080;"> 14</span> CURLOPT_HEADER => 0, <span style="color: #008080;"> 15</span> CURLOPT_RETURNTRANSFER => 1, <span style="color: #008080;"> 16</span> <span style="color: #000000;">)); </span><span style="color: #008080;"> 17</span> <span style="color: #008080;"> 18</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 初始化批处理句柄,并添加简单处理句柄</span> <span style="color: #008080;"> 19</span> <span style="color: #800080;">$mh</span> =<span style="color: #000000;"> curl_multi_init(); </span><span style="color: #008080;"> 20</span> curl_multi_add_handle(<span style="color: #800080;">$mh</span>,<span style="color: #800080;">$ch1</span><span style="color: #000000;">); </span><span style="color: #008080;"> 21</span> curl_multi_add_handle(<span style="color: #800080;">$mh</span>,<span style="color: #800080;">$ch2</span><span style="color: #000000;">); </span><span style="color: #008080;"> 22</span> <span style="color: #008080;"> 23</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 初始化执行状态</span> <span style="color: #008080;"> 24</span> <span style="color: #800080;">$state</span> = <span style="color: #0000ff;">null</span><span style="color: #000000;">; </span><span style="color: #008080;"> 25</span> <span style="color: #008080;"> 26</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 执行批处理</span> <span style="color: #008080;"> 27</span> <span style="color: #0000ff;">do</span><span style="color: #000000;">{ </span><span style="color: #008080;"> 28</span> <span style="color: #800080;">$mc</span> = curl_multi_exec(<span style="color: #800080;">$mh</span>,<span style="color: #800080;">$state</span><span style="color: #000000;">); </span><span style="color: #008080;"> 29</span> }<span style="color: #0000ff;">while</span>(<span style="color: #800080;">$mc</span> ==<span style="color: #000000;"> CURLM_CALL_MULTI_PERFORM); </span><span style="color: #008080;"> 30</span> <span style="color: #0000ff;">while</span>(<span style="color: #800080;">$mc</span> == CURLM_OK && <span style="color: #800080;">$state</span><span style="color: #000000;">) { </span><span style="color: #008080;"> 31</span> <span style="color: #0000ff;">while</span> (curl_multi_exec(<span style="color: #800080;">$mh</span>, <span style="color: #800080;">$state</span>) ===<span style="color: #000000;"> CURLM_CALL_MULTI_PERFORM); </span><span style="color: #008080;"> 32</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 经过实验,发现curl_multi_select($mh)总是返回-1,意味着一下代码不会执行</span> <span style="color: #008080;"> 33</span> <span style="color: #0000ff;">if</span>(curl_multi_select(<span style="color: #800080;">$mh</span>) != -1<span style="color: #000000;">) { </span><span style="color: #008080;"> 34</span> <span style="color: #0000ff;">do</span><span style="color: #000000;">{ </span><span style="color: #008080;"> 35</span> <span style="color: #800080;">$mc</span> = curl_multi_exec(<span style="color: #800080;">$mh</span>,<span style="color: #800080;">$state</span><span style="color: #000000;">); </span><span style="color: #008080;"> 36</span> }<span style="color: #0000ff;">while</span>(<span style="color: #800080;">$mc</span> ==<span style="color: #000000;"> CURLM_CALL_MULTI_PERFORM); </span><span style="color: #008080;"> 37</span> <span style="color: #000000;"> } </span><span style="color: #008080;"> 38</span> <span style="color: #000000;">} </span><span style="color: #008080;"> 39</span> <span style="color: #008080;"> 40</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 获取内容</span> <span style="color: #008080;"> 41</span> <span style="color: #800080;">$text</span> = curl_multi_getcontent(<span style="color: #800080;">$ch1</span><span style="color: #000000;">); </span><span style="color: #008080;"> 42</span> <span style="color: #800080;">$text</span> .= curl_multi_getcontent(<span style="color: #800080;">$ch2</span><span style="color: #000000;">); </span><span style="color: #008080;"> 43</span> <span style="color: #008080;"> 44</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 找到页面中所有的a标签,保存到$matches</span> <span style="color: #008080;"> 45</span> <span style="color: #800080;">$matches</span> = <span style="color: #0000ff;">null</span><span style="color: #000000;">; </span><span style="color: #008080;"> 46</span> <span style="color: #008080;">preg_match_all</span>("/<a.>(.*?)/",<span style="color: #800080;">$text</span>,<span style="color: #800080;">$matches</span><span style="color: #000000;">); </span><span style="color: #008080;"> 47</span> <span style="color: #008080;"> 48</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 关闭各个句柄</span> <span style="color: #008080;"> 49</span> curl_multi_remove_handle(<span style="color: #800080;">$mh</span>,<span style="color: #800080;">$ch1</span><span style="color: #000000;">); </span><span style="color: #008080;"> 50</span> curl_multi_remove_handle(<span style="color: #800080;">$mh</span>,<span style="color: #800080;">$ch2</span><span style="color: #000000;">); </span><span style="color: #008080;"> 51</span> curl_multi_close(<span style="color: #800080;">$mh</span><span style="color: #000000;">); </span><span style="color: #008080;"> 52</span> <span style="color: #008080;"> 53</span> <span style="color: #008000;">/*</span><span style="color: #008000;">在找到的连接中继续查找title标签 </span><span style="color: #008000;">*/</span> <span style="color: #008080;"> 54</span> <span style="color: #008080;"> 55</span> <span style="color: #800080;">$handle</span> = <span style="color: #0000ff;">array</span>(); <span style="color: #008000;">//</span><span style="color: #008000;"> 存储简单处理句柄的数组</span> <span style="color: #008080;"> 56</span> <span style="color: #800080;">$mhandle</span> = curl_multi_init(); <span style="color: #008000;">//</span><span style="color: #008000;">批处理句柄 </span><span style="color: #008080;"> 57</span> <span style="color: #008000;">// 处理100个页面</span> <span style="color: #008080;"> 58</span> <span style="color: #0000ff;">foreach</span>(<span style="color: #008080;">array_slice</span>(<span style="color: #800080;">$matches</span>[1],0,100) <span style="color: #0000ff;">as</span> <span style="color: #800080;">$href</span><span style="color: #000000;">) { </span><span style="color: #008080;"> 59</span> <span style="color: #800080;">$tmp_h</span> =<span style="color: #000000;"> curl_init(); </span><span style="color: #008080;"> 60</span> curl_setopt_array(<span style="color: #800080;">$tmp_h</span>,<span style="color: #0000ff;">array</span><span style="color: #000000;">( </span><span style="color: #008080;"> 61</span> CURLOPT_URL => <span style="color: #800080;">$href</span>, <span style="color: #008080;"> 62</span> CURLOPT_HEADER => 0, <span style="color: #008080;"> 63</span> CURLOPT_RETURNTRANSFER => 1, <span style="color: #008080;"> 64</span> <span style="color: #000000;"> )); </span><span style="color: #008080;"> 65</span> curl_multi_add_handle(<span style="color: #800080;">$mhandle</span>,<span style="color: #800080;">$tmp_h</span><span style="color: #000000;">); </span><span style="color: #008080;"> 66</span> <span style="color: #800080;">$handle</span>[] = <span style="color: #800080;">$tmp_h</span><span style="color: #000000;">; </span><span style="color: #008080;"> 67</span> <span style="color: #000000;">} </span><span style="color: #008080;"> 68</span> <span style="color: #0000ff;">do</span><span style="color: #000000;">{ </span><span style="color: #008080;"> 69</span> <span style="color: #800080;">$mrc</span> = curl_multi_exec(<span style="color: #800080;">$mhandle</span>,<span style="color: #800080;">$active</span><span style="color: #000000;">); </span><span style="color: #008080;"> 70</span> }<span style="color: #0000ff;">while</span>(<span style="color: #800080;">$mrc</span> ==<span style="color: #000000;"> CURLM_CALL_MULTI_PERFORM); </span><span style="color: #008080;"> 71</span> <span style="color: #0000ff;">while</span>(<span style="color: #800080;">$mrc</span> == CURLM_OK && <span style="color: #800080;">$active</span><span style="color: #000000;">) { </span><span style="color: #008080;"> 72</span> <span style="color: #0000ff;">while</span>(curl_multi_exec(<span style="color: #800080;">$mhandle</span>,<span style="color: #800080;">$active</span>) ==<span style="color: #000000;"> CURLM_CALL_MULTI_PERFORM); </span><span style="color: #008080;"> 73</span> <span style="color: #0000ff;">if</span>(curl_multi_select(<span style="color: #800080;">$mhandle</span>) != -1<span style="color: #000000;">) { </span><span style="color: #008080;"> 74</span> <span style="color: #0000ff;">do</span><span style="color: #000000;">{ </span><span style="color: #008080;"> 75</span> <span style="color: #800080;">$mrc</span> = curl_multi_exec(<span style="color: #800080;">$mhandle</span>,<span style="color: #800080;">$active</span><span style="color: #000000;">); </span><span style="color: #008080;"> 76</span> }<span style="color: #0000ff;">while</span>(<span style="color: #800080;">$mrc</span> ==<span style="color: #000000;"> CURLM_CALL_MULTI_PERFORM); </span><span style="color: #008080;"> 77</span> <span style="color: #000000;"> } </span><span style="color: #008080;"> 78</span> <span style="color: #000000;">} </span><span style="color: #008080;"> 79</span> <span style="color: #008080;"> 80</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 获取这些页面的内容</span> <span style="color: #008080;"> 81</span> <span style="color: #800080;">$mtext</span> = <span style="color: #0000ff;">null</span><span style="color: #000000;">; </span><span style="color: #008080;"> 82</span> <span style="color: #0000ff;">foreach</span>(<span style="color: #800080;">$handle</span> <span style="color: #0000ff;">as</span> <span style="color: #800080;">$tmp_h</span><span style="color: #000000;">) { </span><span style="color: #008080;"> 83</span> <span style="color: #800080;">$mtext</span> .= curl_multi_getcontent(<span style="color: #800080;">$tmp_h</span><span style="color: #000000;">); </span><span style="color: #008080;"> 84</span> curl_multi_remove_handle(<span style="color: #800080;">$mhandle</span>, <span style="color: #800080;">$tmp_h</span><span style="color: #000000;">); </span><span style="color: #008080;"> 85</span> <span style="color: #000000;">} </span><span style="color: #008080;"> 86</span> <span style="color: #800080;">$mmatches</span> = <span style="color: #0000ff;">array</span><span style="color: #000000;">(); </span><span style="color: #008080;"> 87</span> <span style="color: #008080;">preg_match_all</span>("/<title>(.*?)/",<span style="color: #800080;">$mtext</span>, <span style="color: #800080;">$mmatches</span><span style="color: #000000;">); </span><span style="color: #008080;"> 88</span> <span style="color: #008080;"> 89</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 编码转换</span> <span style="color: #008080;"> 90</span> mb_detect_order('GB2312,GBK,BIG5,GB18030,UNICODE ,CP936'<span style="color: #000000;">); </span><span style="color: #008080;"> 91</span> <span style="color: #0000ff;">foreach</span>(<span style="color: #800080;">$mmatches</span>[1] <span style="color: #0000ff;">as</span> <span style="color: #800080;">$key</span> => <span style="color: #800080;">$val</span><span style="color: #000000;">) { </span><span style="color: #008080;"> 92</span> <span style="color: #800080;">$encoding</span> = mb_detect_encoding(<span style="color: #800080;">$val</span><span style="color: #000000;">); </span><span style="color: #008080;"> 93</span> <span style="color: #0000ff;">if</span>(<span style="color: #800080;">$encoding</span> != 'UTF-8' && <span style="color: #800080;">$encoding</span> != 'CP936' && <span style="color: #800080;">$encoding</span> != 'GB18030' && <span style="color: #800080;">$encoding</span> !=''<span style="color: #000000;">) { </span><span style="color: #008080;"> 94</span> <span style="color: #800080;">$mmatches</span>[1][<span style="color: #800080;">$key</span>] = <span style="color: #008080;">iconv</span>(<span style="color: #800080;">$encoding</span>,'UTF-8//IGNORE',<span style="color: #800080;">$val</span><span style="color: #000000;">); </span><span style="color: #008080;"> 95</span> <span style="color: #000000;"> } </span><span style="color: #008080;"> 96</span> <span style="color: #000000;">} </span><span style="color: #008080;"> 97</span> <span style="color: #008080;"> 98</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 打印title信息</span> <span style="color: #008080;"> 99</span> <span style="color: #008080;">var_dump</span>(<span style="color: #800080;">$mmatches</span>[1<span style="color: #000000;">]); </span><span style="color: #008080;">100</span> <span style="color: #008080;">101</span> <span style="color: #008000;">//</span><span style="color: #008000;"> 关闭批处理句柄</span> <span style="color: #008080;">102</span> curl_multi_close(<span style="color: #800080;">$mhandle</span>);</title></a.>
登录后复制
本站声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热AI工具

Undresser.AI Undress
人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover
用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool
免费脱衣服图片

Clothoff.io
AI脱衣机

AI Hentai Generator
免费生成ai无尽的。

热门文章
仓库:如何复兴队友
1 个月前
By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O.能量晶体解释及其做什么(黄色晶体)
2 周前
By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island冒险:如何获得巨型种子
1 个月前
By 尊渡假赌尊渡假赌尊渡假赌
击败分裂小说需要多长时间?
4 周前
By DDD
R.E.P.O.保存文件位置:在哪里以及如何保护它?
4 周前
By DDD

热工具

记事本++7.3.1
好用且免费的代码编辑器

SublimeText3汉化版
中文版,非常好用

禅工作室 13.0.1
功能强大的PHP集成开发环境

Dreamweaver CS6
视觉化网页开发工具

SublimeText3 Mac版
神级代码编辑软件(SublimeText3)