为什么我这段curl采集,单线程比多线程还快?
我这里写了个简单的curl采集,但是执行后发现单线程执行的方式比多线程执行要快很多.
是我的写法又问题吗?
<code>$images = [ "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415151923_73502.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415151826_52170.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415152035_59698.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140507/20140507143708_26688.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140417/20140417095153_61993.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140426/20140426094716_96396.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20130730/20130730160625_21437.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20130731/20130731170502_90104.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20130731/20130731165147_80414.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415151923_73502.jpg", ]; </code>
这个是单线程的函数:
<code>function getImg($url = "", $filename = "") { $ch = curl_init (); $opt [CURLOPT_URL] = $url; $opt [CURLOPT_HEADER] = true; $opt [CURLOPT_CONNECTTIMEOUT] = 10; $opt [CURLOPT_TIMEOUT] = 60; $opt [CURLOPT_AUTOREFERER] = true; $opt [CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11'; $opt [CURLOPT_RETURNTRANSFER] = true; // $opt [CURLOPT_FOLLOWLOCATION] = true; //跟随重定向 // $opt [CURLOPT_MAXREDIRS] = 10; curl_setopt_array ( $ch, $opt ); $r = curl_exec ( $ch ); if (false === $r) { $errno = curl_errno ( $ch ); $err = curl_error ( $ch ); curl_close ( $ch ); return false; } // 鉴定一下header:200 才写入文件 $header = explode("\r\n\r\n", $r); if (strpos($header[0], 'HTTP/1.1 200') === 0) { file_put_contents($filename, $header[1]); } curl_close ( $ch ); return true; } </code>
又尝试用curl_multi系列函数,但直接看的手册,没完全弄明白:
<code>// 多线程采集数据 function getImgMulti($url=[] , $filename=[]){ // 创建批处理cURL句柄 $mh = curl_multi_init(); // 这里可以加n=10个线程 foreach ($url as $k => $v ) { $ch[$k] = curl_init(); $opt [CURLOPT_URL] = $v; $opt [CURLOPT_HEADER] = true; $opt [CURLOPT_CONNECTTIMEOUT] = 10; $opt [CURLOPT_TIMEOUT] = 60; $opt [CURLOPT_AUTOREFERER] = true; $opt [CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11'; $opt [CURLOPT_RETURNTRANSFER] = true; // $opt [CURLOPT_FOLLOWLOCATION] = true; //跟随重定向 // $opt [CURLOPT_MAXREDIRS] = 10; curl_setopt_array ( $ch[$k], $opt ); // 增加1个句柄 curl_multi_add_handle($mh,$ch[$k]); } $running=null; // 执行批处理句柄 do { curl_multi_exec($mh,$running); } while($running > 0); for ($i=0; $i </code>
执行结果,循环执行单线程大约1.7秒完成,后面这个3.5秒完成.
可能是我对这个函数的用法没弄透,请哪位来解释下原因?
----后续补充----
我在windows上面测试的,难道说因为win的php多线程有别的问题?
另外还参考了别人写好的php类
http://blog.eiodesign.com/archives/86
用这个类库又做了一遍采集,结果还是一样,更慢了
<code>// 测试库采集 require("libs/class_curl_multi.php"); $mp = new MultiHttpRequest(); //远程图片本地化 $mp->set_urls($images); $images_result = $mp->start(); foreach ((array)$images_result as $image_key => $image_value) { if (!empty($image_key)) { _flush("store image:".$image_key."<br>"); file_put_contents('pics/'.$image_key.'.jpg',$image_value); } } </code>
用时4.05秒
是因为我对这个php的多线程理解有问题,还是其他原因造成这种差距呢? 貌似多线程并没有提高采集效率.反而影响了.
回复内容:
我这里写了个简单的curl采集,但是执行后发现单线程执行的方式比多线程执行要快很多.
是我的写法又问题吗?
<code>$images = [ "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415151923_73502.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415151826_52170.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415152035_59698.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140507/20140507143708_26688.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140417/20140417095153_61993.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140426/20140426094716_96396.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20130730/20130730160625_21437.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20130731/20130731170502_90104.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20130731/20130731165147_80414.jpg", "http://pic.91taojin.com.cn/data/attachment/image/20140415/20140415151923_73502.jpg", ]; </code>
这个是单线程的函数:
<code>function getImg($url = "", $filename = "") { $ch = curl_init (); $opt [CURLOPT_URL] = $url; $opt [CURLOPT_HEADER] = true; $opt [CURLOPT_CONNECTTIMEOUT] = 10; $opt [CURLOPT_TIMEOUT] = 60; $opt [CURLOPT_AUTOREFERER] = true; $opt [CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11'; $opt [CURLOPT_RETURNTRANSFER] = true; // $opt [CURLOPT_FOLLOWLOCATION] = true; //跟随重定向 // $opt [CURLOPT_MAXREDIRS] = 10; curl_setopt_array ( $ch, $opt ); $r = curl_exec ( $ch ); if (false === $r) { $errno = curl_errno ( $ch ); $err = curl_error ( $ch ); curl_close ( $ch ); return false; } // 鉴定一下header:200 才写入文件 $header = explode("\r\n\r\n", $r); if (strpos($header[0], 'HTTP/1.1 200') === 0) { file_put_contents($filename, $header[1]); } curl_close ( $ch ); return true; } </code>
又尝试用curl_multi系列函数,但直接看的手册,没完全弄明白:
<code>// 多线程采集数据 function getImgMulti($url=[] , $filename=[]){ // 创建批处理cURL句柄 $mh = curl_multi_init(); // 这里可以加n=10个线程 foreach ($url as $k => $v ) { $ch[$k] = curl_init(); $opt [CURLOPT_URL] = $v; $opt [CURLOPT_HEADER] = true; $opt [CURLOPT_CONNECTTIMEOUT] = 10; $opt [CURLOPT_TIMEOUT] = 60; $opt [CURLOPT_AUTOREFERER] = true; $opt [CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.47 Safari/536.11'; $opt [CURLOPT_RETURNTRANSFER] = true; // $opt [CURLOPT_FOLLOWLOCATION] = true; //跟随重定向 // $opt [CURLOPT_MAXREDIRS] = 10; curl_setopt_array ( $ch[$k], $opt ); // 增加1个句柄 curl_multi_add_handle($mh,$ch[$k]); } $running=null; // 执行批处理句柄 do { curl_multi_exec($mh,$running); } while($running > 0); for ($i=0; $i </code>
执行结果,循环执行单线程大约1.7秒完成,后面这个3.5秒完成.
可能是我对这个函数的用法没弄透,请哪位来解释下原因?
----后续补充----
我在windows上面测试的,难道说因为win的php多线程有别的问题?
另外还参考了别人写好的php类
http://blog.eiodesign.com/archives/86
用这个类库又做了一遍采集,结果还是一样,更慢了
<code>// 测试库采集 require("libs/class_curl_multi.php"); $mp = new MultiHttpRequest(); //远程图片本地化 $mp->set_urls($images); $images_result = $mp->start(); foreach ((array)$images_result as $image_key => $image_value) { if (!empty($image_key)) { _flush("store image:".$image_key."<br>"); file_put_contents('pics/'.$image_key.'.jpg',$image_value); } } </code>
用时4.05秒
是因为我对这个php的多线程理解有问题,还是其他原因造成这种差距呢? 貌似多线程并没有提高采集效率.反而影响了.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

To work with date and time in cakephp4, we are going to make use of the available FrozenTime class.

CakePHP is an open-source framework for PHP. It is intended to make developing, deploying and maintaining applications much easier. CakePHP is based on a MVC-like architecture that is both powerful and easy to grasp. Models, Views, and Controllers gu

To work on file upload we are going to use the form helper. Here, is an example for file upload.

Validator can be created by adding the following two lines in the controller.

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c

CakePHP is an open source MVC framework. It makes developing, deploying and maintaining applications much easier. CakePHP has a number of libraries to reduce the overload of most common tasks.

This tutorial demonstrates how to efficiently process XML documents using PHP. XML (eXtensible Markup Language) is a versatile text-based markup language designed for both human readability and machine parsing. It's commonly used for data storage an
