In this question, the objective was to optimize the process of extracting images with dimensions greater than or equal to 200 pixels in width and height from a given URL. The initial approach using file_get_html() and getimagesize() resulted in a slow execution time of 48.64 seconds.
Improved Approach:
To enhance efficiency, the following steps were proposed:
Sample Code:
<code class="php">require 'simple_html_dom.php'; $url = 'http://www.huffingtonpost.com'; $html = file_get_html ( $url ); $nodes = array (); $start = microtime (); $res = array (); if ($html->find ( 'img' )) { foreach ( $html->find ( 'img' ) as $element ) { if (startsWith ( $element->src, "/" )) { $element->src = $url . $element->src; } if (! startsWith ( $element->src, "http" )) { $element->src = $url . "/" . $element->src; } $nodes [] = $element->src; } } echo "<pre class="brush:php;toolbar:false">"; print_r ( imageDownload ( $nodes, 200, 200 ) ); echo "<h1>", microtime () - $start, "</h1>"; function imageDownload($nodes, $maxHeight = 0, $maxWidth = 0) { $mh = curl_multi_init (); $curl_array = array (); foreach ( $nodes as $i => $url ) { $curl_array [$i] = curl_init ( $url ); curl_setopt ( $curl_array [$i], CURLOPT_RETURNTRANSFER, true ); curl_setopt ( $curl_array [$i], CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729)' ); curl_setopt ( $curl_array [$i], CURLOPT_CONNECTTIMEOUT, 5 ); curl_setopt ( $curl_array [$i], CURLOPT_TIMEOUT, 15 ); curl_multi_add_handle ( $mh, $curl_array [$i] ); } $running = NULL; do { usleep ( 10000 ); curl_multi_exec ( $mh, $running ); } while ( $running > 0 ); $res = array (); foreach ( $nodes as $i => $url ) { $curlErrorCode = curl_errno ( $curl_array [$i] ); if ($curlErrorCode === 0) { $info = curl_getinfo ( $curl_array [$i] ); $ext = getExtention ( $info ['content_type'] ); if ($info ['content_type'] !== null) { $temp = "temp/img" . md5 ( mt_rand () ) . $ext; touch ( $temp ); $imageContent = curl_multi_getcontent ( $curl_array [$i] ); file_put_contents ( $temp, $imageContent ); if ($maxHeight == 0 || $maxWidth == 0) { $res [] = $temp; } else { $size = getimagesize ( $temp ); if ($size [1] >= $maxHeight && $size [0] >= $maxWidth) { $res [] = $temp; } else { unlink ( $temp ); } } } } curl_multi_remove_handle ( $mh, $curl_array [$i] ); curl_close ( $curl_array [$i] ); } curl_multi_close ( $mh ); return $res; } function getExtention($type) { $type = strtolower ( $type ); switch ($type) { case "image/gif" : return ".gif"; break; case "image/png" : return ".png"; break; case "image/jpeg" : return ".jpg"; break; default : return ".img"; break; } } function startsWith($str, $prefix) { $temp = substr ( $str, 0, strlen ( $prefix ) ); $temp = strtolower ( $temp ); $prefix = strtolower ( $prefix ); return ($temp == $prefix); }</code>
This updated approach demonstrated significant speed improvements, retrieving images in just 0.076 seconds compared to the original 48.64 seconds.
The above is the detailed content of How to Achieve Faster Image Extraction from URLs with Specific Dimensions?. For more information, please follow other related articles on the PHP Chinese website!