PHP批量采集下载美女图片的实现代码_PHP
考虑到单纯的采集一个网页的图片,太麻烦,所以直接采集他的列表页,获取列表的url然后在一一采集,但是用php匹配列表页的url太麻烦,第一列表页有很多无效url这对我这个正则小菜鸟实在是个问题,看了一下列表页的结构,果断采用jquery获取url,jquery的万能选择器又再次强大起来了。
jquery获取url,然后ajax传递url—>对应PHP文件,遍历url参数—->单页面采集保存图片
jquery程序
复制代码 代码如下:
<script> <BR>$(document).ready(function(){ <BR>var hrefs =''; <BR>$('.f_folder>a').each(function(i){ <BR>var href = $('.f_folder:eq('+i+')>a:eq(0)').attr('href'); <BR>if(href!='undefined'){ <BR>hrefs +=href+','; <BR>} <BR>}) <BR>$.getJSON("http://www.****.com/365/getimg.php?hrefs="+hrefs+"&callback=?", function(data){ <BR>//alert(data.info); <BR>}); <BR>}); <BR></script>
这里把url拼接成‘,'分割的字符串传递url,使用getjson是为了跨域需要,关于getjson常见的几个问题可以参看
PHP采集程序
复制代码 代码如下:
// 抓起365图片
error_reporting(E_ALL ^ E_NOTICE);
set_time_limit(0);//设置PHP超时时间
/**
* 得到当前时间
*/
function getMicrotime() {
list ($usec, $sec) = explode(" ", microtime());
return ((float) $usec + (float) $sec);
}
$stime = getMicrotime();
$callback = $_GET['callback'];
$hrefs = $_GET['hrefs'];
$urlarray = explode(',',$hrefs);
//获取指定url的所有图片
function getimgs($url){
$dirname = basename($url,".php");
if(!file_exists($dirname)){
mkdir('365/'.$dirname.'');
}
clearstatcache();
$data = file_get_contents($url);
preg_match_all("/(href|src)=(["|']?)([^ "'>]+.(jpg|png|PNG|JPG|gif))\2/i", $data, $matches);
//$matches[3] = array_unique($matches[3]);
unset($data);
$i=0;
if(count($matches[3])>0){
foreach($matches[3] as $k=>$v){
//简单判断是否是标准url,而不是相对路径
if(substr($v,0,4)=='http'){
$ext = pathinfo($v,PATHINFO_EXTENSION);//图片扩展
if(!file_exists('365/'.$dirname.'/'.$k.'.'.$ext)){
file_put_contents('365/'.$dirname.'/'.$k.'.'.$ext,file_get_contents($v));
$i++;
}else{
unset($v);
}
clearstatcache();
}else{
unset($v);
}
}
unset($matches);
return $i;
}
}
foreach($urlarray as $k=>$v){
if($v!=''){
$j +=getimgs($v);
}
}
$etime = getMicrotime();
echo "合计采集了".$j."张图片";
echo "用时".($etime-$stime)."秒";
考虑到性能问题:在getimgs方法中所用的变量都是使用后便注销(unset)了,以便释放内存。
设计到的几个知识点
判断是否是标准有效图片url
if(substr($v,0,4)=='http')这个只是简单的判断一下匹配到的图片url是否是标准的url,因为采集的图片可能是相对路径的,这里我直接放弃这种图片的采集,当然你也可以把这种图片还原成标准图片路径,还有一个问题就是即使是标准url格式,这样的图片也未必可以采集,因为你不知道这个图片是否还有,也许这个图片url已经无效了,如果你想更严格的判断这个图片url是否真实有效可以推荐看我之前的《PHP判断远程url是否有效的几种方法》有三种方法可以验证是否是有效url。
获取图片格式
$ext = pathinfo($v,PATHINFO_EXTENSION);//图片扩展
这里使用了pathinfo的方法,总结有7种方法可以获取到文件的格式,推荐文章:《PHP判断图片格式的七种方法》
下载保存到本地
file_put_contents('365/'.$dirname.'/'.$k.'.'.$ext,file_get_contents($v));
file_put_contents() 函数把一个字符串写入文件中。
与依次调用 fopen(),fwrite() 以及 fclose() 功能一样。
file_get_contents() 函数把整个文件读入一个字符串中。
因为服务器支持file_get_contents,如果服务器把这个函数禁用了,可以使用curl,这个工具要比file_get_contents更加强大,推荐学习《CURL的学习和应用(附多线程)》,可以使用curl的多线程下载存储,效果更牛逼
清除文件操作缓存
clearstatcache() 函数清除文件状态缓存。clearstatcache() 函数会缓存某些函数的返回信息,以便提供更高的性能。但是有时候,比如在一个脚本中多次检查同一个文件,而该文件在此脚本执行期间有被删除或修改的危险时,你需要清除文件状态缓存,以便获得正确的结果。要做到这一点,就需要使用 clearstatcache() 函数。官方手册:
程序执行时间计算
复制代码 代码如下:
/**
* 得到当前时间
*/
function getMicrotime() {
list ($usec, $sec) = explode(" ", microtime());
return ((float) $usec + (float) $sec);
}
可以参考本博客文章;《获取php页面执行时间,数据库读写次数,函数调用次数等【THINKPHP】》
最后看一下效果;

409秒采集了214张图片,大概2秒下载保存了一张图片,图片总大小约62M,这样看来:
一个小时60*60可以大约下载1800张美女图片。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

With the continuous development of social media, Xiaohongshu has become a platform for more and more young people to share their lives and discover beautiful things. Many users are troubled by auto-save issues when posting images. So, how to solve this problem? 1. How to solve the problem of automatically saving pictures when publishing on Xiaohongshu? 1. Clear the cache First, we can try to clear the cache data of Xiaohongshu. The steps are as follows: (1) Open Xiaohongshu and click the "My" button in the lower right corner; (2) On the personal center page, find "Settings" and click it; (3) Scroll down and find the "Clear Cache" option. Click OK. After clearing the cache, re-enter Xiaohongshu and try to post pictures to see if the automatic saving problem is solved. 2. Update the Xiaohongshu version to ensure that your Xiaohongshu

With the popularity of Douyin short videos, user interactions in the comment area have become more colorful. Some users wish to share images in comments to better express their opinions or emotions. So, how to post pictures in TikTok comments? This article will answer this question in detail and provide you with some related tips and precautions. 1. How to post pictures in Douyin comments? 1. Open Douyin: First, you need to open Douyin APP and log in to your account. 2. Find the comment area: When browsing or posting a short video, find the place where you want to comment and click the "Comment" button. 3. Enter your comment content: Enter your comment content in the comment area. 4. Choose to send a picture: In the interface for entering comment content, you will see a "picture" button or a "+" button, click

Apple's recent iPhones capture memories with crisp detail, saturation and brightness. But sometimes, you may encounter some issues that may cause the image to look less clear. While autofocus on iPhone cameras has come a long way, allowing you to take photos quickly, the camera can mistakenly focus on the wrong subject in certain situations, making the photo blurry in unwanted areas. If your photos on your iPhone look out of focus or lack sharpness overall, the following post should help you make them sharper. How to Make Pictures Clearer on iPhone [6 Methods] You can try using the native Photos app to clean up your photos. If you want more features and options

In PowerPoint, it is a common technique to display pictures one by one, which can be achieved by setting animation effects. This guide details the steps to implement this technique, including basic setup, image insertion, adding animation, and adjusting animation order and timing. Additionally, advanced settings and adjustments are provided, such as using triggers, adjusting animation speed and order, and previewing animation effects. By following these steps and tips, users can easily set up pictures to appear one after another in PowerPoint, thereby enhancing the visual impact of the presentation and grabbing the attention of the audience.

Are you also using Foxit PDF Reader software? So do you know how Foxit PDF Reader converts pdf documents into jpg images? The following article brings you how Foxit PDF Reader converts pdf documents into jpg images. For those who are interested in the method of converting jpg images, please come and take a look below. First start Foxit PDF Reader, then find "Features" on the top toolbar, and then select the "PDF to Others" function. Next, open a web page called "Foxit PDF Online Conversion". Click the "Login" button on the upper right side of the page to log in, and then turn on the "PDF to Image" function. Then click the upload button and add the pdf file you want to convert into an image. After adding it, click "Start Conversion"

How to use JavaScript to implement the drag and zoom function of images? In modern web development, dragging and zooming images is a common requirement. By using JavaScript, we can easily add dragging and zooming functions to images to provide a better user experience. In this article, we will introduce how to use JavaScript to implement this function, with specific code examples. HTML structure First, we need a basic HTML structure to display pictures and add

When using WPS office software, we found that not only one form is used, tables and pictures can be added to the text, pictures can also be added to the table, etc. These are all used together to make the content of the entire document look richer. , if you need to insert two pictures into the document and they need to be arranged side by side. Our next course can solve this problem: how to place two pictures side by side in a wps document. 1. First, you need to open the WPS software and find the picture you want to adjust. Left-click the picture and a menu bar will pop up, select "Page Layout". 2. Select "Tight wrapping" in text wrapping. 3. After all the pictures you need are confirmed to be set to "Tight text wrapping", you can drag the pictures to the appropriate position and click on the first picture.

Overview of advanced functions of how to use HTML, CSS and jQuery to implement image merge display: In web design, image display is an important link, and image merge display is one of the common techniques to improve page loading speed and enhance user experience. This article will introduce how to use HTML, CSS and jQuery to implement advanced functions of image merging and display, and provide specific code examples. 1. HTML layout: First, we need to create a container in HTML to display the merged images. You can use di
