PHP采撷利器:Snoopy 试用心得
PHP采集利器:Snoopy 试用心得
?
$url = "http://www.taoav.com"; include("snoopy.php"); $snoopy = new Snoopy; $snoopy->fetch($url); //获取所有内容 echo $snoopy->results; //显示结果 $snoopy->fetchtext //获取文本内容(去掉html代码) $snoopy->fetchlinks //获取链接 $snoopy->fetchform //获取表单
$formvars["username"] = "admin"; $formvars["pwd"] = "admin"; $action = "http://www.taoav.com";//表单提交地址 $snoopy->submit($action,$formvars);//$formvars为提交的数组 echo $snoopy->results; //获取表单提交后的 返回的结果 $snoopy->submittext; //提交后只返回 去除html的 文本 $snoopy->submitlinks;//提交后只返回 链接
$formvars["username"] = "admin"; $formvars["pwd"] = "admin"; $action = "http://www.taoav.com"; include "snoopy.php"; $snoopy = new Snoopy; $snoopy->cookies["PHPSESSID"] = 'fc106b1918bd522cc863f36890e6fff7'; //伪装sessionid $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)"; //伪装浏览器 $snoopy->referer = "http://www.only4.cn"; //伪装来源页地址 http_referer $snoopy->rawheaders["Pragma"] = "no-cache"; //cache 的http头信息 $snoopy->rawheaders["X_FORWARDED_FOR"] = "127.0.0.101"; //伪装ip $snoopy->submit($action,$formvars); echo $snoopy->results;
-
原来我们可以伪装session 伪装浏览器 ,伪装ip, haha 可以做很多事情了。
$snoopy->proxy_host = "www.only4.cn"; $snoopy->proxy_port = "8080"; //使用代理 $snoopy->maxredirs = 2; //重定向次数 $snoopy->expandlinks = true; //是否补全链接 在采集的时候经常用到 // 例如链接为 /images/taoav.gif 可改为它的全链接 http://www.taoav.com/images/taoav.gif,这个地方其实可以在最后输出的时候用ereg_replace函数自己替换 $snoopy->maxframes = 5 //允许的最大框架数 //注意抓取框架的时候 $snoopy->results 返回的是一个数组 $snoopy->error //返回报错信息
//echo var_dump($_SERVER); include("Snoopy.class.php"); $snoopy = new Snoopy; $snoopy->agent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5 FirePHP/0.2.1";//这项是浏览器信息,前面你用什么浏览器查看cookie,就用那个浏览器的信息(ps:$_SERVER可以查看到浏览器的信息) $snoopy->referer = "http://bbs.phpchina.com/index.php"; $snoopy->expandlinks = true; $snoopy->rawheaders["COOKIE"]="__utmz=17229162.1227682761.29.7.utmccn=(referral)|utmcsr=phpchina.com|utmcct=/html/index.html|utmcmd=referral; cdbphpchina_smile=1D2D0D1; cdbphpchina_cookietime=2592000; __utma=233700831.1562900865.1227113506.1229613449.1231233266.16; __utmz=233700831.1231233266.16.8.utmccn=(referral)|utmcsr=localhost:8080|utmcct=/test3.php|utmcmd=referral; __utma=17229162.1877703507.1227113568.1231228465.1231233160.58; uchome_loginuser=sinopf; xscdb_cookietime=2592000; __utmc=17229162; __utmb=17229162; cdbphpchina_sid=EX5w1V; __utmc=233700831; cdbphpchina_visitedfid=17; cdbphpchinaO766uPYGK6OWZaYlvHSuzJIP22VpwEMGnPQAuWCFL9Fd6CHp2e%2FKw0x4bKz0N9lGk; xscdb_auth=8106rAyhKpQL49eMs%2FyhLBf3C6ClZ%2B2idSk4bExJwbQr%2BHSZrVKgqPOttHVr%2B6KLPg3DtWpTMUI4ttqNNVpukUj6ElM; cdbphpchina_onlineusernum=3721"; $snoopy->fetch("http://bbs.phpchina.com/forum-17-1.html"); $n=ereg_replace("href=\"","href=\"http://bbs.phpchina.com/",$snoopy->results ); echo ereg_replace("src=\"","src=\"http://bbs.phpchina.com/",$n); ?>
$_SERVER['HTTP_USER_AGENT']后边的内容复制下来,粘在$snoopy->agent的地方,然后就是要查看自己的
COOKIE了,用自己在论坛的账号登陆论坛后,在浏览器地址栏里输入
javascript:document.write(document.cookie),回车,就可以看到自己的cookie信息,复制粘贴
到$snoopy->rawheaders["COOKIE"]=的后边。(我的cookie信息为了安全起见已经删除了一段内容)
?

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Many users will choose the Huawei brand when choosing smart watches. Among them, Huawei GT3pro and GT4 are very popular choices. Many users are curious about the difference between Huawei GT3pro and GT4. Let’s introduce the two to you. . What are the differences between Huawei GT3pro and GT4? 1. Appearance GT4: 46mm and 41mm, the material is glass mirror + stainless steel body + high-resolution fiber back shell. GT3pro: 46.6mm and 42.9mm, the material is sapphire glass + titanium body/ceramic body + ceramic back shell 2. Healthy GT4: Using the latest Huawei Truseen5.5+ algorithm, the results will be more accurate. GT3pro: Added ECG electrocardiogram and blood vessel and safety

HTTP status code 520 means that the server encountered an unknown error while processing the request and cannot provide more specific information. Used to indicate that an unknown error occurred when the server was processing the request, which may be caused by server configuration problems, network problems, or other unknown reasons. This is usually caused by server configuration issues, network issues, server overload, or coding errors. If you encounter a status code 520 error, it is best to contact the website administrator or technical support team for more information and assistance.

Why Snipping Tool Not Working on Windows 11 Understanding the root cause of the problem can help find the right solution. Here are the top reasons why the Snipping Tool might not be working properly: Focus Assistant is On: This prevents the Snipping Tool from opening. Corrupted application: If the snipping tool crashes on launch, it might be corrupted. Outdated graphics drivers: Incompatible drivers may interfere with the snipping tool. Interference from other applications: Other running applications may conflict with the Snipping Tool. Certificate has expired: An error during the upgrade process may cause this issu simple solution. These are suitable for most users and do not require any special technical knowledge. 1. Update Windows and Microsoft Store apps

Understand the meaning of HTTP 301 status code: common application scenarios of web page redirection. With the rapid development of the Internet, people's requirements for web page interaction are becoming higher and higher. In the field of web design, web page redirection is a common and important technology, implemented through the HTTP 301 status code. This article will explore the meaning of HTTP 301 status code and common application scenarios in web page redirection. HTTP301 status code refers to permanent redirect (PermanentRedirect). When the server receives the client's

How to use NginxProxyManager to implement automatic jump from HTTP to HTTPS. With the development of the Internet, more and more websites are beginning to use the HTTPS protocol to encrypt data transmission to improve data security and user privacy protection. Since the HTTPS protocol requires the support of an SSL certificate, certain technical support is required when deploying the HTTPS protocol. Nginx is a powerful and commonly used HTTP server and reverse proxy server, and NginxProxy

HTTP status code 403 means that the server rejected the client's request. The solution to http status code 403 is: 1. Check the authentication credentials. If the server requires authentication, ensure that the correct credentials are provided; 2. Check the IP address restrictions. If the server has restricted the IP address, ensure that the client's IP address is restricted. Whitelisted or not blacklisted; 3. Check the file permission settings. If the 403 status code is related to the permission settings of the file or directory, ensure that the client has sufficient permissions to access these files or directories, etc.

Solution: 1. Check the Content-Type in the request header; 2. Check the data format in the request body; 3. Use the appropriate encoding format; 4. Use the appropriate request method; 5. Check the server-side support.

Quick Application: Practical Development Case Analysis of PHP Asynchronous HTTP Download of Multiple Files With the development of the Internet, the file download function has become one of the basic needs of many websites and applications. For scenarios where multiple files need to be downloaded at the same time, the traditional synchronous download method is often inefficient and time-consuming. For this reason, using PHP to download multiple files asynchronously over HTTP has become an increasingly common solution. This article will analyze in detail how to use PHP asynchronous HTTP through an actual development case.
