Home php教程 php手册 Php CURL模拟登陆论坛并采集数据实例

Php CURL模拟登陆论坛并采集数据实例

May 25, 2016 pm 04:44 PM
php Data collection

要模拟浏览器访问网站,首选要学会观察浏览器是如何发送http报文的,以及网站服务器返回给浏览器 是什么样的内容,我推荐安装一个国外人开发的httpwatch的软件,最好搞个破解的版本,否则有些功能是使用不了的,这个软件安装完成之后是嵌入在 IE里的,启动Record,在地址栏输入网址后回车,它就会将浏览器和服务器之间的所有通讯扫描出来,让你一览无遗,关于这个软件的使用在本文不做介绍.

模拟浏览器登陆应用开发,最关键的地方是突破登陆验证,CURL技术不只支持http,还支持https,区别就在多了一层SSL加密传输,如果是要登陆 https网站,php记得要支持openssl,还是先拿一个例子来分析,代码如下:

<?php
$discuz_url = &#39;http://127.0.0.1/discuz/&#39;; //论坛地址
$login_url = $discuz_url . &#39;logging.php?action=login&#39;; //登录页地址
$post_fields = array();
//以下两项不需要修改
$post_fields[&#39;loginfield&#39;] = &#39;username&#39;;
$post_fields[&#39;loginsubmit&#39;] = &#39;true&#39;;
//用户名和密码,必须填写
$post_fields[&#39;username&#39;] = &#39;tianxin&#39;;
$post_fields[&#39;password&#39;] = &#39;111111&#39;;
//安全提问
$post_fields[&#39;questionid&#39;] = 0;
$post_fields[&#39;answer&#39;] = &#39;&#39;;
//@todo验证码
$post_fields[&#39;seccodeverify&#39;] = &#39;&#39;;
//获取表单FORMHASH
$ch = curl_init($login_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$contents = curl_exec($ch);
curl_close($ch);
preg_match(&#39;/<inputs*type="hidden"s*name="formhash"s*value="(.*?)"s*/>/i&#39;, $contents, $matches);
if (!emptyempty($matches)) {
    $formhash = $matches[1];
} else {
    die(&#39;Not found the forumhash.&#39;);
}
//POST数据,获取COOKIE,cookie文件放在网站的temp目录下
$cookie_file = tempnam(&#39;./temp&#39;, &#39;cookie&#39;);
$ch = curl_init($login_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_exec($ch);
curl_close($ch);
//取到了关键的cookie文件就可以带着cookie文件去模拟发帖,fid为论坛的栏目ID
$send_url = $discuz_url . "post.php?action=newthread&fid=2";
$ch = curl_init($send_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
$contents = curl_exec($ch);
curl_close($ch);
//这里的hash码和登陆窗口的hash码的正则不太一样,这里的hidden多了一个id属性
preg_match(&#39;/<inputs*type="hidden"s*name="formhash"s*id="formhash"s*value="(.*?)"s*/>/i&#39;, $contents, $matches);
if (!emptyempty($matches)) {
    $formhash = $matches[1];
} else {
    die(&#39;Not found the forumhash.&#39;);
}
$post_data = array();
//帖子标题
$post_data[&#39;subject&#39;] = &#39;test2&#39;;
//帖子内容
$post_data[&#39;message&#39;] = &#39;test2&#39;;
$post_data[&#39;topicsubmit&#39;] = "yes";
$post_data[&#39;extra&#39;] = &#39;&#39;;
//帖子标签
$post_data[&#39;tags&#39;] = &#39;test&#39;;
//帖子的hash码,这个非常关键!假如缺少这个hash码,discuz会警告你来路的页面不正确
$post_data[&#39;formhash&#39;] = $formhash;
$ch = curl_init($send_url);
curl_setopt($ch, CURLOPT_REFERER, $send_url); //伪装REFERER
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$contents = curl_exec($ch);
curl_close($ch);
//清理cookie文件
unlink($cookie_file);
?>
Copy after login

CURL实现网站模拟登陆,代码如下:

<?php
$cookie_file = tempnam(&#39;./temp&#39;, &#39;cookie&#39;);
$login_url = &#39;/bbs/logging.php?action=login&loginsubmit=yes&#39;;
$post_fields = &#39;username=用户名&password=用户密码&referer=index.php&formhash=24eca8af&loginfield=username&questionid=0&loginsubmit=登录&#39;;
$ch = curl_init($login_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_exec($ch);
curl_close($ch);
$url = &#39;/bbs&#39;;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
$contents = curl_exec($ch);
echo $contents;
curl_close($ch);
?>
Copy after login


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

CakePHP Project Configuration CakePHP Project Configuration Sep 10, 2024 pm 05:25 PM

In this chapter, we will understand the Environment Variables, General Configuration, Database Configuration and Email Configuration in CakePHP.

PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian Dec 24, 2024 pm 04:42 PM

PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

CakePHP Date and Time CakePHP Date and Time Sep 10, 2024 pm 05:27 PM

To work with date and time in cakephp4, we are going to make use of the available FrozenTime class.

CakePHP File upload CakePHP File upload Sep 10, 2024 pm 05:27 PM

To work on file upload we are going to use the form helper. Here, is an example for file upload.

CakePHP Routing CakePHP Routing Sep 10, 2024 pm 05:25 PM

In this chapter, we are going to learn the following topics related to routing ?

Discuss CakePHP Discuss CakePHP Sep 10, 2024 pm 05:28 PM

CakePHP is an open-source framework for PHP. It is intended to make developing, deploying and maintaining applications much easier. CakePHP is based on a MVC-like architecture that is both powerful and easy to grasp. Models, Views, and Controllers gu

How To Set Up Visual Studio Code (VS Code) for PHP Development How To Set Up Visual Studio Code (VS Code) for PHP Development Dec 20, 2024 am 11:31 AM

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c

CakePHP Creating Validators CakePHP Creating Validators Sep 10, 2024 pm 05:26 PM

Validator can be created by adding the following two lines in the controller.

See all articles