代码如下
<code>$cookie_file = tempnam('./temp','cookie'); $login_url = 'http://211.64.47.129/default_ysdx.aspx'; $post_fields = '__VIEWSTATE=dDw1MjQ2ODMxNzY7Oz7xlHJHd0KfeVRA2p7BXNto118wbQ==&TextBox1=学号&TextBox2=密码'; $ch = curl_init($login_url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file); curl_exec($ch); curl_close($ch); $url='http://211.64.47.129/xs_main.aspx?xh=学号'; $ch = curl_init($url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); $contents = curl_exec($ch); preg_match("/<li> (.*)/",$contents,$arr); echo $arr[1]; curl_close($ch);</li></code>
但是最后却回到了登录的界面,小白求大神解答
代码如下
<code>$cookie_file = tempnam('./temp','cookie'); $login_url = 'http://211.64.47.129/default_ysdx.aspx'; $post_fields = '__VIEWSTATE=dDw1MjQ2ODMxNzY7Oz7xlHJHd0KfeVRA2p7BXNto118wbQ==&TextBox1=学号&TextBox2=密码'; $ch = curl_init($login_url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file); curl_exec($ch); curl_close($ch); $url='http://211.64.47.129/xs_main.aspx?xh=学号'; $ch = curl_init($url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); $contents = curl_exec($ch); preg_match("/<li> (.*)/",$contents,$arr); echo $arr[1]; curl_close($ch);</li></code>
但是最后却回到了登录的界面,小白求大神解答
根据楼主的要求, 配合我自己写的 HttpClient
的类, 编写的代码如下, 但因为没有测试账号, 所以测试时使用的账号和密码为 test
, 返回的结果是登陆失败, 楼主只需要修改代码里的账号和密码, 应该就可以了.HTTP
请求过程中的 Cookie
由 HttpClient/CURL
自动处理.
楼主发的代码应该是没问题的, 感觉应该是没有提交 RadioButtonList1
和 Button1
这两个数据.
还有那个__VIEWSTATE
虽然看起来是固定的, 但为了保险期间, 还是应该从页面中获取到之后, 再提交登陆.
附代码:
<code><?php class HttpClient{ private $ch; function __construct($cookie_jar){ $this->ch = curl_init(); curl_setopt($this->ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQDownload 685; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)');//UA curl_setopt($this->ch, CURLOPT_TIMEOUT, 40); curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, TRUE); curl_setopt($this->ch, CURLOPT_AUTOREFERER, true); curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt($this->ch, CURLOPT_COOKIEJAR, $cookie_jar); curl_setopt($this->ch, CURLOPT_COOKIEFILE, $cookie_jar); } function __destruct(){ curl_close($this->ch); } final public function setReferer($ref=''){ if($ref != ''){ curl_setopt($this->ch, CURLOPT_REFERER, $ref); } } final public function Get($url, $header=false, $nobody=false){ curl_setopt($this->ch, CURLOPT_POST, false); curl_setopt($this->ch, CURLOPT_URL, $url); curl_setopt($this->ch, CURLOPT_HEADER, $header); curl_setopt($this->ch, CURLOPT_NOBODY, $nobody); return curl_exec($this->ch); } final public function Post($url, $data=array(), $header=false, $nobody=false){ curl_setopt($this->ch, CURLOPT_URL, $url); curl_setopt($this->ch, CURLOPT_HEADER, $header); curl_setopt($this->ch, CURLOPT_NOBODY, $nobody); curl_setopt($this->ch, CURLOPT_POST, true); curl_setopt($this->ch, CURLOPT_POSTFIELDS, http_build_query($data)); return curl_exec($this->ch); } } const Login_URL = 'http://211.64.47.129/default_ysdx.aspx'; $http = new HttpClient(tempnam('./temp','cookie')); $html = $http->Get(Login_URL);//先请求登陆页面, 获取 __VIEWSTATE preg_match('/name="__VIEWSTATE" value="(.+?)"/', $html, $vs); if(count($vs) !== 2){ echo '获取viewstate失败'; exit(); } //构造登陆时的数据 $data = array( '__VIEWSTATE'=>$vs[1],//__VIEWSTATE 'TextBox1'=>'username',//修改此处的用户 'TextBox2'=>'password',//和密码 'RadioButtonList1'=>'学生',//以及身份类型 'Button1'=>' 登录 ' ); $html = $http->Post(Login_URL, $data); preg_match('/language=\'javascript\'>alert\(\'(.+?)\'\);/', $html, $err); //检测是否出错, 如果有出错, 则显示错误信息, 然后退出 if(count($err) === 2){ echo $err[1]; exit(); } $sn = '123123';//学号 $html = $http->Get('http://211.64.47.129/xs_main.aspx?xh='. $sn); preg_match('/<li>\s*(.*)/', $html, $result); var_dump($result); </li></code>
https://github.com/lndj/Lcrawl/tree/dev
一只优雅的正方教务系统爬虫。