如何抓取网页实时内容

WBOY
Release: 2016-06-23 14:09:04
Original
1102 people have browsed it

#网址:http://data.shishicai.cn/cqssc/haoma/#Demo:<?php/* Created on [2013-5-1] Author[Newton] Filename[action.php]*/#编码转换function convToUtf8($str) {	if (mb_detect_encoding($str, "UTF-8, ISO-8859-1, GBK") != "UTF-8") {		return iconv("GBK", "utf-8", $str);	} else {		return $str;	}}header("content-type:text/html;charset:utf-8");error_reporting(E_ERROR);$pages = file_get_contents('http://data.shishicai.cn/cqssc/haoma/');//$pages = htmlspecialchars($pages);$pages = convToUtf8($pages);echo "pages-->>".print_r($pages);echo PHP_EOL;$doc = new DOMDocument();$new_doc = new DOMDocument('1.0', 'utf-8');echo "doc-->>".print_r($doc);echo PHP_EOL;$dom = $doc->getElementsByTagName('table');$newdoc = $new_doc->loadhtml($dom->item(2)->nodeValue);$table = $new_doc->saveHTML();echo "table-->>{$table}".PHP_EOL;#result:#……乱码……#pages-->>1 DOMDocument Object ( ) doc-->>1 table-->>#table是空的……?>
Copy after login


回复讨论(解决方案)

我想获取的内容是:

对应代码片段:

页面数据是JS填充的。你得爬那个JS脚本。

貌似这样做繁琐了吧?

貌似是用了frame框架内嵌在tbody里面,然后用JS代码来做html。
打开http://datacache.shishicai.cn/script/2f67117ba1b58074.js后,
搜索'frame'出来6条结果
凭我的技术分析是不出框架的链接

LZ看来也是位大神,技术分好高,仰望ING

http://data.shishicai.cn/handler/kuaikai/data.ashx
post: lottery=4&date=2013-05-06
采集这儿.

楼上链接抓的是空白……

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template