Home > php教程 > php手册 > php 抓取 百度热词 搜索的 http://top.baidu.com/buzz/top10.htm

php 抓取 百度热词 搜索的 http://top.baidu.com/buzz/top10.htm

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
Release: 2016-06-06 19:39:47
Original
2765 people have browsed it

前面开发PHP的过程中、有一个网站要做一个导航的、需要用到百度热词、百度搜索榜的TOP50。 可以根据FOr循环找出50条 地址可为这几个都可以抓取是根据simple_html_dom.php simple_html_dom.php百度一下放到相同的目录下 我用的是THINKPHP放在同Action中 //htt

前面开发PHP 的过程中、有一个网站要做一个导航的、需要用到百度热词、百度搜索榜的 TOP50 。

 可以根据FOr 循环找出50 条
地址可为这几个都可以抓取  是根据simple_html_dom.php
 
simple_html_dom.php 百度一下 放到相同的目录下
我用的是THINKPHP   放在同Action中
 
//http://top.baidu.com/buzz/top10.html
//http://top.baidu.com/buzz?b=1&c=513
//http://top.baidu.com/buzz?b=1&fr=topcategory_c513

ThinkPHP

$now_url = 'http://top.baidu.com/buzz.php?p=top10';
$content = '';
if (function_exists ( 'curl_init' )) {
$ch = curl_init ( $now_url );
curl_setopt ( $ch, CURLOPT_HEADER, 0 );
curl_setopt ( $ch, CURLOPT_TIMEOUT, 30 ); // 设置超时限制防止死循环
curl_setopt ( $ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" );
// curl_setopt ( $ch, CURLOPT_USERAGENT,
// "Baiduspider+(+http://www.baidu.com/search/spider.htm)" );
curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
$content = curl_exec ( $ch );
curl_close ( $ch );
} elseif (function_exists ( 'file_get_contents' )) {
$content = file_get_contents ( $now_url );
} else {
exit ( '您的服务器同时不支持组件,无法开始采集!' );
}
include_once ('simple_html_dom.php');
// 新建一个Dom实例
$html = new simple_html_dom ();
// 从字符串中加载
$html->load ( $content ); // syncad_3
$new1 = $html->find ( 'table .keyword .list-title text' ); // 根据table的keyword list-title查出该标签下的数据
$keyArray = array ();
for($i = 0; $i < 20; $i ++) {  
$item = iconv ( "GB2312", "UTF-8", $new1 [$i] . '' );
$keyArray [] = $item;
}
$this->assign ( 'keyArray', $keyArray );
$html->clear ();
unset ( $html );
Copy after login

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Issues
php data acquisition?
From 1970-01-01 08:00:00
0
0
0
PHP extension intl
From 1970-01-01 08:00:00
0
0
0
How to learn php well
From 1970-01-01 08:00:00
0
0
0
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template