Home php教程 php手册 php利用fopen实现简单的网页采集程序

php利用fopen实现简单的网页采集程序

Jun 02, 2016 am 09:13 AM

这个采集程序是一个非常简单的程序了,个人认为不适合于大量数据采集了单页还是没有问题了,因为fopen函数对于远程文件操作与多线程时是非常的不理想的,这个只是一个作者写的觉得好玩合出来了,代码如下:

<?php
/** 
 * 根据URL采集网页内容
 *
 * @param string $url 链接地址
 * @return string
 */
private function fetchbyurl($url) {
    $handle = fopen($url, &#39;r&#39;);
    $content = "; 
while (!feof($handle)){ 
$content .= fgets($handle, 10000); 
} 
return $content; 
//?$this->utf8_iconv($content):";
}
/*获取所有匹配的内容
 * @param string $str 内容
 * @param string $start 起始匹配
 * @param string $end 中止匹配
 * @return array
*/
private function utf8_iconv($content) {
    return iconv(&#39;GBK&#39;, &#39;UTF-8&#39;, $content);
}
private function strCutAll($str, $start, $end) {
    $content = explode($start, $str);
    $matchs = array();
    $sum = count($content);
    for ($i = 1; $i < $sum; $i++) {
        $tmp = explode($end, $content[$i]);
        $matchs[] = $tmp[0];
        unset($tmp);
    }
    return $matchs;
}
/*获取第一个匹配的内容
 * @param string $str 内容
 * @param string $start 起始匹配
 * @param string $end 中止匹配
 * @return string
*/
private function strCut($str, $start, $end) {
    $content = strstr($str, $start);
    $content = substr($content, strlen($start) , strpos($content, $end) - strlen($start));
    return $content;
}
?>
Copy after login
/*采集程序*/
header("content-Type: text/html; charset=utf-8");
//$nr = file_get_contents(‘/webback/php/php-yi-ju-hua-hou-men-zhuan’);
$nr = $this->fetchbyurl(‘/webback/php/php-yi-ju-hua-hou-men-zhuan’);
//推荐,还可以用curl dump($this->strCut($nr,’<div class="context">’,&#39;<div class="betterrelated">’));
//得到内容。需要进一步过滤用(preg_match_all)
dump($this->strCutAll($nr,’<title>’,&#39;</title>’));
得到标题
Copy after login


Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Two Point Museum: All Exhibits And Where To Find Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)