Home > php教程 > php手册 > body text

简单的PHP伪缓存并定时抓取某页面内容

WBOY
Release: 2016-06-13 09:38:21
Original
982 people have browsed it

需求:要抓取某个页面的一部分内容,然后iframe到别的页面去。iframe的时候,不需求每次都访问源页面,而是每天只读取源页面一次,并生成文件,iframe的时候只访问该暂时文件,也就是伪缓存啦。这么做适合访问量不大的页面,降低数据库访问压力。

程序设计如下:

<?php
function get_page_content()
{
	$url = "http://www.bkjia.com/";
	$contents = file_get_contents($url);
	//如果出现中文乱码使用下面代码
	//$getcontent = iconv("gb2312″, "utf-8″,$contents);
	//echo $contents;
	//$pos = strstr($contents, '<div class="hot_news">');
	//print_r($pos);
	$array = explode('<div class="hot_news">', $contents);
	$htmlarray = explode('<div class="car_tab border4">', $array[0]); // HTML部分
	$cssarray = explode('<div class="hometop">', $htmlarray[0]);
	$css_rem_inner = explode('<!--[if !IE]>导航<![endif]-->', $cssarray[0]);
	$css_min = explode('<script type="text/javascript" src="http://www.bkjia.com/ad_comm_t.js">', $css_rem_inner[0]);
	$str_css = $css_min[0];
	$head = '<base target="_blank"></base></head> ';
	$str_1 = '<div class="car_tab border4">';
	$str_html = $htmlarray[1]; 
	$content = $str_css.$head.$str_1.$str_html;
	return $content;
}
$cache_file = "tmp.html";
$cache_time = 60*60*24;
/**
ob_start();
echo $content;
file_put_contents($cacheFile,ob_get_contents());
ob_end_flush();
**/
echo date("Y-m-d H:i:s", time());
echo '<br />';
echo date("Y-m-d H:i:s", floor(@filemtime($cache_file)));
if(time() - $cache_time > floor(@filemtime($cache_file)) )
{
	$content = get_page_content();
	file_put_contents($cacheFile, $content);
	header('Location: http://www.bkjia.com/tmp.html');
}
else
{
	header('Location: http://www.bkjia.com/tmp.html');
}
?>
Copy after login

解释下:

$cache_time = 60*60*24; 缓存时间为一天。

if(time() - $cache_time > floor(@filemtime($cache_file)) ) 如果当前时间减去一天大于暂时文件的修改时间。

$content = get_page_content(); 就读取页面内容并重新生成暂时文件。

就这么简单。

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template