Home php教程 php手册 用php抓取google关键词排名

用php抓取google关键词排名

Jun 13, 2016 am 10:46 AM
cookie curl google php Down store Key words function use Ideas crawl Ranking use of

 

说下思路,利用PHP的curl函数储存cookie,google搜索页面是无法用file_get_connents打开的,必须要完全模拟浏览器才行,百度就不同了,直接用file_get_conntens抓取页面,然后用正则处理下就行了,这里就不列举百度了。

 

header("Content-Type: text/html;charset=utf-8");

 

function ggsearch($url_s, $keyword, $page = 1) {

        $enKeyword = urlencode($keyword);

 

        $rsState = false;

 

        $page_num = ($page -1) * 10;

 

 

        if ($page

                $interface = "eth0:" . rand(1, 4); //避免GG封IP

                $cookie_file = dirname(__FILE__) . "/temp/google.txt"; //存储cookie值

                $url = "http://www.google.com/search?q=$enKeyword&hl=en&prmd=imvns&ei=JPnJTvLFI8HlggeXwbRl&start=$page_num&sa=N";

                $ch = curl_init();

 

                curl_setopt($ch, CURLOPT_URL, $url);

 

                //curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);//获取浏览器类型

                curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5");

                curl_setopt($ch, CURLOPT_INTERFACE, "$interface"); //指定访问IP地址

                curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

 

                curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

 

                curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);

 

                $contents = curl_exec($ch);

 

                curl_close($ch);

 

                $match = "!

(.*)
\s+!";

                preg_match_all("$match", "$contents", $line);

                while (list ($k, $v) = each($line[0])) {

                        preg_match_all("!

]+>(.*?)!", $v, $title);

                        $num = count($title[1]);

                        for ($i = 0; $i

                                if (strstr($title[0][$i], $url_s)) {

                                        $rsState = true;

                                        $j = $i +1;

                                        $sum = $j + (($page) * 10 - 10);

                                        //echo $contents;

                                        echo "关键字" . $keyword . "
" . "排名:" . '' . $sum . '' . "####" . "第" . ''.$page . ''. " 页" . "第" .''.$j . ''. "名" . $title[0][$i] . "
";

                                        echo "" . "点击搜索结果" . "" . "
";

                                        echo "


";

                                        break;

                                }

                        }

                }

                unset ($contents);

                if ($rsState === false) {

                        ggsearch($url_s, $keyword, ++ $page); //找不到搜索页面的继续往下搜索

 

                }

        } else {

 

                echo '关键字' . $keyword . '10页之内没有该网站排名' . '
';

                echo "


";

        }

}

if (!empty ($_POST['submit'])) {

 

        $time = explode(' ', microtime());

        $start = $time[0] + $time[1];

        $more_key = trim($_POST['textarea']);

        $url_s = trim($_POST['url']);

        if (!empty ($more_key) && !empty ($url_s)) {

                /*判断输入字符的规律*/

                if (strstr($more_key, "\n")) {

                        $exkey = explode("\n", $more_key);

                }

                if(strstr($more_key, "|")) {

                        $exkey = explode("|", $more_key);

                }

                if(!strstr($more_key, "\n")&&!strstr($more_key, "|")){

                $exkey=array($more_key);

                }

/*判断是否有www或者http://之类的东西*/

                if (count(explode('.', $url_s))

 

                        $url = ltrim($url_s, 'http://www');

                        $url = 'www.' . $url_s;

                }

                foreach ($exkey as $keyword) {

                        //$keyword;

                        ggsearch($url_s, $keyword);

                }

                $endtime = explode(' ', microtime());

 

                $end = $endtime[0] + $endtime[1];

 

                echo '


';

                echo '程序运行时间: ';

                echo $end - $start;

                //die();

        }

}

?>

抓取排名

 

 

 

 

                        关键字:

格式例如:keyword1|keyword2|keyword3

  或者:      keyword1

          keyword2

          keyword3

 

 

 

                        url地址:

 

                       

www.2cto.com

 

 

 

摘自Shine的圣天堂-〃敏〃

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

CakePHP Project Configuration CakePHP Project Configuration Sep 10, 2024 pm 05:25 PM

In this chapter, we will understand the Environment Variables, General Configuration, Database Configuration and Email Configuration in CakePHP.

PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian Dec 24, 2024 pm 04:42 PM

PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

CakePHP Date and Time CakePHP Date and Time Sep 10, 2024 pm 05:27 PM

To work with date and time in cakephp4, we are going to make use of the available FrozenTime class.

CakePHP File upload CakePHP File upload Sep 10, 2024 pm 05:27 PM

To work on file upload we are going to use the form helper. Here, is an example for file upload.

CakePHP Routing CakePHP Routing Sep 10, 2024 pm 05:25 PM

In this chapter, we are going to learn the following topics related to routing ?

Discuss CakePHP Discuss CakePHP Sep 10, 2024 pm 05:28 PM

CakePHP is an open-source framework for PHP. It is intended to make developing, deploying and maintaining applications much easier. CakePHP is based on a MVC-like architecture that is both powerful and easy to grasp. Models, Views, and Controllers gu

Google opens AI Test Kitchen & Imagen 3 to most users Google opens AI Test Kitchen & Imagen 3 to most users Sep 12, 2024 pm 12:17 PM

Google's AI Test Kitchen, which includes a suite of AI design tools for users to play with, has now opened up to users in well over 100 countries worldwide. This move marks the first time that many around the world will be able to use Imagen 3, Googl

How To Set Up Visual Studio Code (VS Code) for PHP Development How To Set Up Visual Studio Code (VS Code) for PHP Development Dec 20, 2024 am 11:31 AM

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c

See all articles