Home PHP Libraries Other libraries php website crawling library
php website crawling library
<?php
header("Content-Type: text/html; charset=UTF-8");
require("phpQuery.php");
$hj = QueryList::Query('http://mobile.csdn.net/',array("title"=>array('.unit h1','text')));
//dump($hj->data);
$data = QueryList::Query('http://cms.querylist.cc/bizhi/453.html',array(
    'image' => array('img','src')
    ))->data;
//
$data = QueryList::Query('http://cms.querylist.cc/google/list_1.html',array(
    'link' => array('a','href')
    ))->data;
$page = 'http://cms.querylist.cc/news/566.html';
$reg = array(
    'title' => array('h1','text'),
    'date' => array('.pt_info','text','-span -a',function($content){
        $arr = explode(' ',$content);
        return $arr[0];
    }),
    'content' => array('.post_content','html','a -.content_copyright -script',function($content){
     
            $doc = phpQuery::newDocumentHTML($content);
            $imgs = pq($doc)->find('img');
            foreach ($imgs as $img) {
                $src = 'http://cms.querylist.cc'.pq($img)->attr('src');
                $localSrc = 'w/'.md5($src).'.jpg';
                $stream = file_get_contents($src);
                file_put_contents($localSrc,$stream);
                pq($img)->attr('src',$localSrc);
            }
            return $doc->htmlOuter();
    })
    );
$rang = '.content';
$ql = QueryList::Query($page,$reg,$rang);
$data = $ql->getData();
dump($data);

supports crawling websites and crawling. It is very powerful. It is a server-side open source project based on PHP. It allows PHP developers to easily process DOM document content, such as obtaining the headline information of a news website. What's more interesting is that it uses the idea of ​​​​jQuery. You can process the page content just like using jQuery to get the page information you want.

Disclaimer

All resources on this site are contributed by netizens or reprinted by major download sites. Please check the integrity of the software yourself! All resources on this site are for learning reference only. Please do not use them for commercial purposes. Otherwise, you will be responsible for all consequences! If there is any infringement, please contact us to delete it. Contact information: admin@php.cn

Related Article

Create library website using Yii framework Create library website using Yii framework

21 Jun 2023

With the advent of the digital era, libraries have gradually realized their own digital transformation, gradually shifting from traditional paper-based management to digital management. In the process of digital management, using website applications to manage libraries has become a very popular way, because it can help libraries better manage books, borrowing records, user accounts and other information. Using the Yii framework to create a library website has become a very simple and effective way. This article will introduce how to use the Yii framework to create a library website. Y

Discover the treasures of Go language modeling library official website Discover the treasures of Go language modeling library official website

01 Mar 2024

Go language is a popular programming language that is widely used in various fields such as web development, cloud computing, and artificial intelligence. Compared with other programming languages, Go language has higher execution efficiency, simpler syntax and more powerful concurrency processing capabilities, so it is very popular among programmers. In the Go language ecosystem, various excellent open source modeling libraries have also emerged, providing developers with a wealth of resources and tools. This article will introduce some treasure websites that discovered the Go language modeling library to help readers better understand and utilize these powerful

Looking for a php/python library management program (similar to Baidu library, managing doc/pdf and other libraries) Looking for a php/python library management program (similar to Baidu library, managing doc/pdf and other libraries)

30 Sep 2016

Looking for a php/python library management program (similar to Baidu library, managing doc/pdf and other libraries)~~ It mainly needs to have search functions, especially file classification retrieval/file tag retrieval functions, no need for online conversion, online browsing!

Integration of PHP function library and third-party library Integration of PHP function library and third-party library

22 Apr 2024

Function libraries and third-party libraries in PHP can extend the functionality of applications. The function library provides predefined functions that can be included through the include statement. Third-party libraries are available from sources such as Packagist, GitHub, and installed using Composer. Implement automatic loading of classes through autoloaders, such as automatic loading of the Guzzle library. Learn how to use the Dompdf third-party library to generate PDF files through practical cases, including loading the library, loading HTML content, and outputting PDF files. The integration of function libraries and third-party libraries greatly expands the functionality of PHP applications and improves development efficiency and project performance.

How to write a PHP function library? How to write a PHP function library?

17 Apr 2024

The steps for writing a function library in PHP are as follows: Create a PHP file (such as myFunctions.php) to store the functions. Use the function keyword to define functions in a file. Include libraries in other scripts using require_once or include_once statements. Once a function library is included, its functions can be used.

How to create a PHP library and make it support different PHP versions? How to create a PHP library and make it support different PHP versions?

26 Apr 2024

PHP function libraries can improve code reusability by encapsulating common tasks. To create a reusable library that supports different PHP versions: define the library and compatible PHP version ranges; handle version differences based on PHP version; package the library for use by other projects.

See all articles