thinkphp5 uses the workerman timer to regularly crawl site content code

不言
Release: 2023-04-03 12:34:01
Original
3982 people have browsed it

The content of this article is about thinkphp5 using the workerman timer to regularly crawl news information of a certain site. The content is very detailed. Friends in need can refer to it. I hope it can help you.

1. First install workererman through composer. There are detailed instructions in the extension of thinkphp5 complete development manual-"coposer package-"workerman:

#在项目根目录执行以下指令
composer require topthink/think-worker
Copy after login

2. Create the service startup file server.php in the project root directory:

Copy after login

3. Create the server module in the application and create the controller Worker.php in the server:

add_timer();
    }


}
Copy after login

4. Create the Collection.php class

get_jinse();
        return json(['msg'=>"此次采集数据共 $total 条。",'total'=>$total]);
    }

  
    /**
     * 获取金色财经资讯
     */
    public function get_jinse(){
        $url="https://api.jinse.com/v4/live/list?limit=20";
        $data=$this->get_curl($url);
        $data=json_decode($data);
        $data=$data->list[0]->lives;

        $validate=validate('Article');
        $items=[];

        foreach ($data as $k=>$v){

            preg_match('/【(.+?)】(.+)/u',$v->content,$content);

            if(!@$content[2]){
                continue;
            }


            $list=array(
                'source_id'=>$v->id,
                'source'=>'金色财经',
                'title'=>trim(preg_replace('/.*\|/','',$content[1])),
                'content'=>$content[2],
            );
            if($validate->check($list)){
                $items[]=$list;
            }
        }
        if($items){
            krsort($items);
            $model=new ArticleModel();
            $model->saveAll($items);
        }
        return count($items);
    }
    public function get_curl($url){
        $ch=curl_init();
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch,CURLOPT_URL,$url);
        curl_setopt($ch,CURLOPT_HEADER,0);
        curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
        $output = curl_exec($ch);

        if($output === FALSE ){
            echo "CURL Error:".curl_error($ch);
        }
        curl_close($ch);
        // 4. 释放curl句柄

        return $output;

    }
  
}
Copy after login

5. Start the service php server.php start

Related recommendations:

What is template inheritance in Thinkphp? Example of template inheritance

How to use PHP to verify user name and password (code)

The above is the detailed content of thinkphp5 uses the workerman timer to regularly crawl site content code. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!