Using simple_html_dom to crawl and display the entire novel in laravel-Laravel-php.cn

Table of Contents

Home

PHP Framework

Laravel

Using simple_html_dom to crawl and display the entire novel in laravel

L先生

May 07, 2020 pm 02:14 PM

laravel

As mentioned in Programmers also read novels with advertisements, many novel websites basically have very annoying advertisements, or add links to the overall div, and they will jump to some websites if they are accidentally touched. Even in an infinite loop, some mobile apps also have a lot of ads. This article will apply it to the laravel framework. It is best to understand the previous article first and then deploy it yourself.

1. Introduce third-party classes into laravel

1. Create a new folder in the app directory under the project root directory and name it Lib (custom name )

2. If you introduce many third-party libraries, you can create several new directory categories under Lib. Since only one class is introduced, there is no new folder here. (Defined by yourself according to the number of imported classes)

Copy simple_html_dom.php to Lib

3. Find the composer.json file in the project root directory and write the path of the third-party class Enter the classmap under autoload so that it can be loaded automatically

"autoload": {
"classmap": [
"database/seeds",
"database/factories" ,
"app/Lib/simple_html_dom.php"
]
},

4. Switch to the project root directory in the cmd console and execute the command:

composer dumpautoload

5. Use this class in the controller

use simple_html_dom;

$html = new simple_html_dom(); use

2. Create routing

Route::get(&#39;/novel_list&#39;,&#39;index\Spnovel@index&#39;);

Copy after login

3. Create controller Spnovel.php

<?php
namespace App\Http\Controllers\index;
use simple_html_dom;
use Illuminate\Http\Request;
use App\Http\Controllers\Controller;
class Spnovel extends Controller
{
	public function index(){
		$url = "https://www.7kzw.com/85/85445/";
		$list_html = mySpClass::getCurl($url);
		$data[&#39;List&#39;] = self::getList($list_html);
		return view(&#39;index.spnovel.index&#39;,$data);
	}
	private static function getList($list_html){
		$html = new simple_html_dom();
		@$html->load($list_html);
		$list = $html->find(&#39;#list dd a&#39;);
		foreach ($list as $k=>$v) {
			$arr1=$arr2=[];
			$p1 = &#39;/<a .*?>(.*?)<\/a>/i&#39;;
			$p2 = &#39;/<a .*? href="(.*?)">.*?<\/a>/i&#39;;
			preg_match($p1,$v->outertext,$arr1);
			preg_match($p2,$v->outertext,$arr2);
			$content[$k][0]=$arr1[1];
			$content[$k][1]=$arr2[1];
		}
		array_splice($content,0,12); 
		return $content;
	}
}
class mySpClass{
	// 向服务器发送最简单的get请求
	public static function getCurl($url,$header=null){
		// 1.初始化
		$ch = curl_init($url);   //请求的地址
		// 2.设置选项
		curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);//获取的信息以字符串返回,而不是直接输出(必须) 
		curl_setopt($ch,CURLOPT_TIMEOUT,10);//超时时间（必须）
		curl_setopt($ch, CURLOPT_HEADER,0);// 	启用时会将头文件的信息作为数据流输出。 
		//参数为1表示输出信息头,为0表示不输出
		curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false); //不验证证书
		curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false); //不验证证书
		if(!empty($header)){
			curl_setopt($ch,CURLOPT_HTTPHEADER,$header);//设置头信息
		}else{
			$_head = [
			&#39;User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0&#39;
			]; 
			curl_setopt($ch,CURLOPT_HTTPHEADER,$_head);
		}
		// 3.执行
		$res = curl_exec($ch);
		// 4.关闭
		curl_close($ch);
		return $res;
	}
}

Copy after login

Explanation of the above code: First of all, you need to understand the laravel framework and the php class.

After accessing the above route, the index method in the Spnovel.php controller is run. $url is the chapter of a certain novel. The address of the list, use it as a parameter to run the getcurl method in the custom class mySpClass, and return the html document string of this page. Run the getList method in this class, the parameter is the html string that needs to be parsed. Privatize this method, use simple_html_dom parsing, and configure regular rules to extract the URL address and chapter name of each chapter. And return this array, through return view('index.spnovel.index',$data); will open index/spnovel/index.blade.php, please see index.blade.php

four , Create the view index.blade.php

<!DOCTYPE html>
<html>
<head>
	<title>爬取的小说列表</title>
	<style type="text/css">
	body{padding:0px;margin:0px;}
	#lists{width:100%;padding:30px 50px;box-sizing:border-box;}
	ul{margin:0;padding: 0;overflow:hidden;}
	ul li{list-style:none;display:inline-block;float:left;width:25%;color:#444;}
	ul li:hover{color:#777;cursor: pointer;}
	img {z-index:-1;width:100%;height:100%;position:fixed;}
	</style>
</head>
<body>
	<img  src="/static/imghw/default1.png"  data-src="/static/img/index/novelbg.jpg"  class="lazy"   alt="Using simple_html_dom to crawl and display the entire novel in laravel" >
	<div id="lists">
		<ul>
			@foreach($List as $item)
			<li>
			<a href="/novel_con{{$item[1]}}">{{$item[0]}}</a>
			</li>
			@endforeach
		</ul>		
	</div>
</body>
</html>

Copy after login

Explanation of the above code: The css is simply written here, and the img is used as the background image. In the loop li in ul, {{$item[1]}} is the obtained address parameter, and {{$item[0]}} is the obtained chapter name. Take a look at the array and the final effect.

Using simple_html_dom to crawl and display the entire novel in laravel

5. Run

Using simple_html_dom to crawl and display the entire novel in laravel

The following is the content of each chapter

Look at the routing first:

Route::get(&#39;/novel_con/{a}/{b}/{c}&#39;,&#39;index\Spnovel@get_nContent&#39;);

Copy after login

This corresponds to the url parameters of each chapter. For example, the parameters of a certain chapter are: novel_con/85/85445/27248645.html

Writeget_nContent method:

public function get_nContent(Request $req){
		$url1 = $req->a.&#39;/&#39;.$req->b.&#39;/&#39;.$req->c;
		$url = "https://www.7kzw.com/".$url1;
		$res = mySpClass::getCurl($url);//获得
		// 开始解析
		$data[&#39;artic&#39;]= self::getContent($res);
		$next = (int)$req->c;
		$next = $next+1;
		$data[&#39;artic&#39;][&#39;next&#39;]="/novel_con/".$req->a.&#39;/&#39;.$req->b.&#39;/&#39;.$next.&#39;.html&#39;;
		return view(&#39;index.spnovel.ncontent&#39;,$data);
	}
private static function getContent($get_html){
		$html = new simple_html_dom();
		@$html->load($get_html);
		$h1 = $html->find(&#39;.bookname h1&#39;);
		foreach ($h1 as $k=>$v) {
			$artic[&#39;title&#39;] = $v->innertext;
		}
		// 查找小说的具体内容
		$divs = $html->find(&#39;#content&#39;);
		foreach ($divs as $k=>$v) {
			$content = $v->innertext;
		}
		// 正则替换去除多余部分
		$pattern = "/(<p>.*?<\/p>)|(<div .*?>.*?<\/div>)/";
		$artic[&#39;content&#39;] = preg_replace($pattern,&#39;&#39;,$content);
		return $artic;
	}

Copy after login

Explanation:$req->a,$req- >b, $req->c, are three parameters respectively, and then merge them into a complete address to request a certain chapter, and then obtain the html string of a certain chapter through mySpClass::getCurl. Then use getContent in this class to parse this page. First, look at the parsing method, parse the title and content of the chapter with the previous article, write it into the array, and remove the redundant text advertisement part. $next is the address of the next chapter stored, which is used to jump to the chapter details page.

View ncontent.blade.php

<!DOCTYPE html>
<html>
<head>
	<title>{{$artic[&#39;title&#39;]}}</title>
	<style type="text/css">
	h2{text-align:center;padding-top:30px;}
	div{margin:20px 50px;font-size:20px;}
	img {z-index:-1;width:100%;height:100%;position:fixed;}
	.next {position:fixed;right:10px;bottom:20px;background:coral;border-radius:3px;padding:4px;}
	.next:hover{color:#fff;}
	</style>
</head>
<body>
	<img  src="/static/imghw/default1.png"  data-src="/static/img/index/novelbg.jpg"  class="lazy"   alt="Using simple_html_dom to crawl and display the entire novel in laravel" >
	<h2 id="artic-title">{{$artic[&#39;title&#39;]}}</h2>
	<a href="{{$artic[&#39;next&#39;]}}" class="next">下一章</a>
	<div>
		{!!$artic[&#39;content&#39;]!!}
	</div>
</body>
</html>

Copy after login

Explanation: Because there is only the current article, there is no need to loop, { {$artic['title']}} is the title, and can also be written into the title. The way {!!$artic['content']!!} is written is that there is no need to escape the content of the article, otherwise there will be many other characters, such as
, etc. The address of the button for the next chapter can be passed directly. position:fixed fixes the positioning button, and you can go to the next chapter at any time.

Run:

Using simple_html_dom to crawl and display the entire novel in laravel

Summary: The most important part of this article is to introduce third-party classes that can be applied He, and also the basics of laravel, are more accustomed to using the controller view. If you use the model, please write your own verification.

This is enough for a novel. Of course, we can expand it and write out the novel list of the entire site. It will be even more perfect if we continue to pass the appropriate parameters.

The above is the detailed content of Using simple_html_dom to crawl and display the entire novel in laravel. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7563

CakePHP Tutorial

1385

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Laravel - Artisan Commands Aug 27, 2024 am 10:51 AM

Laravel - Artisan Commands - Laravel 5.7 comes with new way of treating and testing new commands. It includes a new feature of testing artisan commands and the demonstration is mentioned below ?

Laravel - Pagination Customizations Aug 27, 2024 am 10:51 AM

Laravel - Pagination Customizations - Laravel includes a feature of pagination which helps a user or a developer to include a pagination feature. Laravel paginator is integrated with the query builder and Eloquent ORM. The paginate method automatical

How to get the return code when email sending fails in Laravel? Apr 01, 2025 pm 02:45 PM

Method for obtaining the return code when Laravel email sending fails. When using Laravel to develop applications, you often encounter situations where you need to send verification codes. And in reality...

Laravel schedule task is not executed: What should I do if the task is not running after schedule: run command? Mar 31, 2025 pm 11:24 PM

Laravel schedule task run unresponsive troubleshooting When using Laravel's schedule task scheduling, many developers will encounter this problem: schedule:run...

In Laravel, how to deal with the situation where verification codes are failed to be sent by email? Mar 31, 2025 pm 11:48 PM

The method of handling Laravel's email failure to send verification code is to use Laravel...

How to implement the custom table function of clicking to add data in dcat admin? Apr 01, 2025 am 07:09 AM

How to implement the table function of custom click to add data in dcatadmin (laravel-admin) When using dcat...

Laravel - Dump Server Aug 27, 2024 am 10:51 AM

Laravel - Dump Server - Laravel dump server comes with the version of Laravel 5.7. The previous versions do not include any dump server. Dump server will be a development dependency in laravel/laravel composer file.

Laravel Redis connection sharing: Why does the select method affect other connections? Apr 01, 2025 am 07:45 AM

The impact of sharing of Redis connections in Laravel framework and select methods When using Laravel framework and Redis, developers may encounter a problem: through configuration...

See all articles