Home Backend Development PHP Tutorial Do programmers still read novels with advertisements?

Do programmers still read novels with advertisements?

May 06, 2020 pm 06:41 PM
programmer

Some people are used to reading novels, and occasionally read a few chapters. They are all published by Baidu, but there are basically very annoying advertisements. Either add links to the overall div, and if they are accidentally touched, they will jump to some websites or even an endless loop. Some mobile apps also have a lot of ads, so I have nothing to do but write a small program to avoid the annoyance of ads

This article will use php curl to collect the page simple_html_dom parsing to achieve true removal of ads.

Look for a book on any novel website, but this site is particularly tricky on mobile phones because of the above problems:

Do programmers still read novels with advertisements?

Just take this This novel will do the surgery. (Disclaimer: This is definitely not promotion, infringement or deletion)

1. Understand the get method of curl

curl is a command line tool that uploads or downloads through the specified URL data and display the data. The c in curl means client, and URL is the URL.

Using cURL in PHP can implement Get and Post request methods

A simple crawl of novels only requires the get method.

The following sample code is an example of obtaining the html of the first chapter novel page through a get request. You only need to change the url parameters.

Initialization, setting options, certificate verification, execution, closing

<?php
header("Content-Type:text/html;charset=utf-8");
$url="https://www.7kzw.com/85/85445/27248636.html";
$ch = curl_init($url);   //初始化
//设置选项
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);//获取的信息以字符串返回,而不是直接输出(必须) 
curl_setopt($ch,CURLOPT_TIMEOUT,10);//超时时间(必须)
curl_setopt($ch, CURLOPT_HEADER,0);// 	启用时会将头文件的信息作为数据流输出。 
//参数为1表示输出信息头,为0表示不输出
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false); //不验证证书
// 3.执行
$res = curl_exec($ch);
// 4.关闭
curl_close($ch);
print_r($res);
?>
Copy after login

The comments are particularly detailed. Follow the steps to send a curl get request. If it is a post request, then You need to add an additional setting to set the post option, pass parameters, and finally output the obtained information. The running results are as follows, there is no css rendering.

Do programmers still read novels with advertisements?

2. Parse the page

The output page has a lot of unnecessary content and needs to be extracted from all the content To get the content we need, such as the title and the content of each chapter, we need to parse the page.

There are many ways to parse the page. Simple_html_dom is used here. You need to download and reference the simple_html_dom.php class, instance object, and call the internal method. For specific methods, you can check the official website or other documents on the Chinese website.

First analyze the source code of this novel page and look at the elements corresponding to the title and content of this chapter

The first is the title: under h1 under the class bookname

Do programmers still read novels with advertisements?

Then the content: Under the div with the id of content,

Do programmers still read novels with advertisements?

simple_html_dom can use the find method, similar to jquery. The selector finds the positioned element. For example:

find('.bookname h1'); //Find the h1 title element under class bookname

find('#content'); //Find The content of the chapter with the id of content

The code is added based on the above:

include "simple_html_dom.php";
$html = new simple_html_dom();
@$html->load($res);
$h1 = $html->find(&#39;.bookname h1&#39;);
foreach ($h1 as $k=>$v) {
	$artic[&#39;title&#39;] = $v->innertext;
}
// 查找小说的具体内容
$divs = $html->find(&#39;#content&#39;);
foreach ($divs as $k=>$v) {
	$content = $v->innertext;
}
// 正则替换去除多余部分
$pattern = "/(<p>.*?<\/p>)|(<div .*?>.*?<\/div>)/";
$artic[&#39;content&#39;] = preg_replace($pattern,&#39;&#39;,$content);
echo $artic[&#39;title&#39;].&#39;<br>&#39;;
echo $artic[&#39;content&#39;];
Copy after login

The content obtained by using the above parsing method is an array, use foreach To obtain the content of the array, regular replacement is used to remove the text advertisements in the text, and the title and novel content are placed in the array. The simplest way to write it is done. The running result is as follows:

Do programmers still read novels with advertisements?

# Of course, this way of writing looks uncomfortable, you can encapsulate the function class yourself. The following is a code example I wrote myself. Of course, there are definitely deficiencies, but it can be used as a reference for expansion.

<?php 
include "simple_html_dom.php";
include "mySpClass.php";
header("Content-Type:text/html;charset=utf-8");
$get_html = get_html($_GET[&#39;n&#39;]);
$artic = getContent($get_html);
echo $artic[&#39;title&#39;].&#39;<br>&#39;;
echo $artic[&#39;content&#39;];
/**
* 获取www.7kzw.com 获取每一章的页面html
* @param type $num 第几章,从第一开始(int)
* @return 返回字符串  
*/
function get_html($num){
	$start = 27248636;
	$real_num = $num+$start-1;
	$url = &#39;https://www.7kzw.com/85/85445/&#39;.$real_num.&#39;.html&#39;;
	$header = [
	&#39;User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0&#39;
	]; 
	return mySpClass()->getCurl($url,$header);
}
/**
* 获取www.7kzw.com小说标题数组
* @param type $get_html 得到的每一章的页面html
* @return 返回$artic数组,[&#39;title&#39;=>&#39;&#39;,&#39;content&#39;=>&#39;&#39;]
*/
function getContent($get_html){
	$html = new simple_html_dom();
	@$html->load($get_html);
	$h1 = $html->find(&#39;.bookname h1&#39;);
	foreach ($h1 as $k=>$v) {
		$artic[&#39;title&#39;] = $v->innertext;
	}
	// 查找小说的具体内容
	$divs = $html->find(&#39;#content&#39;);
	foreach ($divs as $k=>$v) {
		$content = $v->innertext;
	}
	// 正则替换去除多余部分
	$pattern = "/(<p>.*?<\/p>)|(<div .*?>.*?<\/div>)/";
	$artic[&#39;content&#39;] = preg_replace($pattern,&#39;&#39;,$content);
	return $artic;
}
?>
Copy after login
<?php
class mySpClass{
	//单例对象
    private static $ins = null;
    /**
     * 单例化对象
     */
    public static function exec()
    {
        if (self::$ins) {
            return self::$ins;
        }
        return self::$ins = new self();
    }
    
    /**
     * 禁止克隆对象
     */
    public function __clone()
    {
        throw new curlException(&#39;错误:不能克隆对象&#39;);
    }
	// 向服务器发送最简单的get请求
	public static function getCurl($url,$header){
		// 1.初始化
		$ch = curl_init($url);   //请求的地址
		// 2.设置选项
		curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);//获取的信息以字符串返回,而不是直接输出(必须) 
		curl_setopt($ch,CURLOPT_TIMEOUT,10);//超时时间(必须)
		curl_setopt($ch, CURLOPT_HEADER,0);// 	启用时会将头文件的信息作为数据流输出。 
		//参数为1表示输出信息头,为0表示不输出
		curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false); //不验证证书
		curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,false); //不验证证书
		if(!empty($header)){
			curl_setopt($ch,CURLOPT_HTTPHEADER,$header);//设置头信息
		}
		// 3.执行
		$res = curl_exec($ch);
		// 4.关闭
		curl_close($ch);
		return $res;
	}
}
//curl方法不存在就设置一个curl方法
if (!function_exists(&#39;mySpClass&#39;)) {
    function mySpClass() {
        return mySpClass::exec();
    }
}
?>
Copy after login

The final running result of the above example code: enter the number in the chapter and pass the parameters through $_GET['n']

Do programmers still read novels with advertisements?

Summary:

Knowledge points: curl (tips: curl module collects any web page php class), regular, parsing tool simple_html_dom

Although the writing method has been initially improved , but it is best to deploy your own server to achieve the best results. Otherwise, you can only watch it on a computer, which is not very convenient. You may be more willing to tolerate advertisements.

The above are the details of using php curl to collect pages and using simple_html_dom to parse them. For more information, please pay attention to other related articles on the php Chinese website!

The above is the detailed content of Do programmers still read novels with advertisements?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Which AI programmer is the best? Explore the potential of Devin, Tongyi Lingma and SWE-agent Which AI programmer is the best? Explore the potential of Devin, Tongyi Lingma and SWE-agent Apr 07, 2024 am 09:10 AM

On March 3, 2022, less than a month after the birth of the world's first AI programmer Devin, the NLP team of Princeton University developed an open source AI programmer SWE-agent. It leverages the GPT-4 model to automatically resolve issues in GitHub repositories. SWE-agent's performance on the SWE-bench test set is similar to Devin, taking an average of 93 seconds and solving 12.29% of the problems. By interacting with a dedicated terminal, SWE-agent can open and search file contents, use automatic syntax checking, edit specific lines, and write and execute tests. (Note: The above content is a slight adjustment of the original content, but the key information in the original text is retained and does not exceed the specified word limit.) SWE-A

Revealing the appeal of C language: Uncovering the potential of programmers Revealing the appeal of C language: Uncovering the potential of programmers Feb 24, 2024 pm 11:21 PM

The Charm of Learning C Language: Unlocking the Potential of Programmers With the continuous development of technology, computer programming has become a field that has attracted much attention. Among many programming languages, C language has always been loved by programmers. Its simplicity, efficiency and wide application make learning C language the first step for many people to enter the field of programming. This article will discuss the charm of learning C language and how to unlock the potential of programmers by learning C language. First of all, the charm of learning C language lies in its simplicity. Compared with other programming languages, C language

Make money by taking on private jobs! A complete list of order-taking platforms for programmers in 2023! Make money by taking on private jobs! A complete list of order-taking platforms for programmers in 2023! Jan 09, 2023 am 09:50 AM

Last week we did a public welfare live broadcast about "2023PHP Entrepreneurship". Many students asked about specific order-taking platforms. Below, php Chinese website has compiled 22 relatively reliable platforms for reference!

2023过年,又限制放烟花?程序猿有办法! 2023过年,又限制放烟花?程序猿有办法! Jan 20, 2023 pm 02:57 PM

本篇文章给大家介绍如何用前端代码实现一个烟花绽放的绚烂效果,其实主要就是用前端三剑客来实现,也就是HTML+CSS+JS,下面一起来看一下,作者会解说相应的代码,希望对需要的朋友有所帮助。

520 programmers' exclusive way to express romantic feelings! Can't refuse! 520 programmers' exclusive way to express romantic feelings! Can't refuse! May 19, 2022 pm 03:07 PM

520 is approaching, and he is here again for the annual show of tormenting dogs! Want to see how the most rational code and the most romantic confession can collide? Let’s take you through the most complete and complete advertising code one by one to see if the romance of programmers can capture the hearts of your goddesses?

what do programmers do what do programmers do Aug 03, 2019 pm 01:40 PM

Programmer's job responsibilities: 1. Responsible for the detailed design, coding and organization and implementation of internal testing of software projects; 2. Assist project managers and related personnel to communicate with customers and maintain good customer relationships; 3. Participate in demand research and project feasibility performance analysis, technical feasibility analysis and demand analysis; 4. Familiar with and proficient in the relevant software technologies for delivering software projects developed by the software department; 5. Responsible for timely feedback on software development situations to the project manager; 6. Participate in software development and maintenance Solve major technical problems during the process; 7. Responsible for the formulation of relevant technical documents, etc.

A brief analysis of how to download and install historical versions of VSCode A brief analysis of how to download and install historical versions of VSCode Apr 17, 2023 pm 07:18 PM

Download and install historical versions of VSCode VSCode installation download installation reference VSCode installation Windows version: Windows10 VSCode version: VScode1.65.0 (64-bit User version) This article

List of the best Windows 11 terminal emulators in 2022: Top 15 recommendations List of the best Windows 11 terminal emulators in 2022: Top 15 recommendations Apr 24, 2023 pm 04:31 PM

Terminal emulators allow you to emulate the functionality of a standard computer terminal. With it, you can perform data transfers and access another computer remotely. When combined with advanced operating systems like Windows 11, the creative possibilities of these tools are endless. However, there are many third-party terminal emulators available. Therefore, it is difficult to choose the right one. But, just as we do with the must-have Windows 11 apps, we've selected the best Terminals you can use and increase your productivity. How do we choose the best Windows 11 terminal emulator? Before selecting the tools on this list, our team of experts first tested them for compatibility with Windows 11. We also checked them

See all articles