This article mainly introduces the relevant information about the detailed explanation and examples of WeChat applet parsing web page content. Here, crawlers are used to crawl complex web pages. If you encounter some problems, they are organized and solved here. Friends in need can refer to the following
Detailed explanation of web page content parsing by WeChat applet
I am writing a crawler recently and need to parse the web page for use by WeChat applet. Both text and image analysis are easy to understand, and the mini program also has corresponding text and image tags for presentation. More complex ones, such as tables, are more difficult. Whether it is server-side parsing or mini program rendering, it is very laborious, and it is difficult to cover all situations. So I thought that converting the HTML code corresponding to the table into images would be a workaround.
Here we use the node-webshot module, which lightly encapsulates PhantomJS and can easily save web pages as screenshots.
First install Node.js and PhantomJS, then create a new js file and load the node-webshot module:
const webshot = require('webshot');
Define options:
const options = { // 浏览器窗口 screenSize: { width: 755, height: 25 }, // 要截图的页面文档区域 shotSize: { height: 'all' }, // 网页类型 siteType: 'html' };
Here, the width of the browser window It should be set reasonably according to the situation of the web page. The height can be set to a very small value. Then the height of the page document area must be set to all, and the width defaults to the window width, so that the table can be completely screenshotted at the smallest size.
Next, define the html string:
let html = "target rich text html code, eg: <table>...</table>";
Note that the HTML code inside must remove newlines and replace double quotes with single quotes.
Finally, screenshot:
webshot(html, 'demo.png', options, (err) => { if (err) console.log(`Webshot error: ${err.message}`); });
In this way, the conversion from HTML code to local image is achieved, which can be subsequently uploaded to Qiniu Cloud, etc. Whether it is server-side parsing or mini-program presentation, there is no difficulty at all...
Thank you for reading, I hope it can help everyone, thank you everyone for your support of this site!
The above is the detailed content of Detailed explanation of examples of WeChat applet parsing web content. For more information, please follow other related articles on the PHP Chinese website!