Mini program development and parsing web content

巴扎黑
Release: 2017-08-23 16:10:40
Original
1607 people have browsed it


WeChat Mini Program Detailed explanation of parsing web page content

I am writing a crawler recently and need to parse the web page for use by WeChat Mini Program. Both text and image analysis are easy to understand, and the mini program also has corresponding text and image tags for presentation. More complex ones, such as tables, are more difficult. Whether it is server-side parsing or mini program rendering, it is very laborious and difficult to cover all situations. So I thought that converting the HTML code corresponding to the table into images would be a workaround.

Here we use the node-webshot module, which lightly encapsulates PhantomJS and can easily save web pages as screenshots.

First install Node.js and PhantomJS, then create a new js file and load the node-webshot module:

const webshot = require('webshot');
Copy after login

Define options:

const options = {  // 浏览器窗口 
 screenSize: {  
   width: 755,  
     height: 25  },  // 要截图的页面文档区域 
      shotSize: {   
       height: 'all'  },  // 网页类型  
       siteType: 'html'
       };
Copy after login

Here, the width of the browser window It should be set reasonably according to the situation of the web page. The height can be set to a very small value. Then the height of the page document area must be set to all, and the width defaults to the window width, so that the table can be completely screenshotted at the smallest size.

Next, define the html string:

let html = "target rich text html code, eg: <table>...</table>";
Copy after login

Note that the HTML code inside must remove newlines and replace double quotes with single quotes.

Finally, screenshot:

webshot(html, &#39;demo.png&#39;, options, (err) => {  if (err)   
 console.log(`Webshot error: ${err.message}`);});
Copy after login

In this way, the conversion from HTML code to local image is realized, which can be uploaded to Qiniu Cloud and so on. Whether it is server-side parsing or mini-program presentation, there is no difficulty...

The above is the detailed content of Mini program development and parsing web content. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template