Note* This is an "old" project for more than two years, which allows you to use jQuery's selector in NodeJS to operate the back-end HTML/XML like the front-end DOM. After removing the browser compatibility related code, Operations are 8 times faster than JSDOM. We have mentioned before that JSDOM has serious performance issues: Debug Node.JS: How do we locate memory leaks and infinite loops
cheerio
Fast, flexible, using jQuery on the server side.
Introduction
Test your server-side HTML:
Install
npm install cheerio
Function
❤ Familiar syntax: Cheerio implements a subset of core jQuery. Cheerio removes all DOM inconsistency and browser compatibility support from the jQuery library, presenting its truly gorgeous API.
ϟ Extremely fast: Cheerio uses a very simple, consistent DOM model. This results in incredible performance gains for parsing, manipulating and rendering. Preliminary end-to-end benchmarks show that Cheerio is approximately 8 times faster than JSDOM.
❁Incredible flexibility: Compatible with htmlparser2API. Cheerio can parse almost any HTML or XML document.
How about JSDOM?
I write Cheerio because I am increasingly frustrated with JSOM. For me, there are three major problems that I encounter again and again:
• JSDOM’s built-in parser is too strict: The HTML parser bundled with JSDOM currently cannot handle many popular websites.
•JSDOM is too slow: When parsing large websites, JSDOM has obvious delays.
•JSDOM feels too heavy: The purpose of JSDOM is to provide a DOM environment that is the same as what we see in the browser (note * executable JavaScript). I've never really needed any of this stuff, I just want a simple, familiar way of doing HTML manipulation.
When to use JSDOM
Cheerio cannot solve all your problems. If I need to work in a browser-like environment, I'll still use JSDOM, especially if I want to do automated functional testing on the server.
API
Sample HTML code we will use:
Loading
First, you need to load the HTML. This step is done automatically in jQuery because jQuery runs in a real-time DOM environment. We need to pass the HTML document into Cheerio.
This is the preferred method:
Alternatively, you can pass in HTML as a string parameter:
or as root node
You can also load the default parsing options you need to modify via an additional .load():
These parsing options are borrowed directly from htmlparser2, so any parameters that can be used in htmlparser2 are also valid in cheerio. The default option is:
Selectors
Cheerio's selectors are almost identical to jQuery's, so the API is very similar.
The selector selects elements in the order: root[root, optional]->Context[context, optional]->selector. Selectors and contexts can be a string expression, a DOM element, or an array of DOM elements. Root usually document is the root element of the HTML document.
Like jQuery, this selection method traverses and manipulates the document from the starting point. It is the primary way to select elements from a document, but is not built like jQuery's CSSSelect library (Sizzle selector).
Attributes
Methods to obtain and modify properties.
.attr( name, value )
Methods for getting and setting properties. Gets only the attribute value of the first matching element. If the value of a set property is set to null, the property is removed. You can also pass in map and function just like jQuery.
.data( name, value )
Methods for getting and setting data properties. Gets or sets only the first element of the match.
Methods for getting and setting input, select and textarea values. Note: map is supported, function has not been added yet.
For more APIs, please visit the official website