Analysis of Sizzle, the selector engine in jQuery-JS Tutorial-php.cn

Table of Contents

生成终极匹配器（matcherFromGroupMatchers）

查询过程demo

带位置伪类的查询流程

Home

Web Front-end

JS Tutorial

Analysis of Sizzle, the selector engine in jQuery

不言

Jul 14, 2018 am 09:30 AM

javascript jquery Selector

This article mainly introduces the analysis of the selector engine Sizzle in jQuery. It has a certain reference value. Now I share it with everyone. Friends in need can refer to

Read the source code of Sizzle and analyze it. The Sizzle version number is 2.3.3.

Element query methods natively supported by browsers:

##getElementsByClassNameQuery elements based on their classIE9, Firefox 3, Chrome 4, Safari 3.1 getElementsByNameQuery based on the element name attribute ElementIE10 (not supported or incomplete below IE10), FireFox23, Chrome 29, Safari 6 querySelectorQuery element based on selectorIE9 (IE8 is partially supported), Firefox 3.5, Chrome 4, Safari 3.1 querySelectorAllQuery elements based on selector IE9 (IE8 partially supported), Firefox 3.5, Chrome 4, Safari 3.1

In Sizzle, For performance reasons, priority is given to using the native method of JS for query. Among the methods listed above, except the querySelector method, which is not used, the others are all used in Sizzle.

For cases where the results cannot be obtained directly using native methods, Sizzle needs to perform lexical analysis, decompose the complex CSS selector, and then query and filter item by item to obtain the elements that finally meet the query conditions.

There are the following points to improve the speed of this low-level query:

From right to left: The traditional selector is From left to right, for example, for the selector #box .cls a, its query process is to first find the element with id=box, and then search for the class contained in the descendant node of this element. cls element; after finding it, search all a elements under this element. After the search is completed, return to the previous level and continue to search for the next .cls element, and so on until completed. One problem with this approach is that there are many elements that do not meet the conditions and will be traversed during the search. For the right-to-left order, it first finds all a elements, and then filters out ## that meet this condition based on the remaining selector #box .cls #a element. In this way, the query scope is limited, and the speed will of course be faster relatively speaking. But one thing to be clear is that not all selectors are suitable for this right-to-left query. Not all right-to-left queries are faster than left-to-right, but it covers the vast majority of query situations.
Limited seed set: If there is only one set of selectors, that is, there is no comma-separated query condition; then the last node is searched first, and then Filter from the last node set;
Limit the query scope: If the parent node is just an ID and does not contain other restrictions, the query scope will be narrowed To the parent node;#box a;
Cache specific data: Mainly divided into three categories, tokenCache, compileCache, classCache;

Our queries for Sizzle are divided into two categories:

Simple Process (no position pseudo-class)
Query with position pseudo-class

Simple process

Simple process is querying , follow the

right to left process.

Sort out the simple process

Sizzle flow chart (simplified version)

What the simple process ignores is mainly the processing logic related to the position pseudo-class, such as: nth-child Like

Lexical analysis

Lexical analysis parses the string selector into a series of TOKEN.

First of all, let’s clarify the concept of TOKEN. TOKEN can be regarded as the smallest atom and cannot be split. In CSS selectors, TOKEN is generally expressed in the form of TAG, ID, CLASS, ATTR, etc. A complex CSS selector will generate a series of TOKEN after lexical analysis, and then perform final query and filtering based on these Tokens.

The following is an example to illustrate the process of lexical analysis. For the parsing of the string

#box .cls a:

/**
 * 下面是Sizzle中词法解析方法 tokennize 的核心代码 1670 ~ 1681 行
 * soFar = '#box .cls a'
 * Expr.filter 是Sizzle进行元素过滤的方法集合
 * Object.getOwnPropertyNames(Expr.filter) //  ["TAG", "CLASS", "ATTR", "CHILD", "PSEUDO", "ID"]
*/
for ( type in Expr.filter ) {
    // 拿当前的选择字符串soFar 取匹配filter的类型，如果能匹配到，则将当前的匹配对象取出，并当做一个Token存储起来
    // matchExpr中存储一些列正则，这些正则用于验证当前选择字符串是否满足某一token语法
    if ( (match = matchExpr[ type ].exec( soFar )) && (!preFilters[ type ] ||
        (match = preFilters[ type ]( match ))) ) {
        matched = match.shift();
        tokens.push({
            value: matched,
            type: type,
            matches: match
        });

        // 截取掉匹配到选择字符串，继续匹配剩余的字符串(继续匹配是通过这段代码外围的while(soFar)循环实现的)
        // matchExpr中存储的正则都是元字符“^”开头，验证字符串是否以‘xxx’开头；这也就是说， 词法分析的过程是从字符串开始位置，从左至右，一下一下地剥离出token
        soFar = soFar.slice( matched.length );
    }
}

Copy after login

After the above parsing process,

#box .cls a will be parsed into an array in the following form:Sizzle: tokens

Compile function

The process of compiling the function is very simple. First, search the corresponding matcher in the cache of the matcher according to

selector.

If the same

selector query has been performed before and the cache is still there (because the number of Sizzle changes is limited, if the number limit is exceeded, the earliest cache will be deleted), then directly return the current Cached matchers.

If not found in the cache, the ultimate matcher is generated through the

matcherFromTokens() and matcherFromGroupMatchers() methods, and the final matcher is cached.

Generate matchers based on tokens (matcherFromTokens)

This step is to generate matchers (matchers) based on the tokens produced by lexical analysis.

In Sizzle, the corresponding method is
matcherFromTokens.

Get a vaccination. This method is very troublesome to read.

In the Sizzle source code (

sizzle.js file) lines 1705 ~ 1765, there are only 60 lines, but a lot of factory methods are included (just Only refers to the method whose return value is Function type). Let’s simplify the process of this method (removing the processing of pseudo-class selectors)

function matcherFromTokens( tokens ) {
    var checkContext, matcher, j,
        len = tokens.length,
        leadingRelative = Expr.relative[ tokens[0].type ],
        implicitRelative = leadingRelative || Expr.relative[" "],
        i = leadingRelative ? 1 : 0,

        // The foundational matcher ensures that elements are reachable from top-level context(s)
        matchContext = addCombinator( function( elem ) {
            return elem === checkContext;
        }, implicitRelative, true ),
        matchAnyContext = addCombinator( function( elem ) {
            return indexOf( checkContext, elem ) > -1;
        }, implicitRelative, true ),
        matchers = [ function( elem, context, xml ) {
            var ret = ( !leadingRelative && ( xml || context !== outermostContext ) ) || (
                (checkContext = context).nodeType ?
                    matchContext( elem, context, xml ) :
                    matchAnyContext( elem, context, xml ) );
            // Avoid hanging onto element (issue #299)
            checkContext = null;
            return ret;
        } ];
        
    // 上面的都是变量声明

    // 这个for循环就是根据tokens 生成matchers 的过程
    for ( ; i < len; i++ ) {

        // 如果碰到 祖先/兄弟 关系（&#39;>', ' ', '+', '~'），则需要合并之前的matchers；
        if ( (matcher = Expr.relative[ tokens[i].type ]) ) {
            matchers = [ addCombinator(elementMatcher( matchers ), matcher) ];
        } else {
            matcher = Expr.filter[ tokens[i].type ].apply( null, tokens[i].matches );
            matchers.push( matcher );
        }
    }

    // 将所有的matchers 拼合到一起 返回一个匹配器，
    // 所有的matcher返回值都是布尔值，只要有一个条件不满足，则当前元素不符合，排除掉
    return elementMatcher( matchers );
}

Copy after login

Question: Why if we encounter the ancestor/brother relationship ('>', ' ', ' ', '~'), do you need to merge the previous matchers?

Answer:The purpose is not necessarily to merge, but to find the associated nodes of the current node (satisfying the ancestor/sibling relationship ['>', ' ', ' ', '~' ]), and then use the previous matcher to verify whether the associated node satisfies the matcher. In the "verification" step, it is not necessary to merge the previous matchers, but the merged structure will be clearer. for example:

我们需要买汽车，现在有两个汽车品牌A、B。A下面有四种车型：a1,a2,a3,a4；B下面有两种车型：b1,b2。那么我们可以的买到所有车就是
[a1,a2,a3,a4,b1,b2]。但是我们也可以这么写{A:[a1,a2,a3,a4],B:[b1,b2]}。这两种写法都可以表示我们可以买到车型。只是第二种相对前者，更清晰列出了车型所属品牌关系。

同理，在合并后，我们就知道这个合并后的matcher就是为了验证当前的节点的关联节点。

生成终极匹配器（matcherFromGroupMatchers）

主要是返回一个匿名函数，在这个函数中，利用matchersFromToken方法生成的匹配器，去验证种子集合seed，筛选出符合条件的集合。
先确定种子集合，然后在拿这些种子跟匹配器逐个匹配。在匹配的过程中，从右向左逐个token匹配，只要有一个环节不满条件，则跳出当前匹配流程，继续进行下一个种子节点的匹配过程。

通过这样的一个过程，从而筛选出满足条件的DOM节点，返回给select方法。

查询过程demo

用一个典型的查询，来说明Sizzle的查询过程。

以 p.cls input[type="text"] 为例：

解析出的tokens:

[
    [
        { "value": "p", "type": "TAG", "matches": ["p"] }, 
        { "value": ".cls", "type": "CLASS", "matches": ["cls"] }, 
        { "value": " ", "type": " " }, 
        { "value": "input", "type": "TAG", "matches": ["input"] }, 
        { "value": "[type=\"text\"]", "type": "ATTR", "matches": ["type", "=", "text"]}
    ]
]

Copy after login

首先这个选择器会筛选出所有的<input>作为种子集合seed，然后在这个集合中寻找符合条件的节点。
在寻找种子节点的过程中，删掉了token中的第四条{ "value": "input", "type": "TAG", "matches": ["input"] }。

那么会根据剩下的tokens生成匹配器

matcherByTag('p')
matcherByClass('.cls')

碰见父子关系' '，将前面的生成的两个matcher合并生成一个新的

matcher:

matcherByTag('p'),
matcherByClass('.cls')

这个matcher 是通过addCombinator()方法生成的匿名函数，这个matcher会先根据父子关系parentNode，取得当前种子的parentNode，然后再验证是否满足前面的两个匹配器。

碰见第四条属性选择器，生成

matcherByAttr('[type="text"]')

至此，根据tokens已经生成所有的matchers。

终极匹配器

matcher:

matcherByTag('p')
matcherByClass('.cls')

matcherByAttr('[type="text"]')

在matcherFromTokens()方法中的最后一行，还有一步操作，将所有的matchers通过elementMatcher()合并成一个matcher。
elementMatcher这个方法就是将所有的匹配方法，通过while循环都执行一遍，如果碰到不满足条件的，就直接挑出while循环。
有一点需要说明的就是: elementMatcher方法中的while循环是倒序执行的，即从matchers最后一个matcher开始执行匹配规则。对应上面的这个例子就是，最开始执行的匹配器是matcherByAttr('[type="text"]')。这样一来，就过滤出了所有不满足type="text"的<input>的元素。然后执行下一个匹配条件，

Question: Sizzle中使用了大量闭包函数，有什么作用？出于什么考虑的？
Answer:闭包函数的作用，是为了根据selector动态生成匹配器，并将这个匹配器缓存(cached)。因为使用闭包，匹配器得以保存在内存中，这为缓存机制提供了支持。
这么做的主要目的是提高查询性能，通过常驻内存的匹配器避免再次消耗大量资源进行词法分析和匹配器生成。以空间换时间，提高查询速度。

Question: matcherFromTokens中，对每个tokens生成匹配器列表时，为什么会有一个初始化的方法？
Answer: 这个初始化的方法是用来验证元素是否属于当前context。

Question: matcherFromGroupMatchers的作用？
Answer: 返回一个终极匹配器，并让编译函数缓存这个终极匹配器。在这个终极匹配器中，会将获取到的种子元素集合与匹配器进行比对，筛选出符合条件的元素。

TODO: 编译机制也许是Sizzle为了做缓存以便提高性能而做出的选择？？
是的，详细答案待补充~~~

TODO: outermostContext的作用
细节问题，还有待研究~~~

带位置伪类的查询流程

带位置伪类的查询是 由左至右。

用选择器.mark li.limark:first.limark2 a span举例。

在根据tokens生成匹配器（matcherFromTokens）之前的过程，跟简易查询没有任何区别。
不同的地方就在matcherFromTokens()方法中。位置伪类不同于简易查询的是，它会根据位置伪类将选择器分成三个部分。对应上例就是如下

.mark li.limark ：位置伪类之前的选择器；
:first ：位置伪类本身；
.limark2：跟位置伪类本身相关的选择器，
a span：位置伪类之后的选择器；

位置伪类的查询思路，是先进行位置伪类之前的查询.mark li.limark，这个查询过程当然也是利用之前讲过的简易流程(Sizzle(selector))。查询完成后，再根据位置伪类进行过滤，留下满足位置伪类的节点。如果存在第三个条件，则利用第三个条件，再进行一次过滤。然后再利用这些满足位置伪类节点作为context，进行位置伪类之后选择器 a span的查询。

上例选择器中只存在一个位置伪类；如果存在多个，则从左至右，会形成一个一个的层级，逐个层级进行查询。

下面是对应的是matcherFromTokens()方法中对位置伪类处理。

// 这个matcherFromTokens中这个for循环，之前讲过了，但是 有个地方我们跳过没讲
for ( ; i < len; i++ ) {
        if ( (matcher = Expr.relative[ tokens[i].type ]) ) {
            matchers = [ addCombinator(elementMatcher( matchers ), matcher) ];
        } else {
            matcher = Expr.filter[ tokens[i].type ].apply( null, tokens[i].matches );

            // Return special upon seeing a positional matcher
            // 这个就是处理位置伪类的逻辑
            if ( matcher[ expando ] ) {
                // Find the next relative operator (if any) for proper handling
                j = ++i;
                for ( ; j < len; j++ ) { // 寻找下一个关系节点位置，并用j记录下来
                    if ( Expr.relative[ tokens[j].type ] ) {
                        break;
                    }
                }
                return setMatcher(// setMatcher 是生成位置伪类查询的工厂方法
                    i > 1 && elementMatcher( matchers ), // 位置伪类之前的matcher
                    i > 1 && toSelector(
                        // If the preceding token was a descendant combinator, insert an implicit any-element `*`
                        tokens.slice( 0, i - 1 ).concat({ value: tokens[ i - 2 ].type === " " ? "*" : "" })
                    ).replace( rtrim, "$1" ), // 位置伪类之前的selector
                    matcher, // 位置伪类本身的matcher
                    i < j && matcherFromTokens( tokens.slice( i, j ) ), // 位置伪类本身的filter
                    j < len && matcherFromTokens( (tokens = tokens.slice( j )) ), // 位置伪类之后的matcher
                    j < len && toSelector( tokens ) // 位置伪类之后的selector
                );
            }
            matchers.push( matcher );
        }
    }

Copy after login

setMatcher()方法的源码，在这里生成最终的matcher， return给compile()方法。

//第1个参数，preFilter，前置过滤器，相当于伪类token之前`.mark li.limark`的过滤器matcher
//第2个参数，selector，伪类之前的selector (`.mark li.limark`)
//第3个参数，matcher，    当前位置伪类的过滤器matcher `:first`
//第4个参数，postFilter，伪类之后的过滤器 `.limark2`
//第5个参数，postFinder，后置搜索器，相当于在前边过滤出来的集合里边再搜索剩下的规则的一个搜索器 ` a span`的matcher
//第6个参数，postSelector，后置搜索器对应的选择器字符串，相当于` a span`
function setMatcher( preFilter, selector, matcher, postFilter, postFinder, postSelector ) {
    //TODO: setMatcher 会把这俩货在搞一次setMatcher， 还不太懂
    if ( postFilter && !postFilter[ expando ] ) {
        postFilter = setMatcher( postFilter );
    }
    if ( postFinder && !postFinder[ expando ] ) {
        postFinder = setMatcher( postFinder, postSelector );
    }
    
    return markFunction(function( seed, results, context, xml ) {
        var temp, i, elem,
            preMap = [],
            postMap = [],
            preexisting = results.length,

            // Get initial elements from seed or context
            elems = seed || multipleContexts( selector || "*", context.nodeType ? [ context ] : context, [] ),

            // Prefilter to get matcher input, preserving a map for seed-results synchronization
            matcherIn = preFilter && ( seed || !selector ) ?
                condense( elems, preMap, preFilter, context, xml ) :
                elems,

            matcherOut = matcher ?
                // If we have a postFinder, or filtered seed, or non-seed postFilter or preexisting results,
                postFinder || ( seed ? preFilter : preexisting || postFilter ) ?

                    // ...intermediate processing is necessary
                    [] :

                    // ...otherwise use results directly
                    results :
                matcherIn;

        // Find primary matches
        if ( matcher ) {
            // 这个就是 匹配位置伪类的 逻辑， 将符合位置伪类的节点剔出来
            matcher( matcherIn, matcherOut, context, xml );
        }

        // Apply postFilter
        if ( postFilter ) {
            temp = condense( matcherOut, postMap );
            postFilter( temp, [], context, xml );

            // Un-match failing elements by moving them back to matcherIn
            i = temp.length;
            while ( i-- ) {
                if ( (elem = temp[i]) ) {
                    matcherOut[ postMap[i] ] = !(matcherIn[ postMap[i] ] = elem);
                }
            }
        }

        if ( seed ) {
            if ( postFinder || preFilter ) {
                if ( postFinder ) {
                    // Get the final matcherOut by condensing this intermediate into postFinder contexts
                    temp = [];
                    i = matcherOut.length;
                    while ( i-- ) {
                        if ( (elem = matcherOut[i]) ) {
                            // Restore matcherIn since elem is not yet a final match
                            temp.push( (matcherIn[i] = elem) );
                        }
                    }
                    postFinder( null, (matcherOut = []), temp, xml );
                }

                // Move matched elements from seed to results to keep them synchronized
                i = matcherOut.length;
                while ( i-- ) {
                    if ( (elem = matcherOut[i]) &&
                        (temp = postFinder ? indexOf( seed, elem ) : preMap[i]) > -1 ) {

                        seed[temp] = !(results[temp] = elem);
                    }
                }
            }

        // Add elements to results, through postFinder if defined
        } else {
            matcherOut = condense(
                matcherOut === results ?
                    matcherOut.splice( preexisting, matcherOut.length ) :
                    matcherOut
            );
            if ( postFinder ) {
                postFinder( null, results, matcherOut, xml );
            } else {
                push.apply( results, matcherOut );
            }
        }
    });
}

Copy after login

以上就是本文的全部内容，希望对大家的学习有所帮助，更多相关内容请关注PHP中文网！

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks ago By DDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Roblox: Dead Rails - How To Tame Wolves

3 weeks ago By DDD

Blue Prince: How To Get To The Basement

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1653

CakePHP Tutorial

1413

Laravel Tutorial

1306

PHP Tutorial

1251

C# Tutorial

1224

Related knowledge

How to use PUT request method in jQuery? Feb 28, 2024 pm 03:12 PM

How to use PUT request method in jQuery? In jQuery, the method of sending a PUT request is similar to sending other types of requests, but you need to pay attention to some details and parameter settings. PUT requests are typically used to update resources, such as updating data in a database or updating files on the server. The following is a specific code example using the PUT request method in jQuery. First, make sure you include the jQuery library file, then you can send a PUT request via: $.ajax({u

jQuery Tips: Quickly modify the text of all a tags on the page Feb 28, 2024 pm 09:06 PM

Title: jQuery Tips: Quickly modify the text of all a tags on the page In web development, we often need to modify and operate elements on the page. When using jQuery, sometimes you need to modify the text content of all a tags in the page at once, which can save time and energy. The following will introduce how to use jQuery to quickly modify the text of all a tags on the page, and give specific code examples. First, we need to introduce the jQuery library file and ensure that the following code is introduced into the page: &lt

How to remove the height attribute of an element with jQuery? Feb 28, 2024 am 08:39 AM

How to remove the height attribute of an element with jQuery? In front-end development, we often encounter the need to manipulate the height attributes of elements. Sometimes, we may need to dynamically change the height of an element, and sometimes we need to remove the height attribute of an element. This article will introduce how to use jQuery to remove the height attribute of an element and provide specific code examples. Before using jQuery to operate the height attribute, we first need to understand the height attribute in CSS. The height attribute is used to set the height of an element

Use jQuery to modify the text content of all a tags Feb 28, 2024 pm 05:42 PM

Title: Use jQuery to modify the text content of all a tags. jQuery is a popular JavaScript library that is widely used to handle DOM operations. In web development, we often encounter the need to modify the text content of the link tag (a tag) on the page. This article will explain how to use jQuery to achieve this goal, and provide specific code examples. First, we need to introduce the jQuery library into the page. Add the following code in the HTML file:

Understand the role and application scenarios of eq in jQuery Feb 28, 2024 pm 01:15 PM

jQuery is a popular JavaScript library that is widely used to handle DOM manipulation and event handling in web pages. In jQuery, the eq() method is used to select elements at a specified index position. The specific usage and application scenarios are as follows. In jQuery, the eq() method selects the element at a specified index position. Index positions start counting from 0, i.e. the index of the first element is 0, the index of the second element is 1, and so on. The syntax of the eq() method is as follows: $("s

How to tell if a jQuery element has a specific attribute? Feb 29, 2024 am 09:03 AM

How to tell if a jQuery element has a specific attribute? When using jQuery to operate DOM elements, you often encounter situations where you need to determine whether an element has a specific attribute. In this case, we can easily implement this function with the help of the methods provided by jQuery. The following will introduce two commonly used methods to determine whether a jQuery element has specific attributes, and attach specific code examples. Method 1: Use the attr() method and typeof operator // to determine whether the element has a specific attribute

Introduction to how to add new rows to a table using jQuery Feb 29, 2024 am 08:12 AM

jQuery is a popular JavaScript library widely used in web development. During web development, it is often necessary to dynamically add new rows to tables through JavaScript. This article will introduce how to use jQuery to add new rows to a table, and provide specific code examples. First, we need to introduce the jQuery library into the HTML page. The jQuery library can be introduced in the tag through the following code:

How to check if an element contains an attribute value in jQuery? Feb 28, 2024 pm 02:54 PM

In jQuery, we often need to check whether an element contains a specific attribute value. Doing this helps us perform actions based on the attribute values on the element. In this article, I will introduce how to use jQuery to check whether an element contains a certain attribute value, and provide specific code examples. First, let's take a look at some common methods in jQuery to operate the attributes of elements: .attr(): used to get or set the attribute value of an element. .prop(): used to get or set the attribute value of an element

See all articles

Method name	Method description	Compatibility description
getElementById	Query element based on element ID	IE6, Firefox 2, Chrome 4, Safari 3.1
getElementsByTagName	Query elements based on element name	IE6, Firefox 2, Chrome 4, Safari 3.1