javascript - Problem with crawling web page Jquery selector first-child
巴扎黑
巴扎黑 2017-05-16 13:28:41
0
1
509

When crawling a website,
I feel that h2 and h3 have the same structure. Why can h2:first-child get data, but h3 cannot.

The final results h2_1 and h2_2 are the same, no problem.
h3_1 is ok, but h3_2 is empty. Why is this?

code show as below,

const jsdom = require('jsdom');

const jquery = require('jquery');

jsdom.env('https://www.osram.com/os/news-and-events/spotlights/index.jsp', [], {
    defaultEncoding: 'utf-8'
}, function(err, window) {
    if(err) {
        console.error('error get news url from page [%s]');
        return;
    }

    let $ = jquery(window);

    let el = $('p.col-xs-6.col-sm-7.colalign:first');


    let h2_1 = $(el).find('h2.font-headline-teaser').text();
    console.log('h2_1=' + h2_1);
    let h2_2 = $(el).find('h2.font-headline-teaser:first-child').text();
    console.log('h2_2=' + h2_2);

    let h3_1 = $(el).find('h3.font-sub-headline').text();
    console.log('h3_1=' + h3_1);

    let h3_2 = $(el).find('h3.font-sub-headline:first-child').text();
    console.log('h3_2=' + h3_2);



    window.close();


});
巴扎黑
巴扎黑

reply all(1)
为情所困

The selector xxx:first-child means that when the first child element of the parent element of xxx is xxx, to select xxx, these two conditions need to be met at the same time.

is not the first child element of the parent element of xxx, nor is it the first xxx among the child elements of the parent element of xxx

The first child element of the parent element of

h2.font-headline-teaser is h2.font-headline-teaser, so it can be selected

The first child element of the parent element of h3.font-sub-headline is not h3.font-sub-headline, so it is empty

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template