PHP cURL not showing all DOM tags when reviewing collection
P粉677684876
P粉677684876 2023-09-12 20:03:06
0
1
602

I want to implement some code to collect comments from a specific page DOM.

The cURL result is incomplete and I don't know why because some subtags in the DOM are not visible in the result.

The DOM looks like this in the inspector:

I try to collect the DOM using the following code snippet:

$domain = 'feefo.com';
$page_id = 'firebrand-promotions';

$curli = curl_init();

curl_setopt_array($curli, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_FRESH_CONNECT => true,
    CURLOPT_URL => 'https://www.' . $domain . '/en-US/reviews/' . $page_id . '?displayFeedbackType=SERVICE&timeFrame=YEAR'

    CURLOPT_HTTPHEADER => [
        'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,* /*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'Accept-Language: en-US;q=0.8,en;q=0.7',
        'Cache-control: max-age=0',
        'Referer: https://' . $domain,
        'sec-fetch-mode: navigate',
        'sec-fetch-site: none',
        'sec-fetch-dest: document',
        'sec-fetch-user: ?1',
        'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
    ]
]);

$curlResult = curl_exec($curli);

What I see in the cURL result content section is this:

<div class="container">
    <global></global>
</div>

So the tag looks empty, but it shouldn't be.

I try to extract the tag content using the following code:

$dom = new DOMDocument();
$dom->validateOnParse = true;
@$dom->loadHTML($curlResult);

$globals = $dom->getElementsByTagName('global');

$xmlPath = new DOMXPath($dom);

$reviews = $xmlPath->query('//global');

But I still don't see any tags in the tags.

Can someone explain this problem to me? how to solve this problem?

Thank you very much for your help, effort and time. :)

P粉677684876
P粉677684876

reply all(1)
P粉124070451

It's very possible that what you get in Curl is exactly what the browser gets, but the browser starts executing javascript that modifies the DOM.

You can't see with with Curl because Curl cannot execute Javascript.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template