Extracting Information from Shadow-Root Elements Using Selenium Python
In this post, we address the issue of extracting product information from the website https://www.tiendasjumbo.co/buscar?q=mani. These elements are placed within a #shadow-root (open) element, making conventional extraction methods ineffective.
Understanding the Shadow-Root
Shadow-root is a technique used to encapsulate DOM elements, hiding them from the main HTML document. To access elements within a shadow-root, specific shadow-root locators must be utilized.
Solution: Using ShadowRoot.querySelector()
To extract the product label, we implement the following strategy:
Code Example:
<code class="python">from selenium import webdriver from random import randint driver = webdriver.Firefox(executable_path="C:\Program Files (x86)\geckodriver.exe") time.sleep(4) url = "https://www.tiendasjumbo.co/buscar?q=mani" driver.maximize_window() driver.get(url) item = driver.execute_script("return document.querySelector('impulse-search').shadowRoot.querySelector('div.group-name-brand h1.impulse-title span.formatted-text')") print(item.text)</code>
This code will print the product label for the provided URL.
Additional Notes:
The above is the detailed content of How to Extract Product Information from Shadow-Root Elements Using Selenium Python?. For more information, please follow other related articles on the PHP Chinese website!