Extracting Information from a Shadow Root Using Selenium Python
In the context of the provided URL https://www.tiendasjumbo.co/buscar?q=mani, extracting information from elements within a #shadow-root (open) presents a challenge. The following code snippet illustrates the issue:
<code class="python">from selenium import webdriver import time from random import randint driver = webdriver.Firefox(executable_path="C:\Program Files (x86)\geckodriver.exe") driver.implicitly_wait(10) time.sleep(4) url = "https://www.tiendasjumbo.co/buscar?q=mani" driver.maximize_window() driver.get(url) driver.find_element_by_xpath('//h1[@class="impulse-title"]')</code>
Solution:
The products within the webpage are encapsulated within a shadow root. To access these elements, the shadowRoot.querySelector() method must be employed. The following code demonstrates this strategy:
<code class="python">driver.get('https://www.tiendasjumbo.co/buscar?q=mani') item = driver.execute_script("return document.querySelector('impulse-search').shadowRoot.querySelector('div.group-name-brand h1.impulse-title span.formatted-text')") print(item.text)</code>
Output:
La especial mezcla de nueces, maní, almendras y marañones x 450 g
References:
Note:
Microsoft Edge and Google Chrome version 96 introduced changes in shadow root handling. For updated information, please refer to the following resources:
The above is the detailed content of How to Extract Data from a Shadow Root Using Selenium Python?. For more information, please follow other related articles on the PHP Chinese website!