Web scraping: Missing href attribute - Need to simulate mouse clicks for web scraping?

Question

For a fun web scraping project, I want to collect NHL data from ttps://www.nhl.com/stats/teams. There is a clickable Excel export label and I can find it using selenium and bs4. Unfortunately, that's where it ends: I can't seem to access the data since there is no href attribute. I got what I wanted by simulating mouse clicks using pynput, but I'm wondering: could I do it differently? If it feels awkward. ->Tag with export icon can be found here

P粉807471604 · Answer

There is no href attribute, and the download is triggered through JS. When using selenium find your element and use .click() to download the file:

driver.find_element(By.CSS_SELECTOR,'h2>a').click()

Use the css selector here to get the of direct children

or by ending with # The class starting with ##styles__ExportIcon select it directly:

driver.find_element(By.CSS_SELECTOR,'a[class^="styles__ExportIcon"]').click()

Example

You may need to deal with the onetrust banner, so click on it first and then download the table.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

url = 'https://www.nhl.com/stats/teams'
driver.get(url)
driver.find_element(By.CSS_SELECTOR,'#onetrust-reject-all-handler').click()
driver.find_element(By.CSS_SELECTOR,'h2>a').click()