網頁抓取：缺少 href 屬性 - 需要模擬滑鼠點擊進行網頁抓取嗎？

Question

對於一個有趣的網頁抓取項目，我想從ttps://www.nhl.com/stats/teams收集NHL資料。有一個可點擊的Excel匯出標籤，我可以使用selenium和bs4找到它。不幸的是，事情到這裡就結束了：由於沒有href屬性，我似乎無法存取資料。我透過使用pynput模擬滑鼠點擊得到了我想要的，但我想知道：我可以採取不同的做法嗎？如果感覺很笨拙。 ->帶有匯出圖示的標籤可以在這裡找

P粉807471604 · Answer

沒有href屬性，透過JS觸發下載。使用 selenium 時找到您的元素並使用 .click() 下載檔案：

driver.find_element(By.CSS_SELECTOR,'h2>a').click()

在這裡使用css 選擇器 來取得直接子級 的

或透過以styles__ExportIcon開頭的類別直接選擇它：

driver.find_element(By.CSS_SELECTOR,'a[class^="styles__ExportIcon"]').click()

範例

您可能需要處理 onetrust 橫幅，因此請先按一下它，然後下載該表。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

url = 'https://www.nhl.com/stats/teams'
driver.get(url)
driver.find_element(By.CSS_SELECTOR,'#onetrust-reject-all-handler').click()
driver.find_element(By.CSS_SELECTOR,'h2>a').click()