Scraping the web using Python but not sure what to do with static(?) URLs
P粉293341969
P粉293341969 2024-02-17 17:14:50
0
1
383

I'm trying to learn how to extract data from this url: https://denver.coloradotaxsale.com/index.cfm?folder=auctionResults&mode=preview

However, the problem is that when I try to switch pages, the URL doesn't change, so I'm not sure how to enumerate or loop over it. Since the web page has 3000 sales data points, trying to find a better way is being done.

This is my starting code, it's very simple but I would appreciate any help I can provide or any tips. I think I might need to switch to another bag but I'm not sure which one might be beautifulsoup?

导入请求
url =“https://denver.coloradotaxsale.com/index.cfm?folder=auctionResults&mode=preview”

html = requests.get(url).content
df_list = pd.read_html(html,标题 = 1)[0]
df_list = df_list.drop([0,1,2]) #删除不需要的行

P粉293341969
P粉293341969

reply all(1)
P粉600845163

To get data from more pages, you can use the following example:

导入请求
将 pandas 导入为 pd
从 bs4 导入 BeautifulSoup


数据 = {
    "folder": "拍卖结果",
    “登录ID”:“00”,
    "页数": "1",
    "orderBy": "AdvNum",
    "orderDir": "asc",
    "justFirstCertOnGroups": "1",
    "doSearch": "真",
    "itemIDList": "",
    "itemSetIDList": "",
    “兴趣”: ””,
    “优质的”: ””,
    "itemSetDID": "",
}

url =“https://denver.coloradotaxsale.com/index.cfm?folder=auctionResults&mode=preview”


所有数据 = []

for data["pageNum"] in range(1, 3): # 

Print:

SEQ NUM Tax year notify Plot ID Number of faces Winning Bid Sold to 标题> 96 000094 2020 00031-18-001-000 905.98 USD $81.00 00005517 97 000095 2020 00031-18-002-000 $750.13 $75.00 00005517 98 000096 2020 00031-18-003-000 $750.13 $75.00 00005517 99 000097 2020 00031-18-004-000 $750.13 $75.00 00005517 100 000098 2020 00031-18-007-000 $750.13 $76.00 00005517 101 000099 2020 00031-18-008-000 905.98 USD $84.00 00005517 102 000100 2020 00031-19-001-000 $1,999.83 $171.00 00005517 103 000101 2020 00031-19-004-000 1,486.49 USD 131.00 USD 00005517 104 000102 2020 00031-19-006-000 1,063.44 USD 96.00 USD 00005517 105 000103 2020 00031-20-001-000 1,468.47 USD 126.00 USD 00005517 表>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template