How to scrape specific Google Weather text using BeautifulSoup?
P粉275883973
P粉275883973 2024-04-01 14:06:14
0
1
445

How to find the course text "New York City, USA" in Python using BeautifulSoup?

Tried copying the video to practice, but it no longer works.

Tried to find something in the official documentation, but no success. Or is my get_html_content function not working properly and Google is just blocking me, thus returning an empty list / None ?

This is my current code:

from django.shortcuts import render
import requests

def get_html_content(city):
    USER_AGENT = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
    LANGUAGE = "en-US,en;q=0.5"
    session = requests.Session()
    session.headers['User-Agent'] = USER_AGENT
    session.headers['Accept-Language'] = LANGUAGE
    session.headers['Content-Language'] = LANGUAGE
    city.replace(" ", "+")
    html_content = session.get(f"https://www.google.com/search?q=weather+in+{city}").text
    return html_content

def home(request):
    result = None
    if 'city' in request.GET: 
        city = request.GET.get('city')
        html_content = get_html_content(city)
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(html_content, 'html.parser')
        soup.find_all('div', attrs={'class': 'wob_loc q8U8x'})
        **OR**
        soup.find_all('div', attrs={'id': 'wob_loc'})

--> Both return an empty list (= .find method returns None)

P粉275883973
P粉275883973

reply all(1)
P粉509383150

The layout of the Google page may have changed at the same time, so to get data about the weather you must change your code. For example:

import requests
from bs4 import BeautifulSoup


params = {'q':'weather in New York City, New York, USA', 'hl': 'en'}
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:108.0) Gecko/20100101 Firefox/108.0'}
cookies = {'CONSENT':"YES+cb.20220419-08-p0.cs+FX+111"}

url = 'https://www.google.com/search'


soup = BeautifulSoup(requests.get(url, params=params, headers=headers, cookies=cookies).content, 'html.parser')

for t in soup.select('#wob_dp [aria-label]'):
    how = t.find_next('img')['alt']
    temp = t.find_next('span').get_text(strip=True)
    print('{:

Print:

Mon   Sunny                8
Tue   Cloudy               7
Wed   Partly cloudy        11
Thu   Rain                 7
Fri   Mostly cloudy        8
Sat   Partly cloudy        6
Sun   Scattered showers    8
Mon   Showers              8
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template