How to Fake a Browser Visit with Python's Requests Library
When accessing websites programmatically using tools like Python's Requests package or the wget command, you may encounter disparities in the HTML content retrieved compared to when visiting the website through a web browser. This discrepancy stems from the fact that websites often employ mechanisms to distinguish between genuine browser visits and automated requests.
One effective approach to overcome this challenge is to simulate a legitimate browser visit by providing a "User-Agent" header in your request. This header contains information about the specific browser and version being used, which helps the website identify it as a human-initiated visit.
To implement this solution using Python's Requests library, follow these steps:
Example code:
import requests url = 'http://www.ichangtou.com/#company:data_000008.html' headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'} response = requests.get(url, headers=headers) print(response.content)
For reference, a comprehensive list of User-Agent strings for different browsers is available here:
[List of all Browsers](https://deviceatlas.com/blog/list-of-user-agent-strings)
Alternatively, you can utilize the fake-useragent third-party package, which simplifies the process of generating realistic User-Agent strings. Here is a demonstration of its usage:
from fake_useragent import UserAgent ua = UserAgent() request_headers = {'User-Agent': ua.chrome}
The above is the detailed content of How to Fake a Browser Visit with Python's Requests Library?. For more information, please follow other related articles on the PHP Chinese website!