When attempting to parse a website using Python's requests library, you may encounter a "403 Forbidden" error. This error typically indicates that the server has rejected your request due to lack of proper authorization or permission.
Consider the following code:
<code class="python">url = 'http://worldagnetwork.com/' result = requests.get(url) print(result.content.decode())</code>
This code attempts to retrieve and decode the content of the specified URL. However, it produces the following output:
<code class="html"><html> <head><title>403 Forbidden</title></head> <body bgcolor="white"> <center><h1>403 Forbidden</h1></center> <hr><center>nginx</center> </body> </html></code>
In this specific case, the problem arises because the server is rejecting GET requests without a User-Agent header. A User-Agent header identifies the browser or application sending the request, which helps the server determine how to handle the request.
To resolve this issue, explicitly specify a User-Agent header in your requests request. Here's an example:
<code class="python">import requests url = 'http://worldagnetwork.com/' headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'} result = requests.get(url, headers=headers) print(result.content.decode())</code>
By setting the User-Agent header to an appropriate value, you can effectively mimic a browser and successfully retrieve the website's content, as demonstrated by the following output:
<code class="html"><!doctype html> <!--[...]--> <!--[...]--></code>
The above is the detailed content of How to Fix \'403 Forbidden\' Errors in Python Requests?. For more information, please follow other related articles on the PHP Chinese website!