Certificate Verification Failed: Troubleshooting SSL Errors in Scraping
When scraping websites that employ Secure Socket Layer (SSL) protocols, developers may encounter the "CERTIFICATE_VERIFY_FAILED" error. This error indicates that the verification of the website's SSL certificate has failed.
One common example of this error occurs when attempting to scrape Wikipedia using the following Python code:
<code class="python">import urllib.request import bs4 import re pages = set() def getLinks(pageUrl): global pages html = urllib.request.urlopen("http://en.wikipedia.org"+pageUrl) bsObj = bs4.BeautifulSoup(html) for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")): if 'href' in link.attrs: if link.attrs['href'] not in pages: #We have encountered a new page newPage = link.attrs['href'] print(newPage) pages.add(newPage) getLinks(newPage) getLinks("")</code>
When running this code, you may encounter the following error:
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)>
Solution for macOS Users
If you're using macOS, the solution to this error is simple. Navigate to Macintosh HD > Applications > Python 3.6 folder (or whichever version of Python you're using) and double-click on the "Install Certificates.command" file. This command will install the necessary certificates into your system keychain.
After running this command, the "CERTIFICATE_VERIFY_FAILED" error should no longer appear when scraping Wikipedia or other SSL-secured websites.
The above is the detailed content of How to Fix 'CERTIFICATE_VERIFY_FAILED' Errors When Scraping Websites with SSL?. For more information, please follow other related articles on the PHP Chinese website!