Home Backend Development Python Tutorial Can crawler technology crawl https?

Can crawler technology crawl https?

May 29, 2019 pm 01:55 PM
https

Can crawler technology crawl https?

Can crawler technology crawl https?

First of all, let’s understand what https is

https is HTTP SSL In short, the previous plaintext is encrypted and transmitted based on the HTTP transmission method. The information encryption method and secret key are determined before transmission. Even if it is captured or forged during transmission, it can ensure that the information is not leaked.

The essence of the crawler is to pretend to be a browser, send a request to the server, and participate in the entire process, so even https links can be crawled, but the premise is that the forged client has the correct SSL certificate.

Find the source of the error

When the crawler is running and an SSL error is prompted, it is usually because the local certificate or related SSL library is not installed correctly, and the server uses its own CA certificate, which is not certified by an authoritative organization.

Solving certificate exception issues

For CA certificate issues we can refer to the following centralized solutions:

1. Do not verify the CA certificate, but ignore security Warning

coding=utf-8import requests# 不验证CA证书则需要忽略安全警告方式一:import urllib3urllib3.disable_warnings()方式二:from requests.packages.urllib3.exceptions import InsecureRequestWarningrequests.packages.urllib3.disable_warnings(InsecureRequestWarning)r=requests.get(url=“https://www.baidu.com/”,verify=False)print r.elapsed.total_seconds()
Copy after login

2. Specify the certificate location or the folder containing the certificate (this folder is made by the OpenSSL tool)

coding=utf-8import requestsr=requests.get(url=“https://www.baidu.com/”,verify='/path/to/certfile')
Copy after login

The above is the detailed content of Can crawler technology crawl https?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to use Nginx Proxy Manager to implement reverse proxy under HTTPS protocol How to use Nginx Proxy Manager to implement reverse proxy under HTTPS protocol Sep 26, 2023 am 08:40 AM

How to use NginxProxyManager to implement reverse proxy under HTTPS protocol. In recent years, with the popularity of the Internet and the diversification of application scenarios, the access methods of websites and applications have become more and more complex. In order to improve website access efficiency and security, many websites have begun to use reverse proxies to handle user requests. The reverse proxy for the HTTPS protocol plays an important role in protecting user privacy and ensuring communication security. This article will introduce how to use NginxProxy

How to use Nginx Proxy Manager to implement automatic jump from HTTP to HTTPS How to use Nginx Proxy Manager to implement automatic jump from HTTP to HTTPS Sep 26, 2023 am 11:19 AM

How to use NginxProxyManager to implement automatic jump from HTTP to HTTPS. With the development of the Internet, more and more websites are beginning to use the HTTPS protocol to encrypt data transmission to improve data security and user privacy protection. Since the HTTPS protocol requires the support of an SSL certificate, certain technical support is required when deploying the HTTPS protocol. Nginx is a powerful and commonly used HTTP server and reverse proxy server, and NginxProxy

Nginx with SSL: Configure HTTPS to protect your web server Nginx with SSL: Configure HTTPS to protect your web server Jun 09, 2023 pm 09:24 PM

Nginx is a high-performance web server software and a powerful reverse proxy server and load balancer. With the rapid development of the Internet, more and more websites are beginning to use the SSL protocol to protect sensitive user data, and Nginx also provides powerful SSL support, making the security performance of the web server even further. This article will introduce how to configure Nginx to support the SSL protocol and protect the security performance of the web server. What is SSL protocol? SSL (SecureSocket

What does the https workflow look like? What does the https workflow look like? Apr 07, 2024 am 09:27 AM

The https workflow includes steps such as client-initiated request, server response, SSL/TLS handshake, data transmission, and client-side rendering. Through these steps, the security and integrity of data during transmission can be ensured.

How to configure https in tomcat How to configure https in tomcat Jan 05, 2024 pm 05:15 PM

Configuration steps: 1. Obtain the SSL certificate; 2. Configure the SSL certificate; 3. Edit the Tomcat configuration file; 4. Restart Tomcat. Detailed introduction: 1. You need to obtain an SSL certificate, either a self-signed certificate or a valid SSL certificate from a certification agency (such as Let's Encrypt); 2. Place the obtained SSL certificate and private key files on the server and ensure that these files Located in a safe location, only users with sufficient permissions can access; 3. Edit Tomcat configuration files, etc.

Solution: urllib3 ProxySchemeUnknown(proxy.scheme) Solution: urllib3 ProxySchemeUnknown(proxy.scheme) Feb 29, 2024 pm 07:01 PM

The reason for the error is that the ProxySchemeUnknown(proxy.scheme) error of urllib3 is usually caused by the use of an unsupported proxy protocol. In this case, urllib3 does not recognize the proxy server's protocol type and therefore cannot use the proxy for network connections. To resolve this issue, you need to ensure that you are using a supported proxy protocol, such as HTTP or https. How to resolve To resolve this issue, you need to ensure that you are using a supported proxy protocol, such as HTTP or HTTPS. You can solve this problem by setting the proxy parameters of urllib3. If you are using an http proxy, the code example is as follows: importurllib3http

How Nginx firewall ensures HTTPS secure communication How Nginx firewall ensures HTTPS secure communication Jun 10, 2023 am 10:16 AM

In today's Internet era, secure communication has become an indispensable part. Especially in HTTPS communication, how to ensure its security is particularly important. As a popular web server and reverse proxy server, Nginx's firewall can also play an important role in ensuring HTTPS secure communication. This article will discuss the Nginx firewall from the following aspects. TLS/SSL encryption The security of HTTPS communication is mainly based on TLS/SSL encryption technology, which can prevent data from being transmitted during transmission.

How to set up a secure HTTPS connection for a PHP form? How to set up a secure HTTPS connection for a PHP form? Aug 17, 2023 pm 03:25 PM

How to set up a secure HTTPS connection for a PHP form? As the Internet develops, security becomes more and more important in web development. The encrypted transmission protocol HTTPS plays a key role in protecting data transmission. When using PHP forms for data transmission, we can take some measures to ensure the security of the connection. This article will guide you on how to set up a secure HTTPS connection for PHP forms, with some code examples. Purchase an SSL Certificate First, you need to purchase an SSL Certificate. SSL certificate is a guaranteed website

See all articles