This article mainly introduces the implementation method of PHP crawling HTTPS content, and how to deal with an HTTPS problem encountered during crawling. Friends in need can refer to it. Let’s take a look together.
Question
I encountered an HTTPS problem while researching the Hacker News API. Because all Hacker News APIs are accessed through the encrypted HTTPS protocol, which is different from the ordinary HTTP protocol, when using the function <a href="http://www.php.cn/wiki/1311.html" target="_blank">file_get_contents</a>()
in PHP to obtain the information provided in the API Data, an error occurs
The code used is like this:
<?php $data = file_get_contents("/http://blog.it985.com/"); ?>
When running the above code, the following error message is encountered:
PHP Warning: file_get_contents(): Unable to find the wrapper “https” – did you forget to enable it when you configured PHP?
Why does such an error occur?
After searching on the Internet, I found that many people have encountered this error. The problem is very direct, because it is not enabled in PHP's configuration file One parameter, on my local machine is in /apache/bin/php.ini;extension=php_openssl.dll
, the preceding semicolon needs to be removed.
You can use the following script to check the configuration of your PHP environment:
<?php $w = stream_get_wrappers(); echo 'openssl: ', extension_loaded ('openssl') ? 'yes':'no', "\n"; echo 'http wrapper: ', in_array('http', $w) ? 'yes':'no', "\n"; echo 'https wrapper: ', in_array('https', $w) ? 'yes':'no', "\n"; echo 'wrappers: ', var_dump($w);
Run the above script snippet, and the result on my machine is:
<?php openssl: no http wrapper: yes https wrapper: no wrappers: array(10) { [0]=> string(3) "php" [1]=> string(4) "file" [2]=> string(4) "glob" [3]=> string(4) "data" [4]=> string(4) "http" [5]=> string(3) "ftp" [6]=> string(3) "zip" [7]=> string(13) "compress.zlib" [8]=> string(14) "compress.bzip2" [9]=> string(4) "phar" }
Alternatives
It is very simple to find an error and correct it. The difficult thing is that you cannot correct the error after you find it. I originally wanted to put this script method on the remote host, but I couldn't modify the PHP configuration of the remote host. The result was that I couldn't use this solution, but we can't hang ourselves on a tree. This road doesn't work. Let's take a look. Is there any other way?
Another function I often use to grab content in PHP is curl
, which is more powerful than file_get_contents()
and provides a lot of optional parameters. For the problem of accessing HTTPS content, the CURL
configuration parameters we need to use are:
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
You can see from the semantics that it ignores/skips SSL security verification. Maybe this is not a good idea, but for ordinary scenarios, this is enough.
The following is a function encapsulated by Curl
that can access HTTPS content:
function getHTTPS($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_HEADER, false); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_REFERER, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); $result = curl_exec($ch); curl_close($ch); return $result; }
The above is the detailed content of PHP method sample code for grabbing HTTPS content and error handling. For more information, please follow other related articles on the PHP Chinese website!