Home > Backend Development > PHP Tutorial > PHP implements crawling HTTPS content, php crawls https_PHP tutorial

PHP implements crawling HTTPS content, php crawls https_PHP tutorial

WBOY
Release: 2016-07-13 10:12:29
Original
1162 people have browsed it

PHP implements capturing HTTPS content, php captures https

I recently encountered an HTTPS issue while researching the Hacker News API. Because all Hacker News APIs are accessed through the encrypted HTTPS protocol, which is different from the ordinary HTTP protocol, when using the function file_get_contents() in PHP to obtain the data provided in the API, an error occurs. The code used is as follows :

<&#63;php<br />$data = file_get_contents("https://hacker-news.firebaseio.com/v0/topstories.json&#63;print=pretty");<br />......
Copy after login

When running the above code, the following error message is encountered:

PHP Warning:  file_get_contents(): Unable to find the wrapper "https" - did you forget to enable it when you configured PHP&#63;
Copy after login

The following is a screenshot:

php https error

Why does this error occur?

After some searching on the Internet, I found that many people have encountered this error. The problem is very direct. It is because there is no parameter enabled in the PHP configuration file. On my local machine, it is < in /apache/bin/php.ini 🎜> For this item, the preceding semicolon needs to be removed. You can use the following script to check the configuration of your PHP environment: ;extension=php_openssl.dll

$w = stream_get_wrappers();<br />echo 'openssl: ',  extension_loaded  ('openssl') &#63; 'yes':'no', "\n";<br />echo 'http wrapper: ', in_array('http', $w) &#63; 'yes':'no', "\n";<br />echo 'https wrapper: ', in_array('https', $w) &#63; 'yes':'no', "\n";<br />echo 'wrappers: ', var_dump($w);
Copy after login
Running the above script snippet, the result on my machine is:

openssl: no<br />http wrapper: yes<br />https wrapper: no<br />wrappers: array(10) {<br />  [0]=><br>  string(3) "php"<br>  [1]=><br>  string(4) "file"<br>  [2]=><br>  string(4) "glob"<br>  [3]=><br>  string(4) "data"<br>  [4]=><br>  string(4) "http"<br>  [5]=><br>  string(3) "ftp"<br>  [6]=><br>  string(3) "zip"<br>  [7]=><br>  string(13) "compress.zlib"<br>  [8]=><br>  string(14) "compress.bzip2"<br>  [9]=><br>  string(4) "phar"<br>}
Copy after login

Alternatives

It is very simple to find an error and correct it. The difficult thing is that you cannot correct the error after you find it. I originally wanted to put this script method on the remote host, but I couldn't modify the PHP configuration of the remote host. The result was that I couldn't use this solution, but we can't hang ourselves on a tree. This road doesn't work. Let's take a look. Is there any other way?

Another function that I often use to capture content in PHP is

. It is more powerful than curl and provides a lot of optional parameters. For the problem of accessing file_get_contents() content, the HTTPS configuration parameters we need to use are: CURL

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
Copy after login
You can see semantically that it ignores/skips SSL security verification. Maybe this is not a good idea, but for ordinary scenarios, this is enough.

The following is a function encapsulated by

that can access HTTPS content: Curl

function getHTTPS($url) {<br>  $ch = curl_init();<br>  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);<br>  curl_setopt($ch, CURLOPT_HEADER, false);<br>  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);<br>  curl_setopt($ch, CURLOPT_URL, $url);<br>  curl_setopt($ch, CURLOPT_REFERER, $url);<br>  curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);<br>  $result = curl_exec($ch);<br>  curl_close($ch);<br>  return $result;<br>}
Copy after login
The above is the entire process of obtaining https content in PHP. It is very simple and practical. It is recommended to friends who have the same project needs.

http://www.bkjia.com/PHPjc/920621.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/920621.htmlTechArticlePHP implements crawling HTTPS content, php crawls https. Recently, I encountered an HTTPS problem while studying the Hacker News API. Because all Hacker News APIs are accessed via the encrypted HTTPS protocol, follow...
Related labels:
php
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template