cURL is a powerful PHP library. Using PHP's cURL library can simply and effectively crawl web pages and collect content. Set cookies to simulate logging in to web pages. Curl provides a wealth of functions. Developers can learn from the PHP manual. Get more information about cURL. This article takes simulated login to open source China (oschina) as an example. Friends who need it can refer to
PHP’s curl() is relatively efficient in crawling web pages and supports multi-threading, while file_get_contents() The efficiency is slightly lower. Of course, you need to enable the curl extension when using curl.
Code actual combat
Let’s first look at the login part of the code:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Function login_post( ) First initialize curl_init(), then use curl_setopt() to set relevant option information, including the url address to be submitted, saved cookie files, post data (user name and password and other information), whether to return information, etc., and then curl_exec executes curl , and finally curl_close() releases the resources. Note that PHP's own http_build_query() can convert arrays into connected strings.
Next, if the login is successful, we need to obtain the page information after the login is successful.
1 2 3 4 5 6 7 8 9 10 11 |
|
The function get_content() also initializes curl first, then sets relevant options, executes curl, and releases resources. Among them, we set CURLOPT_RETURNTRANSFER to 1 to automatically return information, and CURLOPT_COOKIEFILE can read the cookie information saved when logging in, and finally return the page content.
Our ultimate goal is to obtain the information after simulated login, which is useful information that can only be obtained after successful normal login. Next, we take logging into the mobile version of Open Source China as an example to see how to capture the information after successful login.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Usage summary
1. Initialize curl;
2. Use curl_setopt to set the target url, and other options;
3. curl_exec, execute curl;
4. After execution, close curl;
5. Output data.
The above is the entire content of this article, I hope it will be helpful to everyone's study.
Related recommendations:
node is based on puppeteerSimulated loginDetailed explanation of the crawling steps
PHP uses Curl to implement Simulated login and detailed steps to capture data
puppeteerSimulated loginCapture Get the implementation code of the page
The above is the detailed content of php curl simulates login and obtains data instance. For more information, please follow other related articles on the PHP Chinese website!