Detailed explanation of PHP Curl multi-threading principle examples_PHP tutorial

WBOY
Release: 2016-07-13 10:25:37
Original
899 people have browsed it

Let me introduce to you the examples and principles of Curl multi-threading. Please tell me if I’m wrong
I believe many people are having headaches about the curl_multi family of functions that are unclear in the PHP manual. They have few documents and the examples given are so simple that you have no way to learn from them. I have also searched many web pages and found nothing. See a complete application example.
curl_multi_add_handle
curl_multi_close
curl_multi_exec
curl_multi_getcontent
curl_multi_info_read
curl_multi_init
curl_multi_remove_handle
curl_multi_select
Generally speaking, think of using these functions , the purpose should obviously be You need to request multiple URLs at the same time instead of requesting them one by one. Otherwise, it is better to loop and adjust curl_exec yourself.
The steps are summarized as follows:
Step 1: Call curl_multi_init
Step 2: Call curl_multi_add_handle in a loop
It should be noted in this step that the second parameter of curl_multi_add_handle is a subhandle derived from curl_init .
Step 3: Continue to call curl_multi_exec
Step 4: Call curl_multi_getcontent in a loop to obtain the results as needed
Step 5: Call curl_multi_remove_handle, and call curl_close for each word handle
Step 6: Call curl_multi_close
Here is an example from the PHP manual:

Copy the code The code is as follows:

// Create a pair of cURL resources
$ch1 = curl_init();
$ch2 = curl_init();

// Set the URL and corresponding options
curl_setopt($ch1, CURLOPT_URL, "http://www.jb51.net/");
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/ ");
curl_setopt($ch2, CURLOPT_HEADER, 0);

// Create batch cURL handles
$mh = curl_multi_init();

// Add 2 more Handle
curl_multi_add_handle($mh,$ch1);
curl_multi_add_handle($mh,$ch2);

$active = null;
//Execute batch handle
do {
$mrc ​​= curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc ​​== CURLM_OK) {
if (curl_multi_select($mh) != -1) {
                                                                                                                                                  🎜>}

//Close all handles
curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);
?>


The whole usage process is almost like this. However, this simple code has a fatal weakness, that is, in the do loop, it is an infinite loop during the entire url request, which can easily lead to CPU usage is 100%.
Now let’s improve it. Here we need to use a function curl_multi_select that has almost no documentation. Although C’s curl library has instructions for select, the interface and usage in PHP are indeed different from those in C.
Change the do section above to the following:



Copy the code
The code is as follows: do {                      $mrc ​​= curl_multi_exec($mh,$active);                                                          _OK) {
                                                              ) != -1) {
                                                                                                                                                                                                                                                                                                                                               ; >

Because $active has to wait until all url data is received before it becomes false, so the return value of curl_multi_exec is used here to determine whether there is still data. When there is data, curl_multi_exec will be called continuously. If there is no data temporarily, it will enter the select stage. , it can be awakened to continue execution as soon as new data comes. The advantage here is that there is no unnecessary consumption of CPU.
In addition: There are some details that may sometimes be encountered:
To control the timeout of each request, do it through curl_setopt before curl_multi_add_handle:
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout) ;
To determine whether there is a timeout or other errors, use: curl_error($conn[$i]); before curl_multi_getcontent

Features of this class:
The operation is very stable.
If you set a concurrency, you will always work with this concurrency number, even if you add tasks through the callback function, it will not be affected.
The CPU usage is extremely low, and most of the CPU is consumed in the user's callback function.
The memory utilization is high and the number of tasks is large (15W tasks will occupy more than 256M of memory). You can use the callback function to add tasks, and the number is customized.
Can occupy the bandwidth to the maximum extent.
Chained tasks, such as a task that needs to collect data from multiple different addresses, can be completed in one go through callbacks.
Able to make multiple attempts for CURL errors, the number of times can be customized (CURL errors are likely to occur at the beginning due to large concurrency, and CURL errors may also occur due to network conditions or the stability of the other party's server).
The callback function is quite flexible and can perform multiple types of tasks at the same time (such as downloading files, crawling web pages, and analyzing 404 can be performed simultaneously in one PHP process).
It is very easy to customize the task type, such as checking 404, getting the last URL of the redirect, etc.
You can set up cache to challenge product integrity.
Disadvantages:
Cannot make full use of multi-core CPU (it can be solved by opening multiple processes, and you need to handle logic such as task division by yourself).
The maximum concurrency is 500 (or 512?). After testing, it is an internal limit of CURL. Exceeding the maximum concurrency will always result in a failure.
Currently there is no resume function.
The current task is atomic, and it is not possible to divide a large file into several parts and open separate threads to download them.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/824988.htmlTechArticleI will introduce to you the examples and principles of Curl multi-threading. Please tell me if I’m wrong. I believe many people have a headache about the curl_multi family of functions that are unclear in the PHP manual. They have few documents and give...
Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template