


Comparison of php cURL and Rolling cURL concurrency methods_PHP tutorial
In the process of actual projects or writing your own gadgets (such as news aggregation, commodity price monitoring, price comparison), you usually need to obtain data from a third-party website or API interface. When you need to process a URL queue, in order to improve For performance, you can use the curl_multi_* family of functions provided by cURL to achieve simple concurrency.
This article will discuss two specific implementation methods and make a simple performance comparison of different methods.
1. Classic cURL concurrency mechanism and its existing problems
Classic cURL The implementation mechanism is easy to find online. For example, refer to the following implementation method in the PHP online manual:
function
classic_curl($urls,
$delay)
{
$queue
= curl_multi_init();
$map
= array();
foreach
($urls
as
$url)
{
//
create cURL resources
$ch
= curl_init();
//
set URL and other appropriate options
curl_setopt($ch,
CURLOPT_URL, $url);
curl_setopt($ch,
CURLOPT_TIMEOUT, 1);
curl_setopt($ch,
CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch,
CURLOPT_HEADER, 0);
curl_setopt($ch,
CURLOPT_NOSIGNAL, true);
//
add handle
curl_multi_add_handle($queue,
$ch);
$map[$url]
= $ch;
}
$active
= null;
//
execute the handles
do
{
$mrc
= curl_multi_exec($queue,
$active);
}
while
($mrc
== CURLM_CALL_MULTI_PERFORM);
while
($active
> 0 && $mrc
== CURLM_OK) {
if
(curl_multi_select($queue,
0.5) != -1) {
do
{
$mrc
= curl_multi_exec($queue,
$active);
}
while
($mrc
== CURLM_CALL_MULTI_PERFORM);
}
}
$responses
= array();
foreach
($map
as
$url=>$ch)
{
$responses[$url]
= callback(curl_multi_getcontent($ch),
$delay);
curl_multi_remove_handle($queue,
$ch);
curl_close($ch);
}
curl_multi_close($queue);
return
$responses;
}
First push all URLs into the concurrent queue, then execute the concurrent process, wait for all requests to be received, and perform subsequent processing such as data parsing. In the actual processing process, the recipient Due to the influence of network transmission, the content of some URLs will be returned prior to other URLs, but classic cURL concurrency must wait for the slowest URL to return before starting processing. Waiting means CPU idleness and waste. If the URL queue is very short, This kind of idleness and waste is still within the acceptable range, but if the queue is very long, this kind of waiting and waste will become unacceptable.
2. Improved Rolling cURL concurrency method
After careful analysis, it is not difficult to find that there is still room for optimization of classic cURL concurrency. The optimization method is to process a URL request as quickly as possible after it is completed, and wait for other URLs to return while processing, instead of waiting for the slowest interface. Start processing and other work only after returning, thereby avoiding CPU idleness and waste. Without further ado, here is the specific implementation:
function
rolling_curl($urls,
$delay)
{
$queue
= curl_multi_init();
$map
= array();
foreach
($urls
as
$url)
{
$ch
= curl_init();
curl_setopt($ch,
CURLOPT_URL, $url);
curl_setopt($ch,
CURLOPT_TIMEOUT, 1);
curl_setopt($ch,
CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch,
CURLOPT_HEADER, 0);
curl_setopt($ch,
CURLOPT_NOSIGNAL, true);
curl_multi_add_handle($queue,
$ch);
$map[(string)
$ch]
= $url;
}
$responses
= array();
do
{
while
(($code
= curl_multi_exec($queue,
$active))
== CURLM_CALL_MULTI_PERFORM) ;
if
($code
!= CURLM_OK) { break;
}
//
a request was just completed -- find out which one
while
($done
= curl_multi_info_read($queue))
{
//
get the info and content returned on the request
$info
= curl_getinfo($done['handle']);
$error
= curl_error($done['handle']);
$results
= callback(curl_multi_getcontent($done['handle']),
$delay);
$responses[$map[(string)
$done['handle']]]
= compact('info',
'error',
'results');
//
remove the curl handle that just completed
curl_multi_remove_handle($queue,
$done['handle']);
curl_close($done['handle']);
}
//
Block for data in / output; error handling is done by curl_multi_exec
if
($active
> 0) {
curl_multi_select($queue,
0.5);
}
}
while
($active);
curl_multi_close($queue);
return
$responses;
}
3. 两种并发实现的性能对比
改进前后的性能对比试验在LINUX主机上进行, 测试时使用的并发队列如下:
http://a.com/item.htm?id=14392877692
http:/a.com/item.htm?id=16231676302
http://a.com/item.htm?id=5522416710
http://a.com/item.htm?id=16551116403
简要说明下实验设计的原则和性能测试结果的格式: 为保证结果的可靠, 每组实验重复20次, 在单次实验中, 给定相同的接口URL集合, 分别测量Classic(指经典的并发机制)和Rolling(指改进后的并发机制)两种并发机制的耗时(秒为单位), 耗时短者胜出(Winner), 并计算节省的时间(Excellence, 秒为单位)以及性能提升比例(Excel. %). 为了尽量贴近真实的请求而又保持实验的简单, 在对返回结果的处理上只是做了简单的正则表达式匹配, 而没有进行其他复杂的操作. 另外, 为了确定结果处理回调对性能对比测试结果的影响, 可以使用usleep模拟现实中比较负责的数据处理逻辑(如提取, 分词, 写入文件或数据库等).
性能测试中用到的回调函数为:
function
callback($data,
$delay)
{
preg_match_all('/
(.+)
/iU',$data,
$matches);
usleep($delay);
return
compact('data',
'matches');
}
When there is no delay in data processing callback: Rolling Curl is slightly better, but the performance improvement effect is not obvious.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

In this chapter, we will understand the Environment Variables, General Configuration, Database Configuration and Email Configuration in CakePHP.

PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

To work with date and time in cakephp4, we are going to make use of the available FrozenTime class.

To work on file upload we are going to use the form helper. Here, is an example for file upload.

In this chapter, we are going to learn the following topics related to routing ?

CakePHP is an open-source framework for PHP. It is intended to make developing, deploying and maintaining applications much easier. CakePHP is based on a MVC-like architecture that is both powerful and easy to grasp. Models, Views, and Controllers gu

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c

Validator can be created by adding the following two lines in the controller.
