Using CURL in PHP to implement GET and POST requests

高洛峰
Release: 2016-10-20 14:48:08
Original
1298 people have browsed it

1. What is CURL?

 cURL is a tool that uses URL syntax to transfer files and data. It supports many protocols, such as HTTP, FTP, TELNET, etc. The best part is that PHP also supports the cURL library. Using PHP's cURL library can easily and effectively scrape web pages. You only need to run a script and analyze the web pages you crawled, and then you can get the data you want programmatically. Whether you want to retrieve partial data from a link, take an XML file and import it into a database, or even simply retrieve the content of a web page, cURL is a powerful PHP library.


2. CURL function library.

  curl_close — Close a curl session

Curl_copy_handle — Copy all contents and parameters of a curl connection resource

Curl_errno — Return a numeric number containing the current session error message

Curl_error — Return a string containing the current session error message

  curl_exec — Execute a curl session

Curl_getinfo — Get information about a curl connection resource handle

Curl_init — Initialize a curl session

Curl_multi_add_handle — Add a separate curl handle resource to the curl batch session

Curl_multi_ close — close a Batch handle resource

curl_multi_exec — Parse a curl batch handle

curl_multi_getcontent — Return the text stream of the obtained output

curl_multi_info_read — Get the relevant transmission information of the currently parsed curl

curl_multi_init — Initialize a curl batch handle resource

 curl_multi_remove_handle — Remove a handle resource in the curl batch handle resource

 curl_multi_select — Get all the sockets associated with the cURL extension, which can then be “selected”

 curl_setopt_array — Set up a session for a curl in the form of an array Parameters

  curl_setopt — Set session parameters for a curl

Curl_version — Get curl-related version information

The function of the curl_init() function initializes a curl session. The only parameter of the curl_init() function is optional and represents a url address.

 The function of the curl_exec() function is to execute a curl session. The only parameter is the handle returned by the curl_init() function.

 The curl_close() function is used to close a curl session. The only parameter is the handle returned by the curl_init() function.

3. Basic steps for setting up a CURL request in PHP

 ①: Initialization

  curl_init()

 ②: Setting attributes

  curl_setopt(). There is a long list of cURL parameters to set, which can specify various details of the URL request .

 ③: Execute and get the results

  curl_exec()

 ④: Release the handle

  curl_close()

IV. CURL implements GET and POST

①: GET method to implement

<?php
    //初始化
    $curl = curl_init();
    //设置抓取的url
    curl_setopt($curl, CURLOPT_URL, &#39;http://www.baidu.com&#39;);
    //设置头文件的信息作为数据流输出
    curl_setopt($curl, CURLOPT_HEADER, 1);
    //设置获取的信息以文件流的形式返回,而不是直接输出。
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    //执行命令
    $data = curl_exec($curl);
    //关闭URL请求
    curl_close($curl);
    //显示获得的数据
    print_r($data);
?>
Copy after login

②:POST Ways to implement

 
<?php
    //初始化
    $curl = curl_init();
    //设置抓取的url
    curl_setopt($curl, CURLOPT_URL, &#39;http://www.baidu.com&#39;);
    //设置头文件的信息作为数据流输出
    curl_setopt($curl, CURLOPT_HEADER, 1);
    //设置获取的信息以文件流的形式返回,而不是直接输出。
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    //设置post方式提交
    curl_setopt($curl, CURLOPT_POST, 1);
    //设置post数据
    $post_data = array(
        "username" => "coder",
        "password" => "12345"
        );
    curl_setopt($curl, CURLOPT_POSTFIELDS, $post_data);
    //执行命令
    $data = curl_exec($curl);
    //关闭URL请求
    curl_close($curl);
    //显示获得的数据
    print_r($data);
?>
Copy after login

③: If the data obtained is in json format, use the json_decode function to interpret it into an array.

 $output_array = json_decode($output,true);

 If you use json_decode($output) to parse, you will get object type data.

5. A function encapsulated by myself

//参数1:访问的URL,参数2:post数据(不填则为GET),参数3:提交的$cookies,参数4:是否返回$cookies
 function curl_request($url,$post=&#39;&#39;,$cookie=&#39;&#39;, $returnCookie=0){
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_URL, $url);
        curl_setopt($curl, CURLOPT_USERAGENT, &#39;Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)&#39;);
        curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt($curl, CURLOPT_AUTOREFERER, 1);
        curl_setopt($curl, CURLOPT_REFERER, "http://XXX");
        if($post) {
            curl_setopt($curl, CURLOPT_POST, 1);
            curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($post));
        }
        if($cookie) {
            curl_setopt($curl, CURLOPT_COOKIE, $cookie);
        }
        curl_setopt($curl, CURLOPT_HEADER, $returnCookie);
        curl_setopt($curl, CURLOPT_TIMEOUT, 10);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        $data = curl_exec($curl);
        if (curl_errno($curl)) {
            return curl_error($curl);
        }
        curl_close($curl);
        if($returnCookie){
            list($header, $body) = explode("\r\n\r\n", $data, 2);
            preg_match_all("/Set\-Cookie:([^;]*);/", $header, $matches);
            $info[&#39;cookie&#39;]  = substr($matches[1][0], 1);
            $info[&#39;content&#39;] = $body;
            return $info;
        }else{
            return $data;
        }
}
Copy after login


Attached optional parameter description:

First category:

For the optional parameters of the following options, the value should be set to a bool Type value:

Option

Optional value

Remarks

CURLOPT_AUTOREFERER

When redirecting based on Location:, the Referer: information in the header is automatically set.

  CURLOPT_BINARYTRANSFER

  When CURLOPT_RETURNTRANSFER is enabled, return raw (Raw) output.

  CURLOPT_COOKIESESSION

When enabled, curl will only pass one session cookie and ignore other cookies. By default, curl will return all cookies to the server. Session cookies refer to cookies that are used to determine whether the server-side session is valid.

  CURLOPT_CRLF

  When enabled, convert Unix line feed characters into carriage return and line feed characters.

  CURLOPT_DNS_USE_GLOBAL_CACHE

When enabled, a global DNS cache will be enabled. This item is thread-safe and enabled by default.

  CURLOPT_FAILONERROR

  Display the HTTP status code. The default behavior is to ignore HTTP messages with numbers less than or equal to 400.

  CURLOPT_FILETIME

  When enabled, it will try to modify the information in the remote document. The result information will be returned through the CURLINFO_FILETIME option of the curl_getinfo() function. curl_getinfo().

 CURLOPT_FOLLOWLOCATION

When enabled, the "Location:" returned by the server will be placed in the header and returned to the server recursively. Use CURLOPT_MAXREDIRS to limit the number of recursive returns.

 CURLOPT_FORBID_REUSE

 Forcibly disconnect after completing the interaction and cannot be reused.

 CURLOPT_FRESH_CONNECT

 Force to obtain a new connection to replace the connection in the cache.

 CURLOPT_FTP_USE_EPRT

 When enabled, use the EPRT (or LPRT) command when FTP downloads. When set to FALSE disables EPRT and LPRT, using the PORT command only.

  CURLOPT_FTP_USE_EPSV

 When enabled, the EPSV command is first tried before reverting to PASV mode during FTP transfers. Disables EPSV commands when set to FALSE.

  CURLOPT_FTPAPPEND

  When enabled append writes to the file instead of overwriting it.

  CURLOPT_FTPASCII

 An alias for CURLOPT_TRANSFERTEXT.

  CURLOPT_FTPLISTONLY

  When enabled, only the name of the FTP directory will be listed.

  CURLOPT_HEADER

When enabled, the header file information will be output as a data stream.

  CURLINFO_HEADER_OUT

  The request string of the tracking handle when enabled.

 Available starting from PHP 5.1.3. The CURLINFO_ prefix is ​​intentional.

  CURLOPT_HTTPGET

When enabled, the HTTP method will be set to GET. Because GET is the default, it is only used when it is modified.

  CURLOPT_HTTPPROXYTUNNEL

  When enabled, it will be transmitted through HTTP proxy.

  CURLOPT_MUTE

  When enabled, all modified parameters in the cURL function will be restored to their default values.

  CURLOPT_NETRC

 After the connection is established, access the ~/.netrc file to obtain the username and password information to connect to the remote site.

  CURLOPT_NOBODY

  When enabled, the BODY part in HTML will not be output.

  CURLOPT_NOPROGRESS

  Turn off the progress bar of curl transmission when enabled. The default setting of this item is enabled.

 Note:

  PHP automatically sets this option to TRUE, this option should only be changed for debugging purposes.

  CURLOPT_NOSIGNAL

When enabled, ignore all signals passed by curl to php. This item is enabled by default during SAPI multi-threaded transmission.

 Added in cURL 7.10.

  CURLOPT_POST

When enabled, a regular POST request will be sent, type: application/x-www-form-urlencoded, just like the form submission.

  CURLOPT_PUT

When enabled, HTTP is allowed to send files. CURLOPT_INFILE and CURLOPT_INFILESIZE must be set at the same time.

  CURLOPT_RETURNTRANSFER

  Return the information obtained by curl_exec() in the form of a file stream instead of outputting it directly.

  CURLOPT_SSL_VERIFYPEER

  When disabled, cURL will terminate verification from the server. Set the certificate using the CURLOPT_CAINFO option. Set the certificate directory using the CURLOPT_CAPATH option. If CURLOPT_SSL_VERIFYPEER (default 2) is enabled, CURLOPT_SSL_VERIFYHOST needs to be set to TRUE otherwise set to FALSE.

 The default is TRUE since cURL 7.10. Starting with cURL 7.10, bundle installation is defaulted.

  CURLOPT_TRANSFERTEXT

  When enabled, use ASCII mode for FTP transfers. For LDAP, it retrieves plain text information rather than HTML. On Windows systems, the system does not set STDOUT to binary mode.

  CURLOPT_UNRESTRICTED_AUTH

  Continuously append username and password information to multiple locations in the header generated using CURLOPT_FOLLOWLOCATION, even if the domain name has changed.

  CURLOPT_UPLOAD

  Allow file uploads when enabled.

  CURLOPT_VERBOSE

  When enabled, all information will be reported and stored in STDERR or the specified CURLOPT_STDERR.


Second category:

For the optional parameters of the following options, value should be set to an integer type value:

Option

Optional value

Remarks

CURLOPT_BUFFERSIZE

Each time The size of the cache is read from the obtained data, but there is no guarantee that this value will be filled every time.

 Added in cURL 7.10.

  CURLOPT_CLOSEPOLICY

  Either CURLCLOSEPOLICY_LEAST_RECENTLY_USED or CURLCLOSEPOLICY_OLDEST, there are three other CURLCLOSEPOLICY_, but cURL does not support it yet.

  CURLOPT_CONNECTTIMEOUT

  The time to wait before initiating a connection. If set to 0, it will wait indefinitely.

  CURLOPT_CONNECTTIMEOUT_MS

  The time to wait for a connection attempt, in milliseconds. If set to 0, wait infinitely.

  Added in cURL 7.16.2. Available starting with PHP 5.2.3.

  CURLOPT_DNS_CACHE_TIMEOUT

  Set the time to save DNS information in memory, the default is 120 seconds.

  CURLOPT_FTPSSLAUTH

  FTP authentication method: CURLFTPAUTH_SSL (try SSL first), CURLFTPAUTH_TLS (try TLS first) or CURLFTPAUTH_DEFAULT (let cURL decide automatically).

  Added in cURL 7.12.2.

  CURLOPT_HTTP_VERSION

  CURL_HTTP_VERSION_NONE (default value, let cURL decide which version to use), CURL_HTTP_VERSION_1_0 (force to use HTTP/1.0) or CURL_HTTP_VERSION_1_1 (force to use HTTP/1.1).

  CURLOPT_HTTPAUTH

  HTTP authentication method used, optional values ​​are: CURLAUTH_BASIC, CURLAUTH_DIGEST, CURLAUTH_GSSNEGOTIATE, CURLAUTH_NTLM, CURLAUTH_ANY and CURLAUTH_ANYSAFE.

  Multiple values ​​can be separated using the | bitfield (or) operator, and cURL lets the server choose the one that supports the best value.

CURLAUTH_ANY is equivalent to CURLAUTH_BASIC | CURLAUTH_DIGEST | CURLAUTH_GSSNEGOTIATE | CURLAUTH_NTLM.

CURLAUTH_ANYSAFE is equivalent to CURLAUTH_DIGEST | CURLAUTH_GSSNEGOTIATE | CURLAUTH_NTLM.

CURLO PT_INFILESIZE

  Set the size limit of uploaded files in bytes.

  CURLOPT_LOW_SPEED_LIMIT

  When the transmission speed is less than CURLOPT_LOW_SPEED_LIMIT (bytes/sec), PHP will use CURLOPT_LOW_SPEED_TIME to determine whether to cancel the transmission because it is too slow.

 CURLOPT_LOW_SPEED_TIME

 When the transmission speed is less than CURLOPT_LOW_SPEED_LIMIT (bytes/sec), PHP will use CURLOPT_LOW_SPEED_TIME to determine whether to cancel the transmission because it is too slow.

  CURLOPT_MAXCONNECTS

  The maximum number of connections allowed. If it exceeds, CURLOPT_CLOSEPOLICY will be used to determine which connections should be stopped.

 CURLOPT_MAXREDIRS

Specify the maximum number of HTTP redirects. This option is used together with CURLOPT_FOLLOWLOCATION.

  CURLOPT_PORT

  Used to specify the connection port. (Optional)

 CURLOPT_PROTOCOLS

 Bit field refers to CURLPROTO_*. If enabled, the bitfield value limits which protocols libcurl can use during transfers. This will allow you to compile libcurl to support many protocols, but only to use a subset of them that are allowed to be used. By default libcurl will use all protocols it supports. See CURLOPT_REDIR_PROTOCOLS.

The available protocol options are: CURLPROTO_HTTP, CURLPROTO_HTTPS, CURLPROTO_FTP, CURLPROTO_FTPS, CURLPROTO_SCP, CURLPROTO_SFTP, CURLPROTO_TELNET, CURLPROTO_LDAP, CURLPROTO_LDAPS, CURLPROTO_DICT, CURLPROTO_FILE, CURLPROTO_TFTP, CURLPROTO_ALL

  were added in cURL 7.19.4.

 CURLOPT_PROXYAUTH

 Verification method for HTTP proxy connection. Use the bitfield flags in CURLOPT_HTTPAUTH to set the corresponding options. For proxy authentication only CURLAUTH_BASIC and CURLAUTH_NTLM are currently supported.

  Added in cURL 7.10.7.

  CURLOPT_PROXYPORT

  The port of the proxy server. The port can also be set in CURLOPT_PROXY.

  CURLOPT_PROXYTYPE

  Either CURLPROXY_HTTP (default) or CURLPROXY_SOCKS5.

 Added in cURL 7.10.

  CURLOPT_REDIR_PROTOCOLS

  Bit field values ​​in CURLPROTO_*. If enabled, the bitfield value will limit the protocols that the transport thread can use when following a redirect when CURLOPT_FOLLOWLOCATION is turned on. This will allow you to restrict the transport thread to a subset of allowed protocols when redirecting. By default libcurl will allow all protocols except FILE and SCP. This is slightly different from the 7.19.4 pre-release version which unconditionally follows all supported protocols. For protocol constants, please refer to CURLOPT_PROTOCOLS.

  Added in cURL 7.19.4.

  CURLOPT_RESUME_FROM

  Pass a byte offset when resuming transmission (used to resume transmission from breakpoint).

  CURLOPT_SSL_VERIFYHOST

  1 Check whether there is a common name in the server SSL certificate. Translator's Note: Common Name generally means filling in the domain name (domain) or subdomain (sub domain) for which you are going to apply for an SSL certificate. 2 Check that the common name exists and matches the provided host name.

 CURLOPT_SSLVERSION

 The SSL version to use (2 or 3). By default PHP will detect this value by itself, although in some cases it may need to be set manually.

  CURLOPT_TIMECONDITION

If it has been edited after a certain time specified by CURLOPT_TIMEVALUE, use CURL_TIMECOND_IFMODSINCE to return the page. If it has not been modified and CURLOPT_HEADER is true, a "304 Not Modified" header will be returned. CURLOPT_HEADER is false. Then use CURL_TIMECOND_IFUNMODSINCE, the default value is CURL_TIMECOND_IFUNMODSINCE.

 CURLOPT_TIMEOUT

  Set the maximum number of seconds cURL is allowed to execute.

 CURLOPT_TIMEOUT_MS

  Set the maximum number of milliseconds that cURL is allowed to execute.

  Added in cURL 7.16.2. Available from PHP 5.2.3 onwards.

 CURLOPT_TIMEVALUE

 Set a timestamp used by CURLOPT_TIMECONDITION. By default, CURL_TIMECOND_IFMODSINCE is used.

 The third category:

For the optional parameters of the following options, value should be set to a string type value:

Option

Optional value

Remarks

CURLOPT_CAINFO

Each holds 1 or multiple file names for certificates to be verified by the server. This parameter is only meaningful when used with CURLOPT_SSL_VERIFYPEER. .

  CURLOPT_CAPATH

  A directory that holds multiple CA certificates. This option is used with CURLOPT_SSL_VERIFYPEER.

  CURLOPT_COOKIE

  Set the "Cookie:" part of the HTTP request. Multiple cookies are separated by a semicolon followed by a space (for example, "fruit=apple; color=red").

  CURLOPT_COOKIEFILE

  The file name containing cookie data. The format of the cookie file can be Netscape format, or just pure HTTP header information can be stored in the file.

 CURLOPT_COOKIEJAR

  A file that saves cookie information after the connection is completed.

  CURLOPT_CUSTOMREQUEST

Use a custom request message instead of "GET" or "HEAD" as the HTTP request. This is useful for performing "DELETE" or other more covert HTTP requests. Valid values ​​are "GET", "POST", "CONNECT", etc. That is, don't enter the entire HTTP request here. For example, entering "GET /index.html HTTP/1.0rnrn" is incorrect.

 Note:

 Do not use this custom request method until you are sure that the server supports it.

  CURLOPT_EGDSOCKET

  Similar to CURLOPT_RANDOM_FILE, except for an Entropy Gathering Daemon socket.

 CURLOPT_ENCODING

 The value of "Accept-Encoding:" in the HTTP request header. Supported encodings are "identity", "deflate" and "gzip". If it is the empty string "", the request header will send all supported encoding types.

 Added in cURL 7.10.

  CURLOPT_FTPPORT

  This value will be used to obtain the IP address required for the FTP "POST" command. The "POST" command tells the remote server to connect to the IP address we specified. This string can be a plain text IP address, a hostname, a network interface name (under UNIX) or just a '-' to use the default IP address.

  CURLOPT_INTERFACE

  Network sending interface name, which can be an interface name, IP address or a host name.

 CURLOPT_KRB4LEVEL

 KRB4 (Kerberos 4) security level. Any of the following values ​​are valid (in order from lowest to highest): "clear", "safe", "confidential", "private". If the string matches none of these, "private" will be used. Setting this option to NULL disables KRB4 security authentication. Currently KRB4 security certification can only be used for FTP transfers.

  CURLOPT_POSTFIELDS

  All data is sent using the "POST" operation in the HTTP protocol. To send a file, prefix the file name with @ and use the full path. This parameter can be passed through a urlencoded string like 'para1=val1¶2=val2&...' or an array with the field name as the key and the field data as the value. If value is an array, the Content-Type header will be set to multipart/form-data.

  CURLOPT_PROXY

  HTTP proxy channel.

  CURLOPT_PROXYUSERPWD

  A string in the format of "[username]:[password]" used to connect to the proxy.

  CURLOPT_RANDOM_FILE

  A file name used to generate SSL random number seeds.

  CURLOPT_RANGE

  In the form of "X-Y", where X and Y are both optional options to obtain the range of data, measured in bytes. The HTTP transport thread also supports several such duplicates separated by commas such as "X-Y,N-M".

 CURLOPT_REFERER

 The content of "Referer:" in the HTTP request header.

 CURLOPT_SSL_CIPHER_LIST

 A list of SSL encryption algorithms. For example RC4-SHA and TLSv1 are both available encryption lists.

  CURLOPT_SSLCERT

  A file name containing a certificate in PEM format.

 CURLOPT_SSLCERTPASSWD

 The password required to use the CURLOPT_SSLCERT certificate.

 CURLOPT_SSLCERTTYPE

  Type of certificate. Supported formats are "PEM" (default), "DER" and "ENG".

  Added in cURL 7.9.3.

 CURLOPT_SSLENGINE

  The encryption engine variable used for the SSL private key specified in CURLOPT_SSLKEY.

  CURLOPT_SSLENGINE_DEFAULT

  Variable used for asymmetric encryption operations.

  CURLOPT_SSLKEY

  The file name containing the SSL private key.

 CURLOPT_SSLKEYPASSWD

 The password of the SSL private key specified in CURLOPT_SSLKEY.

 Note:

Since this option contains sensitive password information, remember to keep this PHP script safe.

  CURLOPT_SSLKEYTYPE

  The encryption type of the private key specified in CURLOPT_SSLKEY. The supported key types are "PEM" (default value), "DER" and "ENG".

  CURLOPT_URL

  The URL address to be obtained can also be set in the curl_init() function.

  CURLOPT_USERAGENT

  Contains a "User-Agent:" header string in the HTTP request.

  CURLOPT_USERPWD

  Pass the username and password required for a connection, in the format: "[username]:[password]".

  The fourth category

 For the optional parameters of the following options, value should be set to an array:

 Option

 Optional value value

 Remarks

 CURLOPT_HTTP200ALIASES

 200 response code array, in the array response is considered a correct response, otherwise it is considered an error.

  Added in cURL 7.10.3.

  CURLOPT_HTTPHEADER

  An array used to set HTTP header fields. Use an array in the following form to set: array('Content-type: text/plain', 'Content-length: 100′)

  CURLOPT_POSTQUOTE

  A set of FTP commands executed on the server after the FTP request is executed. .

 CURLOPT_QUOTE

 A set of FTP commands executed on the server before the FTP request.

 For the optional parameters of the following options, value should be set to a stream resource (for example, using fopen()):

 Option

 Optional value

 CURLOPT_FILE

  Set the location of the output file, the value is a Resource type, default is STDOUT (browser).

  CURLOPT_INFILE

  The file address that needs to be read when uploading a file. The value is a resource type.

  CURLOPT_STDERR

  Set an error output address, the value is a resource type, replacing the default STDERR.

  CURLOPT_WRITEHEADER

  Set the file address where the header part is written, and the value is a resource type.

 For the optional parameters of the following options, value should be set to a callback function name:

 Option

 Optional value

  CURLOPT_HEADERFUNCTION

Set a callback function. This function has two parameters. The first One is the cURL resource handle, and the second is the output header data. The output of header data must rely on this function, which returns the size of the written data.

  CURLOPT_PASSWDFUNCTION

  Set a callback function with three parameters. The first is the cURL resource handle, the second is a password prompt, and the third parameter is the maximum allowed password length. Returns the value of the password.

  CURLOPT_PROGRESSFUNCTION

  Set a callback function with three parameters. The first is the cURL resource handle, the second is a file descriptor resource, and the third is the length. Returns the contained data.

  CURLOPT_READFUNCTION

  A callback function with two parameters. The first parameter is the session handle, and the second parameter is the string of HTTP response header information. Using this function, the returned data will be processed yourself. The return value is the data size in bytes. Returning 0 represents the EOF signal.

  CURLOPT_WRITEFUNCTION

  A callback function with two parameters. The first parameter is the session handle, and the second parameter is the string of HTTP response header information. Using this callback function, the response header information will be processed by itself. The response header information is the entire string. Set the return value to the exact length of the string written. The transport thread terminates when an error occurs.


Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template