This article mainly shares with you the detailed explanation of HTTP request headers and request bodies. The study of HTTP mainly includes the four parts of HTTP basics, HTTP request headers and request bodies, HTTP response headers and status codes, and HTTP cache. For HTTP related As for the expansion and extension, we also need to understand the understanding and practice of HTTPS, HTTP/2 basics, and WebSocket basics. The knowledge points in this part are also summarized in the author's review of my road to school recruitment preparation: from Web front-end to server-side application architecture.

HTTP Request

HTTP request message is divided into three parts: request line, request header and request body. The format is as follows:

A A typical request message header field is as follows:

　　POST/GET http://download.microtool.de:80/somedata.exe 　
　　　Host: download.microtool.de 　　Accept:*/* 　　Pragma: no-cache 　
　　　　Cache-Control: no-cache 　　Referer: http://download.microtool.de/ 
　　　　　　User-Agent:Mozilla/4.04[en](Win95;I;Nav) 　　Range:bytes=554554-

Copy after login

Request Line: Request Line

The request line (Request Line) is divided into three parts: request method, request address and protocol and version , ends with CRLF(rn).
HTTP/1.1 defines 8 request methods: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, TRACE. The two most common ones are GET and POST. If it is a RESTful interface, GET and POST, DELETE, PUT.

Request Methods

Note that only the three verbs POST, PUT and PATCH will contain the request body, while the verbs GET, HEAD, DELETE, CONNECT, TRACE and OPTIONS will contain the request body. Does not contain request body.

Header: Request header

Accept-RangesAuthorization##Cache-ControlSpecify the caching mechanism followed by requests and responsesCache-Control: no-cacheConnectionIndicates whether a persistent connection is required. (HTTP 1.1 uses persistent connections by default) Connection: closeCookieWhen an HTTP request is sent, the information stored under the requested domain name will be All cookie values are sent together to the web server.

Header	Explanation	Example
Accept	Specify the content types that the client can receive	Accept: text/plain, text/html,application/json
Accept-Charset	The character encoding set that the browser can accept.	Accept-Charset: iso-8859-5
Accept-Encoding	Specifies the content compression encoding type returned by the web server that the browser can support.	Accept-Encoding: compress, gzip
Accept-Language	Accept-Language: en,zh
You can request one or more sub-range fields of the web page entity	Accept-Ranges: bytes
HTTP authorization authorization certificate	Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==


Cookie: $Version=1; Skin=new;
Content-Length	Requested content length	Content- Length: 348
Content-Type	The requested MIME information corresponding to the entity	Content-Type: application/x-www-form- urlencoded
Date	The date and time the request was sent	Date: Tue, 15 Nov 2010 08:12:31 GMT
Expect	Requested specific server behavior	Expect: 100-continue
From	Email of the user who made the request	From: user@email.com
Host	Specify the domain name and port number of the requested server	Host: www.zcmhi.com
If-Match	Only the request content matches the entity is valid	If-Match: "737060cd8c284d8af7ad3082f209582d"
If-Modified-Since	If the requested part is modified after the specified time, the request is successful. If it is not modified, a 304 code is returned	If-Modified-Since: Sat, 29 Oct 2010 19:43:31 GMT
If-None-Match	Return 304 code if the content has not changed , the parameter is the Etag previously sent by the server, compare it with the Etag responded by the server to determine whether it has changed	If-None-Match: “737060cd8c284d8af7ad3082f209582d”
If-Range	If the entity has not changed, the server sends the missing part from the client, otherwise the entire entity is sent.The parameter is also Etag	If-Range: “737060cd8c284d8af7ad3082f209582d”
If-Unmodified-Since	Only if the entity has not been modified after the specified time The request is successful	If-Unmodified-Since: Sat, 29 Oct 2010 19:43:31 GMT
Max-Forwards	Restricted information passed Time sent by proxies and gateways	Max-Forwards: 10
Pragma	Used to contain implementation-specific instructions	Pragma : no-cache
Proxy-Authorization	Authorization certificate to connect to the proxy	Proxy-Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
Range	Request only a part of the entity, specify the range	Range: bytes=500-999
Referer	The address of the previous web page, followed by the current requested web page, that is, the source	Referer: http://www.zcmhi.com/archives...
TE	The transfer encoding that the client is willing to accept, and notifies the server to accept the tail plus header information	TE: trailers,deflate;q=0.5
Upgrade	Specify a transport protocol to the server for conversion (if supported)	Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11
User-Agent	The content of User-Agent contains information about the user who made the request	User-Agent: Mozilla/5.0 (Linux; X11)
Via	Notify the intermediate gateway or proxy server address, communication protocol	Via: 1.0 fred, 1.1 nowhere.com (Apache/1.1)
Warning	Warning information about the message entity	Warn: 199 Miscellaneous warning

Request Body:请求体

Types

根据应用场景的不同，HTTP请求的请求体有三种不同的形式。

任意类型

移动开发者常见的，请求体是任意类型，服务器不会解析请求体，请求体的处理需要自己解析，如 POST JSON时候就是这类。

application/json

application/json 这个 Content-Type 作为响应头大家肯定不陌生。实际上，现在越来越多的人把它作为请求头，用来告诉服务端消息主体是序列化后的 JSON 字符串。由于 JSON 规范的流行，除了低版本 IE 之外的各大浏览器都原生支持 JSON.stringify，服务端语言也都有处理 JSON 的函数，使用 JSON 不会遇上什么麻烦。

JSON 格式支持比键值对复杂得多的结构化数据，这一点也很有用。记得我几年前做一个项目时，需要提交的数据层次非常深，我就是把数据 JSON 序列化之后来提交的。不过当时我是把 JSON 字符串作为 val，仍然放在键值对里，以 x-www-form-urlencoded 方式提交。

Google 的 AngularJS 中的 Ajax 功能，默认就是提交 JSON 字符串。例如下面这段代码：

JSvar data = {&#39;title&#39;:&#39;test&#39;, &#39;sub&#39; : [1,2,3]};$http.post(url, data).success(function(result) {
    ...
});

Copy after login

最终发送的请求是：

BASHPOST http://www.example.com HTTP/1.1 Content-Type: application/json;charset=utf-8{"title":"test","sub":[1,2,3]}

Copy after login

这种方案，可以方便的提交复杂的结构化数据，特别适合 RESTful 的接口。各大抓包工具如 Chrome 自带的开发者工具、Firebug、Fiddler，都会以树形结构展示 JSON 数据，非常友好。但也有些服务端语言还没有支持这种方式，例如 php 就无法通过 $_POST 对象从上面的请求中获得内容。这时候，需要自己动手处理下：在请求头中 Content-Type 为 application/json 时，从 php://input 里获得原始输入流，再 json_decode 成对象。一些 php 框架已经开始这么做了。

当然 AngularJS 也可以配置为使用 x-www-form-urlencoded 方式提交数据。如有需要，可以参考这篇文章。

text/xml

我的博客之前提到过 XML-RPC（XML Remote Procedure Call）。它是一种使用 HTTP 作为传输协议，XML 作为编码方式的远程调用规范。典型的 XML-RPC 请求是这样的：

HTMLPOST http://www.example.com HTTP/1.1 
Content-Type: text/xml<?xml version="1.0"?><methodCall>
    <methodName>examples.getStateName</methodName>
    <params>
        <param>
            <value><i4>41</i4></value>
        </param>
    </params></methodCall>

Copy after login

XML-RPC 协议简单、功能够用，各种语言的实现都有。它的使用也很广泛，如 WordPress 的 XML-RPC Api，搜索引擎的 ping 服务等等。JavaScript 中，也有现成的库支持以这种方式进行数据交互，能很好的支持已有的 XML-RPC 服务。不过，我个人觉得 XML 结构还是过于臃肿，一般场景用 JSON 会更灵活方便。

Query String:application/x-www-form-urlencoded

This is the most common way to submit data via POST. The browser's native

form, if the enctype attribute is not set, will eventually submit data in application/x-www-form-urlencoded mode. The request is similar to the following (irrelevant request headers are omitted in this article):

POST http://www.example.com HTTP/1.1
Content-Type : application/x-www-form-urlencoded;charset=utf-8
title=test&sub%5B%5D=1&sub%5B%5D=2&sub%5B%5D=3

First, Content-Type is specified as application/x-www-form-urlencoded; the format requirement here is the format requirement of Query String in the URL: multiple key-value pairs are connected with &, and keys and values are preceded by = Connection, and only ASCII characters can be used. Non-ASCII characters need to be encoded using UrlEncode. Most server-side languages have good support for this method. For example, in PHP, $_POST['title'] can get the value of title, and $_POST['sub'] can get the sub array.

File splitting

The request body of the third request body is divided into multiple parts and will be used when uploading files. This format is the first It should be used in email transmission. Each field/file is divided into separate segments by the boundary (specified in Content-Type). Each segment starts with -- plus boundary, followed by the description header of the segment, and is empty after the description header. The content is connected in one line. The end of the request is marked with -- after boundary. The structure is shown in the figure below:

The key to distinguish whether it is treated as a file is whether the Content-Disposition contains filename, because files have different Type, so Content-Type is also used to indicate the type of the file. If you don’t know what type it is, the value can be application/octet-stream to indicate that the file is a binary file. If it is not a file, Content-Type can be omitted.
When we use a form to upload files, we must make the form's enctyped equal to multipart/form-data. Let’s look directly at a request example:

BASHPOST http://www.example.com HTTP/1.1Content-Type:multipart/form-data; 
boundary=----WebKitFormBoundaryrGKCBY7qhFd3TrwA------WebKitFormBoundaryrGKCBY7qhFd3TrwAContent-Disposition: 
form-data; name="text"title------WebKitFormBoundaryrGKCBY7qhFd3TrwAContent-Disposition: form-data; name="file";
 filename="chrome.png"Content-Type: image/pngPNG ... content of chrome.png ...------WebKitFormBoundaryrGKCBY7qhFd3TrwA--

Copy after login

这个例子稍微复杂点。首先生成了一个 boundary 用于分割不同的字段，为了避免与正文内容重复，boundary 很长很复杂。然后 Content-Type 里指明了数据是以 multipart/form-data 来编码，本次请求的 boundary 是什么内容。消息主体里按照字段个数又分为多个结构类似的部分，每部分都是以 --boundary 开始，紧接着是内容描述信息，然后是回车，最后是字段具体内容（文本或二进制）。如果传输的是文件，还要包含文件名和文件类型信息。消息主体最后以 --boundary-- 标示结束。关于 multipart/form-data 的详细定义，请前往 rfc1867 查看。

这种方式一般用来上传文件，各大服务端语言对它也有着良好的支持。

上面提到的这两种 POST 数据的方式，都是浏览器原生支持的，而且现阶段标准中原生表单也只支持这两种方式（通过元素的enctype 属性指定，默认为 application/x-www-form-urlencoded。其实 enctype 还支持 text/plain，不过用得非常少）。

随着越来越多的 Web 站点，尤其是 WebApp，全部使用 Ajax 进行数据交互之后，我们完全可以定义新的数据提交方式，给开发带来更多便利。

Encoding:编码

网页中的表单使用POST方法提交时，数据内容的类型是 application/x-www-form-urlencoded，这种类型会：

　　1.字符"a"-"z"，"A"-"Z"，"0"-"9"，"."，"-"，"*"，和"_" 都不会被编码;

　2. Convert spaces into plus signs (+)

　3. Convert non-text content into the form of "%xy", xy is a two-digit hexadecimal value;

　4. Place an & symbol between each name=value pair.

One of the many problems faced by web designers is how to deal with the differences between different operating systems. These differences cause problems with URLs: for example, some operating systems allow spaces in file names, while others do not. Most operating systems do not think that the symbol "#" in the file name has any special meaning; but in a URL, the symbol "#" indicates that the file name has ended, followed by a fragment (part) identifier. Other special characters, non-alphanumeric character sets, which have special meanings in the URL or on another operating system, pose similar problems. In order to solve these problems, the characters we use in the URL must be elements in a fixed character set of the ASCII character set, as follows:

　1. Uppercase letters A-Z

　2. Lowercase letters a-z

　3. Numbers 0-9

　4. Punctuation characters - _ . ! ~ * ' (and,)

　 Characters such as: / & ? @ # $ + = and % can also be used, but each has its own special purpose. If a file name contains these characters ( / & ? @ # $ + = % ), these characters and all other characters should be encoded.

The encoding process is very simple. Any characters that are not ASCII numbers, letters, or the aforementioned punctuation characters will be converted into byte form. Each byte is written in this form: a "%" Followed by two hexadecimal values. Whitespaces are a special case because they are so common. In addition to being encoded as "%20", it can also be encoded as a "+". The plus sign (+) itself is encoded as %2B. Both / # = & and ? should be encoded when used as part of a name, rather than as a separator between URL parts.

WARNING This strategy is not ideal in heterogeneous environments with a large number of character sets. For example: On U.S. Windows systems, é is encoded as %E9. On U.S. Mac, it is encoded as %8E. The existence of this uncertainty is an obvious shortcoming of existing URIs. Therefore, future URI specifications should be improved through International Resource Identifiers (IRIs).

Class URLs do not automatically perform encoding or decoding. You can generate a URL object, which may contain illegal ASCII and non-ASCII characters and/or %xx. Such characters and escape characters are not automatically encoded or decoded when using the methods getPath() and toExternalForm() as output methods. You are responsible for the string objects used to generate a URL object, ensuring that all characters are encoded appropriately.

Related recommendations:

HTTP request header

The above is the detailed content of Detailed explanation of HTTP request headers and request body. For more information, please follow other related articles on the PHP Chinese website!