缓存是指:为了降低服务器端的访问频率,减少通信数量,前端将获取的数据信息保存下来,当再次需要时,就使用所保存的数据。
缓存对用户体验和通信成本都会造成很大的影响,所以要尽可能地去灵活使用缓存机制。
HTTP
缓存是一个以时间为维度的缓存。
浏览器在第一次请求中缓存了响应,而后续的请求可以从缓存提取第一次请求的响应。从而达到:减少时延而且还能降低带宽消耗,因为可能压根就没有发出请求,所以网络的吞吐量也下降了。
浏览器发出第一次请求,服务器返回响应。如果得到响应中有信息告诉浏览器可以缓存此响应。那么浏览器就把这个响应缓存到浏览器缓存中。
如果后续再发出请求时,浏览器会先判断缓存是否过期。如果没有过期,浏览器压根就不会向服务器发出请求,而是直接从缓存中提取结果。
比如:访问掘金站点
从Size
中可以看出,disk cache
是从硬盘中提取的缓存信息。
如果缓存过期了,也并不一定向第一个请求那样服务器直接返回响应。
浏览器的缓存时间过过期了,就把该请求带上缓存的标签发送给服务器。这时如果服务器觉得这份缓存还能用,那就返回304响应码。浏览器将继续使用这份缓存。
For example: select one of the cache files in the picture above, copy
requesturl
display in curl
First add -I
Get the original request and look at the etag
or last-modified
header.
Because after the browser cache expires, the request will be sent to the server with these headers, allowing the server to determine whether it can still be used.
For the etag
header, add a if-none-match
header with the value of etag
to query the server. Of course, you can also add a if-modified-since
header to ask for the last-modified
header.
returns 304. The advantage of 304 is that it does not carry the package body, which means content-length
is 0, which saves a lot of bandwidth.
The browser cache is a private cache and is only available to one user.
The shared cache is placed on the server and can be used by multiple users. For example, a popular video and other hot resources will be placed in the cache of the proxy server to reduce the pressure on the source server and improve network efficiency.
How to tell whether this resource is cached by the proxy server or sent by the source server?
Still using Nuggets example
From the picture, we can see the age
header in the Response Headers
of this request, unit It's seconds.
Indicates that this cache is returned by the shared cache. age
Indicates the time it exists in the shared cache. The figure is 327784, which means it has existed in the shared cache for 327784 seconds.
The shared cache also expires. Let’s take a look at how the shared cache works.
As shown in the figure:
1. When client1
initiates a request, Cache
is the proxy server (shared cache) and forwards this Request to origin server. The origin server returns the response and sets in the Cache-Control
header that it can be cached for 100 seconds. Then a timer Age
will be started in Cache
, and the response will be returned to client1
with the Age:0
header.
2. After 10 seconds, client2
sends the same request. The cache in Cache
has not expired, so it brings Age: 10
The header returns the cached response to client2
.
3. After 100 seconds, client3
sends the same request. At this time, the cache in Cache
has expired, using the conditions as mentioned earlier The request header If-None-Match
is sent to the origin server with the cached fingerprint. When the origin service believes that the cache is still available, it returns a 304 status code to Cache
. Cache
will retime, find the response from the cache and return it to Client3
with the Age: 0
header.
There are related caching mechanisms in the HTTP
protocol, and these mechanisms can also be used directly in API
to manage the cache. The caching mechanism of HTTP
is defined in detail in RFC7234
and is divided into: Expiration Model(Expiration Model)
and Validation Model(Validation Model)
Two categories
In HTTP
, when the cache is in an available state, it is called fresh
(fresh) state, and when it is in an unavailable state, it is called stale
(stale) status.
The expiration model can be implemented by including information on when to expire in the server's response message. Two implementation methods are defined in HTTP1.1
: one method is to use Cache-Control
to respond to the message header, and the other method is to use Expires
to respond to the message header.
// 1 Expires: Fri, 01 Oct 2020 00:00:00 GMT // 2 Cache-Control: max-age=3600复制代码
Expires
The header has existed since HTTP1.0
. It uses absolute time to express expiration and uses RFC1123
Described in a defined time format. Cache-Control
is defined in HTTP1.1
and represents the number of seconds that have passed since the current time.
这两个首部该使用哪个,则是由返回的数据的性质决定的。对于一开始就知道在某个特定的日期会更新的数据,比如天气预报这种每天在相同时间进行更新的数据,可以使用Expires
首部来指定执行更新操作的时间。对于今后不会使用更新的数据或静态数据等,可以通过指定一个未来非常遥远的日期,使得获取的缓存数据始终保存下去。但根据HTTP1.1
的规定,不允许设置超过1年以上的时间,因此未来非常遥远的时间最多也只能是1年后的日期了。
Expires: Fri, 01 Oct 2021 00:00:00 GMT复制代码
而对于不是定期更新,但如果更新频率在某种程度上是一定的,或者虽然更新频率不低但不希望频繁访问服务器端,对于这种情况可以使用Cache-Control
首部。
如果Expires
和Cache-Control
首部同时使用时,Cache-Control
首部优先判断。
上面Cache-Control
示例中使用到了max-age
关键字,max-age
计算会使用名为Date
的首部。该首部用来显示服务器端生成响应信息的时间信息。从该时间开始计算,当经过的时间超过max-age
值时,就可以认为缓存已到期。
Date: Expires: Fri, 30 Sep 2020 00:00:00 GMT复制代码
Date
首部表示服务器端生成响应信息的时间信息。根据HTTP
协议的规定,除了几个特殊的情况之外,所有的HTTP
消息都要加上Date
首部。
Date
首部的时间信息必须使用名为HTTP
时间的格式来描述。在计算缓存时间时,会用到该首部的时间信息,这时就可以使用Date
首部信息来完成时间的同步操作,做到即便客户端擅自修改日期等配置信息。
与到期模型只根据所接收的响应信息来决定缓存的保存时间相对,验证模型采用了询问服务器的方式来判断当前时间所保存的缓存是否有效。
验证模型在检查缓存的过程中会不时地去访问网络。在执行验证模型时,需要应用程序服务器支持附带条件地请求。附带条件地请求是指前端向服务器端发送地“如果现在保存地信息有更新,请给我更新后地信息”。在整个处理的过程中,前端会发送同“过去某个时间点所获得的数据”有关的信息,随后只有在服务器端的数据发生更新时,服务器端才会返回更新的数据,不然就只会返回304(Not Modified)
状态码来告知前端当前服务器端没有更新的数据。
要进行附带条件的请求,就必须向服务器端传达“前端当前保存的信息的状态”,为此需要用到最后更新日期或实体标签(Entity Tag)
作为指标。顾名思义,最后更新日期表示当前数据最后一次更新的日期:而实体标签则是表示某个特定资源版本的标识符,十一串表示指纹印(Finger Print)
的字符串。例如响应数据的MD5散列值等,整个字符串会随着消息内容的变化而变化。这些信息会在服务器端生成,并被包含在响应信息的首部发送给前端,前端会将其缓存一同保存下来,用于附带条件的请求。
最后更新日期和实体标签会被分别填充到Last-Modified
和ETag
响应消息首部返回给前端
Last-Modified: Fri, 01 Oct 2021 00:00:00 GMT ETag: 'ff568sdf4545687fadf4dsa545e4f5s4f5se45'复制代码
前端使用最后更新日期执行附带条件的请求时,会用到Modified-Since
首部。在使用实体标签时,会用到If-None-Match
首部
GET /v1/user/1 If-Modified-Since: Fri, 01 Oct 2021 00:00:00 GMT GET /v1/user/1 If-None-Match: 'ff568sdf4545687fadf4dsa545e4f5s4f5se45'复制代码
服务器端会检查前端发送过来的信息和当前信息,如果没有发生更新则返回304状态码。如果有更新,则会同应答普通请求一样,在返回200状态码的同时将更新内容一并返回给前端,这时也会带上新的最后更新日期和实体标签。当服务器返回304状态码时,响应消息为空,从而节约了传输的数据量。
在HTTP
协议中,ETag
有强验证与弱验证两个概念。
执行强验证的ETag
ETag: 'ffsd5f46s12wef13we2f13dsd21fsd32f1'
执行弱验证的ETag
ETag: W/'ffsd5f46s12wef13we2f13dsd21fsd32f1'
强验证是指服务器端同客户端的数据不能有一个字节的差别,必须完全一样;而弱验证是指即使数据不完全一样,只要从资源意义的角度来看没有发生变化,就可以视为相同的数据。例如广告信息,虽然每次访问时这些广告的内容都会有所改变,但它们依然是相同的资源,这种情况下便可以使用弱验证。
HTTP1.1
mentioned that when the server does not give a clear expiration time, the client can decide how long it needs to keep the cached data. At this time, the client must determine the cache expiration time based on the server's update frequency, specific conditions and other information. This method is called heuristic expiration.
For example, by observing Last-Modified
, the front end finds that the last update was 1 year ago, which means that there will be no problem if the cache data is saved for a while; if It is found that the result of the visit so far is that there is only one update per day, which means that it may be feasible to save the cache for half a day. Like this, the front end can reduce the number of visits through independent judgment.
Although API
whether the heuristic expiration method is allowed depends on the characteristics of the API, because the server has the deepest understanding of cache update and control, the server uses Cache -Control
, Expires
, etc. accurately return the information of "how long to save the cached data" to the front end, which is an ideal approach for both parties to the interaction. But if it does not return, the server needs to inform the front end through header information such as Last-Modified
Vary
to specify the cache unit during implementation You may also need to specify the Vary
header when caching. When implementing caching, Vary
is used to specify which request header item is used in addition to URI
to determine unique data. Vary
is used because even if URI
is the same, the obtained data sometimes changes due to different request header content. Only the headers specified by the vary
header must match the headers in the request in order to use caching. The definition of
vary
:
field- name
: The specified header must match the header in the request to use the cacheAs shown in the figure:
1. When Client1
The GET
request carrying the Accept-Encoding: *
header is sent to server
. server
returns the response encoded by gzip
, and the vary: Content-Encoding
header, indicating that caching can only be used when the encoding method is the same.
2. When Client2
carries the Accept-Encoding: br
header, the GET
request is sent to server
, this The requested encoding is br
. Therefore, Cache
cannot use cache because it does not match the value in vary
and can only forward the request to the source server server
.
3. When Client3
carries the Accept-Encoding: br
header, the GET
request is sent to server
, this When Cache
has a br
encoded cache, it can match the value of the vary
header, so it can be returned using the cache.
Generally speaking, the Vary
header is used in scenarios where HTTP interacts through a proxy server, especially when the proxy server has a caching function. However, sometimes the server cannot know whether the front-end access is through the proxy server. In this case, the server-driven content negotiation mechanism needs to be used, and the Vary
header becomes a required option.
Cache-Control
The header value range is very complex. The definition of
Cache-Control
is:
token
valueCache-Control
can be used in the request or is used in response. And the same value has different meanings in the request and response. The
Cache-Control
value has three uses:
token
token
value '=' decimal number token
value '=' corresponding header/ use token
value The value, usage and meaning of Cache-Control: @ indicates the usage after
Age
that exceeds max-age
seconds max-stale
. If there is no value after max-stale
, it means that the client can use it no matter how long it expires. Age
must wait at least min-fresh
seconds before the cache can be usedIn the response The value and meaning of Cache-Control
:
Age
exceeds max-age
The cache will expire after max-age
, but only for shared cache and has a higher priority than max-age
andexpires
must-revaildate
Similar, but it is only valid for the shared cache of the proxy serverno-cache
, then if the client’s subsequent requests and responses do not contain these headers, the cache can be used directly priate
, it tells the proxy server that it cannot cache the specified header and can cache other headers Related free learning recommendations :javascript(Video)
The above is the detailed content of One article to solve 'caching'. For more information, please follow other related articles on the PHP Chinese website!