HTTP Cache Headers - A Complete Guide
As a website owner, you want your website to be fast, efficient, and accessible to as many users as possible. One of the best ways to achieve this is by using HTTP caching headers. These headers tell web browsers and other HTTP clients how to cache and serve content from your website.
This article highlights important information on HTTP caching headers and associated CDN behavior. In case you are looking for in-depth information on the role of HTTP cache headers in the modern web, here's everything you need to know.
How caching works
When a browser requests a file from a server, the server responds with the file and some cache headers. The browser then caches the file based on these headers. The next time the browser requests the same file, it checks its cache to see if it already has a copy. If it does, and the file hasn't expired, the browser serves the cached version of the file. If the file has expired or if the browser has been told not to cache it, the browser requests a fresh copy of the file from the server.
Caching works differently depending on the type of cache being used. There are two main types of caches: browser caches and CDN caches.
Browser caches
Browser caches are local caches that are used by web browsers to store copies of files. When a browser requests a file, it first checks its local cache to see if it already has a copy. If it does, and the file hasn't expired, the browser serves the cached version of the file. If the file has expired, or if the browser has been told not to cache it, the browser requests a fresh copy of the file from the server.
CDN caches
CDN caches are distributed caches that are used by Content Delivery Networks (CDNs) to store copies of files. When a browser requests a file from a website that is using a CDN, the request is sent to the CDN instead of the origin server. If the CDN has a cached copy of the file, it serves it directly to the browser. This can greatly reduce the amount of time and resources needed to load the file, as the request doesn't need to travel all the way to the origin server.
CDN caches can be configured in a number of different ways, depending on the needs of the website. Some CDNs use a "pull" model, where the CDN only caches files when they are requested by a browser. Other CDNs use a "push" model, where the origin server sends files to the CDN proactively before they are requested by a browser.
What are HTTP cache headers?
HTTP cache headers are instructions that web servers send to web browsers, telling them how to cache and serve content. These headers are sent with every HTTP request and response. They can be used to control how frequently a browser caches a file, how long the cache should keep the file, and what should be done when the file is expired.
HTTP cache headers are important because they help reduce the amount of time and resources needed to load a web page. By caching content, a browser can serve it more quickly without having to request it from the server every time a user visits the page. This can improve website performance, reduce server load, and improve the overall user experience.
Types of HTTP cache headers
Caches work with content mainly through freshness and validation. A fresh representation is available instantly from a cache while a validated representation rarely sends the entire representation again if it hasn't changed. In cases where there is no validator present (e.g. ETag
or Last-Modified
header), and a lack of explicit freshness info, it will usually (but not always) be considered uncacheable. Let's shift our focus to the kind of headers you should be concerned about.
1. Cache-Control
Every resource can define its own caching policy via the Cache-Control
HTTP header. Cache-Control
directives control who caches the response, under what conditions and for how long.
Requests that don't need server communication are considered the best requests: local copies of the responses allow the elimination of network latency as well as data charges resulting from data transfers. The HTTP specification enables the server to send several different Cache-Control
directives which control how and for how long individual responses are cached by browsers among other intermediate caches such as a CDN.
Cache-Control: private, max-age=0, no-cache
These settings are referred to as response directives. They are as follows:
public
vs private
A response that is marked public
can be cached even in cases where it is associated with an HTTP authentication or the HTTP response status code is not cacheable normally. In most cases, a response marked public
isn't necessary, since explicit caching information (e.g. max-age
) shows that a response is cacheable anyway.
On the contrary, a response marked private
can be cached (by the browser) but such responses are typically intended for single users hence they aren't cacheable by intermediate caches (e.g. HTML pages with private user info can be cached by a user's browser but not by a CDN).
no-cache
and no-store
no-cache
shows that returned responses can't be used for subsequent requests to the same URL before checking if server responses have changed. If a proper ETag
(validation token) is present as a result, no-cache
incurs a roundtrip in an effort to validate cached responses. Caches can however eliminate downloads if the resources haven't changed. In other words, web browsers might cache the assets but they have to check on every request if the assets have changed (304 response if nothing has changed).
On the contrary, no-store
is simpler. This is the case because it disallows browsers and all intermediate caches from storing any versions of returned responses, such as responses containing private/personal information or banking data. Every time users request this asset, requests are sent to the server. The assets are downloaded every time.
max-age
The max-age
directive states the maximum amount of time in seconds that fetched responses are allowed to be used again (from the time when a request is made). For instance, max-age=90
indicates that an asset can be reused (remains in the browser cache) for the next 90 seconds.
s-maxage
The "s-" stands for shared as in shared cache. This directive is explicitly for CDNs among other intermediary caches. This directive overrides the max-age
directive and expires header field when present. KeyCDN also obeys this directive.
must-revalidate
The must-revalidate
directive is used to tell a cache that it must first revalidate an asset with the origin after it becomes stale. The asset must not be delivered to the client without doing an end-to-end revalidation. In short, stale assets must first be verified and expired assets should not be used.
proxy-revalidate
The proxy-revalidate
directive is the same as the must-revalidate
directive, however, it only applies to shared caches such as proxies. It is useful in the event that a proxy services many users that need to be authenticated one by one. A response to an authenticated request can be stored in the user's cache without needing to revalidate it each time as they are known and have already been authenticated. However, proxy-revalidate
allows proxies to still revalidate for new users that have not been authenticated yet.
no-transform
The no-transform
directive tells any intermediary such as a proxy or cache server to not make any modifications whatsoever to the original asset. The Content-Encoding
, Content-Range
, and Content-Type
headers must remain unchanged. This can occur if a non-transparent proxy decides to make modifications to assets in order to save space. However, this can cause serious problems in the event that the asset must remain identical to the original entity-body although it must also pass through the proxy.
According to Google, the Cache-Control
header is all that's needed in terms of specifying caching policies. Other methods are available, which we'll go over in this article, however, are not required for optimal performance.
The Cache-Control
header is defined as part of HTTP/1.1 specifications and supersedes previous headers (e.g. Expires
) used to specify response caching policies. Cache-Control
is supported by all modern browsers so that's all we need.
2. Pragma
The old Pragma
header accomplishes many things most of them characterized by newer implementations. We are however most concerned with the Pragma: no-cache
directive which is interpreted by newer implementations as Cache-Control: no-cache
. You don't need to be concerned about this directive because it's a request header that will be ignored by KeyCDN's edge servers. It is however important to be aware of the directive for the overall understanding. Going forward, there won't be new HTTP directives defined for Pragma
.
3. Expires
A couple of years back, this was the main way of specifying when assets expire. Expires
is simply a basic date-time stamp. It's fairly useful for old user agents which still roam unchartered territories. It is, however, important to note that Cache-Control
headers, max-age
and s-maxage
still take precedence on most modern systems. It's however good practice to set matching values here for the sake of compatibility. It's also important to ensure you format the date properly or it might be considered as expired.
Expires: Sun, 03 May 2015 23:02:37 GMT
To avoid breaking the specification, avoid setting the date value to more than a year.
4. Validators
ETag
This type of validation token (the standard in HTTP/1.1):
- Is communicated via the
ETag
HTTP header (by the server). - Enables efficient resource updates where no data is transfered if the resource doesn't change.
The following example will illustrate this. 90 seconds after the initial fetch of an asset, initiates the browser a new request (the exact same asset). The browser looks up the local cache and finds the previously cached response but cannot use it because it's expired. This is the point where the browser requests the full content from the server. The problem with it this is that if the resource hasn't changed, there is absolutely no reason for downloading the same asset that is already in the CDN cache.
Validation tokens are solving this problem. The edge server creates and returns arbitrary tokens, that are stored in the ETag
header field, which are typically a hash or other fingerprints of content of existing files. Clients don't need to know how the tokens are generated but need to send them to the server on subsequent requests. If the tokens are the same then resources haven't changed thus downloads can be skipped.
The web browser provides the ETag
token automatically within the If-None-Match
HTTP request header. The server then checks tokens against current assets in the cache. A 304 Not Modified
response will tell the browser if an asset in the cache hasn't been changed and therefore allowing a renewal for another 90 seconds. It's important to note that these assets don't need to be downloaded again which saves bandwidth and time.
How do web developers benefit from efficient revalidation?
Browsers do most (if not) all the work for web developers. For instance, they automatically detect if validation tokens have been previously specified and appending them to outgoing requests and updating cache timestamps as required based on responses from servers. Web developers are therefore left with one job only which is ensuring servers provide the required ETag
tokens. KeyCDN's edge servers fully support the ETag
header.
Last-Modified
The Last-Modified
header indicates the time a document last changed which is the most common validator. It can be seen as a legacy validator from the time of HTTP/1.0. When a cache stores an asset including a Last-Modified
header, it can utilize it to query the server if that representation has changed over time (since it was last seen). This can be done using an If-Modified-Since
request header field.
An HTTP/1.1 origin server should send both, the ETag
and the Last-Modified
value. More details can be found in section 13.3.4 in the RFC2616.
KeyCDN example response header:
HTTP/1.1 200 OK
Server: keycdn-engine
Date: Mon, 27 Apr 2015 18:54:37 GMT
Content-Type: text/css
Content-Length: 44660
Connection: keep-alive
Vary: Accept-Encoding
**Last-Modified: Mon, 08 Dec 2014 19:23:51 GMT**
**ETag: "5485fac7-ae74"**
**Cache-Control: max-age=533280**
**Expires: Sun, 03 May 2015 23:02:37 GMT**
X-Cache: HIT
X-Edge-Location: defr
Access-Control-Allow-Origin: *
Accept-Ranges: bytes
You can check your HTTP Cache Headers using KeyCDN's HTTP Header Checker tool.
5. Extension Cache-Control
directives
Apart from the well-known Cache-Control
directives outlined in the first section of this article, there also exists other directives which can be used as extensions to Cache-Control
resulting in a better user experience for your visitors.
immutable
No conditional revalidation will be triggered even if the user explicitly refreshes a page. The immutable directive tells the client that the response body will not change over time, therefore, there is no need to check for updates as long as it is unexpired.
stale-while-revalidate
The stale-while-revalidate
directive allows for a stale asset to be served while it is revalidated in the background.
A stale-while-revalidate
value is defined to tell the cache that it has a certain amount of time to validate the asset in the background while continuing to deliver the stale one. An example of this would look like the following:
Cache-Control: max-age=2592000, stale-while-revalidate=86400
Learn more about the stale-while-revalidate
directive in our stale-while-revalidate
and stale-if-error guide
.
stale-if-error
The stale-if-error
directive is very similar to the stale-while-revalidate
directive in that it serves stale content when the max-age
expires. However, the stale-if-error
only returns stale content if the origin server returns an error code (e.g. 500
, 502
, 503
, or 504
) when the cache attempts to revalidate the asset.
Therefore, instead of showing visitors an error page, stale content is delivered to them for a predefined period of time. During this time it is the goal that the error has been resolved and that the asset can be revalidated.
Learn more about the stale-if-error
directive in our stale-while-revalidate
and stale-if-error guide
.
KeyCDN and HTTP cache headers
At KeyCDN, we understand the importance of HTTP cache headers and their role in optimizing website performance. KeyCDN allows you define your HTTP cache headers as you see fit. The ability to set the Expire and Max Expire values directly within the dashboard makes it very easy to configure things on the CDN side.
Furthermore, if you rather have even more control over your HTTP cache headers you can disable the Ignore Cache Control feature in your Zone settings and have KeyCDN honor all of your cache headers from the origin. This is very useful in the event that you need to exclude a certain asset or group of assets from the CDN.
TL;DR
The Cache-Control
(in particular), along with the ETag
header field are modern mechanisms to control freshness and validity of your assets. The other values are only used for backward compatibility.
Conclusion
HTTP cache headers are an important tool for improving website performance and reducing server load. By properly configuring cache headers, you can ensure that your files are cached and served efficiently, without sacrificing freshness or reliability. Remember to set appropriate cache control and expiration headers, consider using ETag headers, and test your headers to ensure that they are working correctly. By following these best practices, you can create a fast, reliable, and efficient website that delivers a great user experience.
Do you have any thoughts on using HTTP cache headers? If so we would love to hear them below in the comments.