How Does a CDN Work?

We've previously discussed the specifics behind what is a CDN in another article, however, in this CDN tutorial we'll dive a little deeper into how does a CDN work.
Why speed matters
It's important to provide a fast web browsing experience to site visitors.
40% of people abandon a website that takes more than 3 seconds to load.
Even when users don't leave a site, they are less likely to make a purchase or engage with the site if the user experience is poor.
For those running an ecommerce website, utilizing a WooCommerce CDN combination for example, this can help improve website performance and thus increase revenue.
There are lots of reasons why a site might perform badly. Some of the responsibility lies in the hands of the web developer. If they use code that is too verbose or not optimized, the result is large files that take longer to download. This increases the website load time for the user. Complex JavaScript can make the browser unresponsive and bloated media files can be slow to download.
Network related speed issues
But not all of the blame belongs to the web developer. Often, the Internet itself is to blame.
A website visitor may be a great distance from the web server which is subject to what is called Fibre Optic Latency. The Internet is an International network, so a visitor could be in a different country thousands of miles away from a web server. This distance has a big impact on how quickly the user receives a web page.
When a user sends a request to a web server, data is sent both ways. The web browser sends a request for content in a packet which is sent to the web server. The web server responds with the data in one or more packets. In each case, the data must travel across the Internet to reach its destination.
The Internet is a network of networks. First, the request packet is sent to the ISP's network. The ISP connects to a larger network, which connects to other similarly sized networks. As the packets of data from the user's computer and the web server navigate the Internet to reach their destination, they must hop from network to network, passing through bridges, routers, and gateways.
If the server and user are close together, there are fewer hops, and the connection is faster. However, if the server and user are not close to each other, the distance each content packet has to travel is greater, resulting in slower page load speeds. At this point you may be asking yourself if something exists to allows there to be fewer hops thus decreasing a web page's load time. The answer is yes, this is the job of a CDN.
What is a CDN?
If you're unfamiliar with the term CDN, it stands for content delivery network. A CDN is essentially a group of servers that are strategically placed across the globe with the purpose of accelerating the delivery of your static web content. Wikipedia's CDN definition consist of the following:
A content delivery network or content distribution network (CDN) is a globally distributed network of proxy servers deployed in multiple data centers.
CDNs are very useful for a multitude of reasons. For website owners who have visitors in multiple geographic locations, content will be delivered faster to these users as there is less distance to travel. CDN users also benefit from the ability to easily scale up and down much more easily due to traffic spikes. On average, 80% of a website consist of static resources therefore when using a CDN, there is much less load on the origin server. KeyCDN users can also take advantage of faster speeds due to optimized HTTP/2 supported edge servers using a customized TCP stack.
Now that you've got a quick primer on what is CDN, the sections below will discuss the different components involved in using a CDN and will answer the question "how does a CDN work" in terms of caching static assets to speed up content delivery.
What is an origin server?
The first component involved in many cases when using a CDN Pull Zone is the origin server. The origin server is, in this case, the primary source of your website's data and where your website files are hosted. For example, if you are using DigitalOcean to host your site's files and have chosen data centre in San Francisco, then your origin server would be based in San Francisco.
This means that without a CDN, all of your website visitors would need to request information from your server in San Francisco and would therefore receive all responses from your server in San Francisco. As we can see, being limited to one server to deliver files from can be quite inefficient as distance is a contributing factor to latency.
Being able to deliver parts of a website from various locations helps decrease the distance between the visitor and the web server, thus reducing latency. This is exactly what CDN edge servers achieve.
What are edge servers?
The question "how does a CDN work" cannot be explained without addressing what CDN edge servers are. Edge servers are the CDN servers used to cache content retrieved from your origin server or storage cluster. Another term often closely related to edge server is point of presence (POP). A POP refers to the physical location of where the edge servers are located. For example, KeyCDN's network currently has a wide range of POPs, one of which is located in Chicago, USA.
That POP can have multiple edge servers caching content at that location. Therefore, depending upon the number of edge servers a CDN has at one location, this may result in the need to prime the cache of more than one server. This is something CDN users should be aware of when testing their website's speed after integrating a CDN with their site. CDN Caching behaviour is explained in further detail in the section below.
CDN reverse proxy
A reverse proxy is a server that takes a client request and forwards it to the backend server. It is an intermediary server between the client and the origin server itself. A CDN reverse proxy takes this concept a step further by caching responses from the origin server that are on their way back to the client. Therefore, the CDN's servers are able to more quickly deliver assets to nearby visitors. This method is also desirable for reasons such as:
- Load balancing and scalability
- Increased website security
A CDN reverse proxy is used in the case of a Pull Zone. A complete explanation of Pull Zones as well as how CDN caching actually works is described in the sections below.
How does CDN caching work?
Caching accounts for a major part of a CDN's functionality. Read our Cache Definition post to get a better understanding of what caching actually is and how it is beneficial. In the case of a CDN, the edge servers are where the data is cached and stored. Once you have integrated a CDN to work with your website, caching takes place as follows:
- A visitor in a particular location (e.g. Chicago) makes the first request for a static asset on your site (e.g. style.css)
- The asset is retrieved from your origin server and upon being delivered, the asset is cached on the KeyCDN Chicago edge server (i.e. the nearest KeyCDN POP based on that visitor's location).
- If the same visitor makes a request for the same asset again, the request goes to the CDN POP edge server(s) to check if the asset is already cached. If the request hits an edge server that already has the asset cached, the visitor receives a response from that edge server. On the other hand, if the request hits a different edge server which doesn't have the asset cached yet, Step 2 is repeated.
Once your static assets are cached on all the edge servers for a particular location, all subsequent visitor requests for static assets will be delivered from the edge servers instead of the origin, thus reducing origin load and improving scalability. To visualize this concept, see the image below which shows the difference between not using a CDN and using a CDN.
Upon initially setting up your CDN, you can verify if your assets are being delivered from an edge server by checking an asset's response header. To do this, open up Chrome Developer tools and navigate to the Network tab. From here, select a static asset and check the X-Cache response header.
Alternatively, you can also check this from the command line through using curl.
curl -I https://www.keycdn.com/img/example.jpg
HTTP/2 200
server: keycdn-engine
date: Wed, 15 Jun 2022 06:43:47 GMT
content-type: image/jpeg
content-length: 195025
last-modified: Thu, 16 Jul 2020 07:06:27 GMT
vary: Accept-Encoding
etag: "5f0ffc73-2f9d1"
expires: Wed, 22 Jun 2022 06:43:47 GMT
cache-control: max-age=604800
strict-transport-security: max-age=31536000; includeSubdomains; preload
content-security-policy: default-src 'self' 'unsafe-inline' 'unsafe-eval' https: data:
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: no-referrer-when-downgrade
x-cache: HIT
x-edge-location: chzh
access-control-allow-origin: *
accept-ranges: bytes
If you see that the X-Cache value is HIT that means that the asset was delivered from the CDN edge server's cache. Otherwise, if the asset was not delivered from cache you will see an
X-Cache: MISS meaning that the asset was retrieved from your origin server.
To learn more about CDN caching read our Exploring the Ins and Outs of CDN Cache article.
Traceroute example
Now that you know more about how a CDN handles caching, we'll take a look at the difference between the number of hops required when using a CDN and when not. Traceroute is a tool that tells you how many hops between networks were needed in order to deliver the piece of content you requested. It also shows you the amount of time it takes for each hop. The more hops needed to complete a request, the longer it may take to deliver the content to the user's browser.
As mentioned, when a website is using a CDN it minimizes the number of hops required to deliver the data to a user's browser due to the POPs that are located near the user. For this example, let's assume someone from Singapore would like to access the KeyCDN website. Using the traceroute tool it's easy to visualize the number of hops required when a website is not delivered from the CDN:
Singapore, SG - Without a CDN (14 hops)
1.|-- 128.199.191.254            0.0%     4    1.1   0.7   0.6   1.1   0.0
2.|-- 103.253.144.233            0.0%     4    0.7   0.6   0.5   0.7   0.0
3.|-- 103.253.144.249            0.0%     4    0.5   0.6   0.5   0.6   0.0
4.|-- 116.51.27.189              0.0%     4    1.5   1.7   1.5   2.2   0.0
5.|-- 129.250.4.73               0.0%     4    1.3  10.3   1.3  32.0  14.7
6.|-- 129.250.7.64               0.0%     4  192.8 186.6 182.7 192.8   4.8
7.|-- 129.250.3.138              0.0%     4  198.7 196.7 195.2 198.7   1.5
8.|-- 129.250.6.217              0.0%     4  198.6 198.9 197.8 200.9   1.2
9.|-- 213.198.72.230             0.0%     4  197.9 199.6 197.9 204.1   2.9
10.|-- 213.239.245.2              0.0%     4  194.3 195.1 193.8 197.7   1.5
11.|-- 213.239.245.17             0.0%     4  208.4 205.5 204.1 208.4   1.9
12.|-- 213.239.203.182            0.0%     4  200.4 199.6 198.6 200.4   0.6
13.|-- 213.239.203.186           25.0%     4  201.7 200.1 199.2 201.7   1.2
14.|-- 136.243.1.25              25.0%     4  204.3 203.8 202.6 204.6   1.0
Here is an example of the same website, this time delivered by the CDN:
Singapore, SG - With a CDN (6 hops)
1.|-- 128.199.191.253            0.0%     4    1.2   0.7   0.5   1.2   0.0
2.|-- 103.253.144.237            0.0%     4    0.5   0.6   0.5   0.7   0.0
3.|-- 202.79.197.69              0.0%     4    1.3   1.2   1.0   1.3   0.0
4.|-- 50.97.18.199               0.0%     4    2.3   2.4   2.3   2.5   0.0
5.|-- 174.133.118.133            0.0%     4    4.5   3.0   2.4   4.5   1.0
6.|-- 119.81.66.229              0.0%     4    2.3   2.2   2.2   2.3   0.0
The number of hops required when using a CDN is drastically diminished from 14 to 6 hops, resulting in optimal page load time for users across the globe. There is also a dramatic decrease in latency when using a CDN compared to not using a CDN. As can be seen in the traceroute without a CDN, starting at hop 6, the latency increases to 192.8 ms, at hop 7 it is 198.7 ms, and so on. However, with a CDN the max latency in this example occurs at hop 5 with only 4.5 ms.
Differences between Pull and Push Zones
With KeyCDN, you have the option to choose either between a Pull or Push Zone. It is dependent upon your requirements which will determine what type of Zone you will need, however in most cases a Pull Zone is the desired choice.
A Pull Zone will pull files from an existing website without having to upload data manually.
A Push Zone requires data to be upload to the CDN storage cloud. Typically recommended for distributing larger files, like files larger than 10 MB, and is required for files larger than 100 MB.
A Pull Zone is the most common method to use as many website owners already have an existing web host that is storing all of their files. As is shown by the Pull Zone image, the content gets pulled automatically from the origin server to the POPs and gets delivered to the visitor via the edge servers. In order to use a Pull Zone with KeyCDN, you can simply enter your FQDN (fully qualified domain name) into the Origin URL setting of your Zone and the CDN will automatically pull the content.
A Push Zone is a bit different as there is no existing source where the content is stored. With this method, you must upload your data manually to the KeyCDN storage cloud. From here, content is distributed to the CDN POPs in a similar way as a Pull Zone but optimized for larger files.
For a full guide on how to setup a CDN, read our Complete CDN migration article.
Not all CDNs are created equal
The question of what is CDN and how they work should now be a little clearer. However, if you are starting to investigate the CDN market, it should be noted that not all CDNs are created equal. CDN architecture such as edge server capabilities and location, will likely vary based on the provider. Before selecting a content delivery network provider, the website owner should first determine where their site visitors are coming from. Based on that information, one CDN may have better performance than another.
CDNs may also differ in the performance of their edge servers. Certain CDN networks aim to employ concentrated amounts of low performing POPs in particular locations while others aim to utilize a lesser amount of strategically placed, high performing POPs. Support for the new HTTP/2 protocol, SSL support, and customizability are a few things that should also be taken into consideration when selecting a CDN.
Summary
Hopefully, this CDN tutorial helps clear up the questions of "what is CDN" and "how does a CDN work?". Having a solid understanding of caching and how a CDN can help boost the delivery speed and scalability of your website will assist both with visitor retention and return. Depending upon where your visitors are originating from in regards to where your origin server is located, a CDN can have a major affect on speed.
We've performed a few tests to compare the latency with and without a CDN and saw (on average) a decrease in latency of 83%. Although very useful for improving the speed of a website, a CDN is also useful for other reasons such as SEO, security, etc. For more information, read our 7 Reasons You Should Use a Content Distribution Network article.