Few years back, I came across a new e-commerce website while browsing the internet. I noticed that the images on the website didn’t load quickly. I refreshed the page and retried a couple of times. Finally, after ten seconds, I was able to see the rendered webpage with images.
Initially, I thought my internet connection was poor but the internet download speed was good enough. I was able to view Youtube videos in HD as well. This sparked my curiosity to know why the e-commerce website wasn’t able to load images quickly.
The curious engineer inside me decided to research the reason for the website’s slow performance. I quickly opened google’s developer tools and navigated to the Networking tab for analysis. After going through a couple of Stack Overflow posts, the conclusion was that the site didn’t use Content Delivery Network (CDN).
In this article, we will understand what a CDN is, why it is needed and it’s working. Also, we will see how websites can accelerate their content delivery by harnessing a CDN.
The poor performance of the website didn’t create a good first impression on me. And due to it’s slow performance, I decided that I won’t shop on the website again.
In today’s world, performance of a website is of prime importance. It’s difficult to retain users in case your website shows downtime or slowness while loading. This is applicable to all businesses. Many websites even had to shut their shops as their competitors delivered webpages faster than them.
Before getting into the details of what a Content Delivery Network (CDN) is, we will refresh our fundamentals. Let’s understand how webpages are displayed on our devices.
The above diagram gives an overview of how a Client requests a page and finally shows it to the users. Following are the details :-
- The Client (Mobile App/ Browser) sends a HTTP request to the Web Server.
- The Server processes the request, performs checks & validations, fetches data from a database, hard disk or blob storage. Further, it constructs a response.
- The Client reads the response. The response is generally a HTML page which is given back to the client.
- Finally, the HTML page is displayed to the users.
The HTML page can also contain images, gifs, videos, etc. So, along with the document the Client has an additional responsibility of displaying this data.
In case you select
We can classify the content on any Web Page into two types — Static and Dynamic.
Also, even if these files change, we won’t show incorrect data to the users. Only the user experience or look and feel of the website would change in case of any file enhancements.
In some cases, such files are persisted on the file system on the server. In other words, the Web server would fetch these files from the hard disk and send it back to the clients. Many a times, these files are persisted in blob storage such as S3, Azure Blob Storage, etc.
The size of the static data can be in KBs, MBs or GBs. Movie files are large in size and consume significant bandwidth.
The data that changes frequently is dynamic. For eg:- The number of viewers watching a video on Youtube. The comments, likes or shares on social media websites.
Generally, server stores the dynamic data in a database. Depending on the use case, it could be SQL or NoSQL. For every request, the server queries this data and then passes it back in the response. In most of the cases, JSON is used for data serialization.
The size of dynamic data is small in comparison to the static data like movies, videos or images. Its in the order of few KBs. The server can also store this data in an external cache such as Redis, or MemCached for efficiency.
Latency is the time taken for a website to completely render all the data. In case the latency increases, users have to wait more time. The more users have to wait, the less will be the conversion rate.
Websites that render pages with lower latency are more performant. These websites would show the page to the user within few milli seconds.
Latency is dependent on multiple factors. It includes the following :-
- Distance between user and the server
- Server’s processing time
- Time taken to retrieve data from the database
If the website’s server crashes, the client won’t be able to view the web page. The server should also be able to handle increasing load.
In case the website isn’t scalable, the server processing time would increase and increase the latency. In today’s world, downtimes aren’t tolerated.
Since the websites are global, they are expected to run 24*7*365. Users want to watch videos, shop online, message their friends, etc seamlessly.
Let’s assume that we are starting a new short video app like TikTok. We build the first version of the website and deploy it Los Angeles, United States. Our website is accessible all over the globe & slowly starts gaining traction.
We notice that we are receiving traffic from Europe, Middle east and India. As the traffic grows, we scale horizontally and add more servers. However, users from India complain that they experience longer load times. Users from USA don’t face the same issue. Why does this happen ?
As seen above, we have our server deployed in LA, USA. For users in India, the network packet has to travel a larger distance as compared to users in USA. As the distance is more, the time taken would be proportionately more. If the time taken to fetch data in USA is 5 milli seconds, it would take 35 milli seconds (distance is 7x) for Indian users.
The same applies to users in Europe as well. The website load time in EU would be more than the load time in USA. It would be close to 3.5x.
Moreover, the video files on our servers won’t change often. In case a video becomes viral, the same video file would be accessed by users all over the globe. The problem would amplify if the video file size is more as network bandwidth will become a bottleneck.
The above problem would be solved in case we reduce the distance between the client and the server. And since the video file doesn’t change often, we can have a caching mechanism.
By caching, we would request the file from our server once & subsequent requests would be served by the cache. This would reduce the overall load on the server.
Content Delivery Networks (CDN) apply this solution and accelerate the content delivery for the websites. Our main web server is also called the Origin server. CDNs consist of a group of servers which are geographically distributed. These servers are also called Point of Presence (POP) servers. And the locations where POP servers are present are called Edge locations.
The POP servers serve content for the users who reside in the same geographical location. For eg:- POP server in Europe will serve data for European users. Indian users will be served by a POP server that is located in India.
Let’s take an example of withdrawing cash from the Bank. Imagine what would have happened if there were no ATM machines ? In the absence of ATM machines, there would have been long queues outside the Banks. It would have taken us long time to get money. Further, on bank holidays, people wouldn’t have the flexibility to withdraw cash.
ATM machines at different places such as Metro stations, Restaurants, Airports, etc, ensure that long queues are not formed at the banks. People can withdraw money at their convenience. Also, people can go to their nearest ATM instead of going to their banks (distance could be more).
You can think of the Banks as Origin servers and ATMs as CDNs. CDNs reside close to the user’s location. They are geographically distributed and reduce the load on the Origin server. Also, they improve the website’s availability by handling traffic for static content.
The below diagram shows high level working of what happens when you load a website in your browser :-
- The browser sends a request to the DNS for IP address lookup.
- The DNS server will respond with the address of the closest CDN server.
- The browser will send request to the CDN server (POP) to fetch the data.
- The CDN server will check if the data (image, js, css, video, etc) exists in it’s cache.
- In case the data doesn’t exist, the POP server will send a request to the Origin server to get the content.
- Later, the POP server will store the content and send the data back to the user.
- CDN users can also set TTL(Time to Live) on the content. For eg:- TTL of 15 mins on the image file. The CDN server will send the same data to the users until it expires.
- Once the CDN servers detect that the content is stale, it would refetch it from the Origin servers.
The clients fetch the static data (images, videos, etc) from the CDN instead of the Origin server. CDN servers are geographically located close to the users. As the distance reduces, so does the time to fetch the data. This results in significant improvement in website load time.
Websites with faster load time improve the user experience. Users are more inclined to products that are efficient & performant. For eg:- compare Google Chrome to Internet explorer.
CDN servers take up much of the website load. As CDN servers act like caches, the Origin Server is protected from traffic spikes. Since the Origin server deals with less load, there are less chances that this server would go down.
The overall availability of the website improves. In case there are regional traffic spikes, CDN servers in the respective regions scale & handle the increasing load.
Website’s primary expense is cost of bandwidth consumption. As CDNs handle the traffic, there is an enormous reduction in the data that Origin servers provide. This helps in reducing the bandwidth cost for the website owners.
- Content Delivery Network consists of a set of servers which are located geographically close to the users for accelerating the delivery of website content.
- Browsers or Mobile Apps request static data from the CDN instead of the Origin server.
- The CDNs fetch the data from the Origin servers and cache it. The subsequent requests are served from the CDN server.
- The primary benefits of CDN are improvement in website’s performance, reduction in bandwidth consumption costs, & improved availability of the website.