The 4 basic flows of HTTP Caching

Somente disponível em Inglês.

2013-11-03

2 min

Every web developer probably had used cache one day in their web apps or APIs to avoid redundant data traffic, network bottlenecks, protect your server from load spikes or simply long network latencies. The concept of caching is usually well understood and easily applicable in practice thanks to open source tools. However, to build a good cache strategy, i.e. a strategy that defines what can be cached, how long can you cache something and what policy you’ll follow after some resource is staled, is a hard and incremental process that need:

knowledge of how your resources are consumed by users;
understanding of how HTTP caching protocol works (HTTP headers everywhere);
patience to solve problems when tools don’t honor the protocol (believe me, this is very common)

The first one is up to you. The third one is an inherently issue of every computer system and you should be used to that too. Time and experience will help you to deal with this burden, but there’s always mailing lists and web search to save the day.

This post is an illustrative contribution to item 2 and presents the four basic flows of HTTP caching that can be implicitly extracted from the RFC. These flows shows how clients, server and caching software will behave depending on the status of the resource being requested and its location in the topology. And by “basic” I mean that they could vary depending on the cache strategy, so use them as a starting point. Let’s go!

Cache Miss

Cache Hit

Curious fact: this animation length is half of the “Cache miss” length. 😉

Cache Revalidation (Condition False)

Cache Revalidation (Condition True)

If you want to dig deeper on the subject, this content above was included in a talk I presented at Rubyconf Brazil 2013 about HTTP Caching (intro and good practices). The slides are embedded below:

You might also like to explore the HTTP headers RFC.