Caching in Web Applications


Modern web applications consume large amounts of data, and it's not uncommon for a page to generate as many as 50 SQL queries. WordPress is particularly known for this. Multiply this amount by a growing number of visitors, and you often get an overtaxed database, and sluggish site performance. This results in a poor experience for the users.

The rise of distributed memory caching systems has allowed web applications to substantially improve performance by storing frequently requested data in memory which in turn reduces the number of database requests required. Read-heavy applications benefit the most from implementing a cache layer.

Cache is faster than a database

In the terms of performance, we're always waiting for one of four things: CPU, memory, network, or disk.

Caching systems store data in memory where access times are many orders of magnitude faster than magnetic or solid state disk drives.

  • Main memory: 60-120ns (nanoseconds)
  • Solid-state disk: 10-100μs (microseconds)
  • Rotational disk: 5-10ms (milliseconds)

The cache acts as a temporary data-store and serves as the first lookup location by the application. Here's how it works:

Scenario 1: Cache Hit

Data is in the cache and isn't expired

  1. Application requests data from the cache.
  2. Cache returns the data to the application.

Over time, the cache hit ratio can climb north of 90%.

Scenario 2: Cache Miss

Data is not in the cache or is expired

  1. Application requests data from the cache.
  2. Cache doesn't have the requested data, so returns a null.
  3. Application requests and receives the data from the database.
  4. Application populates the cache with the new data.

Local Memory Caching

Of course, that's fine when we're running a standalone application on a single server.  But what happens when we've got services spread across multiple servers? They cannot access local memory on the app server. The distributed nature of caching systems allows you to fetch the data from anywhere.

Distributed In-Memory Caches

Redis and Memcached are the most widely used distributed caching systems today.  They are both open-source and can be installed on any supported server. They serve as in-memory, key-value NoSQL data stores, although Redis can be more accurately described as a data structure store.  

What code changes are required?

  • Every write operation (SQL INSERTs and UPDATEs) will be performed in both database and cache layer.
  • Every read operation will be performed in cache layer, and will fall back to database in case it is not found in cache.

Cache Expiration

If the data we are caching changes over time, we can set an expiration interval for the cached item. We could also manually clear the cache when a change occurs.

Sensitive Data

Any type of sensitive data like passwords, social security numbers, credit card numbers is best left outside of cache. Adding it to the cache creates an additional attack vector which could lead to compromise of the data.