Systems

LRU Cache

Discard the least-recently-used — O(1) get and put with hash + linked list

An LRU (Least Recently Used) cache holds a fixed number of items, evicting the one accessed longest ago when adding a new item past capacity. Implemented with a hash map plus a doubly linked list, both get and put run in O(1). Used by every web browser, every CPU's cache hierarchy, every database's buffer pool, and Redis as a memory-management policy.

  • Get / putO(1)
  • Eviction policyLeast Recently Used
  • ImplementationHash map + doubly linked list
  • Memory overhead~3 pointers + key + value per entry
  • Hit ratioWorkload-dependent — typically 70-95% on web
  • VariantsLFU, ARC, 2Q, CLOCK, W-TinyLFU

Interactive visualization

Press play, or step through manually. The visualization is yours to drive — try it before reading on.

Open visualization fullscreen ↗

Watch the 60-second explainer

A condensed visual walkthrough — narrated, captioned, under a minute.

How an LRU cache works

Two data structures cooperating:

  1. Hash map: key → linked list node. O(1) lookup.
  2. Doubly linked list: ordered most-recently-used (head) to least-recently-used (tail). O(1) move-to-head and remove-tail.

Operations:

  • get(key): hash-lookup. On hit, move the node to the head, return its value. On miss, return null.
  • put(key, value): if key exists, update value, move node to head. Otherwise, if at capacity, remove tail (evict LRU). Insert new node at head and store in hash map.

Both operations are O(1) — no scans, no comparisons against other entries. The only catch is keeping the linked list and hash map synchronized; bugs in synchronization cause subtle leaks and incorrect evictions.

JavaScript implementation (using Map's insertion-order property)

class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.map = new Map(); // Map preserves insertion order — front to back
  }

  get(key) {
    if (!this.map.has(key)) return undefined;
    // Move to "most recent" by deleting + re-setting
    const value = this.map.get(key);
    this.map.delete(key);
    this.map.set(key, value);
    return value;
  }

  put(key, value) {
    if (this.map.has(key)) {
      this.map.delete(key);
    } else if (this.map.size >= this.capacity) {
      // Evict the least recently used (front of insertion order)
      const oldestKey = this.map.keys().next().value;
      this.map.delete(oldestKey);
    }
    this.map.set(key, value);
  }
}

const cache = new LRUCache(3);
cache.put('a', 1);  // [a]
cache.put('b', 2);  // [a, b]
cache.put('c', 3);  // [a, b, c]
cache.get('a');     // [b, c, a]
cache.put('d', 4);  // [c, a, d] — b evicted
cache.get('b');     // undefined

Map's key order is insertion order — re-setting moves a key to the back. This makes the implementation 20 lines instead of 60. For very high-throughput cases (millions of ops/sec), an explicit linked list is slightly faster due to lower allocation overhead.

Explicit hash + doubly linked list

class Node {
  constructor(key, value) {
    this.key = key; this.value = value;
    this.prev = null; this.next = null;
  }
}

class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.map = new Map();
    // Sentinel nodes simplify edge cases (empty list, head/tail mutations)
    this.head = new Node(null, null);
    this.tail = new Node(null, null);
    this.head.next = this.tail;
    this.tail.prev = this.head;
  }

  _addToFront(node) {
    node.prev = this.head;
    node.next = this.head.next;
    this.head.next.prev = node;
    this.head.next = node;
  }

  _remove(node) {
    node.prev.next = node.next;
    node.next.prev = node.prev;
  }

  get(key) {
    const node = this.map.get(key);
    if (!node) return undefined;
    this._remove(node);
    this._addToFront(node);
    return node.value;
  }

  put(key, value) {
    let node = this.map.get(key);
    if (node) {
      node.value = value;
      this._remove(node);
      this._addToFront(node);
      return;
    }
    if (this.map.size === this.capacity) {
      const lru = this.tail.prev;
      this._remove(lru);
      this.map.delete(lru.key);
    }
    node = new Node(key, value);
    this._addToFront(node);
    this.map.set(key, node);
  }
}

Python implementation

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = OrderedDict()

    def get(self, key):
        if key not in self.cache: return None
        self.cache.move_to_end(key)
        return self.cache[key]

    def put(self, key, value):
        if key in self.cache:
            self.cache.move_to_end(key)
        elif len(self.cache) >= self.capacity:
            self.cache.popitem(last=False)  # evict oldest
        self.cache[key] = value

# Or use functools.lru_cache for memoization with LRU eviction:
from functools import lru_cache

@lru_cache(maxsize=128)
def expensive(x):
    # Compute something expensive
    return x * x

LRU variants and alternatives

PolicyEviction ruleBest for
LRULeast recently usedWorkloads with temporal locality (default)
LFULeast frequently usedFrequency-driven access patterns
FIFOOldest insertion (regardless of access)Hardware-cheap, lower hit rate
RandomRandom victimAdversarial workloads, no locality
2QTwo LRU lists — recent vs frequentScan-resistant; database buffer pools
ARC (Adaptive Replacement Cache)Self-tuning between recency and frequencyMixed workloads; ZFS, IBM DB2
CLOCKApproximate LRU using a single bit per entryOS page caches; cheaper than exact LRU
W-TinyLFUWindow-based frequency + admission controlModern caches; Caffeine (Java), Ristretto (Go)

For new caches, W-TinyLFU is the modern best-in-class — better hit rate than LRU on most real workloads. ARC is patent-encumbered for some uses; CLOCK is the standard in OS kernels.

When LRU is the right choice

  • Web browsers and CDN caches. Recent URLs are likely to be re-fetched. LRU works well.
  • Database buffer pools. Recently-read pages are likely to be re-read. Many DBs use LRU variants (2Q, CLOCK, ARC).
  • Function-result memoization. Python's @lru_cache, JS WeakMap-based memo wrappers — recent inputs are likely to recur.
  • Object pools and connection pools. Reuse recently-released resources before allocating new ones.
  • Image / asset caches in mobile apps. Recently-viewed images stay; old ones evict to make room.

Skip LRU when the workload has no temporal locality (random access patterns), when the eviction cost is high (each evicted page must be flushed to disk — write-back caches need careful policy), or when you can fit everything in memory (no eviction needed).

Common LRU bugs

  • Hash map and linked list out of sync. Adding to one without the other corrupts the cache silently. Encapsulate both inside the cache class; never expose the internal data structures.
  • Not moving on get. If get doesn't update the access order, the policy degenerates to FIFO. Always move-to-head on hit.
  • Not handling capacity 0. A 0-capacity cache should never store anything; some implementations throw on put or silently never evict. Decide and document.
  • Stale references after eviction. If callers hold references to evicted values, weak references or explicit invalidation are needed. In garbage-collected languages, holding strong references prevents memory release.
  • Concurrent access without locking. Hash map + linked list mutations aren't atomic. Concurrent get + put can corrupt the structure. Use a lock per operation, a concurrent-safe library (Java ConcurrentLinkedHashMap, Caffeine), or a different policy that's easier to make concurrent.
  • Pretending it's O(1) when it's not. Some implementations use a singly linked list, making move-to-head O(n) (must scan from head to find predecessor). Always use a doubly linked list, or pay the asymptotic cost.

Frequently asked questions

Why is LRU implemented with a hash map AND a linked list?

The hash map gives O(1) lookup by key. The doubly linked list maintains access order — the most recently used at the head, least recently used at the tail. On a hit, we move the entry to the head in O(1) (only with a doubly linked list — singly linked needs O(n) to find the predecessor). On eviction, we remove the tail in O(1). Either structure alone can't do both operations in O(1); together they can.

Why is LRU usually a good policy?

Real workloads exhibit temporal locality — recently accessed items are likely to be accessed again soon. LRU exploits this by keeping recent items at the cost of evicting old ones. For workloads without locality (one-shot scans of large data), LRU is no better than random — sometimes worse. CLOCK and W-TinyLFU are designed for those cases.

When does LRU perform badly?

On scan-heavy workloads — sequentially reading a large dataset that doesn't fit in cache. Each access misses; the scan evicts everything useful. After the scan completes, the cache is empty of the originally-cached working set. Solution — segregate scans (don't pollute the LRU) or use a scan-resistant policy like 2Q or ARC.

Can LRU be made approximate for hardware caches?

Yes — exact LRU is too expensive in hardware (per-access bookkeeping). CPUs use pseudo-LRU (PLRU) or NRU (Not Recently Used) with single-bit per cache line. Approximate but cheap. Real CPU L1/L2 caches use tree-PLRU or similar; only software caches typically maintain exact LRU.

What's LFU and how does it differ?

LFU (Least Frequently Used) evicts the item with the lowest access count, not the oldest. Better for workloads with frequency-driven access (one item accessed many times stays even if not recently). Worse on freshness — old popular items never get evicted. Hybrid policies (W-TinyLFU, CLOCK-Pro) combine the two — frequency for popular items, recency for the rest.

How do databases use LRU caches?

Buffer pool management. PostgreSQL's shared buffers, MySQL InnoDB's buffer pool, MongoDB's WiredTiger cache — all hold recently-accessed disk pages in memory using LRU or a variant. A page hit avoids a disk read; the buffer pool's hit ratio directly determines query performance. Tuning buffer pool size to the working set is the most-impactful database knob.

How do you size an LRU cache?

Big enough to hold the working set (set of items accessed in a typical window). Use observability — measure hit ratio at various sizes; the curve is usually concave (diminishing returns). Sweet spot is where hit ratio plateaus. Web caches often plateau around 90-95% hit ratio at sizes 10-30% of the unique URL count.