An LRU cache is one of those data structures that looks simple until you try to implement it cleanly.

When I am building a distributed system that I know is going to scale, I care a lot about making informed decisions early, especially when most of the system's scope is already clear.

Not every decision needs to be perfect on day one. That would be unrealistic. But some decisions set the shape of the system for a long time: how data moves, where state lives, what gets recomputed, what gets cached, and how much pressure each service can absorb before it starts pushing pain somewhere else.

In distributed systems, caching is not something you can avoid forever.

At small scale, you can sometimes get away with repeated database reads, repeated API calls, or recomputing the same values on demand. At larger scale, those choices start showing up as latency, cost, noisy dependencies, and unnecessary load on services that should not be doing the same work repeatedly.

What I am about to walk through is not a perfect solution I discovered the first time I built a cache.

It is a data-structure adaptation shaped by many rounds of trial and error: the kind where the first version works, the second version fails under a real access pattern, and the third version finally teaches you what the cache was supposed to protect in the first place.

The idea is easy:

Keep the most recently used items. When the cache is full, evict the least recently used item.

The usual recipe is also familiar:

  • a hash map for fast lookup
  • a doubly linked list for recency order

The hash map tells you where the item is. The linked list tells you which item is oldest. Every get or put moves an item to the front. When capacity is exceeded, remove from the back.

In many languages, you can implement this with pointers and move on.

Rust makes you slow down.

That is not a bad thing. It forces you to be explicit about ownership, mutation, and the shape of the references in your data structure. A doubly linked list is exactly the kind of structure Rust does not let you hand-wave.

So in this post, we will build a small LRU cache in Rust using a HashMap and an index-based doubly linked list.

It will not be the shortest implementation possible. That is not the goal.

The goal is to build something understandable, safe, and honest about the tradeoffs.

The shape we want

The cache should support:

let mut cache = LruCache::new(2);

cache.put("a", 1);
cache.put("b", 2);

assert_eq!(cache.get(&"a"), Some(&1));

cache.put("c", 3);

assert_eq!(cache.get(&"b"), None);

After get("a"), a becomes the most recently used item. When we insert c, the least recently used item is b, so b gets evicted.

The operations we care about are:

  • get: O(1)
  • put: O(1)
  • eviction: O(1)

To get there, we need two structures:

HashMap<K, usize>
Vec<Node<K, V>>

The HashMap maps keys to node indexes. The Vec stores nodes. Each node stores indexes for the previous and next nodes in the recency list.

Using indexes instead of pointers is the trick that keeps the implementation safe and relatively simple.

The node

Each node needs to know its key, value, and neighbors:

use std::collections::HashMap;
use std::hash::Hash;

#[derive(Debug)]
struct Node<K, V> {
    key: K,
    value: V,
    prev: Option<usize>,
    next: Option<usize>,
}

The cache itself tracks the list head and tail:

pub struct LruCache<K, V> {
    capacity: usize,
    map: HashMap<K, usize>,
    nodes: Vec<Option<Node<K, V>>>,
    head: Option<usize>,
    tail: Option<usize>,
    free: Vec<usize>,
}

There are a few details here.

First, nodes is a Vec<Option<Node<K, V>>>, not Vec<Node<K, V>>.

Why?

Because when we evict a node, we want to leave an empty slot behind and possibly reuse it later. None means "this slot is currently free."

Second, free stores reusable indexes.

That avoids growing the vector forever as cache entries churn.

Third, the list is represented by indexes:

head -> most recently used
tail -> least recently used

When an item is read or written, we move its node to the head.

When the cache is full, we evict the tail.

Creating the cache

The constructor is straightforward:

impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    pub fn new(capacity: usize) -> Self {
        Self {
            capacity,
            map: HashMap::new(),
            nodes: Vec::new(),
            head: None,
            tail: None,
            free: Vec::new(),
        }
    }
}

We require K: Clone because the key lives in two places:

  • inside the HashMap
  • inside the node, so eviction can remove the key from the map

That is a real tradeoff.

For many cache keys, cloning is acceptable. If keys are large or expensive to clone, you would design this differently. You might use Arc<K>, interning, or another ownership strategy. For this implementation, Clone keeps the code clear.

Linking and unlinking nodes

Most of the cache is list bookkeeping.

We need a helper to detach a node from wherever it currently lives:

impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    fn detach(&mut self, index: usize) {
        let (prev, next) = {
            let node = self.nodes[index].as_ref().expect("node should exist");
            (node.prev, node.next)
        };

        match prev {
            Some(prev_index) => {
                self.nodes[prev_index].as_mut().unwrap().next = next;
            }
            None => {
                self.head = next;
            }
        }

        match next {
            Some(next_index) => {
                self.nodes[next_index].as_mut().unwrap().prev = prev;
            }
            None => {
                self.tail = prev;
            }
        }

        let node = self.nodes[index].as_mut().unwrap();
        node.prev = None;
        node.next = None;
    }
}

And another helper to attach a node at the front:

impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    fn attach_front(&mut self, index: usize) {
        {
            let node = self.nodes[index].as_mut().expect("node should exist");
            node.prev = None;
            node.next = self.head;
        }

        if let Some(old_head) = self.head {
            self.nodes[old_head].as_mut().unwrap().prev = Some(index);
        } else {
            self.tail = Some(index);
        }

        self.head = Some(index);
    }

    fn move_to_front(&mut self, index: usize) {
        if self.head == Some(index) {
            return;
        }

        self.detach(index);
        self.attach_front(index);
    }
}

This is the part of the implementation where Rust is doing useful work.

The compiler will not let us casually hold multiple mutable references into the vector at the same time. So the code pulls out the neighbor indexes first, releases that borrow, then mutates the neighboring nodes.

It is a little more verbose than pointer-heavy code.

It is also harder to accidentally create a dangling pointer.

Reading from the cache

On get, we need to:

  1. find the node index
  2. move it to the front
  3. return a reference to the value
impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    pub fn get(&mut self, key: &K) -> Option<&V> {
        let index = *self.map.get(key)?;

        self.move_to_front(index);

        self.nodes[index]
            .as_ref()
            .map(|node| &node.value)
    }
}

Notice that get takes &mut self.

That surprises some people at first. We are "only reading," right?

Not really.

An LRU cache mutates on read because reading updates recency. A cache lookup is also a write to the cache's internal bookkeeping.

That is one of those small API details that tells you whether someone has actually implemented the structure before.

Writing into the cache

On put, there are two cases.

If the key already exists, update the value and move the node to the front.

If it does not exist, insert a new node. If the cache grows beyond capacity, evict the tail.

impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    pub fn put(&mut self, key: K, value: V) {
        if self.capacity == 0 {
            return;
        }

        if let Some(&index) = self.map.get(&key) {
            self.nodes[index].as_mut().unwrap().value = value;
            self.move_to_front(index);
            return;
        }

        let index = self.allocate_node(Node {
            key: key.clone(),
            value,
            prev: None,
            next: None,
        });

        self.map.insert(key, index);
        self.attach_front(index);

        if self.map.len() > self.capacity {
            self.evict_lru();
        }
    }
}

The allocation helper reuses a free slot when possible:

impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    fn allocate_node(&mut self, node: Node<K, V>) -> usize {
        if let Some(index) = self.free.pop() {
            self.nodes[index] = Some(node);
            index
        } else {
            self.nodes.push(Some(node));
            self.nodes.len() - 1
        }
    }
}

And eviction removes the tail:

impl<K, V> LruCache<K, V>
where
    K: Eq + Hash + Clone,
{
    fn evict_lru(&mut self) {
        let Some(index) = self.tail else {
            return;
        };

        self.detach(index);

        let node = self.nodes[index]
            .take()
            .expect("node should exist");

        self.map.remove(&node.key);
        self.free.push(index);
    }
}

That is the whole cache.

Not a toy in the sense of being fake, but still small enough to reason about.

A quick test

Here is the behavior we expect:

#[test]
fn evicts_the_least_recently_used_item() {
    let mut cache = LruCache::new(2);

    cache.put("a", 1);
    cache.put("b", 2);

    assert_eq!(cache.get(&"a"), Some(&1));

    cache.put("c", 3);

    assert_eq!(cache.get(&"b"), None);
    assert_eq!(cache.get(&"a"), Some(&1));
    assert_eq!(cache.get(&"c"), Some(&3));
}

#[test]
fn updating_existing_key_keeps_capacity_stable() {
    let mut cache = LruCache::new(2);

    cache.put("a", 1);
    cache.put("a", 10);
    cache.put("b", 2);
    cache.put("c", 3);

    assert_eq!(cache.get(&"a"), None);
    assert_eq!(cache.get(&"b"), Some(&2));
    assert_eq!(cache.get(&"c"), Some(&3));
}

That second test is worth reading carefully.

Updating a moves it to the front. Then inserting b makes b newer than a. Inserting c evicts a.

LRU behavior is easy to get slightly wrong when updates and reads both affect recency.

Why not use Rc<RefCell<Node>>?

You can build an LRU cache with Rc<RefCell<Node>>.

It may even look closer to the classic linked-list explanation.

But I do not love it as the first Rust version because it teaches a habit I would rather avoid: using runtime borrow checks to escape a design problem too early.

RefCell is useful. Rc is useful. But for this structure, indexes give us enough flexibility without moving borrow safety from compile time to runtime.

The index-based approach also makes it easier to inspect and test the structure. There are no reference cycles to think about. There is no weak pointer dance. Eviction is just removing a node from a vector slot and deleting the map entry.

The tradeoff is that we now own index validity.

That is why nodes uses Option<Node<K, V>>, why evicted slots are tracked in free, and why helper methods keep all list mutation in one place.

What I would do differently in production

For production, I would first ask whether I should write this at all.

Rust already has mature LRU cache crates. If the cache is not the product, using a well-tested crate is usually the responsible choice.

If I did need a custom implementation, I would think about:

  • metrics for hit rate, miss rate, and evictions
  • memory limits, not just item count
  • entry weight, because not all values cost the same
  • TTL or time-aware invalidation
  • concurrency, probably by wrapping the cache rather than baking locks into it
  • fallible value creation, so callers can use get_or_insert_with
  • whether cloning keys is acceptable
  • whether indexes need generation counters to protect against stale references in more complex APIs

This implementation is intentionally narrower.

It is a good way to understand the mechanics. It is not a full cache subsystem.

That distinction is important.

The real lesson

An LRU cache is a nice example because the algorithm is simple but the implementation exposes your language choices.

In Rust, the hard part is not knowing that you need a hash map and a linked list.

The hard part is deciding how to represent the linked list without fighting the ownership model or reaching for unsafe code too quickly.

Indexes are not glamorous. They are not the cleverest solution. But they give you a practical middle ground:

  • O(1) lookup
  • O(1) recency updates
  • O(1) eviction
  • no unsafe code
  • no reference-counted cycles
  • no runtime borrow panics

That is the kind of tradeoff I like.

Not because it is perfect, but because it is explicit.

Closing thought

Data structures are not only about Big O notation.

They are also about the constraints of the language you are implementing them in.

Rust makes some structures harder to write casually. But that friction is often useful. It pushes you to choose ownership deliberately instead of hiding it behind references that may or may not still be valid.

An LRU cache is small enough to build in an afternoon.

It is also rich enough to remind you that "simple" and "obvious" are not the same thing.