Implement an LRU Cache in Go-Golang-php.cn

Home

Backend Development

Golang

Implement an LRU Cache in Go

Aug 05, 2024 pm 04:04 PM

Implement an LRU Cache in Go

So you need a small cache and can't justify a Redis or memcached instance. Let's see what it takes to implement one in Go. For fun, we'll make it using generics so it is reusable in our project.

An LRU cache generally has a fixed capacity and the simplest ejection policy: eject the element that has the longest time since it was accessed. A simple lru cache will implement the following interface:

type LRUCache[T any] interface {
    Get(key string) (value T, found bool)
    Put(key string, value T)
    Keys() []string
    Remove(key string) bool
    Clear()
    Capacity() int
    Len() int
}

Copy after login

We know the cache will store a data item as an entry that is keyed by some value. That sounds like a map. What about implementing the ejection policy? One way to do this is to keep a timeAccessed property along with each item. Something like:

type cacheEntry[T any] struct {
  Data T
  LastAccessed time.time
}

Copy after login

However, let's think about performance, we want to be able to search for the cache key as well as insert and evict the oldest, if necessary, as fast as possible.

Using a map, which is a hashtable, will give us pretty fast performance for lookups. What about finding the oldest entry? If your cache struct looks like:

type LRUCache[T any] {
  capacity int
  keyMap map[string]cacheEntry[T]
}

Copy after login

We will necessarily need to iterate over the map to find the oldest when it comes time to evict an entry.

We need a way to store entries in a way that will efficiently allow us to keep a list of cache entries sorted. It is preferable that we not need to use a sort routine.

A double linked list is a good way to do this and we don't need to store access time in the entry unless we actually want it. So let's suppose we have a linked list that implements the following along with its node struct:

type DoubleLinkedList[T any] interface {
    Head() *DoubleNode[T]
    Tail() *DoubleNode[T]
    // Append inserts new item at tail
    Append(data T) *DoubleNode[T]
    // Push appends new item at head
    Push(data T) *DoubleNode[T]
    Remove(node *DoubleNode[T]) *DoubleNode[T]
    RemoveTail() *DoubleNode[T]
    MoveToHead(node *DoubleNode[T])
}
type DoubleNode[T any] struct {
    Data T
    Prev *DoubleNode[T]
    Next *DoubleNode[T]
}

Copy after login

The cache struct can now look something like:

type lruCache[T any] struct {
    capacity int
    keyMap   map[string]*DoubleNode[lruCacheEntry[T]]
    list     DoubleLinkedList[lruCacheEntry[T]]
}

Copy after login

The cache entry struct will be:

type lruCacheEntry[T any] struct {
    key   string
    value T
}

Copy after login

Realistically, you would probably use an interface for the cache key. I am using a string to keep the code simple.

In the implementation here, the most recently accessed entry in the cache will be at the head and the least recently used will be at the tail. So, when it comes time to evict, we simply remove the tail element of the linked list.

Implementing the Get() function is simple:

func (l *lruCache[T]) Get(key string) (value T, found bool) {
    if node, ok := l.keyMap[key]; ok {
        l.list.MoveToHead(node)
        return node.Data.value, ok
    }
    var zero T
    return zero, false
}

Copy after login

Get just needs to retrieve the map entry for the key, then move the node to the head of the list since it it now the 'most recently used'.

The Put() function is where we will handle the eviction if it's necessary:

func (l *lruCache[T]) Put(key string, value T) {
    if node, ok := l.keyMap[key]; ok {
        node.Data = lruCacheEntry[T]{
            key:   key,
            value: value,
        }
        // move the element to the most recent position
        l.list.MoveToHead(node)
    } else {
        // insert the new element at the head
        newNode := l.list.Push(lruCacheEntry[T]{
            key:   key,
            value: value,
        })
        l.keyMap[key] = newNode
    }
    // is eviction necessary
    if len(l.keyMap) &gt; l.capacity {
        nodeRemoved := l.list.RemoveTail()
        delete(l.keyMap, nodeRemoved.Data.key)
    }
}

Copy after login

For Put(), we first check to see if there is already a value for the given key. If so, then update the value and move the node to the head of the list. Otherwise, we create a new cache entry, add it to the list as the head, and add it to our map.

Finally, don't forget to check for capacity. If the new entry puts us over the capacity, we evict the oldest entry which is the tail of the list and delete the entry from our map.

Note that storing the key as part of the cache entry allows us to rapidly delete the key from the map. If we had only stored the data in the cache entry, then we would need to iterate over the map to find it.

This cache is missing something critical for a multi-threaded app. There is no synchronization. Realistically, a cache would be accessed by multiple threads. Synchronization is a complex topic. For our implementation, we can add a mutex to the cache struct:

type lruCache[T any] struct {
    capacity int
    keyMap   map[string]DoubleNode[lruCacheEntry[T]]
    list     DoubleLinkedList[lruCacheEntry[T]]
    mutex    sync.RWMutex
}

Copy after login

then add the following at the start of each function.

    l.mutex.Lock()
    defer l.mutex.Unlock()

Copy after login

Notice that we are using a read/write lock. Some of the functions don't change the structure of the cache, so we can use the read lock method provided, for example the Len() function:

func (l *lruCache[T]) Len() int {
    l.mutex.RLock()
    defer l.mutex.RUnlock()
    return len(l.keyMap)
}

Copy after login

Note that the synchronization strategy chosen here may break down if there are a large number of threads trying to access the cache. It's a complex topic that could be a series of posts in itself.

See the full implementation in the repository given in the link below.

What would you do different to implement a cache? How would you address synchronization? I am interested in hearing your thoughts on this one. There's no single solution to this so put your comments below.

Thanks!

The code for this post and all posts in this series can be found here

The above is the detailed content of Implement an LRU Cache in Go. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn