146 LRU Cache

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations:getandset.

get(key)- Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1. set(key, value)- Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

The Idea:

  • Maintain a hash table that contains key value pairs and a list iterator. This essentially acts like an index that we can hash to it our list LRU array. More on this later.

Set:

  • Find the key. If it is not found, then we'd have to set it. We can run into two different cases: one, the cache is full, in which case we evict the LRU element (which will always be the last element in the list), by erasing it from the cache. We also, erase it from the LRU array. Next, we add the new _key _to the front of the list, as well as insert into the cache the key, value pair, and the iterator, which will always be in the front of the list. As the list grows, the former new element will move back toward the end of the list, and the iterator in the hash table will move along with it so we can identify it right away.

  • In another case, the element has not been found and the capacity of the cache has not been exceeded yet. In this case, all we have to do is insert the new element into the cache, as well as to the front of the list.

  • The final case is the that the key element already exists in the hash table. In this case, we would be responsible in doing is making sure that the LRU list is property updated (update LRU history), and that the value gets updated (but we can get lazy and update the entire cache line).

Get:

  • Find the key. If it doesn't exist, then return -1.

  • Otherwise, the key exists. We must update its location in the LRU history (Update LRU history), and return the value of that key that has been found.

Update LRU History:

  • Erase the old iterator pointer in the list O(1), and add to the front of the list the key. Finally, have the cache point to the beginning of the list, to have it follow this new element that has been most recently accessed.

class LRUCache {
public:
    LRUCache(const size_t capacity) : cap(capacity) {
        cache.reserve(capacity);
    }

    int get(int key) 
    {
        auto found = cache.find(key);
        if (found == cache.end())
            return -1;

        else {
            update_LRU_history(found);
            return found->second.first;
        }
    }

    void set(int key, int value) 
    {
        auto found = cache.find(key);

        if (found != cache.end()) {
            update_LRU_history(found);
            cache[key] = { { value, LRU_history.begin() } };
            return;
        }

        else if (LRU_history.size() >= cap) {
            cache.erase(LRU_history.back());
            LRU_history.pop_back();
            LRU_history.push_front(key);
            cache.insert({key, make_pair(value, LRU_history.begin())});
        }

        else {
            LRU_history.push_front(key);
            cache.insert({ key, make_pair(value, LRU_history.begin()) });
        }
    }

private:
    using value_pointer = pair<int, list<int>::iterator>;
    unordered_map<int, value_pointer> cache;
    list<int> LRU_history;

    const size_t cap;

    void update_LRU_history(unordered_map<int, value_pointer>::iterator it)
    {
        LRU_history.erase(it->second.second);
        LRU_history.push_front(it->first);
        it->second.second = LRU_history.begin();
    }
};

Python

The Idea: Use a doubly-linked list, along with a hash table to micromanage LRU updates. The doubly linked list combined with the hash table allows O(1) random access, with O(1) removals. The doubly link list is essentially used a container that holds the history of cache updates through time. For every put and get operation, the DLL will update. New elements get put to the back of the list, and older elements are put in front. Any address can be identified in O(1) time by mapping through thekey -> Nodepairs in the hash table. Once we have the address of the node, we can update its position (in our case, moving it to the end) in O(1) time.

When the capacity of the cache is full, then we retrieve the oldest element of the cache by popping the front element of the DLL, and then deleting this corresponding key from the hash table.

Complexity: O(1) time and O(2n) space

class Node:
    def __init__(self, key, val):
        self.key = key
        self.val = val
        self.left = None
        self.right = None


class DLL:
    def __init__(self):
        self.front = None
        self.tail = None

    def add_back(self, key, val):
        """
        :param val: int - construct a newNode to the back of list
        :return: address of end of list
        """
        newNode = Node(key, val)

        # empty list
        if not self.front:
            self.front = self.tail = newNode
        else:
            tmp = self.tail
            self.tail.right = newNode
            self.tail = newNode
            newNode.left = tmp
        return self.tail

    def move_back(self, node):
        """
        :param node: Node - address of node to move back of list
        :return: new address of node
        """

        # add the new element to the end of the address
        # and remove the older address (via garbage collection)
        new_addr = self.add_back(node.key, node.val)
        if node.left:
            node.left.right = node.right
            node.right.left = node.left
            node = None
        # if no left, then node is the front
        else:
            self.front = self.front.right
            self.front.left = None
        return new_addr

    def pop_front(self):
        if self.front:
            key, val = self.front.key, self.front.val
            self.front = self.front.right
            self.front.left = None
            return key, val
        raise MemoryError('List is empty')

    def printDLL(self):
        iter = self.front
        print('')
        while iter:
            print('(%i, %i) -> ' % (iter.key, iter.val), end = '')
            iter = iter.right
        print('')

class LRUCache(object):

    def __init__(self, capacity):
        """
        :type capacity: int
        """
        self.history = DLL()
        self.db = {}
        self.cap = capacity

    def get(self, key):
        """
        :type key: int
        :rtype: int
        """

        if key in self.db:
            self.db[key] = self.history.move_back(self.db[key])
            return self.db[key].val
        return -1

    def put(self, key, value):
        """
        :type key: int
        :type value: int
        :rtype: void
        """
        # if the element doesnt exist, put into the cache
        if key not in self.db:
            self.db[key] = self.history.add_back(key, value)
        # in the db already, just update position and value
        else:
            self.db[key].val = value
            self.db[key] = self.history.move_back(self.db[key])

        # then take care of the issue if capacity is full
        if len(self.db) > self.cap:
            LRU_key, _ = self.history.pop_front()
            del self.db[LRU_key]

Limited Testing

cache = LRUCache(2)
cache.put(1, 1)
cache.put(2, 2)
print(cache.get(1))
cache.put(3, 3)
print(cache.get(2))
cache.put(4, 4)
print(cache.get(1))
print(cache.get(3))
print(cache.get(4))



cache4 = LRUCache(3)
cache4.put(1, 1)
cache4.put(2, 2)
cache4.put(3, 3)
cache4.put(4, 4)
print(cache4.get(4))
print(cache4.get(3))
print(cache4.get(2))
print(cache4.get(1))
cache4.put(5, 5)
print(cache4.get(1))
print(cache4.get(2))
print(cache4.get(3))
print(cache4.get(4))
print(cache4.get(5))


cache3 = LRUCache(2)
cache3.put(2, 1)
cache3.put(2, 2)
print(cache3.get(2))
cache3.put(1, 1)
cache3.put(4, 1)
print(cache3.get(2))


cache1 = LRUCache(1)
cache1.put(2, 1)
print(cache1.get(2))
cache1.put(3, 2)
print(cache1.get(2))
print(cache1.get(3))

Last updated