Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations:getandset.
get(key)- Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
set(key, value)- Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.
The Idea:
Maintain a hash table that contains key value pairs and a list iterator. This essentially acts like an index that we can hash to it our list LRU array. More on this later.
Set:
Find the key. If it is not found, then we'd have to set it. We can run into two different cases: one, the cache is full, in which case we evict the LRU element (which will always be the last element in the list), by erasing it from the cache. We also, erase it from the LRU array. Next, we add the new _key _to the front of the list, as well as insert into the cache the key, value pair, and the iterator, which will always be in the front of the list. As the list grows, the former new element will move back toward the end of the list, and the iterator in the hash table will move along with it so we can identify it right away.
In another case, the element has not been found and the capacity of the cache has not been exceeded yet. In this case, all we have to do is insert the new element into the cache, as well as to the front of the list.
The final case is the that the key element already exists in the hash table. In this case, we would be responsible in doing is making sure that the LRU list is property updated (update LRU history), and that the value gets updated (but we can get lazy and update the entire cache line).
Get:
Find the key. If it doesn't exist, then return -1.
Otherwise, the key exists. We must update its location in the LRU history (Update LRU history), and return the value of that key that has been found.
Update LRU History:
Erase the old iterator pointer in the list O(1), and add to the front of the list the key. Finally, have the cache point to the beginning of the list, to have it follow this new element that has been most recently accessed.
class LRUCache {
public:
LRUCache(const size_t capacity) : cap(capacity) {
cache.reserve(capacity);
}
int get(int key)
{
auto found = cache.find(key);
if (found == cache.end())
return -1;
else {
update_LRU_history(found);
return found->second.first;
}
}
void set(int key, int value)
{
auto found = cache.find(key);
if (found != cache.end()) {
update_LRU_history(found);
cache[key] = { { value, LRU_history.begin() } };
return;
}
else if (LRU_history.size() >= cap) {
cache.erase(LRU_history.back());
LRU_history.pop_back();
LRU_history.push_front(key);
cache.insert({key, make_pair(value, LRU_history.begin())});
}
else {
LRU_history.push_front(key);
cache.insert({ key, make_pair(value, LRU_history.begin()) });
}
}
private:
using value_pointer = pair<int, list<int>::iterator>;
unordered_map<int, value_pointer> cache;
list<int> LRU_history;
const size_t cap;
void update_LRU_history(unordered_map<int, value_pointer>::iterator it)
{
LRU_history.erase(it->second.second);
LRU_history.push_front(it->first);
it->second.second = LRU_history.begin();
}
};
Python
The Idea: Use a doubly-linked list, along with a hash table to micromanage LRU updates. The doubly linked list combined with the hash table allows O(1) random access, with O(1) removals. The doubly link list is essentially used a container that holds the history of cache updates through time. For every put and get operation, the DLL will update. New elements get put to the back of the list, and older elements are put in front. Any address can be identified in O(1) time by mapping through thekey -> Nodepairs in the hash table. Once we have the address of the node, we can update its position (in our case, moving it to the end) in O(1) time.
When the capacity of the cache is full, then we retrieve the oldest element of the cache by popping the front element of the DLL, and then deleting this corresponding key from the hash table.
Complexity: O(1) time and O(2n) space
classNode:def__init__(self,key,val): self.key = key self.val = val self.left =None self.right =NoneclassDLL:def__init__(self): self.front =None self.tail =Nonedefadd_back(self,key,val):""" :param val: int - construct a newNode to the back of list :return: address of end of list """ newNode =Node(key, val)# empty listifnot self.front: self.front = self.tail = newNodeelse: tmp = self.tail self.tail.right = newNode self.tail = newNode newNode.left = tmpreturn self.taildefmove_back(self,node):""" :param node: Node - address of node to move back of list :return: new address of node """# add the new element to the end of the address# and remove the older address (via garbage collection) new_addr = self.add_back(node.key, node.val)if node.left: node.left.right = node.right node.right.left = node.left node =None# if no left, then node is the frontelse: self.front = self.front.right self.front.left =Nonereturn new_addrdefpop_front(self):if self.front: key, val = self.front.key, self.front.val self.front = self.front.right self.front.left =Nonereturn key, valraiseMemoryError('List is empty')defprintDLL(self):iter= self.frontprint('')whileiter:print('(%i, %i) -> '% (iter.key, iter.val), end ='')iter=iter.rightprint('')classLRUCache(object):def__init__(self,capacity):""" :type capacity: int """ self.history =DLL() self.db ={} self.cap = capacitydefget(self,key):""" :type key: int :rtype: int """if key in self.db: self.db[key]= self.history.move_back(self.db[key])return self.db[key].valreturn-1defput(self,key,value):""" :type key: int :type value: int :rtype: void """# if the element doesnt exist, put into the cacheif key notin self.db: self.db[key]= self.history.add_back(key, value)# in the db already, just update position and valueelse: self.db[key].val = value self.db[key]= self.history.move_back(self.db[key])# then take care of the issue if capacity is fulliflen(self.db)> self.cap: LRU_key, _ = self.history.pop_front()del self.db[LRU_key]