Distributed Cache Implementation in Python
Building a distributed cache is a common requirement in modern applications to improve performance and reduce database load. This challenge asks you to design and implement a simplified distributed cache system in Python, focusing on core functionalities like setting, getting, and deleting keys across multiple cache nodes. The goal is to understand the fundamental concepts of distributed caching and how to handle data consistency.
Problem Description
You are tasked with creating a basic distributed cache system. The system should consist of multiple cache nodes, each storing a subset of the cached data. The cache should support the following operations:
set(key, value, node_id): Stores akey-valuepair in the specifiednode_id.node_idis an integer representing the node where the data should be stored.get(key, node_id): Retrieves the value associated with the givenkeyfrom the specifiednode_id. If the key is not found in that node, it should returnNone.delete(key, node_id): Removes thekey-valuepair associated with the givenkeyfrom the specifiednode_id.
The system should be designed to handle multiple concurrent requests. For simplicity, assume that node IDs are always valid (i.e., within the range of available nodes). You don't need to implement node discovery or failover mechanisms for this challenge. Focus on the core caching logic.
Key Requirements:
- Implement a
DistributedCacheclass. - The
DistributedCacheclass should internally manage a dictionary representing the cache nodes. Each node is identified by itsnode_idand stores its data as a dictionary. - The
set,get, anddeletemethods should be implemented as described above. - Handle the case where a key is not found gracefully (return
Noneforget).
Expected Behavior:
The cache should behave as a simple key-value store distributed across multiple nodes. Data stored on a specific node should only be accessible through the get and delete operations targeting that node.
Edge Cases to Consider:
- Empty cache: Ensure that
getreturnsNonewhen the cache is empty or the key doesn't exist. - Invalid node ID: While the problem states node IDs are always valid, consider how your code would behave if an invalid node ID were passed (e.g., raise an exception or ignore the operation).
- Concurrent access: While not strictly required to implement full concurrency control, consider how your code might behave under concurrent access (e.g., using locks if necessary for more complex scenarios).
Examples
Example 1:
Input:
cache = DistributedCache(num_nodes=3)
cache.set("key1", "value1", 0)
cache.set("key2", "value2", 1)
print(cache.get("key1", 0))
print(cache.get("key2", 1))
print(cache.get("key3", 2))
Output:
value1
value2
None
Explanation: "key1" is stored on node 0, "key2" on node 1, and "key3" is not present in any node.
Example 2:
Input:
cache = DistributedCache(num_nodes=2)
cache.set("key1", "value1", 0)
cache.delete("key1", 0)
print(cache.get("key1", 0))
Output:
None
Explanation: "key1" is stored on node 0 and then deleted from node 0. A subsequent get on node 0 returns None.
Example 3: (Edge Case)
Input:
cache = DistributedCache(num_nodes=1)
print(cache.get("key1", 0))
Output:
None
Explanation: The cache has only one node, and "key1" is not stored on that node.
Constraints
num_nodeswill be a positive integer between 1 and 100 (inclusive).keywill be a string.valuecan be any Python object.node_idwill be an integer between 0 andnum_nodes - 1(inclusive).- The solution should be reasonably efficient for a small number of nodes and keys (e.g., up to 1000 keys across all nodes). Optimizing for extreme scale is not required.
Notes
- You can use Python dictionaries to represent the cache nodes.
- Consider the structure of your
DistributedCacheclass and how it manages the nodes. - This is a simplified model of a distributed cache. Real-world distributed caches have much more complex features like data replication, consistency protocols, and fault tolerance.
- Focus on clarity and correctness of the core caching operations. Error handling beyond the specified behavior (e.g., invalid input types) is not required.