Implementing a Distributed In-Memory Cache in Python
This challenge asks you to build a basic distributed in-memory cache system in Python. A distributed cache is crucial for improving the performance of applications by storing frequently accessed data across multiple nodes, reducing the need for repeated database or network calls. This exercise will help you understand the fundamental concepts of distributed systems, data partitioning, and inter-process communication.
Problem Description
You need to create a Python-based distributed cache system. This system should allow multiple client processes to interact with a shared cache, which can be conceptually spread across different nodes. For this challenge, we will simulate the distributed nature using a single machine and multiple Python processes.
Your cache implementation should support the following operations:
set(key, value, ttl_seconds=None): Stores avalueassociated with akey. Ifttl_secondsis provided, the cache entry will expire after that duration.get(key): Retrieves thevalueassociated with akey. If the key does not exist or has expired, it should returnNone.delete(key): Removes thekeyand its associated value from the cache.
Key Requirements:
- Distribution Simulation: The cache will be simulated across multiple processes on a single machine. You'll need a mechanism for these processes to communicate.
- Data Partitioning: Implement a simple hashing strategy to decide which cache "node" (represented by a process) is responsible for a given key.
- Inter-Process Communication (IPC): Use a suitable IPC mechanism (e.g., sockets, message queues, or a simple RPC library) for clients to communicate with the cache nodes.
- Time-To-Live (TTL): Support expiring cache entries.
- Concurrency Handling: The cache nodes should be able to handle multiple client requests concurrently.
Expected Behavior:
- When a client calls
set(key, value), the key-value pair should be stored in the appropriate cache node based on the hashing strategy. - When a client calls
get(key), the request should be routed to the correct cache node, and the value should be returned if found and not expired. - When a client calls
delete(key), the key should be removed from the responsible cache node. - Expired keys should not be returned by
get.
Edge Cases to Consider:
- Key Collisions: How does your partitioning handle keys that hash to the same node? (This is handled by the node itself).
- Node Failures (Simulated): While not strictly required for a basic implementation, consider how you might handle a node becoming unresponsive. (For this challenge, assume nodes are always available).
- TTL expiring while being accessed: If a
getrequest arrives just as an entry expires, it should be treated as expired.
Examples
Since this is an infrastructure implementation, direct input/output examples for a single call are less illustrative. Instead, consider a scenario with multiple clients and nodes.
Scenario:
Imagine we have 3 cache nodes (processes) and a client process.
Cache Node Setup:
Each node starts with an empty cache.
Client Operations:
client.set("user:123", {"name": "Alice", "age": 30})client.set("product:abc", {"name": "Laptop", "price": 1200}, ttl_seconds=60)client.get("user:123")client.get("product:abc")client.set("user:456", {"name": "Bob"})client.delete("user:123")client.get("user:123")- Wait for 70 seconds.
client.get("product:abc")
Expected Outcome (Conceptual):
set("user:123", ...): Hashing"user:123"might direct it to Node 1. Node 1 stores it.set("product:abc", ..., ttl_seconds=60): Hashing"product:abc"might direct it to Node 2. Node 2 stores it with an expiry of 60 seconds.get("user:123"): Request goes to Node 1. Node 1 returns{"name": "Alice", "age": 30}.get("product:abc"): Request goes to Node 2. Node 2 returns{"name": "Laptop", "price": 1200}.set("user:456", ...): Hashing"user:456"might direct it to Node 1. Node 1 stores it.delete("user:123"): Request goes to Node 1. Node 1 removes the entry.get("user:123"): Request goes to Node 1. Node 1 returnsNone(key deleted).- After 70 seconds,
get("product:abc"): Request goes to Node 2. Node 2 finds the entry has expired and returnsNone.
Constraints
- You will need to run at least 3 cache node processes and 1 client process for testing.
- The keys and values can be any Python pickleable objects.
- The partitioning strategy can be a simple consistent hashing algorithm or a modulo-based approach.
- IPC mechanism should be efficient enough for reasonable performance, but absolute high-performance is not the primary goal.
- The system should be able to handle at least 1000 unique keys.
Notes
- Consider using Python's
multiprocessingmodule for creating separate cache node processes. - For IPC, explore libraries like
Pyro4for simple RPC, orredis-pyif you want to abstract the client-server interaction to a Redis-like interface (though you'd be implementing the server-side logic). For a more fundamental approach, look intosocketprogramming orZeroMQ. - Implementing TTL requires a mechanism to track expiration times and clean up expired entries. This could be a background thread in each cache node.
- A simple modulo-based partitioning (
hash(key) % num_nodes) is a good starting point, but be aware of its limitations when the number of nodes changes. Consistent hashing is a more robust solution. - Your solution should consist of at least two main components: the cache node server and the cache client interface.