Hone logo
Hone
Problems

Implementing a Distributed In-Memory Cache in Python

This challenge asks you to build a basic distributed in-memory cache system in Python. A distributed cache is crucial for improving the performance of applications by storing frequently accessed data across multiple nodes, reducing the need for repeated database or network calls. This exercise will help you understand the fundamental concepts of distributed systems, data partitioning, and inter-process communication.

Problem Description

You need to create a Python-based distributed cache system. This system should allow multiple client processes to interact with a shared cache, which can be conceptually spread across different nodes. For this challenge, we will simulate the distributed nature using a single machine and multiple Python processes.

Your cache implementation should support the following operations:

  1. set(key, value, ttl_seconds=None): Stores a value associated with a key. If ttl_seconds is provided, the cache entry will expire after that duration.
  2. get(key): Retrieves the value associated with a key. If the key does not exist or has expired, it should return None.
  3. delete(key): Removes the key and its associated value from the cache.

Key Requirements:

  • Distribution Simulation: The cache will be simulated across multiple processes on a single machine. You'll need a mechanism for these processes to communicate.
  • Data Partitioning: Implement a simple hashing strategy to decide which cache "node" (represented by a process) is responsible for a given key.
  • Inter-Process Communication (IPC): Use a suitable IPC mechanism (e.g., sockets, message queues, or a simple RPC library) for clients to communicate with the cache nodes.
  • Time-To-Live (TTL): Support expiring cache entries.
  • Concurrency Handling: The cache nodes should be able to handle multiple client requests concurrently.

Expected Behavior:

  • When a client calls set(key, value), the key-value pair should be stored in the appropriate cache node based on the hashing strategy.
  • When a client calls get(key), the request should be routed to the correct cache node, and the value should be returned if found and not expired.
  • When a client calls delete(key), the key should be removed from the responsible cache node.
  • Expired keys should not be returned by get.

Edge Cases to Consider:

  • Key Collisions: How does your partitioning handle keys that hash to the same node? (This is handled by the node itself).
  • Node Failures (Simulated): While not strictly required for a basic implementation, consider how you might handle a node becoming unresponsive. (For this challenge, assume nodes are always available).
  • TTL expiring while being accessed: If a get request arrives just as an entry expires, it should be treated as expired.

Examples

Since this is an infrastructure implementation, direct input/output examples for a single call are less illustrative. Instead, consider a scenario with multiple clients and nodes.

Scenario:

Imagine we have 3 cache nodes (processes) and a client process.

Cache Node Setup:

Each node starts with an empty cache.

Client Operations:

  1. client.set("user:123", {"name": "Alice", "age": 30})
  2. client.set("product:abc", {"name": "Laptop", "price": 1200}, ttl_seconds=60)
  3. client.get("user:123")
  4. client.get("product:abc")
  5. client.set("user:456", {"name": "Bob"})
  6. client.delete("user:123")
  7. client.get("user:123")
  8. Wait for 70 seconds.
  9. client.get("product:abc")

Expected Outcome (Conceptual):

  • set("user:123", ...): Hashing "user:123" might direct it to Node 1. Node 1 stores it.
  • set("product:abc", ..., ttl_seconds=60): Hashing "product:abc" might direct it to Node 2. Node 2 stores it with an expiry of 60 seconds.
  • get("user:123"): Request goes to Node 1. Node 1 returns {"name": "Alice", "age": 30}.
  • get("product:abc"): Request goes to Node 2. Node 2 returns {"name": "Laptop", "price": 1200}.
  • set("user:456", ...): Hashing "user:456" might direct it to Node 1. Node 1 stores it.
  • delete("user:123"): Request goes to Node 1. Node 1 removes the entry.
  • get("user:123"): Request goes to Node 1. Node 1 returns None (key deleted).
  • After 70 seconds, get("product:abc"): Request goes to Node 2. Node 2 finds the entry has expired and returns None.

Constraints

  • You will need to run at least 3 cache node processes and 1 client process for testing.
  • The keys and values can be any Python pickleable objects.
  • The partitioning strategy can be a simple consistent hashing algorithm or a modulo-based approach.
  • IPC mechanism should be efficient enough for reasonable performance, but absolute high-performance is not the primary goal.
  • The system should be able to handle at least 1000 unique keys.

Notes

  • Consider using Python's multiprocessing module for creating separate cache node processes.
  • For IPC, explore libraries like Pyro4 for simple RPC, or redis-py if you want to abstract the client-server interaction to a Redis-like interface (though you'd be implementing the server-side logic). For a more fundamental approach, look into socket programming or ZeroMQ.
  • Implementing TTL requires a mechanism to track expiration times and clean up expired entries. This could be a background thread in each cache node.
  • A simple modulo-based partitioning (hash(key) % num_nodes) is a good starting point, but be aware of its limitations when the number of nodes changes. Consistent hashing is a more robust solution.
  • Your solution should consist of at least two main components: the cache node server and the cache client interface.
Loading editor...
python