Hone logo
Hone
Problems

Implement a Python Caching Layer

Web applications and data-intensive services often involve computationally expensive operations or frequent requests to external resources. A caching layer can significantly improve performance by storing the results of these operations and returning them directly for subsequent identical requests, avoiding redundant work. This challenge asks you to implement a basic in-memory caching mechanism in Python.

Problem Description

You need to create a Python class, let's call it Cache, that acts as an in-memory cache. This cache should store key-value pairs. The primary operations your Cache class will need to support are:

  1. set(key, value, ttl=None): Stores a value associated with a key. Optionally, a ttl (Time To Live) in seconds can be provided. If ttl is set, the cache entry should expire after that duration.
  2. get(key): Retrieves the value associated with a key. If the key is not found in the cache, or if it has expired, it should return None.
  3. delete(key): Removes a key and its associated value from the cache.

Key Requirements:

  • The cache should be thread-safe to handle concurrent access from multiple threads.
  • Expiration of cached items based on ttl should be handled automatically. This means get operations should respect the expiration.
  • If ttl is not provided, the item should be considered to have an indefinite lifespan (until explicitly deleted or the cache is cleared).

Expected Behavior:

  • When set is called with a ttl, the item should be retrievable until ttl seconds have passed since it was set.
  • After expiration, get for that key should return None.
  • delete should permanently remove an item.
  • Concurrent set, get, and delete operations should not lead to race conditions or corrupted cache state.

Edge Cases to Consider:

  • Setting an item with a ttl of 0 or a negative value (should probably be treated as immediately expired or raise an error, but for this challenge, let's assume it means it won't be stored or is immediately expired).
  • Accessing a key that was recently deleted.
  • Accessing a key that has expired.
  • The scenario where multiple threads try to access or modify the same key simultaneously.

Examples

Example 1:

from cache import Cache
import time

cache = Cache()

# Set a value without TTL
cache.set("user:1", {"name": "Alice", "email": "alice@example.com"})
print(cache.get("user:1"))  # Output: {'name': 'Alice', 'email': 'alice@example.com'}

# Set a value with TTL
cache.set("config:settings", {"theme": "dark", "language": "en"}, ttl=5)
print(cache.get("config:settings")) # Output: {'theme': 'dark', 'language': 'en'}

# Wait for expiration
time.sleep(6)
print(cache.get("config:settings")) # Output: None

Explanation: The first set stores "user:1" indefinitely. The second set stores "config:settings" with a 5-second TTL. After 6 seconds, "config:settings" has expired and get returns None.

Example 2:

from cache import Cache
import time

cache = Cache()

# Set a value with TTL
cache.set("short_lived_data", [1, 2, 3], ttl=2)
print(cache.get("short_lived_data")) # Output: [1, 2, 3]

# Delete the item before expiration
cache.delete("short_lived_data")
print(cache.get("short_lived_data")) # Output: None

# Set another item
cache.set("permanent_data", "This should stay")
print(cache.get("permanent_data")) # Output: This should stay

Explanation: "short_lived_data" is set with a TTL of 2 seconds. Before it expires, delete is called, and subsequent get calls return None. "permanent_data" remains in the cache.

Example 3: Thread Safety

from cache import Cache
import threading
import time

cache = Cache()
num_threads = 10
num_iterations = 100

def worker(thread_id):
    for i in range(num_iterations):
        key = f"thread_{thread_id}_item_{i}"
        value = f"data_from_thread_{thread_id}_{i}"
        cache.set(key, value, ttl=1)
        retrieved_value = cache.get(key)
        if retrieved_value != value:
            print(f"Thread {thread_id} ERROR: Mismatch for key {key}. Expected {value}, got {retrieved_value}")
        time.sleep(0.001) # Simulate some work

threads = []
for i in range(num_threads):
    t = threading.Thread(target=worker, args=(i,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print("All threads finished.")
# Optional: Verify a few keys still exist (they might have expired)
print(f"Cache size after threads: {len(cache.cache)}") # Accessing internal cache size for verification

Explanation: This example demonstrates concurrent access. Multiple threads simultaneously set items with short TTLs. A correctly implemented thread-safe cache should handle these concurrent writes and reads without data corruption or errors. The key is ensuring that operations like adding to a dictionary or checking expiration times are atomic.

Constraints

  • The cache implementation should use Python's standard library. No external caching libraries (like redis, memcached clients, functools.lru_cache directly) are allowed for the core caching logic.
  • The cache should reside entirely in memory.
  • The ttl should be handled by checking the current time against the timestamp when the item was set.
  • Performance of get and set operations should ideally be O(1) on average, assuming no significant expiration cleanup overhead.

Notes

  • Consider how to store the expiration timestamp along with the value for each cached item.
  • For thread safety, threading.Lock or threading.RLock is your friend. Think about which operations need to be protected by a lock.
  • The mechanism for handling expired items can be passive (checked only on get) or active (a background thread cleaning up). For this challenge, a passive approach checked during get is sufficient and simpler.
  • You might need a way to store the creation time for each item to implement TTL.
  • Consider the data structure you will use for the cache itself (e.g., a dictionary).
Loading editor...
python