Implement a Token Bucket Rate Limiter in Python
Rate limiting is a crucial technique for managing the flow of requests to a service, preventing overload, and ensuring fair usage. The token bucket algorithm is a popular and effective method for implementing rate limiting. Your challenge is to implement this algorithm in Python.
Problem Description
You need to create a Python class, TokenBucket, that simulates the token bucket rate limiting algorithm. This class should manage a bucket of tokens that represents the allowed rate of operations. Tokens are added to the bucket at a fixed rate, and operations consume tokens. If an operation attempts to consume a token when the bucket is empty, it is considered to have exceeded the rate limit.
Key Requirements:
- Initialization: The
TokenBucketshould be initialized with acapacity(maximum number of tokens the bucket can hold) and afill_rate(number of tokens added to the bucket per second). - Adding Tokens: Tokens should be added to the bucket over time. The
fill_ratedetermines how many tokens are added per second. The bucket should not exceed itscapacity. - Consuming Tokens: A method should exist to attempt to consume a specified number of tokens. This method should return
Trueif successful (enough tokens were available) andFalseotherwise. - Time Management: The implementation must correctly account for the passage of time to refill the bucket.
Expected Behavior:
- When
consume(n)is called, if there are at leastntokens in the bucket,ntokens are removed, and the method returnsTrue. - If there are fewer than
ntokens, no tokens are removed, and the method returnsFalse. - The
TokenBucketshould maintain the current number of tokens and the timestamp of the last refill.
Edge Cases:
- What happens if
consume(n)is called withngreater than the bucket'scapacity? (The algorithm generally handles this by checking available tokens). - How to handle the initial state of the bucket (full or empty)?
Examples
Example 1:
# Initialize a bucket with capacity 5 and fill rate of 2 tokens/second
bucket = TokenBucket(capacity=5, fill_rate=2)
# Simulate some operations
print(bucket.consume(1)) # Output: True (Bucket has 5 tokens, consumes 1, remaining 4)
print(bucket.consume(3)) # Output: True (Bucket has 4 tokens, consumes 3, remaining 1)
print(bucket.consume(2)) # Output: False (Bucket has 1 token, cannot consume 2)
# Let time pass for 1.5 seconds
# After 1.5 seconds, 1.5 * 2 = 3 tokens should be added (capped at capacity 5)
# Bucket should now have min(1 + 3, 5) = 4 tokens
# This simulation part requires external time passing.
# If we assume time has passed:
# Assume bucket.last_refill_time is now 1.5 seconds in the past.
# bucket.refill() # (Internal call, or simulated)
# Now bucket.tokens is effectively 4
print(bucket.consume(3)) # Output: True (Bucket has 4 tokens, consumes 3, remaining 1)
Explanation:
Initially, the bucket is full (5 tokens). The first two consume calls succeed. The third consume call fails because only 1 token is left. After some time, more tokens are added, allowing the next consume call to succeed.
Example 2:
# Initialize a bucket with capacity 10 and fill rate of 5 tokens/second
bucket = TokenBucket(capacity=10, fill_rate=5)
# Consume a large amount immediately
print(bucket.consume(8)) # Output: True (Bucket has 10 tokens, consumes 8, remaining 2)
# Try to consume more than available
print(bucket.consume(3)) # Output: False (Bucket has 2 tokens, cannot consume 3)
# Let time pass for 2 seconds.
# 2 * 5 = 10 tokens added.
# Bucket should now have min(2 + 10, 10) = 10 tokens
# Assume bucket.refill() has been implicitly called due to time passing.
print(bucket.consume(10)) # Output: True (Bucket has 10 tokens, consumes 10, remaining 0)
print(bucket.consume(1)) # Output: False (Bucket has 0 tokens, cannot consume 1)
Explanation: The bucket starts full. Consuming 8 tokens leaves 2. The next attempt to consume 3 fails. After 2 seconds, the bucket refills to its capacity (10 tokens), allowing a subsequent consumption of 10 tokens.
Constraints
capacity: An integer,1 <= capacity <= 1000.fill_rate: An integer,1 <= fill_rate <= 100. Represents tokens per second.- The
consumemethod should not raise exceptions for insufficient tokens; it should returnFalse. - The implementation should be efficient and handle time passing reasonably. Avoid busy-waiting for refills.
Notes
- You will need to use Python's
timemodule, specificallytime.time(), to track the passage of time. - The
refilllogic should be triggered implicitly whenconsumeis called, based on the time elapsed since the last refill. - Consider how to handle floating-point arithmetic for time and token calculations carefully. The number of tokens added might not always be a whole number, but tokens themselves are discrete. You might need to decide how to handle fractional tokens (e.g., truncate, round). A common approach is to maintain the number of tokens as a float internally and only consider them as consumed in whole units.