Python Metrics Collector
You are tasked with building a flexible and efficient metrics collection system in Python. This system will be used to track various types of data points, such as counters, gauges, and timings, which are crucial for monitoring application performance and behavior. A well-designed metrics collector allows developers to gain insights into their applications' health and identify potential issues proactively.
Problem Description
Your goal is to create a Python class, MetricsCollector, that can:
-
Record different types of metrics:
- Counter: Increments a value by a fixed amount (typically 1). Useful for tracking events like requests processed or errors encountered.
- Gauge: Represents a value that can go up or down. Useful for tracking current resource usage like memory or active connections.
- Timer: Records the duration of an operation. Useful for measuring how long specific tasks take.
-
Store and retrieve metrics: The collector should maintain an internal state of all recorded metrics. There should be a way to retrieve the current value of any metric.
-
Handle multiple metrics of the same type: The system should be able to store and manage multiple distinct metrics, each with its own name.
-
Provide a clear API: The class should offer intuitive methods for adding and updating metrics.
Key Requirements:
- A
MetricsCollectorclass. - Methods to:
increment_counter(name, value=1): Increments a counter byvalue. If the counter doesn't exist, it should be initialized tovalue.set_gauge(name, value): Sets a gauge to a specificvalue. If the gauge doesn't exist, it should be initialized tovalue.record_timer(name, duration): Records adurationfor a timer. Timers should store a collection of recorded durations.get_metric_value(name): Returns the current value of a specified metric. For timers, this method should return a dictionary with aggregate statistics (e.g., count, sum, average, min, max duration).
- All metric names should be case-sensitive strings.
Expected Behavior:
- Calling
increment_countermultiple times on the same metric name should result in the counter's value increasing cumulatively. - Calling
set_gaugemultiple times on the same metric name should update the gauge to the latest value. - Calling
record_timermultiple times on the same metric name should add to the collection of durations for that timer. get_metric_valueshould return appropriate representations for each metric type.
Edge Cases to Consider:
- Retrieving a metric value that has not been recorded yet.
- Recording a timer duration when no durations have been recorded previously.
- Handling non-numeric values for counters and gauges (though for this challenge, assume valid numeric inputs).
Examples
Example 1: Basic Counter and Gauge
collector = MetricsCollector()
collector.increment_counter("requests_total")
collector.increment_counter("requests_total")
collector.increment_counter("errors_total", 5)
collector.set_gauge("active_connections", 10)
collector.set_gauge("active_connections", 12)
print(collector.get_metric_value("requests_total"))
print(collector.get_metric_value("errors_total"))
print(collector.get_metric_value("active_connections"))
2
5
12
Explanation:
requests_totalwas incremented twice, so its value is 2.errors_totalwas incremented by 5, so its value is 5.active_connectionswas initially set to 10 and then updated to 12.
Example 2: Timer Metrics
import time
collector = MetricsCollector()
start_time = time.perf_counter()
time.sleep(0.05)
end_time = time.perf_counter()
collector.record_timer("api_latency", end_time - start_time)
start_time = time.perf_counter()
time.sleep(0.02)
end_time = time.perf_counter()
collector.record_timer("api_latency", end_time - start_time)
start_time = time.perf_counter()
time.sleep(0.1)
end_time = time.perf_counter()
collector.record_timer("database_query_time", end_time - start_time)
print(collector.get_metric_value("api_latency"))
print(collector.get_metric_value("database_query_time"))
{'count': 2, 'sum': 0.07xxxx, 'average': 0.03xxxx, 'min': 0.01xxxx, 'max': 0.05xxxx}
{'count': 1, 'sum': 0.1xxxx, 'average': 0.1xxxx, 'min': 0.1xxxx, 'max': 0.1xxxx}
(Note: xxxx represents variable floating-point values due to time.sleep inaccuracies)
Explanation:
api_latencyrecorded two durations. The output shows the count, total sum of durations, average, minimum, and maximum duration.database_query_timerecorded one duration, so count is 1, and sum, average, min, and max are all that single duration.
Example 3: Retrieving Non-existent Metric
collector = MetricsCollector()
print(collector.get_metric_value("non_existent_metric"))
None
Explanation:
- When
get_metric_valueis called for a metric that has not been recorded, it returnsNone.
Constraints
- Metric names will be strings containing only alphanumeric characters and underscores.
- Counter and Gauge values will be numeric (integers or floats).
- Timer durations will be positive floating-point numbers representing time in seconds.
- The number of distinct metrics will not exceed 1000.
- The number of timer recordings for a single metric will not exceed 10,000.
- Your solution should be reasonably efficient, with metric recording and retrieval operations typically taking O(1) or O(log N) time complexity where N is the number of timer recordings for a specific metric.
Notes
- Consider how you will structure your internal storage to differentiate between metric types. A dictionary or a similar mapping structure would be a good starting point.
- For timers, you'll need to aggregate statistics. Think about what statistics are most useful for monitoring performance.
- Ensure your
get_metric_valuemethod handles the different metric types gracefully and returns informative data. - The
timemodule in Python can be helpful for simulating or measuring durations.