Visualize Data Trends with Matplotlib
This challenge will test your ability to create common data visualizations using the matplotlib library in Python. You will be given datasets and asked to generate specific plot types to represent the trends and relationships within that data. This is a fundamental skill for data analysis and communication.
Problem Description
You will be provided with three distinct datasets. For each dataset, you need to generate a specific matplotlib plot that accurately represents the data. The plots should be clear, well-labeled, and easy to interpret.
Requirements:
-
Dataset 1: Sales Over Time
- Task: Create a line plot showing monthly sales figures.
- Details:
- X-axis: Month (e.g., 'Jan', 'Feb', ..., 'Dec')
- Y-axis: Sales ($)
- Title: "Monthly Sales Performance"
- X-axis label: "Month"
- Y-axis label: "Sales ($)"
- Ensure the plot displays markers for each data point.
-
Dataset 2: Student Performance
- Task: Create a bar chart comparing the average scores of different subjects.
- Details:
- X-axis: Subject names (e.g., 'Math', 'Science', 'History')
- Y-axis: Average Score (%)
- Title: "Average Subject Scores"
- X-axis label: "Subject"
- Y-axis label: "Average Score (%)"
- The bars should be colored distinctly.
-
Dataset 3: Population Distribution
- Task: Create a histogram showing the distribution of ages in a population sample.
- Details:
- X-axis: Age Bins
- Y-axis: Frequency (Number of people)
- Title: "Age Distribution of Population Sample"
- X-axis label: "Age"
- Y-axis label: "Frequency"
- Specify at least 5 bins for the histogram.
Expected Behavior:
When your Python script is run, it should generate three separate matplotlib figures, each corresponding to one of the datasets and fulfilling the specific plotting requirements. These figures should be displayed to the user or saved to files (the prompt will specify which if saving is required; for this challenge, display is sufficient).
Edge Cases:
- Empty Datasets: If any dataset is empty, the corresponding plot should still be generated but might appear blank or with appropriate labels indicating no data.
- Single Data Point: If a dataset has only one data point, the line plot should display a single marker. The bar chart and histogram should handle this gracefully.
Examples
Example 1: Sales Data
Input Data:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun']
sales = [15000, 18000, 22000, 20000, 25000, 28000]
Output: A line plot titled "Monthly Sales Performance" with months on the x-axis and sales on the y-axis. Markers should be visible.
Explanation: The plot will connect the given sales figures for each month, visualizing the trend in sales over the first half of the year.
Example 2: Student Performance Data
Input Data:
subjects = ['Math', 'Science', 'English', 'History']
scores = [85, 90, 78, 88]
Output: A bar chart titled "Average Subject Scores" with subjects on the x-axis and average scores on the y-axis. Each bar should have a different color.
Explanation: The bar chart visually compares the average performance across different subjects.
Example 3: Population Age Distribution
Input Data:
ages = [22, 35, 48, 19, 28, 62, 33, 45, 55, 29, 38, 42, 50, 25, 31, 39, 47, 58, 23, 36, 40, 52, 27, 30, 43]
Output: A histogram titled "Age Distribution of Population Sample" with age bins on the x-axis and frequency on the y-axis. The histogram should have at least 5 bins.
Explanation: The histogram shows how frequently different age groups appear in the provided sample.
Constraints
- You must use the
matplotlib.pyplotmodule for all plotting. - The input data will be provided as Python lists.
- All generated plots should have clear titles and axis labels as specified.
- The solution should be a single Python script.
- No external data files will be used; all data is provided within the script or as input parameters to your functions.
Notes
- Familiarize yourself with
matplotlib.pyplotfunctions such asplot(),bar(),hist(),title(),xlabel(),ylabel(), andshow(). - For the line plot, you might want to use the
markerargument inplt.plot(). - For the bar chart, consider using the
colorargument or iterating through a list of colors. - For the histogram, the
binsargument inplt.hist()is crucial. - Ensure you call
plt.figure()before creating each new plot to keep them separate. Alternatively, ensure each plot is fully configured and then callplt.show()at the very end of the script.