SQL Pivot Table with Conditional Aggregation
This challenge asks you to transform data from a row-oriented format into a column-oriented format using SQL. Specifically, you'll implement a "pivot" operation, which is incredibly useful for summarizing and analyzing data by grouping it based on different categories and presenting aggregated values in new columns.
Problem Description
You are provided with a table containing sales transaction data. Each record represents a sale of a specific product in a particular region during a given month. Your task is to create a query that generates a pivot table summarizing the total sales amount for each product, with the months represented as distinct columns. You will use the CASE statement within aggregate functions to achieve this conditional aggregation.
What needs to be achieved: Transform the input table into a summarized view where each row represents a product, and columns represent the months, displaying the total sales for that product in that month.
Key requirements:
- The output should have one row per unique product.
- Columns should be named
Jan,Feb,Mar, etc., representing the months. - The values in the month columns should be the sum of sales for that product in that specific month.
- If a product has no sales in a particular month, the corresponding cell in the pivot table should show 0.
Expected behavior: The query should dynamically create columns for each month present in the data and aggregate sales accordingly.
Edge cases to consider:
- Months with no sales for any product.
- Products with sales only in a subset of months.
- An empty input table.
Examples
Example 1:
Input Table: SalesData
| Product | Region | Month | SalesAmount |
|---|---|---|---|
| Laptop | North | Jan | 1200 |
| Mouse | South | Jan | 25 |
| Keyboard | East | Jan | 75 |
| Laptop | West | Feb | 1500 |
| Mouse | North | Feb | 30 |
| Laptop | South | Mar | 1100 |
| Keyboard | West | Mar | 80 |
Output Table:
| Product | Jan | Feb | Mar |
|---|---|---|---|
| Keyboard | 75 | 0 | 80 |
| Laptop | 1200 | 1500 | 1100 |
| Mouse | 25 | 30 | 0 |
Explanation:
- For 'Keyboard', the Jan sales are 75, Feb sales are 0 (not present in input), and Mar sales are 80.
- For 'Laptop', the Jan sales are 1200, Feb sales are 1500, and Mar sales are 1100.
- For 'Mouse', the Jan sales are 25, Feb sales are 30, and Mar sales are 0.
Example 2:
Input Table: SalesData
| Product | Region | Month | SalesAmount |
|---|---|---|---|
| Monitor | North | Apr | 300 |
| Monitor | South | May | 320 |
Output Table:
| Product | Apr | May |
|---|---|---|
| Monitor | 300 | 320 |
Explanation:
- Only April and May sales are present. The pivot table reflects these months.
Example 3: (Edge Case - Empty Input)
Input Table: SalesData (empty)
| Product | Region | Month | SalesAmount |
|---|
Output Table:
| Product |
|---|
Explanation:
- When the input table is empty, the output should also be empty, or at least contain no rows with product data. The month columns might still be conceptually present, but without data to populate them.
Constraints
- The
SalesDatatable will contain at least the columns:Product(VARCHAR),Region(VARCHAR),Month(VARCHAR, e.g., 'Jan', 'Feb'), andSalesAmount(DECIMAL or INTEGER). - The number of distinct products will be at most 1000.
- The number of distinct months will be at most 12.
- The
SalesAmountwill be a non-negative number. - Assume the SQL dialect supports standard aggregate functions (
SUM) and theCASEstatement.
Notes
- The core of this challenge lies in using the
CASEstatement within theSUM()aggregate function to conditionally sumSalesAmountbased on theMonth. - You will likely need to use a
GROUP BYclause on theProductcolumn to get one row per product. - Consider how you will handle months that are present in the data but have zero sales for a particular product. The
CASEstatement is crucial here. - The exact set of month names might vary, but for the examples, assume standard English abbreviations like 'Jan', 'Feb', 'Mar', etc. In a real-world scenario, you might need a more robust way to handle all possible months or a predefined list of months you want to pivot on.