Hone logo
Hone
Problems

SQL Pivot Table with Conditional Aggregation

This challenge asks you to transform data from a row-oriented format into a column-oriented format using SQL. Specifically, you'll implement a "pivot" operation, which is incredibly useful for summarizing and analyzing data by grouping it based on different categories and presenting aggregated values in new columns.

Problem Description

You are provided with a table containing sales transaction data. Each record represents a sale of a specific product in a particular region during a given month. Your task is to create a query that generates a pivot table summarizing the total sales amount for each product, with the months represented as distinct columns. You will use the CASE statement within aggregate functions to achieve this conditional aggregation.

What needs to be achieved: Transform the input table into a summarized view where each row represents a product, and columns represent the months, displaying the total sales for that product in that month.

Key requirements:

  • The output should have one row per unique product.
  • Columns should be named Jan, Feb, Mar, etc., representing the months.
  • The values in the month columns should be the sum of sales for that product in that specific month.
  • If a product has no sales in a particular month, the corresponding cell in the pivot table should show 0.

Expected behavior: The query should dynamically create columns for each month present in the data and aggregate sales accordingly.

Edge cases to consider:

  • Months with no sales for any product.
  • Products with sales only in a subset of months.
  • An empty input table.

Examples

Example 1:

Input Table: SalesData

ProductRegionMonthSalesAmount
LaptopNorthJan1200
MouseSouthJan25
KeyboardEastJan75
LaptopWestFeb1500
MouseNorthFeb30
LaptopSouthMar1100
KeyboardWestMar80

Output Table:

ProductJanFebMar
Keyboard75080
Laptop120015001100
Mouse25300

Explanation:

  • For 'Keyboard', the Jan sales are 75, Feb sales are 0 (not present in input), and Mar sales are 80.
  • For 'Laptop', the Jan sales are 1200, Feb sales are 1500, and Mar sales are 1100.
  • For 'Mouse', the Jan sales are 25, Feb sales are 30, and Mar sales are 0.

Example 2:

Input Table: SalesData

ProductRegionMonthSalesAmount
MonitorNorthApr300
MonitorSouthMay320

Output Table:

ProductAprMay
Monitor300320

Explanation:

  • Only April and May sales are present. The pivot table reflects these months.

Example 3: (Edge Case - Empty Input)

Input Table: SalesData (empty)

ProductRegionMonthSalesAmount

Output Table:

Product

Explanation:

  • When the input table is empty, the output should also be empty, or at least contain no rows with product data. The month columns might still be conceptually present, but without data to populate them.

Constraints

  • The SalesData table will contain at least the columns: Product (VARCHAR), Region (VARCHAR), Month (VARCHAR, e.g., 'Jan', 'Feb'), and SalesAmount (DECIMAL or INTEGER).
  • The number of distinct products will be at most 1000.
  • The number of distinct months will be at most 12.
  • The SalesAmount will be a non-negative number.
  • Assume the SQL dialect supports standard aggregate functions (SUM) and the CASE statement.

Notes

  • The core of this challenge lies in using the CASE statement within the SUM() aggregate function to conditionally sum SalesAmount based on the Month.
  • You will likely need to use a GROUP BY clause on the Product column to get one row per product.
  • Consider how you will handle months that are present in the data but have zero sales for a particular product. The CASE statement is crucial here.
  • The exact set of month names might vary, but for the examples, assume standard English abbreviations like 'Jan', 'Feb', 'Mar', etc. In a real-world scenario, you might need a more robust way to handle all possible months or a predefined list of months you want to pivot on.
Loading editor...
plaintext