Hone logo
Hone
Problems

Sales Performance Analysis: Multidimensional Reporting

This challenge focuses on designing efficient Online Analytical Processing (OLAP) queries to analyze sales data from multiple perspectives. OLAP is crucial for business intelligence, enabling users to slice, dice, and drill down into data to uncover trends and insights. You will be tasked with creating queries that aggregate sales performance across different dimensions like time, product, and region.

Problem Description

Your task is to design SQL queries that answer specific business questions by aggregating sales data. You are given a simplified database schema representing sales transactions. The goal is to retrieve summarized information efficiently, demonstrating an understanding of OLAP concepts like aggregation, grouping, and potentially window functions.

Key Requirements:

  1. Total Sales per Month per Product Category: Calculate the total revenue generated for each product category, broken down by month.
  2. Top Selling Product per Region per Quarter: Identify the product with the highest total sales revenue in each region for each quarter.
  3. Year-over-Year Sales Growth: For each month, calculate the percentage change in sales revenue compared to the same month in the previous year.

Expected Behavior:

  • Queries should return clearly structured results with appropriate labels.
  • The calculations should be accurate based on the provided data.
  • Queries should be optimized for performance, considering the potential size of the datasets.

Edge Cases:

  • Months with no sales for a specific category.
  • New product categories introduced mid-year.
  • Missing sales data for certain periods.

Examples

Example 1: Total Sales per Month per Product Category

Input Data (Simplified Representation):

  • Sales Table:
    • sale_id (INT)
    • sale_date (DATE)
    • product_id (INT)
    • quantity (INT)
    • price_per_unit (DECIMAL)
  • Products Table:
    • product_id (INT)
    • category (VARCHAR)

Sample Input Data:

Sales Table:

sale_idsale_dateproduct_idquantityprice_per_unit
12023-01-15101210.00
22023-01-20102125.00
32023-02-10101310.00
42023-02-18103150.00
52024-01-10101112.00

Products Table:

product_idcategory
101Electronics
102Books
103Electronics

Output:

yearmonthcategorytotal_revenue
20231Electronics20.00
20231Books25.00
20232Electronics30.00
20232Electronics50.00
20241Electronics12.00

Explanation: The query calculates quantity * price_per_unit for each sale, extracts the year and month from sale_date, joins with the Products table to get the category, and then groups by year, month, and category to sum the revenue.


Example 2: Top Selling Product per Region per Quarter

Input Data (Simplified Representation):

  • Sales Table:
    • sale_id (INT)
    • sale_date (DATE)
    • product_id (INT)
    • region_id (INT)
    • quantity (INT)
    • price_per_unit (DECIMAL)
  • Products Table:
    • product_id (INT)
    • product_name (VARCHAR)
  • Regions Table:
    • region_id (INT)
    • region_name (VARCHAR)

Sample Input Data:

Sales Table:

sale_idsale_dateproduct_idregion_idquantityprice_per_unit
12023-01-151011210.00
22023-01-201021125.00
32023-04-101012310.00
42023-04-181031150.00
52023-07-011011110.00

Products Table:

product_idproduct_name
101Laptop
102Keyboard
103Monitor

Regions Table:

region_idregion_name
1North
2South

Output:

yearquarterregion_nameproduct_nametotal_revenue
20231NorthLaptop20.00
20232SouthLaptop30.00
20232NorthMonitor50.00
20233NorthLaptop10.00

Explanation: This query first calculates the revenue for each sale. It then determines the quarter from the sale_date, joins with Products and Regions tables. A window function (e.g., ROW_NUMBER or RANK) is used over partitions of year, quarter, and region, ordered by total revenue in descending order, to assign a rank to each product within its respective group. Finally, it filters to keep only the top-ranked product (rank = 1) for each partition.


Example 3: Year-over-Year Sales Growth (Edge Case Consideration)

Input Data: (Same tables as Example 1, but with data spanning across two years)

Sales Table:

sale_idsale_dateproduct_idquantityprice_per_unit
12023-01-15101210.00
22024-01-10101112.00
32023-02-10101310.00
42024-02-05101212.00
52023-03-20102125.00
62023-04-01101110.00
72024-04-05101212.00

Products Table:

product_idcategory
101Electronics
102Books

Output:

yearmonthcurrent_year_salesprevious_year_salesyoy_growth_percentage
2023120.00NULLNULL
2023230.00NULLNULL
2023325.00NULLNULL
2023410.00NULLNULL
2024112.0020.00-40.00
2024224.0030.00-20.00
2024424.0010.00140.00

Explanation: This query first aggregates total sales by month and year. Then, using a window function, it retrieves the sales from the previous year for the same month. Finally, it calculates the year-over-year growth percentage using the formula: (current_year_sales - previous_year_sales) / previous_year_sales * 100. For months in the first year of data, the previous_year_sales and yoy_growth_percentage will be NULL.

Constraints

  • The Sales table can contain up to 10 million records.
  • The Products and Regions tables will contain at most 1,000 records each.
  • Dates in the sale_date column will be within the last 3 years.
  • price_per_unit and quantity will always be non-negative.
  • Queries should aim to execute within a reasonable time frame (e.g., under 30 seconds) on typical hardware.

Notes

  • You'll need to extract year, month, and quarter information from the sale_date.
  • Consider using Common Table Expressions (CTEs) to break down complex queries into more manageable steps.
  • For the year-over-year growth calculation, pay attention to how you handle the first year of data where there is no previous year to compare against.
  • The exact SQL dialect is not specified; use standard SQL syntax that is widely supported. If you need to use a specific function (like for date extraction), mention it clearly.
  • Think about how to efficiently join tables and perform aggregations. Indexes on relevant columns (sale_date, product_id, region_id) would typically be beneficial in a real-world scenario.
Loading editor...
plaintext