Hone logo
Hone
Problems

Sales Performance Analysis by Region and Product Category

This challenge focuses on analyzing sales data to understand performance across different regions and product categories. You will use CASE statements in conjunction with aggregate functions to categorize and summarize sales data, providing valuable insights for business decision-making. This is a common task in business intelligence and data analysis.

Problem Description

You are provided with a table containing individual sales transactions. Each transaction includes information about the sale_id, region where the sale occurred, the product_category, and the sale_amount.

Your task is to write a SQL query that calculates the total sale_amount for each region and product_category. Furthermore, you need to categorize each sale_amount into one of three tiers: 'Low', 'Medium', or 'High'. A sale is considered 'Low' if its amount is less than 100, 'Medium' if it's between 100 (inclusive) and 500 (exclusive), and 'High' if it's 500 or greater.

The final output should display:

  1. The region.
  2. The product_category.
  3. The total_sales_amount for that region and category.
  4. The count of sales falling into the 'Low' tier for that region and category.
  5. The count of sales falling into the 'Medium' tier for that region and category.
  6. The count of sales falling into the 'High' tier for that region and category.

You should group your results by region and product_category.

Key Requirements:

  • Aggregate sale_amount to get the total sales per group.
  • Use a CASE statement to classify individual sales into 'Low', 'Medium', or 'High' tiers.
  • Aggregate the counts of sales within each tier.

Expected Behavior: The query should return one row for each unique combination of region and product_category. If a particular region-category combination has no sales, it should not appear in the output.

Edge Cases to Consider:

  • Sales amounts exactly equal to the tier boundaries (e.g., 100, 500) should be correctly assigned.

Examples

Example 1:

Input Table: sales

sale_idregionproduct_categorysale_amount
1NorthElectronics150
2NorthElectronics75
3SouthClothing400
4NorthElectronics600
5SouthClothing90
6NorthElectronics250
7WestHome Goods700
8WestHome Goods30

Output:

regionproduct_categorytotal_sales_amountlow_tier_countmedium_tier_counthigh_tier_count
NorthElectronics1075121
SouthClothing490110
WestHome Goods730101

Explanation:

  • North, Electronics: Total sales = 150 + 75 + 600 + 250 = 1075.
    • Sale 2 (75) is 'Low' (count: 1).
    • Sales 1 (150) and 6 (250) are 'Medium' (count: 2).
    • Sale 4 (600) is 'High' (count: 1).
  • South, Clothing: Total sales = 400 + 90 = 490.
    • Sale 5 (90) is 'Low' (count: 1).
    • Sale 3 (400) is 'Medium' (count: 1).
    • No 'High' sales.
  • West, Home Goods: Total sales = 700 + 30 = 730.
    • Sale 8 (30) is 'Low' (count: 1).
    • Sale 7 (700) is 'High' (count: 1).
    • No 'Medium' sales.

Example 2:

Input Table: sales

sale_idregionproduct_categorysale_amount
10EastBooks50
11EastBooks100
12EastBooks500
13EastBooks499
14EastBooks200

Output:

regionproduct_categorytotal_sales_amountlow_tier_countmedium_tier_counthigh_tier_count
EastBooks1349121

Explanation:

  • East, Books: Total sales = 50 + 100 + 500 + 499 + 200 = 1349.
    • Sale 10 (50) is 'Low' (count: 1).
    • Sales 11 (100) and 14 (200) are 'Medium' (count: 2).
    • Sale 12 (500) is 'High' (count: 1).
    • Sale 13 (499) is 'Medium' (count: 2 total).

Constraints

  • The sales table can contain up to 1,000,000 rows.
  • sale_id is a unique integer.
  • region, product_category are strings (e.g., 'North', 'Electronics').
  • sale_amount is a non-negative number (integer or decimal).
  • The query should execute efficiently, aiming for a runtime that completes within a reasonable time on the given dataset size.

Notes

  • Remember to correctly handle the inclusive and exclusive boundaries for the 'Medium' tier.
  • Think about how to aggregate the counts for each tier within the GROUP BY clause.
  • Consider using conditional aggregation if your SQL dialect supports it, or multiple CASE statements within aggregate functions.
Loading editor...
plaintext