Sales Performance Analysis with Pivoted Categories and Conditional Aggregation
This challenge requires you to implement a pivot table-like functionality in SQL, incorporating conditional aggregation using the CASE statement. The goal is to analyze sales data, grouping it by product category and calculating total sales for different customer segments based on their purchase amounts. This is a common requirement in business intelligence for summarizing and comparing data across multiple dimensions.
Problem Description
You are given a table named sales with the following columns:
product_category(VARCHAR): The category of the product sold (e.g., 'Electronics', 'Clothing', 'Books').customer_id(INT): A unique identifier for each customer.purchase_amount(DECIMAL): The amount spent by the customer on the purchase.
Your task is to write a SQL query that generates a report showing the total sales for each product category, broken down by customer segments. The customer segments are defined as follows:
- High Value: Customers with
purchase_amountgreater than or equal to 100. - Medium Value: Customers with
purchase_amountbetween 50 (inclusive) and 99 (inclusive). - Low Value: Customers with
purchase_amountless than 50.
The output should be a table with the following columns:
product_category(VARCHAR): The product category.high_value_sales(DECIMAL): Total sales for high-value customers in that category.medium_value_sales(DECIMAL): Total sales for medium-value customers in that category.low_value_sales(DECIMAL): Total sales for low-value customers in that category.
The query should effectively pivot the customer segments (High, Medium, Low) into separate columns, aggregating the purchase_amount based on the product_category and the defined customer segments.
Examples
Example 1:
Input:
sales table:
| product_category | customer_id | purchase_amount |
|------------------|-------------|-----------------|
| Electronics | 1 | 120 |
| Clothing | 2 | 60 |
| Electronics | 3 | 40 |
| Books | 4 | 80 |
| Clothing | 5 | 150 |
| Books | 6 | 30 |
Output:
| product_category | high_value_sales | medium_value_sales | low_value_sales |
|------------------|------------------|--------------------|-----------------|
| Electronics | 120 | 0 | 40 |
| Clothing | 150 | 60 | 0 |
| Books | 80 | 0 | 30 |
Explanation:
- Electronics: High Value = 120, Medium Value = 0, Low Value = 40
- Clothing: High Value = 150, Medium Value = 60, Low Value = 0
- Books: High Value = 80, Medium Value = 0, Low Value = 30
Example 2:
Input:
sales table:
| product_category | customer_id | purchase_amount |
|------------------|-------------|-----------------|
| Electronics | 1 | 50 |
| Clothing | 2 | 50 |
| Electronics | 3 | 100 |
| Books | 4 | 100 |
| Clothing | 5 | 100 |
| Books | 6 | 50 |
Output:
| product_category | high_value_sales | medium_value_sales | low_value_sales |
|------------------|------------------|--------------------|-----------------|
| Electronics | 100 | 0 | 50 |
| Clothing | 100 | 50 | 0 |
| Books | 100 | 0 | 50 |
Explanation:
- Electronics: High Value = 100, Medium Value = 0, Low Value = 50
- Clothing: High Value = 100, Medium Value = 50, Low Value = 0
- Books: High Value = 100, Medium Value = 0, Low Value = 50
Constraints
- The
salestable will always contain at least one row. purchase_amountwill always be a non-negative decimal value.product_categorywill always be a non-empty string.- The query should be efficient and perform well on reasonably sized datasets (up to 1 million rows).
- The SQL dialect should be compatible with standard SQL (e.g., PostgreSQL, MySQL, SQL Server).
Notes
- You can use the
CASEstatement to define the customer segments based on thepurchase_amount. - Consider using conditional aggregation (e.g.,
SUM(CASE WHEN ... THEN ... ELSE ... END)) to calculate the total sales for each segment within each product category. - The order of the columns in the output table is important. Ensure the columns are in the specified order:
product_category,high_value_sales,medium_value_sales,low_value_sales. - Think about how to group the data by
product_categorywhile simultaneously applying the conditional aggregation for the customer segments. A single query is expected. - No external libraries or functions are allowed. Only standard SQL functions are permitted.
- The solution should be readable and well-formatted.
- Consider edge cases where a product category might have no sales in a particular segment (e.g., no high-value customers for a specific category). In such cases, the corresponding sales value should be 0.