Hone logo
Hone
Problems

Ranking Products by Popularity and Category

This challenge focuses on using SQL window functions, specifically ROW_NUMBER() and RANK(), to analyze and rank product data. You'll be tasked with assigning ranks to products based on their sales volume, first overall and then within their respective categories. This is a common requirement in business intelligence and data analysis for identifying top performers and understanding relative performance.

Problem Description

You are provided with a table named Products containing information about individual products. The table has the following columns:

  • product_id (INTEGER): Unique identifier for each product.
  • product_name (VARCHAR): The name of the product.
  • category (VARCHAR): The category to which the product belongs.
  • sales_volume (INTEGER): The total number of units sold for the product.

Your goal is to produce a report that includes:

  1. Overall Rank: Assign a rank to each product based on its sales_volume in descending order (highest sales volume gets rank 1). Use ROW_NUMBER() for this, as it assigns a unique sequential integer to each row within its partition, ensuring no ties in rank.
  2. Rank within Category: Assign a rank to each product based on its sales_volume within its category, also in descending order. Use RANK() for this, as it allows for ties in rank if products within the same category have the same sales_volume.

The output should contain the product_id, product_name, category, sales_volume, the calculated overall_rank, and the calculated category_rank.

Key Requirements:

  • Use the ROW_NUMBER() window function for the overall ranking.
  • Use the RANK() window function for the category-specific ranking.
  • Order the rankings by sales_volume in descending order.
  • The ROW_NUMBER() function should not have a partition.
  • The RANK() function should be partitioned by category.

Expected Behavior:

  • Products with higher sales_volume should receive a lower (better) rank.
  • In the overall_rank, each product will have a distinct rank.
  • In the category_rank, products within the same category that have the same sales_volume will share the same rank, and the next rank will be skipped accordingly (e.g., 1, 2, 2, 4).

Edge Cases:

  • Products with zero sales volume: These should be ranked accordingly, likely at the bottom of the overall and category rankings.
  • Multiple products with identical sales volume: ROW_NUMBER() will assign distinct sequential numbers. RANK() will assign them the same rank.

Examples

Example 1:

Input Products table:

product_idproduct_namecategorysales_volume
101LaptopElectronics500
102KeyboardElectronics300
103MouseElectronics300
104T-ShirtApparel700
105JeansApparel400
106HatApparel700

Output:

product_idproduct_namecategorysales_volumeoverall_rankcategory_rank
104T-ShirtApparel70011
106HatApparel70021
101LaptopElectronics50032
105JeansApparel40043
102KeyboardElectronics30054
103MouseElectronics30064

Explanation:

  • overall_rank: Products are ranked solely by sales_volume descending. T-Shirt and Hat are tied in sales but get consecutive ranks due to ROW_NUMBER().
  • category_rank:
    • In 'Apparel', T-Shirt and Hat both have 700 sales, so they both get rank 1. Jeans has 400 sales, so it gets rank 3 (since ranks 1 and 2 are accounted for by the top two items).
    • In 'Electronics', Laptop has 500 sales (rank 2 within category), Keyboard and Mouse both have 300 sales (rank 4 within category, as rank 3 is skipped because of the tie).

Example 2:

Input Products table:

product_idproduct_namecategorysales_volume
201Book ABooks150
202Book BBooks250
203Book CBooks150
301Coffee MugHome Goods400
302Picture FrameHome Goods400
303CandleHome Goods350

Output:

product_idproduct_namecategorysales_volumeoverall_rankcategory_rank
301Coffee MugHome Goods40011
302Picture FrameHome Goods40021
303CandleHome Goods35033
202Book BBooks25041
201Book ABooks15052
203Book CBooks15062

Explanation:

  • overall_rank: Coffee Mug and Picture Frame are tied for highest sales (400) but get distinct overall ranks.
  • category_rank:
    • In 'Home Goods', Coffee Mug and Picture Frame share rank 1. Candle, with 350 sales, gets rank 3.
    • In 'Books', Book B has 250 sales (rank 1). Book A and Book C, with 150 sales, share rank 2.

Constraints

  • product_id will be unique for each product.
  • sales_volume will be a non-negative integer.
  • The number of products can range from 1 to 10,000.
  • The number of unique categories can range from 1 to 100.
  • The SQL query should execute efficiently, aiming for a performance that can handle the given constraints.

Notes

  • Consider the order of operations for window functions. The ORDER BY clause within the OVER() clause is crucial for determining the ranking.
  • Pay close attention to the difference between ROW_NUMBER() and RANK(), particularly how they handle ties.
  • You will need to select all the original columns plus the two new rank columns.
  • The final output does not need to be sorted in any specific order, but it's often helpful to display it in a logical order for verification.
Loading editor...
plaintext