Hone logo
Hone
Problems

Analyzing Sales Performance with Window Functions

Many businesses need to analyze sales data to identify top performers, track rankings over time, and understand relative performance within different categories. This challenge focuses on using SQL window functions (specifically ROW_NUMBER and RANK) to derive insights from a sales dataset, allowing you to determine sales representative rankings and identify top performers within specific regions.

Problem Description

You are given a table named Sales containing sales data for a company. The table has the following columns:

  • SalespersonID (INT): Unique identifier for each salesperson.
  • SalespersonName (VARCHAR): Name of the salesperson.
  • Region (VARCHAR): The region the salesperson operates in.
  • SalesAmount (DECIMAL): The total sales amount for the salesperson.

Your task is to write SQL queries that utilize window functions to:

  1. Rank Salespeople Globally: Calculate the overall rank of each salesperson based on their SalesAmount across all regions. Use RANK to handle ties (salespeople with the same sales amount should have the same rank).
  2. Rank Salespeople by Region: Calculate the rank of each salesperson within their respective Region based on their SalesAmount. Use ROW_NUMBER to assign a unique rank even in the case of ties (breaking ties arbitrarily).
  3. Identify Top 3 Salespeople per Region: Determine the top 3 salespeople in each Region based on SalesAmount. Use ROW_NUMBER to assign a rank within each region and then filter the results.

Expected Behavior:

The queries should return tables with the following columns:

  • For Global Ranking: SalespersonID, SalespersonName, Region, SalesAmount, GlobalRank
  • For Regional Ranking: SalespersonID, SalespersonName, Region, SalesAmount, RegionalRank
  • For Top 3 per Region: SalespersonID, SalespersonName, Region, SalesAmount, RegionalRank

Examples

Example 1:

Input:
Sales Table:
SalespersonID | SalespersonName | Region | SalesAmount
-------------|-----------------|--------|-------------
1            | Alice           | North  | 100000
2            | Bob             | South  | 120000
3            | Charlie         | North  | 90000
4            | David           | East   | 110000
5            | Eve             | West   | 80000
6            | Frank           | South  | 120000

Output (Global Ranking):
SalespersonID | SalespersonName | Region | SalesAmount | GlobalRank
-------------|-----------------|--------|-------------|------------
2            | Bob             | South  | 120000      | 1
6            | Frank           | South  | 120000      | 1
4            | David           | East   | 110000      | 3
1            | Alice           | North  | 100000      | 4
3            | Charlie         | North  | 90000       | 5
5            | Eve             | West   | 80000       | 6

Explanation: Bob and Frank tie for the highest sales amount, so they both receive rank 1.  The remaining salespeople are ranked sequentially.

Example 2:

Input:
Sales Table: (Same as Example 1)

Output (Regional Ranking):
SalespersonID | SalespersonName | Region | SalesAmount | RegionalRank
-------------|-----------------|--------|-------------|--------------
1            | Alice           | North  | 100000      | 1
3            | Charlie         | North  | 90000       | 2
2            | Bob             | South  | 120000      | 1
6            | Frank           | South  | 120000      | 1
4            | David           | East   | 110000      | 1
5            | Eve             | West   | 80000       | 1

Explanation: Within each region, salespeople are ranked by sales amount.  Bob and Frank tie in the South, so they both receive rank 1.

Example 3:

Input:
Sales Table: (Same as Example 1)

Output (Top 3 per Region):
SalespersonID | SalespersonName | Region | SalesAmount | RegionalRank
-------------|-----------------|--------|-------------|--------------
1            | Alice           | North  | 100000      | 1
3            | Charlie         | North  | 90000       | 2
2            | Bob             | South  | 120000      | 1
6            | Frank           | South  | 120000      | 1
4            | David           | East   | 110000      | 1

Explanation:  For each region, only the top 3 salespeople based on sales amount are returned.

Constraints

  • The Sales table will contain at least 10 rows.
  • SalesAmount will be a non-negative decimal number.
  • Region will be a string with a maximum length of 50 characters.
  • The queries should be efficient and perform well on a table with up to 1000 rows.

Notes

  • Remember to use the OVER() clause with the appropriate partitioning and ordering for each window function.
  • Consider the difference between RANK() and ROW_NUMBER() when handling ties. RANK() assigns the same rank to tied values, while ROW_NUMBER() assigns a unique rank to each row.
  • For the "Top 3 per Region" query, you'll need to partition the data by Region within the window function and then filter the results based on the assigned rank.
  • The specific database system (e.g., PostgreSQL, MySQL, SQL Server) is not specified, so the SQL should be standard and portable.
  • Focus on clarity and correctness of the SQL queries. Performance optimization is secondary. Pseudocode for the queries:

Global Ranking:

SELECT
    SalespersonID,
    SalespersonName,
    Region,
    SalesAmount,
    RANK() OVER (ORDER BY SalesAmount DESC) AS GlobalRank
FROM
    Sales;

Regional Ranking:

SELECT
    SalespersonID,
    SalespersonName,
    Region,
    SalesAmount,
    ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SalesAmount DESC) AS RegionalRank
FROM
    Sales;

Top 3 per Region:

SELECT
    SalespersonID,
    SalespersonName,
    Region,
    SalesAmount,
    ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SalesAmount DESC) AS RegionalRank
FROM
    Sales
WHERE
    RegionalRank <= 3;
Loading editor...
plaintext