Hone logo
Hone
Problems

First and Last Stand: Tracking Product Lifecycles

Imagine you're working with a dataset of product sales over time. To understand product performance, you need to know the very first and last recorded sale for each product. This information can be crucial for determining a product's initial adoption and its eventual market presence or discontinuation. This challenge will test your ability to use SQL window functions to extract this valuable lifecycle data.

Problem Description

Your task is to write an SQL query that, for each unique product, returns:

  1. The product_id.
  2. The sale_date of the first sale recorded for that product.
  3. The sale_date of the last sale recorded for that product.

You should leverage the FIRST_VALUE and LAST_VALUE window functions to achieve this. The results should be ordered by product_id for consistent output.

Key Requirements:

  • Identify the earliest and latest sale date per product_id.
  • Use FIRST_VALUE and LAST_VALUE window functions.
  • The query should return a result set with three columns: product_id, first_sale_date, and last_sale_date.

Expected Behavior:

For each distinct product_id in the input table, the query should produce a single row containing that product_id along with its absolute first and last sale dates.

Edge Cases:

  • Products with only a single sale should have the same date for both first_sale_date and last_sale_date.
  • The input data may not be pre-sorted by date.

Examples

Example 1:

Input Table: Sales

sale_idproduct_idsale_date
11012023-01-15
21022023-02-20
31012023-01-20
41032023-03-01
51012023-01-10
61022023-02-25

Output:

product_idfirst_sale_datelast_sale_date
1012023-01-102023-01-20
1022023-02-202023-02-25
1032023-03-012023-03-01

Explanation: For product_id 101, the earliest sale date is 2023-01-10 and the latest is 2023-01-20. For product_id 102, the earliest sale date is 2023-02-20 and the latest is 2023-02-25. For product_id 103, there's only one sale on 2023-03-01, so both first and last sale dates are the same.

Example 2:

Input Table: Sales

sale_idproduct_idsale_date
12012022-11-01
22022023-01-05
32012022-10-15

Output:

product_idfirst_sale_datelast_sale_date
2012022-10-152022-11-01
2022023-01-052023-01-05

Explanation: For product_id 201, the earliest sale date is 2022-10-15 and the latest is 2022-11-01. For product_id 202, there's only one sale on 2023-01-05.

Constraints

  • The Sales table will contain at least one row.
  • product_id will be an integer.
  • sale_date will be a date type.
  • The Sales table will have at least one product_id with multiple entries.
  • The query should execute efficiently and not time out on reasonably sized datasets (e.g., up to 1 million rows).

Notes

  • Consider the PARTITION BY clause within your window function definitions to correctly group sales by product.
  • The ORDER BY clause within the window function's OVER() clause is critical for determining what constitutes the "first" and "last" value.
  • You will need to perform a GROUP BY on product_id in your final selection, or use a subquery/CTE to aggregate the results after applying the window functions.
Loading editor...
plaintext