Sales Analysis III
This challenge asks you to analyze sales data to identify products that have achieved consistent sales across multiple years. Identifying such products is crucial for inventory management, marketing strategies, and understanding long-term product performance.
Problem Description
You are given two tables: Products and Sales.
The Products table contains information about each product, including its product_id and product_name.
The Sales table records individual sales transactions, including sale_id, product_id, and sale_date.
Your task is to find the product_id and product_name of all products that have been sold in every year present in the Sales table.
Key Requirements:
- You need to consider all unique years present in the
Salestable. - A product is considered sold in a year if there is at least one sale record for that product in that year.
- The output should list the
product_idandproduct_name.
Edge Cases:
- What if a product has no sales at all? (It should not be included in the output).
- What if there are products without sales in certain years, but sales in others? (These should be excluded if they weren't sold in every year).
- What if the
Salestable is empty? (The output should be empty).
Examples
Example 1:
Input:
Products table:
| product_id | product_name |
|---|---|
| 0 | LC |
| 1 | Apple |
| 2 | Orange |
| 3 | Banana |
Sales table:
| sale_id | product_id | sale_date |
|---|---|---|
| 1 | 0 | 2018-01-01 |
| 2 | 1 | 2019-01-01 |
| 3 | 0 | 2019-01-01 |
| 4 | 0 | 2020-01-01 |
| 5 | 1 | 2020-01-01 |
| 6 | 1 | 2021-01-01 |
| 7 | 2 | 2019-01-01 |
| 8 | 2 | 2020-01-01 |
| 9 | 2 | 2021-01-01 |
Output:
| product_id | product_name |
|---|---|
| 0 | LC |
| 2 | Orange |
Explanation:
The unique years present in the Sales table are 2018, 2019, 2020, and 2021.
- Product 'LC' (id 0) was sold in 2018, 2019, and 2020. It was NOT sold in 2021. Therefore, it's excluded.
- Product 'Apple' (id 1) was sold in 2019, 2020, and 2021. It was NOT sold in 2018. Therefore, it's excluded.
- Product 'Orange' (id 2) was sold in 2019, 2020, and 2021. It was NOT sold in 2018. Therefore, it's excluded.
- Product 'Banana' (id 3) has no sales. Therefore, it's excluded.
Wait, let's re-evaluate based on the requirement of being sold in every year present. The years are 2018, 2019, 2020, 2021.
- Product 'LC' (id 0): Sold in 2018, 2019, 2020. Missing 2021. Excluded.
- Product 'Apple' (id 1): Sold in 2019, 2020, 2021. Missing 2018. Excluded.
- Product 'Orange' (id 2): Sold in 2019, 2020, 2021. Missing 2018. Excluded.
- Product 'Banana' (id 3): No sales. Excluded.
Let's adjust the example to better illustrate the desired outcome.
Example 1 (Revised):
Input:
Products table:
| product_id | product_name |
|---|---|
| 0 | LC |
| 1 | Apple |
| 2 | Orange |
| 3 | Banana |
Sales table:
| sale_id | product_id | sale_date |
|---|---|---|
| 1 | 0 | 2018-01-01 |
| 2 | 1 | 2019-01-01 |
| 3 | 0 | 2019-01-01 |
| 4 | 0 | 2020-01-01 |
| 5 | 1 | 2020-01-01 |
| 6 | 1 | 2021-01-01 |
| 7 | 2 | 2019-01-01 |
| 8 | 2 | 2020-01-01 |
| 9 | 2 | 2021-01-01 |
| 10 | 0 | 2021-01-01 |
Output:
| product_id | product_name |
|---|---|
| 0 | LC |
Explanation:
The unique years present in the Sales table are 2018, 2019, 2020, and 2021.
- Product 'LC' (id 0) was sold in 2018, 2019, 2020, and 2021. This product has been sold in every year.
- Product 'Apple' (id 1) was sold in 2019, 2020, and 2021. It was NOT sold in 2018. Therefore, it's excluded.
- Product 'Orange' (id 2) was sold in 2019, 2020, and 2021. It was NOT sold in 2018. Therefore, it's excluded.
- Product 'Banana' (id 3) has no sales. Therefore, it's excluded.
Example 2:
Input:
Products table:
| product_id | product_name |
|---|---|
| 101 | Keyboard |
| 102 | Mouse |
Sales table:
| sale_id | product_id | sale_date |
|---|---|---|
| 1001 | 101 | 2022-03-15 |
| 1002 | 102 | 2022-04-20 |
| 1003 | 101 | 2023-01-10 |
| 1004 | 102 | 2023-02-05 |
| 1005 | 101 | 2024-11-30 |
| 1006 | 102 | 2024-12-01 |
Output:
| product_id | product_name |
|---|---|
| 101 | Keyboard |
| 102 | Mouse |
Explanation: The unique years present are 2022, 2023, and 2024. Both 'Keyboard' (id 101) and 'Mouse' (id 102) have sales records in all three years.
Example 3: (Edge Case - No sales in any year)
Input:
Products table:
| product_id | product_name |
|---|---|
| 500 | Monitor |
| 501 | Webcam |
Sales table: (empty)
Output: (empty table)
Explanation:
Since the Sales table is empty, there are no years to consider, and therefore no products can be said to have been sold in "every" year.
Constraints
product_idis a unique integer for each product.product_nameis a string.sale_idis a unique integer for each sale.sale_dateis a date in the format 'YYYY-MM-DD'.- The number of products can be up to 1000.
- The number of sales can be up to 1,000,000.
- The execution time for your solution should be within 5 seconds.
Notes
- You will need to extract the year from the
sale_date. - Consider how to efficiently count or check for sales across all unique years.
- Think about grouping and aggregation techniques.
- The final output should be ordered by
product_id.