Hone logo
Hone
Problems

Optimizing E-commerce Product Search

In e-commerce, efficient product searching is crucial for user experience and business performance. Slow search queries can lead to lost sales and frustrated customers. This challenge focuses on optimizing a common e-commerce search scenario by applying SQL query optimization techniques.

Problem Description

You are tasked with retrieving a list of the top 10 most popular products within a specific brand and category, based on the total quantity sold. You are given two tables: Products and OrderItems. The Products table contains information about each product, including its product_id, product_name, brand, and category. The OrderItems table records individual items sold in orders, containing order_item_id, product_id, and quantity.

Your goal is to write a SQL query that efficiently returns the product_id, product_name, and the total_quantity_sold for the top 10 most popular products matching a given brand_name and category_name.

Key Requirements:

  • Filter products by a specific brand_name.
  • Filter products by a specific category_name.
  • Calculate the total quantity sold for each product.
  • Order the results by total_quantity_sold in descending order.
  • Limit the output to the top 10 products.
  • The query should be performant, even with large datasets.

Expected Behavior: The query should return a table with three columns: product_id, product_name, and total_quantity_sold.

Edge Cases:

  • What if a brand or category has fewer than 10 products sold? The query should return all available products in descending order of sales.
  • What if no products match the given brand or category? The query should return an empty result set.

Examples

Example 1:

Input Tables:

Products:
| product_id | product_name | brand    | category  |
|------------|--------------|----------|-----------|
| 101        | T-Shirt      | Acme     | Apparel   |
| 102        | Jeans        | Acme     | Apparel   |
| 103        | Hoodie       | Acme     | Apparel   |
| 104        | Sneakers     | Nike     | Footwear  |
| 105        | Sandals      | Adidas   | Footwear  |
| 106        | Running Shoes| Nike     | Footwear  |

OrderItems:
| order_item_id | product_id | quantity |
|---------------|------------|----------|
| 1             | 101        | 5        |
| 2             | 101        | 3        |
| 3             | 102        | 2        |
| 4             | 103        | 8        |
| 5             | 104        | 10       |
| 6             | 106        | 6        |
| 7             | 106        | 4        |
| 8             | 101        | 2        |

Target Brand: "Acme"
Target Category: "Apparel"

Output:
| product_id | product_name | total_quantity_sold |
|------------|--------------|---------------------|
| 103        | Hoodie       | 8                   |
| 101        | T-Shirt      | 10                  |
| 102        | Jeans        | 2                   |

Explanation:
Products from "Acme" in "Apparel" are T-Shirt (101), Jeans (102), and Hoodie (103).
- T-Shirt (101): Total quantity sold = 5 + 3 + 2 = 10
- Jeans (102): Total quantity sold = 2
- Hoodie (103): Total quantity sold = 8
Sorted by total quantity sold descending: Hoodie (8), T-Shirt (10), Jeans (2).
The output shows the top 3 as there are only 3 products matching the criteria.

Example 2:

Input Tables:

Products:
| product_id | product_name | brand    | category  |
|------------|--------------|----------|-----------|
| 201        | Laptop       | TechCo   | Electronics|
| 202        | Mouse        | TechCo   | Electronics|
| 203        | Keyboard     | TechCo   | Electronics|
| 204        | Monitor      | ViewSys  | Electronics|
| 205        | Desk Lamp    | HomeArt  | Home      |

OrderItems:
| order_item_id | product_id | quantity |
|---------------|------------|----------|
| 11            | 201        | 7        |
| 12            | 201        | 5        |
| 13            | 202        | 15       |
| 14            | 203        | 10       |
| 15            | 204        | 3        |
| 16            | 201        | 2        |
| 17            | 202        | 8        |

Target Brand: "TechCo"
Target Category: "Electronics"

Output:
| product_id | product_name | total_quantity_sold |
|------------|--------------|---------------------|
| 202        | Mouse        | 23                  |
| 201        | Laptop       | 14                  |
| 203        | Keyboard     | 10                  |

Explanation:
Products from "TechCo" in "Electronics" are Laptop (201), Mouse (202), and Keyboard (203).
- Laptop (201): Total quantity sold = 7 + 5 + 2 = 14
- Mouse (202): Total quantity sold = 15 + 8 = 23
- Keyboard (203): Total quantity sold = 10
Sorted by total quantity sold descending: Mouse (23), Laptop (14), Keyboard (10).
The output shows the top 3 as there are only 3 products matching the criteria.

Example 3: (Edge Case - No Matching Products)

Input Tables:

Products:
| product_id | product_name | brand    | category  |
|------------|--------------|----------|-----------|
| 301        | Book         | PubCo    | Books     |

OrderItems:
| order_item_id | product_id | quantity |
|---------------|------------|----------|
| 21            | 301        | 100      |

Target Brand: "NonExistentBrand"
Target Category: "SomeCategory"

Output:
(Empty result set)

Explanation:
No products in the `Products` table match the specified brand and category. Therefore, the query returns an empty result set.

Constraints

  • The Products table can contain up to 1 million rows.
  • The OrderItems table can contain up to 10 million rows.
  • product_id is an integer and is unique in the Products table.
  • quantity is an integer.
  • The query should execute within 5 seconds on a standard database system with appropriate indexing.
  • Input brand_name and category_name will be non-empty strings.

Notes

  • Consider how you will join the tables.
  • Think about how to efficiently aggregate the quantities.
  • What indexes would be beneficial for this query?
  • A subquery or a Common Table Expression (CTE) might be useful for structuring your solution.
  • Focus on writing a query that is not only correct but also efficient.
Loading editor...
plaintext