Finding Customers Who Placed Orders Above Average
This challenge focuses on using subqueries within the WHERE clause of a SQL statement. You'll be tasked with identifying customers who have placed orders with a total value greater than the average order value across all customers. This is a common scenario for identifying high-value customers or analyzing purchasing trends.
Problem Description
Your goal is to write a SQL query that returns the names of all customers who have placed at least one order whose total amount is strictly greater than the average total amount of all orders placed in the system.
Key Requirements:
- You need to calculate the average of all order totals first.
- Then, you need to identify individual orders that exceed this calculated average.
- Finally, you need to retrieve the names of the customers associated with these high-value orders.
- A customer should appear only once in the result, even if they have multiple orders exceeding the average.
Expected Behavior: The query should return a list of unique customer names.
Edge Cases:
- What if there are no orders in the system? (Assume for this challenge there will always be at least one order).
- What if all orders have the exact same total value? (In this case, no orders will be strictly greater than the average, so the result should be empty).
Examples
Example 1:
Input Data:
Customers Table:
| customer_id | customer_name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
Orders Table:
| order_id | customer_id | order_total |
|---|---|---|
| 101 | 1 | 50.00 |
| 102 | 2 | 150.00 |
| 103 | 1 | 75.00 |
| 104 | 3 | 120.00 |
| 105 | 2 | 200.00 |
Output:
[
"Bob",
"Charlie"
]
Explanation:
- Calculate the average order total: (50 + 150 + 75 + 120 + 200) / 5 = 600 / 5 = 120.00.
- Identify orders greater than 120.00: Order 105 (200.00) and Order 104 (120.00 is not strictly greater). So only Order 105. Wait, let's re-evaluate. The average is 120. Order 102 (150) and Order 105 (200) are greater than 120.
- Customers associated with these orders:
- Order 102 is by customer_id 2 (Bob).
- Order 105 is by customer_id 2 (Bob).
- Order 104 is by customer_id 3 (Charlie). (Correction: Order 104 total is 120, not strictly greater than 120 average. So not included).
Let's re-calculate the average: (50 + 150 + 75 + 120 + 200) / 5 = 600 / 5 = 120.00. Orders strictly greater than 120.00 are:
- Order 102 (150.00) - customer_id 2 (Bob)
- Order 105 (200.00) - customer_id 2 (Bob)
Customers associated: Bob (customer_id 2).
Ah, I made an error in the initial explanation. Let's correct the example to be more precise.
Revised Example 1:
Input Data:
Customers Table:
| customer_id | customer_name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
Orders Table:
| order_id | customer_id | order_total |
|---|---|---|
| 101 | 1 | 50.00 |
| 102 | 2 | 150.00 |
| 103 | 1 | 75.00 |
| 104 | 3 | 120.00 |
| 105 | 2 | 200.00 |
Output:
[
"Bob"
]
Explanation:
- Calculate the average order total: (50 + 150 + 75 + 120 + 200) / 5 = 600 / 5 = 120.00.
- Identify orders with
order_totalstrictly greater than 120.00:- Order 102 (150.00)
- Order 105 (200.00)
- Find the
customer_ids for these orders:- Order 102: customer_id 2
- Order 105: customer_id 2
- Retrieve the
customer_names for thesecustomer_ids from the Customers table.- customer_id 2 corresponds to "Bob".
- The unique customer name is "Bob".
Example 2:
Input Data:
Customers Table:
| customer_id | customer_name |
|---|---|
| 10 | David |
| 20 | Eve |
Orders Table:
| order_id | customer_id | order_total |
|---|---|---|
| 201 | 10 | 100.00 |
| 202 | 10 | 100.00 |
| 203 | 20 | 100.00 |
Output:
[]
Explanation:
- Calculate the average order total: (100 + 100 + 100) / 3 = 300 / 3 = 100.00.
- There are no orders with an
order_totalstrictly greater than 100.00. Therefore, the result is an empty list.
Constraints
- The
Customerstable has at least one row. - The
Orderstable has at least one row. customer_idinOrdersis a foreign key referencingcustomer_idinCustomers.order_totalis a numeric type (e.g., DECIMAL, FLOAT).- The number of customers is between 1 and 1000.
- The number of orders is between 1 and 10000.
Notes
- Your solution should involve a subquery within the
WHEREclause. - Consider how to efficiently calculate the average without repeating the calculation for every row.
- Ensure that duplicate customer names are not included in the final output.
- Pay attention to the strict inequality (
>) when comparing order totals to the average.