Hone logo
Hone
Problems

Customer Order Segmentation

This challenge focuses on segmenting customers based on their purchasing behavior. You'll need to identify groups of customers who meet specific criteria related to the number of orders they have placed. This is a common task in business analytics, allowing for targeted marketing campaigns or service level adjustments.

Problem Description

Your task is to write a SQL query that retrieves a list of customer IDs and their corresponding total number of orders, but only for those customers who have placed more than a specified minimum number of orders.

Key Requirements:

  1. Group by Customer: You need to aggregate order data for each individual customer.
  2. Count Orders: For each customer, you must count the total number of orders they have placed.
  3. Filter Groups: You must filter these aggregated results to include only those customers whose order count exceeds a given threshold.

Expected Behavior:

The output should be a table with two columns: customer_id and order_count. Each row should represent a customer who meets the filtering criteria.

Edge Cases:

  • No Customers Meet Criteria: If no customers have placed more orders than the specified threshold, the output table should be empty.
  • Customers with Zero Orders: While not explicitly in the input schema, if a customer could exist without any orders, they should not appear in the result as their order_count would be 0, and thus less than or equal to any positive threshold.

Examples

Example 1:

Input: orders table:

order_idcustomer_idorder_date
101C10012023-01-15
102C10022023-01-16
103C10012023-02-20
104C10032023-03-10
105C10022023-03-11
106C10012023-04-05
107C10042023-04-10

minimum_orders_threshold = 1

Output:

customer_idorder_count
C10013
C10022
C10031
C10041

Explanation: Customers are grouped by customer_id.

  • C1001 has 3 orders.
  • C1002 has 2 orders.
  • C1003 has 1 order.
  • C1004 has 1 order. The minimum_orders_threshold is 1. Customers with an order_count strictly greater than 1 are filtered. Therefore, C1001 (3 orders) and C1002 (2 orders) are included. C1003 and C1004 have exactly 1 order, which is not greater than the threshold.

Example 2:

Input: orders table:

order_idcustomer_idorder_date
101C10012023-01-15
102C10022023-01-16
103C10012023-02-20
104C10032023-03-10
105C10022023-03-11
106C10012023-04-05
107C10042023-04-10

minimum_orders_threshold = 2

Output:

customer_idorder_count
C10013

Explanation: Similar to Example 1, we group and count orders per customer.

  • C1001 has 3 orders.
  • C1002 has 2 orders.
  • C1003 has 1 order.
  • C1004 has 1 order. The minimum_orders_threshold is 2. Only customers with an order_count strictly greater than 2 are included. Therefore, only C1001 (3 orders) is selected.

Example 3: (Edge Case: No customers meet criteria)

Input: orders table:

order_idcustomer_idorder_date
101C10012023-01-15
102C10022023-01-16

minimum_orders_threshold = 2

Output:

customer_idorder_count

Explanation:

  • C1001 has 1 order.
  • C1002 has 1 order. The minimum_orders_threshold is 2. No customer has more than 2 orders. Thus, the result is an empty set.

Constraints

  • The orders table will contain at least one row.
  • customer_id will be a string.
  • order_id will be an integer.
  • The minimum_orders_threshold will be a non-negative integer.
  • The query should be efficient and execute within typical database time limits for moderate-sized datasets.

Notes

This challenge is designed to test your understanding of aggregate functions (like COUNT) and how to filter results after aggregation. Think about which SQL clause is used to filter groups based on aggregated values. Pseudocode for the orders table schema might look like this:

TABLE orders (
    order_id INT,
    customer_id VARCHAR,
    order_date DATE
);
Loading editor...
plaintext