Hone logo
Hone
Problems

Identifying Unique Customers in Sales Transactions

You are tasked with analyzing sales data to understand your customer base. A common challenge is to identify how many unique customers have made purchases, rather than just counting the total number of transactions. This helps in understanding customer reach and avoids overcounting repeat customers.

Problem Description

You are given a table of sales transactions. Each row in this table represents a single sale and includes information such as the transaction_id, the customer_id who made the purchase, and the amount of the sale. Your goal is to write a query that returns a list of all the unique customer_ids present in the sales table. This means that if a customer has made multiple purchases, their customer_id should only appear once in the result.

Key Requirements:

  • Retrieve a list of unique customer identifiers.
  • Ensure that each customer_id appears only once in the output.

Expected Behavior: The output should be a single column containing customer_ids. If the input table contains duplicate customer_ids, only one instance of each customer_id should be returned.

Edge Cases:

  • An empty sales table should result in an empty output.

Examples

Example 1:

Input sales table:

transaction_idcustomer_idamount
101C00150.00
102C00275.00
103C00125.00
104C003100.00
105C00230.00

Output:

customer_id
C001
C002
C003

Explanation: Customer C001 and C002 appear multiple times in the input table. The output correctly lists each unique customer only once.

Example 2:

Input sales table:

transaction_idcustomer_idamount
201A12310.00
202B45620.00
203A12315.00

Output:

customer_id
A123
B456

Explanation: Customer A123 made two transactions, but is listed only once in the output.

Example 3:

Input sales table:

transaction_idcustomer_idamount

Output:

customer_id

Explanation: An empty input table results in an empty output.

Constraints

  • The sales table will contain at least one column named customer_id.
  • The customer_id column will store string or integer values representing customer identifiers.
  • The number of rows in the sales table can range from 0 to 1,000,000.
  • The query should execute efficiently, aiming for a time complexity that scales well with the number of rows.

Notes

Think about SQL keywords that are specifically designed to eliminate duplicate rows from a result set. This keyword is typically used immediately after the SELECT statement.

Loading editor...
plaintext