Hone logo
Hone
Problems

Customer Purchase Funnel Analysis

This challenge requires you to analyze customer behavior through a purchase funnel using advanced SQL features. You will identify how many customers progress from one stage of the funnel to the next, enabling businesses to pinpoint drop-off points and optimize their marketing and sales strategies.

Problem Description

You are tasked with analyzing customer interactions with an e-commerce platform to understand their journey through a defined purchase funnel. The funnel consists of the following stages:

  1. view_product: Customer viewed at least one product page.
  2. add_to_cart: Customer added at least one item to their shopping cart.
  3. initiate_checkout: Customer started the checkout process (e.g., clicked "Proceed to Checkout").
  4. purchase_complete: Customer successfully completed a purchase.

You need to calculate the number of unique customers who reached each stage of the funnel and the conversion rate between consecutive stages.

Key Requirements:

  • Identify unique customers at each stage.
  • Calculate the count of customers for each stage.
  • Calculate the percentage of customers who progressed from stage N to stage N+1.
  • The output should be ordered by the funnel stage.

Expected Behavior:

The output should be a table with columns: funnel_stage and customer_count. Additionally, for stages beyond the first, a conversion_rate column representing the percentage of customers from the previous stage who reached the current stage should be included.

Edge Cases:

  • Customers might perform actions out of order (e.g., add to cart before viewing a product). The funnel should reflect the first time a customer reaches a defined stage.
  • A customer might perform an action multiple times. We only care about whether they reached the stage at least once.
  • Some customers might only reach certain stages and not proceed further.

Examples

Example 1:

Input Data (represented as event logs):

customer_events table:

customer_idevent_typeevent_timestamp
101view_product2023-10-26 10:00:00
102view_product2023-10-26 10:05:00
101add_to_cart2023-10-26 10:15:00
103view_product2023-10-26 10:20:00
102add_to_cart2023-10-26 10:25:00
101initiate_checkout2023-10-26 10:30:00
102initiate_checkout2023-10-26 10:35:00
103add_to_cart2023-10-26 10:40:00
101purchase_complete2023-10-26 10:45:00
102purchase_complete2023-10-26 10:50:00

Output:

funnel_stagecustomer_countconversion_rate
view_product3NULL
add_to_cart3100.00
initiate_checkout266.67
purchase_complete2100.00

Explanation:

  • view_product: Customers 101, 102, and 103 viewed products. (Count = 3)
  • add_to_cart: Customers 101, 102, and 103 added to cart. (Count = 3). Conversion rate: (3/3) * 100 = 100.00%
  • initiate_checkout: Customers 101 and 102 initiated checkout. (Count = 2). Conversion rate: (2/3) * 100 = 66.67%
  • purchase_complete: Customers 101 and 102 completed purchase. (Count = 2). Conversion rate: (2/2) * 100 = 100.00%

Example 2:

Input Data:

customer_events table:

customer_idevent_typeevent_timestamp
201view_product2023-11-01 09:00:00
202view_product2023-11-01 09:05:00
201add_to_cart2023-11-01 09:10:00
203view_product2023-11-01 09:15:00
201purchase_complete2023-11-01 09:20:00
202initiate_checkout2023-11-01 09:25:00
203add_to_cart2023-11-01 09:30:00
203initiate_checkout2023-11-01 09:35:00

Output:

funnel_stagecustomer_countconversion_rate
view_product3NULL
add_to_cart266.67
initiate_checkout2100.00
purchase_complete150.00

Explanation:

  • view_product: Customers 201, 202, and 203. (Count = 3)
  • add_to_cart: Customers 201 and 203. (Count = 2). Conversion rate: (2/3) * 100 = 66.67%
  • initiate_checkout: Customers 202 and 203. (Count = 2). Conversion rate: (2/2) * 100 = 100.00%
  • purchase_complete: Customer 201. (Count = 1). Conversion rate: (1/2) * 100 = 50.00%

Example 3: Handling Out-of-Order and Repeated Actions

Input Data:

customer_events table:

customer_idevent_typeevent_timestamp
301add_to_cart2023-11-05 11:00:00
301view_product2023-11-05 11:05:00
301view_product2023-11-05 11:10:00
302initiate_checkout2023-11-05 11:15:00
302view_product2023-11-05 11:20:00
302add_to_cart2023-11-05 11:25:00
301initiate_checkout2023-11-05 11:30:00
301add_to_cart2023-11-05 11:35:00
302purchase_complete2023-11-05 11:40:00

Output:

funnel_stagecustomer_countconversion_rate
view_product2NULL
add_to_cart2100.00
initiate_checkout2100.00
purchase_complete150.00

Explanation:

  • view_product: Customers 301 and 302 viewed products at least once. (Count = 2)
  • add_to_cart: Customers 301 and 302 added to cart at least once. (Count = 2). Conversion rate: (2/2) * 100 = 100.00%
  • initiate_checkout: Customers 301 and 302 initiated checkout at least once. (Count = 2). Conversion rate: (2/2) * 100 = 100.00%
  • purchase_complete: Customer 302 completed purchase. (Count = 1). Conversion rate: (1/2) * 100 = 50.00%

Constraints

  • customer_events table contains at least 1 record and at most 1,000,000,000 records.
  • customer_id is an integer.
  • event_type is a string, one of 'view_product', 'add_to_cart', 'initiate_checkout', 'purchase_complete'.
  • event_timestamp is a datetime/timestamp data type.
  • The solution should be efficient and complete within a reasonable execution time, assuming optimal indexing.

Notes

  • The order of event_timestamp for a given customer is crucial to determine when they first reached a stage.
  • Consider using window functions (e.g., ROW_NUMBER, RANK) to identify the first occurrence of each event type per customer.
  • You will need to pivot or conditionally aggregate your results to get the desired funnel stage counts.
  • Calculating conversion rates requires careful handling of division and potential division-by-zero scenarios (though not explicitly present in the examples, consider it for robustness).
  • The NULL value for the first stage's conversion rate is intentional.
Loading editor...
plaintext