Upserting Customer Data with MERGE
Your task is to implement an "upsert" operation for customer data in a database. Upserting means inserting a new record if it doesn't exist, or updating an existing record if it does. This is a common operation when synchronizing data from different sources or when processing periodic updates. You will use the MERGE statement, a powerful SQL construct that handles this logic concisely and efficiently.
Problem Description
You are given two tables: Customers (the target table) and NewCustomerData (the source table containing incoming customer information). You need to update the Customers table based on the data in NewCustomerData.
Requirements:
- Insert New Customers: If a customer from
NewCustomerDatadoes not exist in theCustomerstable (identified byCustomerID), insert the customer's details into theCustomerstable. - Update Existing Customers: If a customer from
NewCustomerDataalready exists in theCustomerstable (identified byCustomerID), update theirFirstName,LastName, andEmailwith the values fromNewCustomerData. - Handle No-Op: If a customer exists in
Customersand their details inNewCustomerDataare identical, no update should occur. TheMERGEstatement should handle this implicitly.
Expected Behavior:
After executing the MERGE operation, the Customers table should reflect the combined state of the initial Customers table and the NewCustomerData, with new customers added and existing ones updated as per the rules.
Edge Cases:
- The
NewCustomerDatatable might be empty. NewCustomerDatamight contain duplicates of the sameCustomerID. TheMERGEstatement should typically handle the first match found or have defined behavior based on the specific SQL dialect. For this challenge, assume standard behavior where it processes based on the first match.CustomerIDis the primary key inCustomersand should be unique.
Examples
Example 1:
Initial Customers table:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
1 | Alice | Smith | alice.smith@email.com
2 | Bob | Johnson | bob.johnson@email.com
NewCustomerData table:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
2 | Robert | Johnson | r.johnson@email.com
3 | Charlie | Brown | charlie.b@email.com
Output Customers table after MERGE:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
1 | Alice | Smith | alice.smith@email.com
2 | Robert | Johnson | r.johnson@email.com
3 | Charlie | Brown | charlie.b@email.com
Explanation:
- CustomerID
1remains unchanged as it's not inNewCustomerData. - CustomerID
2exists in both tables.FirstName,LastName, andEmailare updated fromNewCustomerData. - CustomerID
3exists inNewCustomerDatabut not inCustomers, so it is inserted.
Example 2:
Initial Customers table:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
10 | David | Lee | david.lee@email.com
NewCustomerData table:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
10 | David | Lee | david.lee@email.com
Output Customers table after MERGE:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
10 | David | Lee | david.lee@email.com
Explanation:
- CustomerID
10exists in both tables. The data inNewCustomerDatais identical to the existing record inCustomers. No update or insert occurs.
Example 3: (Empty NewCustomerData)
Initial Customers table:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
5 | Eve | Adams | eve.adams@email.com
NewCustomerData table:
(empty)
Output Customers table after MERGE:
CustomerID | FirstName | LastName | Email
-----------|-----------|----------|--------------------
5 | Eve | Adams | eve.adams@email.com
Explanation:
- Since
NewCustomerDatais empty, no operations are performed. TheCustomerstable remains unchanged.
Constraints
CustomerIDis an integer and is the unique identifier for customers.FirstName,LastName, andEmailare strings.- The maximum number of rows in
Customersis 1,000,000. - The maximum number of rows in
NewCustomerDatais 100,000. - The
MERGEoperation should complete within 5 seconds.
Notes
The MERGE statement typically has the following structure:
MERGE INTO target_table AS T
USING source_table AS S
ON T.join_column = S.join_column
WHEN MATCHED THEN
UPDATE SET T.column1 = S.column1, T.column2 = S.column2, ...
WHEN NOT MATCHED THEN
INSERT (column1, column2, ...)
VALUES (S.column1, S.column2, ...);
You will need to adapt this structure to fit the Customers and NewCustomerData tables, using CustomerID as the join column. Pay attention to the WHEN MATCHED and WHEN NOT MATCHED clauses.