Hone logo
Hone
Problems

Delete Duplicate Emails

You are tasked with cleaning up a dataset of email addresses. In many real-world scenarios, data can become duplicated, leading to inefficiencies and incorrect analysis. This challenge requires you to process a list of email addresses and remove any duplicates, ensuring each unique email appears only once.

Problem Description

Given a list of email addresses, you need to return a new list containing only the unique email addresses from the original list. The order of the unique emails in the output list does not matter.

Key Requirements:

  • The function should accept a list of strings, where each string represents an email address.
  • The function should return a list of strings, containing only the unique email addresses from the input.
  • Case sensitivity for emails should be ignored (e.g., "Test@example.com" and "test@example.com" are considered duplicates).

Expected Behavior:

  • If the input list is empty, an empty list should be returned.
  • If the input list contains only unique emails, the output list should be identical to the input (though the order might change if your implementation naturally reorders).
  • If an email appears multiple times, only one instance of that email should be present in the output.

Edge Cases to Consider:

  • Empty input list.
  • List with all identical emails.
  • List with mixed case duplicates.
  • List with emails containing special characters (although for this problem, we assume standard email formats and focus on duplicate detection).

Examples

Example 1:

Input: ["john@example.com", "jane@example.com", "john@example.com", "peter@example.com", "jane@example.com"]
Output: ["john@example.com", "jane@example.com", "peter@example.com"]
Explanation: The email "john@example.com" appears twice, and "jane@example.com" appears twice. After removing duplicates, we are left with three unique email addresses. The order of the output can vary.

Example 2:

Input: ["test@domain.com", "Test@domain.com", "ANOTHER@domain.com"]
Output: ["test@domain.com", "another@domain.com"]
Explanation: "test@domain.com" and "Test@domain.com" are considered duplicates due to case insensitivity. "ANOTHER@domain.com" is unique, but its lowercase version is returned.

Example 3:

Input: []
Output: []
Explanation: An empty input list results in an empty output list.

Constraints

  • The input list will contain between 0 and 1000 email addresses.
  • Each email address will be a string with a length between 5 and 255 characters.
  • Email addresses will adhere to a generally valid format, focusing on the duplicate removal aspect rather than strict email validation.
  • The solution should be efficient, aiming for a time complexity that scales well with the number of emails.

Notes

  • Consider how you can efficiently keep track of emails you have already encountered.
  • Remember to handle case insensitivity correctly. You might want to normalize the emails before comparison or storage.
  • The order of emails in the output list is not specified, so you have flexibility in how you present the unique emails.
Loading editor...
plaintext