Delete Duplicate Emails
You are tasked with cleaning up a dataset of email addresses. In many real-world scenarios, data can become duplicated, leading to inefficiencies and incorrect analysis. This challenge requires you to process a list of email addresses and remove any duplicates, ensuring each unique email appears only once.
Problem Description
Given a list of email addresses, you need to return a new list containing only the unique email addresses from the original list. The order of the unique emails in the output list does not matter.
Key Requirements:
- The function should accept a list of strings, where each string represents an email address.
- The function should return a list of strings, containing only the unique email addresses from the input.
- Case sensitivity for emails should be ignored (e.g., "Test@example.com" and "test@example.com" are considered duplicates).
Expected Behavior:
- If the input list is empty, an empty list should be returned.
- If the input list contains only unique emails, the output list should be identical to the input (though the order might change if your implementation naturally reorders).
- If an email appears multiple times, only one instance of that email should be present in the output.
Edge Cases to Consider:
- Empty input list.
- List with all identical emails.
- List with mixed case duplicates.
- List with emails containing special characters (although for this problem, we assume standard email formats and focus on duplicate detection).
Examples
Example 1:
Input: ["john@example.com", "jane@example.com", "john@example.com", "peter@example.com", "jane@example.com"]
Output: ["john@example.com", "jane@example.com", "peter@example.com"]
Explanation: The email "john@example.com" appears twice, and "jane@example.com" appears twice. After removing duplicates, we are left with three unique email addresses. The order of the output can vary.
Example 2:
Input: ["test@domain.com", "Test@domain.com", "ANOTHER@domain.com"]
Output: ["test@domain.com", "another@domain.com"]
Explanation: "test@domain.com" and "Test@domain.com" are considered duplicates due to case insensitivity. "ANOTHER@domain.com" is unique, but its lowercase version is returned.
Example 3:
Input: []
Output: []
Explanation: An empty input list results in an empty output list.
Constraints
- The input list will contain between 0 and 1000 email addresses.
- Each email address will be a string with a length between 5 and 255 characters.
- Email addresses will adhere to a generally valid format, focusing on the duplicate removal aspect rather than strict email validation.
- The solution should be efficient, aiming for a time complexity that scales well with the number of emails.
Notes
- Consider how you can efficiently keep track of emails you have already encountered.
- Remember to handle case insensitivity correctly. You might want to normalize the emails before comparison or storage.
- The order of emails in the output list is not specified, so you have flexibility in how you present the unique emails.