Unicorn IDs: Deduplicate in Order
๐ฆ Unicorn IDs: Deduplicate in Order
Difficulty: Beginner Tags: lists, sets, order, deduplication, hash-set Series: CS 101
Problem
The Unicorn Registry needs to remove duplicate IDs while maintaining the original order. Remove duplicate elements from a list, keeping only the first occurrence of each element in their original order.
Deduplication Rules: - Keep first occurrence of each element - Preserve original order - Remove all subsequent duplicates - Can handle any hashable types
Real-World Application
Order-preserving deduplication is used in: - Data cleaning - removing duplicate records while maintaining chronology - Event processing - deduplicating event streams - Log analysis - unique entries in order - Web scraping - removing duplicate URLs - Database queries - SELECT DISTINCT with ORDER BY - User activity tracking - unique actions timeline - Recommendation systems - deduplicating suggestions - Network packet analysis - identifying unique packets
Input
data = List[Any] # List of any hashable elements
Output
List[Any] # List with duplicates removed, order preserved
Constraints
- 0 โค list length โค 100,000
- Elements must be hashable (can be used in set)
- Original order must be preserved
- Empty list returns empty list
Examples
Example 1: Basic Deduplication
Input: [3, 1, 3, 2, 1]
Output: [3, 1, 2]
Explanation: - First 3 at index 0: keep - First 1 at index 1: keep - Second 3 at index 2: remove (duplicate) - First 2 at index 3: keep - Second 1 at index 4: remove (duplicate)
Example 2: All Unique
Input: [1, 2, 3]
Output: [1, 2, 3]
Explanation: No duplicates, return as-is.
Example 3: Empty List
Input: []
Output: []
Explanation: Empty list has no duplicates.
Example 4: All Same
Input: [5, 5, 5, 5]
Output: [5]
Explanation: Keep only first occurrence.
Example 5: Strings
Input: ['a', 'b', 'a', 'c', 'b']
Output: ['a', 'b', 'c']
Explanation: Works with any hashable type.
Example 6: Mixed Types
Input: [1, '1', 1, '1', 2]
Output: [1, '1', 2]
Explanation: Integer 1 and string '1' are different.
What You'll Learn
- Using sets for O(1) membership checking
- Preserving order while deduplicating
- Time complexity: list vs set operations
- When to use set vs dict for tracking
- Hashable types in Python
Why This Matters
Order-preserving deduplication is a fundamental data processing pattern: - Tests understanding of sets and order - Shows performance optimization techniques - Common in data pipelines and ETL - Appears in many interview contexts
Starter Code
def challenge_function(data):
"""
Remove duplicates from list while preserving order.
Keep only the first occurrence of each element.
Args:
data (List[Any]): list of hashable elements
Returns:
List[Any]: list with duplicates removed, order preserved
"""
# Your implementation here
pass