To remove duplicates from a list in Python while optionally preserving order, use the following methods:
1. Using a Set (Order Not Preserved)
Convert the list to a set
to remove duplicates, then back to a list.
Note: This method does not preserve the original order.
original_list = [2, 3, 2, 1, 5, 4, 4]
unique_list = list(set(original_list))
print(unique_list) # Output (order varies): [1, 2, 3, 4, 5]
2. Using a Loop and seen
Set (Order Preserved)
Iterate through the list and append elements to a new list only if they haven’t been seen before.
Preserves the order of first occurrence.
original_list = [2, 3, 2, 1, 5, 4, 4]
seen = set()
unique_list = []
for item in original_list:
if item not in seen:
seen.add(item)
unique_list.append(item)
print(unique_list) # Output: [2, 3, 1, 5, 4]
3. Using dict.fromkeys()
(Order Preserved in Python 3.7+)
Convert the list to a dictionary (keys are unique), then back to a list.
Preserves order in Python 3.7+:
original_list = [2, 3, 2, 1, 5, 4, 4]
unique_list = list(dict.fromkeys(original_list))
print(unique_list) # Output: [2, 3, 1, 5, 4]
4. Using OrderedDict
(Order Preserved in Older Python Versions)
For Python <3.7, use collections.OrderedDict
:
from collections import OrderedDict
original_list = [2, 3, 2, 1, 5, 4, 4]
unique_list = list(OrderedDict.fromkeys(original_list))
print(unique_list) # Output: [2, 3, 1, 5, 4]
5. Using List Comprehension (Order Preserved)
Combine a seen
set with a list comprehension for brevity:
original_list = [2, 3, 2, 1, 5, 4, 4]
seen = set()
unique_list = [x for x in original_list if not (x in seen or seen.add(x))]
print(unique_list) # Output: [2, 3, 1, 5, 4]
6. Using itertools.groupby
(Sorted List Only)
Remove consecutive duplicates after sorting the list:
from itertools import groupby
original_list = [2, 3, 2, 1, 5, 4, 4]
sorted_list = sorted(original_list) # Sort first
unique_list = [k for k, _ in groupby(sorted_list)]
print(unique_list) # Output: [1, 2, 3, 4, 5] (order changed)
7. Handling Unhashable Elements (e.g., Lists of Lists)
For lists containing unhashable elements (like nested lists), convert elements to hashable types first:
original_list = [[1, 2], [3], [1, 2], [4, 5]]
seen = set()
unique_list = []
for sublist in original_list:
# Convert list to tuple (hashable)
tuple_sublist = tuple(sublist)
if tuple_sublist not in seen:
seen.add(tuple_sublist)
unique_list.append(sublist)
print(unique_list) # Output: [[1, 2], [3], [4, 5]]
Summary Table
Method | Order Preserved? | Time Complexity | Use Case |
---|---|---|---|
set() | ❌ No | O(n) | Quick deduplication, order irrelevant |
Loop + seen set | ✅ Yes | O(n) | Order matters, general use |
dict.fromkeys() | ✅ Yes (Python 3.7+) | O(n) | Concise, modern Python |
OrderedDict | ✅ Yes | O(n) | Python <3.7 compatibility |
List Comprehension + seen | ✅ Yes | O(n) | Compact code |
itertools.groupby | ❌ No (sorted) | O(n log n) | Sorted lists, consecutive duplicates |
Key Takeaways
- Preserve Order: Use
dict.fromkeys()
(Python 3.7+), loop withseen
, orOrderedDict
. - Speed: The
set()
method is fastest but doesn’t preserve order. - Unhashable Elements: Convert elements to hashable types (e.g., tuples) first.
By choosing the right method, you can efficiently remove duplicates while meeting your specific needs!