How to remove duplicates in lists with Python ?

To remove duplicates from a list in Python while optionally preserving order, use the following methods:

1. Using a Set (Order Not Preserved)

Convert the list to a set to remove duplicates, then back to a list.
Note: This method does not preserve the original order.

original_list = [2, 3, 2, 1, 5, 4, 4]
unique_list = list(set(original_list))
print(unique_list)  # Output (order varies): [1, 2, 3, 4, 5]

2. Using a Loop and `seen` Set (Order Preserved)

Iterate through the list and append elements to a new list only if they haven’t been seen before.
Preserves the order of first occurrence.

original_list = [2, 3, 2, 1, 5, 4, 4]
seen = set()
unique_list = []
for item in original_list:
    if item not in seen:
        seen.add(item)
        unique_list.append(item)
print(unique_list)  # Output: [2, 3, 1, 5, 4]

3. Using `dict.fromkeys()` (Order Preserved in Python 3.7+)

Convert the list to a dictionary (keys are unique), then back to a list.
Preserves order in Python 3.7+:

original_list = [2, 3, 2, 1, 5, 4, 4]
unique_list = list(dict.fromkeys(original_list))
print(unique_list)  # Output: [2, 3, 1, 5, 4]

4. Using `OrderedDict` (Order Preserved in Older Python Versions)

For Python <3.7, use collections.OrderedDict:

from collections import OrderedDict
original_list = [2, 3, 2, 1, 5, 4, 4]
unique_list = list(OrderedDict.fromkeys(original_list))
print(unique_list)  # Output: [2, 3, 1, 5, 4]

5. Using List Comprehension (Order Preserved)

Combine a seen set with a list comprehension for brevity:

original_list = [2, 3, 2, 1, 5, 4, 4]
seen = set()
unique_list = [x for x in original_list if not (x in seen or seen.add(x))]
print(unique_list)  # Output: [2, 3, 1, 5, 4]

6. Using `itertools.groupby` (Sorted List Only)

Remove consecutive duplicates after sorting the list:

from itertools import groupby
original_list = [2, 3, 2, 1, 5, 4, 4]
sorted_list = sorted(original_list)  # Sort first
unique_list = [k for k, _ in groupby(sorted_list)]
print(unique_list)  # Output: [1, 2, 3, 4, 5] (order changed)

7. Handling Unhashable Elements (e.g., Lists of Lists)

For lists containing unhashable elements (like nested lists), convert elements to hashable types first:

original_list = [[1, 2], [3], [1, 2], [4, 5]]
seen = set()
unique_list = []
for sublist in original_list:
    # Convert list to tuple (hashable)
    tuple_sublist = tuple(sublist)
    if tuple_sublist not in seen:
        seen.add(tuple_sublist)
        unique_list.append(sublist)
print(unique_list)  # Output: [[1, 2], [3], [4, 5]]

Summary Table

Method	Order Preserved?	Time Complexity	Use Case
`set()`	❌ No	O(n)	Quick deduplication, order irrelevant
Loop + `seen` set	✅ Yes	O(n)	Order matters, general use
`dict.fromkeys()`	✅ Yes (Python 3.7+)	O(n)	Concise, modern Python
`OrderedDict`	✅ Yes	O(n)	Python <3.7 compatibility
List Comprehension + `seen`	✅ Yes	O(n)	Compact code
`itertools.groupby`	❌ No (sorted)	O(n log n)	Sorted lists, consecutive duplicates

Key Takeaways

Preserve Order: Use dict.fromkeys() (Python 3.7+), loop with seen, or OrderedDict.
Speed: The set() method is fastest but doesn’t preserve order.
Unhashable Elements: Convert elements to hashable types (e.g., tuples) first.

By choosing the right method, you can efficiently remove duplicates while meeting your specific needs!

1. Using a Set (Order Not Preserved)

2. Using a Loop and seen Set (Order Preserved)

3. Using dict.fromkeys() (Order Preserved in Python 3.7+)

4. Using OrderedDict (Order Preserved in Older Python Versions)