How to remove duplicates in lists?

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

How to Remove Duplicates in Lists

Removing duplicates from a list is a common task in programming. Here are several methods to achieve this in Python:

Using a Set

A set is a collection data type that automatically removes duplicates. Converting a list to a set and back to a list is a simple way to remove duplicates.

Example:

my_list = [1, 2, 2, 3, 4, 4, 5] unique_list = list(set(my_list)) print(unique_list) # Output: [1, 2, 3, 4, 5]

Using a List Comprehension and Set

If you want to maintain the order of elements while removing duplicates, you can use a set to track seen elements and a list comprehension to build the result.

Example:

my_list = [1, 2, 2, 3, 4, 4, 5] seen = set() unique_list = [x for x in my_list if not (x in seen or seen.add(x))] print(unique_list) # Output: [1, 2, 3, 4, 5]

Using a For Loop and Set

Similar to the list comprehension method, but using a for loop to explicitly add unique elements to a new list.

Example:

my_list = [1, 2, 2, 3, 4, 4, 5] unique_list = [] seen = set() for item in my_list: if item not in seen: unique_list.append(item) seen.add(item) print(unique_list) # Output: [1, 2, 3, 4, 5]

Using Collections.OrderedDict (Python 3.7+)

The OrderedDict from the collections module maintains the order of insertion and can be used to remove duplicates.

Example:

from collections import OrderedDict my_list = [1, 2, 2, 3, 4, 4, 5] unique_list = list(OrderedDict.fromkeys(my_list)) print(unique_list) # Output: [1, 2, 3, 4, 5]

Using Pandas (for large datasets)

If you are working with large datasets, using the pandas library can be very efficient.

Example:

import pandas as pd my_list = [1, 2, 2, 3, 4, 4, 5] unique_list = pd.Series(my_list).drop_duplicates().tolist() print(unique_list) # Output: [1, 2, 3, 4, 5]

Summary

  • Using a Set: Quick and easy, but does not maintain order.
  • Using List Comprehension and Set: Maintains order, concise.
  • Using For Loop and Set: Maintains order, explicit.
  • Using Collections.OrderedDict: Maintains order, uses a dict.
  • Using Pandas: Efficient for large datasets.

Each method has its advantages, and the best choice depends on your specific requirements, such as maintaining order or handling large datasets efficiently.

TAGS
Coding Interview
CONTRIBUTOR
Design Gurus Team

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
What Interview prep bootcamps are good for freshers?
How to explain complex technical concepts simply?
How to make a recursive lambda?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
Grokking Advanced Coding Patterns for Interviews
Master advanced coding patterns for interviews: Unlock the key to acing MAANG-level coding questions.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2024 Designgurus, Inc. All rights reserved.