Sets vs. Lists | Python

Prajeet Singh Kalchuri
3 min readJul 21, 2021

What is faster …..Sets or Lists ??

As Wikipedia says “ Membership testing with sets and dictionaries is much faster, O(1), than searching sequences, O(n). When testing “a in b”, b should be a set or dictionary instead of a list or tuple.

You must have been using sets in place of lists whenever speed is important in your code, but have you ever wondered why sets are so much faster than lists. So let’s see what exactly is going on behind the scenes in python to make sets faster?

Sets are implemented using hash tables , So whenever you add an object to a set, the position within the memory of the set object is determined using the hash of the object to be added.

And When testing for membership, all that needs to be done is basically to look if the object is at the position determined by its hash, so the speed of this operation does not depend on the size of the set.

For lists, in contrast, the whole list needs to be searched, which will become slower as the list grows.

Let’s Understand this through an example :

list: Imagine you are looking for your pen, but you don't know in which drawer your pen is, so you have to search drawer by drawer until you find it (or maybe you never do). That's what we call O(n), because in the worst scenario, you will look in all your drawers (where n is the number of drawers).

set: Now, imagine you're still looking for your pen, but now you know in which drawer your pen is, say in the 8th drawer. So, you will just search in the 8th drawer, instead of searching in all drawers. That's what we call O(1), because in the worst scenario you will look in just one drawer.

Python lists are implemented as dynamic arrays and sets are implemented as a hash tables.

You have to keep one most important thing in your mind : that sets aren’t faster than lists in general — membership test is faster for sets, and so is removing an element , and As long as you don’t need these operations, lists are often faster.

And when you go deeper in this you will know that sets vs. lists depends largely on what operation we are performing like ,

  • If we are adding an element — then a set doesn’t need to move any data, and all it needs to do is calculate a hash value and add it to a table but for a list insertion then potentially there is data to be moved.
  • If we are deleting an element — all a set needs to do is remove the hash entry from the hash table, for a list it potentially needs to move data around.
  • If we are searching (i.e. an in operator) — a set just needs to calculate the hash value of the data item, find that hash value in the hash table, and if it is there — then bingo. For a list, the search has to look up each item in turn . Even for many 1000s of items a set will be far quicker to search.

Note: Sets aren’t faster than lists in general — membership test is faster for sets, and so is removing an element. As long as you don’t need these operations, lists are often faster.

Hope this article may be helpful to you.

Keep Learning!

--

--