Key Takeaways: A Hash Set is exactly like a Hash Table, but it only stores Keys without any Values. Its main superpower is instantly preventing and removing duplicate items from a dataset.
In the previous tutorial, we learned about Hash Tables (Dictionaries). Hash Tables use a Key to look up a Value.
But what if you don't care about a "Value"? What if you only want to keep track of a list of items and make sure that list has no duplicates? That is where the Hash Set comes in!
A Hash Set is an unordered collection of unique items.
Under the hood, a Hash Set uses the exact same Hash Function magic as a Hash Table. The difference is that it only stores the Keys.
Think of a VIP Guest List at an exclusive club.
If John tries to add his name to the guest list a second time, the bouncer says, "You are already on the list!" and ignores him.
Because a Hash Set uses a Hash Function, it instantly knows if an item already exists.
"Apple" to the Set.0. Since Index 0 is empty, it adds "Apple"."Apple" again.0. It sees that "Apple" is already there!"Apple".Hash Set: Duplicates are mathematically mapped to the same index and rejected.
Hash Sets are the ultimate tool for specific algorithmic problems. You should use them when:
O(n) time, but searching a Set takes O(1) time).Because Hash Sets use Hash Functions, their speeds are identical to Hash Tables!
O(1) Constant Time.O(1) Constant Time.O(1) Constant Time.Warning: Because Hash Sets are based on mathematical hashing, they are unordered. If you print a Set, the items will not be in the order you added them!
Python has a built-in set data structure that is incredibly easy to use.
Let's write some code to see how it automatically destroys duplicate values!
# 1. Create an empty Set vip_list = set() # 2. ADD items (O(1) time) vip_list.add("Alice") vip_list.add("Bob") print("Current List:", vip_list) # 3. Try to add a duplicate! vip_list.add("Alice") # Alice is still only in there once! print("After duplicate:", vip_list) # 4. CHECK IF EXISTS (O(1) time) if "Bob" in vip_list: print("Bob is allowed inside!") else: print("Bob is not on the list.") # 5. Magic Trick: Remove duplicates from a List instantly messy_list = [1, 2, 2, 3, 3, 3, 4] clean_list = list(set(messy_list)) print("Cleaned Array:", clean_list)
What happens if you try to add a value to a Hash Set that already exists inside it?