Python sets, explained
A set is an unordered collection of unique, hashable elements, backed by a hash table. That backing is what makes sets brilliant for membership tests and de-duplication — and why a frozenset exists. This guide walks through operations, performance, and the mutating methods.
Creating sets
s = {1, 2, 3} # set literal
s = set([1, 2, 2, 3]) # {1, 2, 3} — duplicates dropped
empty = set() # {} is an empty DICT, not a set!
That last line is a classic trap: {} is an empty dictionary; you must use set() for an
empty set.
Set operations
Sets support mathematical operations as both operators and methods:
a = {1, 2, 3}
b = {2, 3, 4}
a | b # {1, 2, 3, 4} — union (a.union(b))
a & b # {2, 3} — intersection (a.intersection(b))
a - b # {1} — difference (a.difference(b))
a ^ b # {1, 4} — symmetric diff (a.symmetric_difference(b))
a <= b # subset check
a >= b # superset check
The method forms (a.union(b)) accept any iterable, not just another set, which is handy:
a.union([4, 5]).
Why membership testing is O(1)
x in some_set is O(1) average, because a set hashes x and jumps straight to its bucket.
x in some_list is O(n) because it scans element by element. For repeated lookups against a
large collection, converting to a set is a big win:
allowed = {"GET", "POST", "PUT"} # set — O(1) lookups
"GET" in allowed # fast regardless of size
Removing duplicates
The most common everyday use of a set is de-duplication:
list(set([3, 1, 2, 3, 1])) # [1, 2, 3] — order NOT guaranteed
If you need to drop duplicates while preserving order, use dict.fromkeys, which keeps
insertion order:
list(dict.fromkeys([3, 1, 2, 3, 1])) # [3, 1, 2]
add, discard, and remove
Three methods change a set; the difference is how they handle a missing element:
s = {1, 2, 3}
s.add(4) # {1, 2, 3, 4}
s.remove(2) # {1, 3, 4}; KeyError if 2 weren't present
s.discard(99) # no error even though 99 isn't there
s.pop() # removes and returns an arbitrary element
Use discard when "remove if present" is the intent; use remove when absence is a bug
worth surfacing.
Elements must be hashable
Because sets use hashing, every element must be hashable (immutable). You can put numbers, strings, and tuples in a set, but not lists or other sets:
{1, "a", (2, 3)} # fine
{[1, 2]} # TypeError: unhashable type: 'list'
frozenset — the immutable set
A frozenset is an immutable, hashable set. Because it's hashable, it can be a dict key
or an element of another set — things a normal set cannot do:
fs = frozenset({1, 2, 3})
{fs: "value"} # frozenset as a dict key — works
{frozenset({1, 2}), frozenset({3})} # a set of sets — works
It supports all the read-only operations (|, &, in) but none that mutate (add,
remove).
Recap
A set is an unordered collection of unique, hashable elements with O(1) membership
tests — far faster than a list for "is this in here?". Use |, &, -, ^ for union,
intersection, difference, and symmetric difference. set(seq) de-duplicates (order lost);
dict.fromkeys(seq) de-duplicates while keeping order. Use discard for safe removal and
remove when absence should raise. Elements must be hashable, and when you need an
immutable set you can use as a dict key or nest inside another set, reach for frozenset.