Python garbage collection, explained
CPython frees memory automatically, but how matters in interviews and in debugging leaks. There are two mechanisms working together: reference counting does almost all the work instantly, and a cyclic garbage collector cleans up the one case reference counting can't — reference cycles. Understanding both explains a lot of Python's behaviour.
Reference counting is the primary mechanism
Every object carries a count of how many references point to it. When the count hits zero, the object is freed immediately — deterministically, the moment its last reference goes away.
import sys
x = []
sys.getrefcount(x) # 2 — x, plus the temporary arg to getrefcount itself
y = x
sys.getrefcount(x) # 3 — now y references it too
del y
sys.getrefcount(x) # 2 again
This is why CPython releases resources promptly: when a local goes out of scope, its referents' counts drop, and unreferenced objects die right away — no waiting for a collector.
Why reference counting isn't enough
Reference counting can't free cycles — objects that reference each other. Their counts never reach zero even when nothing outside the cycle points at them.
a = {}
b = {}
a["b"] = b # a references b
b["a"] = a # b references a — a cycle
del a, b # outside refs gone, but each still has refcount 1 → leaked
Without a second mechanism, this memory would be lost forever. That's the cyclic collector's job.
The cyclic garbage collector
The gc module runs a separate cycle detector. It periodically finds groups of objects
that are only reachable from each other and collects them. It only tracks container
objects (lists, dicts, classes, etc.) — things that can form cycles; simple immutables
like ints and strings are never tracked.
import gc
gc.collect() # force a full collection now, returns objects collected
gc.isenabled() # True by default
You rarely call it manually, but forcing a collection is useful after deleting a large web of objects, or in tests checking for leaks.
Generational collection
For efficiency the collector is generational, based on the observation that most objects die young. New objects start in generation 0, which is scanned often; survivors are promoted to older generations scanned progressively less.
import gc
gc.get_count() # (gen0, gen1, gen2) allocation counters
gc.get_threshold() # (700, 10, 10) — thresholds that trigger each generation
This keeps the collector cheap: it spends most effort on short-lived objects and rarely re-scans long-lived ones.
del and weak references
__del__ runs when an object is about to be destroyed, but don't rely on it for critical
cleanup — its timing is tied to refcount/gc, and cycles with __del__ were historically
tricky. For cleanup, prefer context managers. To reference an object without keeping it
alive (avoiding cycles), use weakref.
import weakref
class Node: pass
n = Node()
ref = weakref.ref(n) # doesn't increase refcount
ref() # the Node, or None once n is gone
del n
ref() # None
Weak references are the standard tool for caches and parent/child links that shouldn't prevent collection.
Practical implications
Reference counting means CPython's memory use is predictable and resources free promptly —
but it adds per-operation overhead and is part of why the GIL exists (refcount updates
must be thread-safe). Cycles are handled, just not instantly, so a program that builds many
cyclic structures may want to tune gc thresholds or call gc.collect() strategically.
Recap
CPython frees memory with two cooperating systems: reference counting frees objects the
instant their count hits zero (deterministic, prompt), and a generational cyclic
collector in the gc module reclaims reference cycles that counting alone can't. The
collector tracks only container objects and focuses effort on young generations since most
objects die young. Use context managers rather than __del__ for cleanup, and
weakref to reference objects without keeping them alive.