Python generators, explained
Generators are Python's answer to "produce a sequence of values without building the whole
thing in memory." They power lazy pipelines, let you model infinite streams, and turn up
constantly in interviews because they sit at the intersection of functions, the
iterator protocol, and memory efficiency. This guide explains what yield really
does and when a generator beats a list.
What a generator is
Any function containing a yield is a generator function. Calling it doesn't run the
body — it returns a generator object (an iterator) that runs the body lazily, pausing
at each yield and resuming where it left off on the next next().
def count_up(n):
for i in range(n):
yield i # pause here, hand back i, remember the position
gen = count_up(3) # nothing has run yet
next(gen) # 0 — runs until the first yield
next(gen) # 1 — resumes after the yield
next(gen) # 2
next(gen) # StopIteration — function returned
The key idea: yield suspends the function with all its local state intact. The
generator remembers exactly where it paused, so it picks up mid-loop on the next call.
yield vs return
return ends a function and hands back one value. yield hands back a value but keeps
the function alive and paused, ready to produce more. A generator can yield many times;
a plain function returns once.
def squares(nums):
for n in nums:
yield n * n # produces a value each iteration, doesn't end the function
for s in squares([1, 2, 3]):
print(s) # 1, 4, 9
A bare return inside a generator just stops iteration (raises StopIteration); its value
isn't yielded.
The memory win: lazy evaluation
A list comprehension builds every element up front and holds them all in memory. A generator produces values one at a time, on demand — so it uses roughly constant memory regardless of how many values flow through.
import sys
nums = [x * x for x in range(1_000_000)] # list: ~8 MB, all at once
gen = (x * x for x in range(1_000_000)) # generator: a few hundred bytes
sys.getsizeof(nums) # ~8000000+
sys.getsizeof(gen) # ~200
This is the headline benefit: you can process a 100 GB file or an endless stream because you only ever hold one item at a time. The trade-off is that a generator is single-use — once exhausted, you must recreate it to iterate again.
Generator expressions vs comprehensions
A generator expression uses the same syntax as a list comprehension but with
parentheses, and it's lazy. Prefer it when you're feeding the result straight into a
consumer like sum, any, max, or a for loop.
total = sum(x * x for x in range(1000)) # no intermediate list built
big = any(n > 100 for n in data) # stops early at the first match
# you can even drop the parens when it's the sole argument:
total = sum(x * x for x in range(1000))
Use a list comprehension when you need the materialised list (to index it, reuse it, or iterate more than once); use a generator expression when you just stream through once.
yield from: delegating to a sub-generator
yield from yields every value from another iterable, flattening nested generators
without a manual loop. It also transparently passes through send/throw to the
sub-generator.
def chain(*iterables):
for it in iterables:
yield from it # equivalent to: for x in it: yield x
list(chain([1, 2], [3, 4])) # [1, 2, 3, 4]
It's the clean way to compose generators — a generator that defers part of its output to another one.
Infinite sequences
Because generators are lazy, they can represent endless streams that you slice with
something like itertools.islice or break out of manually.
def naturals():
n = 0
while True:
yield n # never ends — fine, it's lazy
n += 1
import itertools
list(itertools.islice(naturals(), 5)) # [0, 1, 2, 3, 4]
A list version of this would loop forever and exhaust memory; the generator only computes as far as you consume.
A note on generator state
Generators are also coroutine-like: you can push values in with gen.send(value) (the
value becomes the result of the paused yield expression) and stop them early with
gen.close(). This is advanced, but it's why generators underpin older-style coroutines.
def accumulator():
total = 0
while True:
x = yield total # receives whatever send() passes in
total += x
In day-to-day code you'll mostly just iterate, but knowing send/close exist explains
how generators generalise beyond simple iteration.
Recap
A function with yield is a generator — calling it returns a lazy iterator that pauses
at each yield and resumes on the next next(), keeping its local state. Generators
trade the random-access convenience of a list for constant memory and the ability to
model infinite streams, at the cost of being single-use. Reach for a generator
expression when streaming into a consumer, a list comprehension when you need the
list itself, and yield from to delegate to a sub-generator. Once you see yield as
"pause and hand back," the rest follows.