Three tools for the same job — but not interchangeable
Every Python developer uses list comprehensions daily, reaches for generators when memory is tight, and writes custom iterators when building reusable data sources. But the differences between them go deeper than syntax — they have fundamentally different memory behaviour, laziness, and reusability that determine which is right for any given situation.
Quick-reference comparison
| List comprehension | Generator expression | Custom iterator (class) | |
|---|---|---|---|
| Syntax | [x for x in it] | (x for x in it) | class with __iter__ + __next__ |
| Evaluation | Eager — builds all at once | Lazy — one item at a time | Lazy — one item at a time |
| Memory | O(n) — holds all items | O(1) — one item at a time | O(1) typically |
| Iterable again? | Yes — a list, reuse freely | No — exhausted after one pass | Yes (if __iter__ returns self) |
| Indexable? | Yes (result[0]) | No | No |
len()? | Yes | No | No (unless you implement it) |
| Best for | Short sequences you'll reuse | Large / infinite / pipeline data | Stateful, reusable data sources |
List comprehensions — eager, indexed, reusable
A list comprehension builds the entire result in memory at once and returns a plain
list. Because it's a real list, you can index it, call len(), iterate it multiple
times, sort it, or pass it to any function expecting a sequence.
squares = [n * n for n in range(10)] # built immediately — all 10 values in RAM
squares[0] # 0 — indexable
len(squares) # 10 — knows its size
list(squares) # [0, 1, 4, ...] — can iterate again
Use a list comprehension when:
- The result is small enough to hold in memory.
- You need to index, sort, or inspect the result multiple times.
- You're transforming one sequence into another in a single step.
The trade-off: for large data sets, storing every element wastes memory. A list of one million integers takes ~8 MB; the equivalent generator takes ~100 bytes.
Deep dive: Comprehensions interview questions
Generator expressions — lazy, one-pass, memory efficient
Change the brackets to parentheses and the comprehension becomes a generator expression: lazy, one-element-at-a-time, with a tiny fixed memory footprint. It doesn't compute anything until you iterate it.
import sys
nums_list = [n * n for n in range(1_000_000)] # ~8 MB — built now
nums_gen = (n * n for n in range(1_000_000)) # ~100 bytes — nothing built yet
sys.getsizeof(nums_list) # large
sys.getsizeof(nums_gen) # tiny, regardless of range size
# Use just like a list in a for loop or sum/max/any/all:
total = sum(n * n for n in range(1_000_000)) # streams, never builds the list
The critical limitation: a generator is exhausted after one pass. Iterating it a second time yields nothing:
gen = (n for n in range(3))
list(gen) # [0, 1, 2]
list(gen) # [] — exhausted; create a new generator if you need to re-iterate
Generator functions (using yield) give you the same laziness with more control:
def squares(n):
for i in range(n):
yield i * i # pauses here between calls to next()
for s in squares(5):
print(s) # 0, 1, 4, 9, 16 — computed one at a time
Use a generator when:
- The data set is large or potentially infinite.
- You're building a pipeline (transforming → filtering → consuming without staging all at once).
- You only need to iterate once.
Deep dive: Generators & yield interview questions
Custom iterators — stateful, reusable, flexible
A custom iterator is a class that implements the iterator protocol: __iter__ returns
self, and __next__ returns the next value or raises StopIteration. This approach is
more verbose than a generator but gives you full control over state, restartability, and
additional methods.
class Counter:
def __init__(self, start, stop):
self.current = start
self.stop = stop
def __iter__(self):
return self # the iterator is its own iterable
def __next__(self):
if self.current >= self.stop:
raise StopIteration
value = self.current
self.current += 1
return value
c = Counter(1, 4)
list(c) # [1, 2, 3]
# Note: c is now exhausted — Counter.__iter__ returns self, not a reset copy
# Reset by creating a new Counter(1, 4)
For a reusable iterable (re-iterate from the start each time), separate the iterable from the iterator:
class NumberRange:
def __init__(self, start, stop):
self.start, self.stop = start, stop
def __iter__(self):
return Counter(self.start, self.stop) # fresh iterator on each call
r = NumberRange(1, 4)
list(r) # [1, 2, 3]
list(r) # [1, 2, 3] — works again
Use a custom iterator class when:
- The data source has complex state that benefits from methods (
reset(),peek()). - You want the same iterable to be re-iterable (separate iterable from iterator).
- You're integrating with a resource (file, socket, DB cursor) that requires cleanup in
__del__.
Deep dive: Iterators & the Iterator Protocol interview questions
The decision flowchart
Is the result small and will you use it multiple times (index, sort, len)?
└── Yes → list comprehension
Is the data large, infinite, or a processing pipeline you'll consume once?
└── Yes → generator expression or generator function (yield)
Do you need a reusable data source with state or multiple methods?
└── Yes → custom iterator class
When generators beat comprehensions even for small data
Even for small data, generators win inside sum(), max(), any(), and all() because
those functions consume the iterable in a single pass and don't need the list. Passing a
generator expression avoids the intermediate allocation:
# unnecessary list — builds it, then sums it, then discards it
total = sum([n * n for n in range(1000)])
# better — streams directly into sum(), no list ever built
total = sum(n * n for n in range(1000))
The outer () from sum(...) acts as the generator's delimiters — no extra parentheses
needed.
Recap
A list comprehension is eager, O(n) memory, indexable, and re-iterable — use it when
the result fits in memory and you need a real list. A generator expression (or
yield-based generator function) is lazy, O(1) memory, and one-pass — use it for large
data, infinite sequences, or pipelines. A custom iterator class is lazy, stateful, and
flexible — use it when you need reusability, complex state, or resource management that
goes beyond what a generator can express cleanly.