Python itertools, explained
itertools is a collection of fast, lazy iterator building blocks. Everything it
returns is an iterator that produces values on demand, so you can compose pipelines over
huge or infinite streams without ever building a list in memory. Here are the tools worth
knowing by heart.
Infinite iterators
count, cycle, and repeat generate endless streams. They're useful precisely because
they're lazy — pair them with something that stops the iteration.
from itertools import count, cycle, islice
ids = count(start=100, step=10) # 100, 110, 120, ...
print(next(ids), next(ids)) # 100 110
colors = cycle(["red", "green", "blue"]) # loops forever
first_seven = list(islice(colors, 7)) # ['red','green','blue','red','green','blue','red']
islice is the standard way to take a finite slice of an infinite (or lazy) iterator — you
can't use [:7] on an iterator.
Combining and slicing streams
chain flattens several iterables into one sequence; islice slices lazily without
materialising.
from itertools import chain, islice
merged = chain([1, 2], [3, 4], [5]) # 1 2 3 4 5 lazily
list(merged) # [1, 2, 3, 4, 5]
big = range(1_000_000)
window = list(islice(big, 10, 20)) # items 10..19, no giant list built
chain.from_iterable(list_of_lists) is the idiomatic flatten when the iterables themselves
come from an iterable.
groupby — consecutive runs
groupby collapses consecutive equal keys into groups. The catch that trips everyone
up: it only groups adjacent items, so you almost always sort by the same key first.
from itertools import groupby
data = [("a", 1), ("a", 2), ("b", 3), ("a", 4)]
data.sort(key=lambda x: x[0]) # sort first — groupby is adjacency-based
for key, group in groupby(data, key=lambda x: x[0]):
print(key, [v for _, v in group]) # a [1,2,4] then b [3]
Each group is itself a one-shot iterator — consume it before advancing to the next group.
accumulate — running totals
accumulate yields a running reduction. By default it's a cumulative sum, but pass any
binary function.
from itertools import accumulate
import operator
list(accumulate([1, 2, 3, 4])) # [1, 3, 6, 10]
list(accumulate([1, 2, 3, 4], operator.mul)) # [1, 2, 6, 24]
list(accumulate([3, 1, 4, 1, 5], max)) # [3, 3, 4, 4, 5] running max
It's the lazy, streaming cousin of functools.reduce that keeps every intermediate result.
Combinatorics — product, permutations, combinations
These generate combinatoric sequences lazily, replacing hand-rolled nested loops.
from itertools import product, permutations, combinations
list(product([0, 1], repeat=2)) # [(0,0),(0,1),(1,0),(1,1)] — nested loops
list(permutations("ABC", 2)) # ordered pairs: AB AC BA BC CA CB
list(combinations("ABC", 2)) # unordered pairs: AB AC BC
product replaces nested for loops; permutations cares about order; combinations
doesn't. Add combinations_with_replacement when repeats are allowed.
Filtering and pairing helpers
A few more everyday tools: takewhile/dropwhile for prefix-based filtering, pairwise
(3.10+) for sliding adjacent pairs, and starmap to unpack tuples into a function.
from itertools import takewhile, pairwise, starmap
list(takewhile(lambda x: x < 3, [1, 2, 3, 1])) # [1, 2] — stops at first failure
list(pairwise([1, 2, 3, 4])) # [(1,2),(2,3),(3,4)]
list(starmap(pow, [(2, 3), (3, 2)])) # [8, 9] — pow(2,3), pow(3,2)
Recap
itertools gives you lazy iterator building blocks that compose into memory-efficient
pipelines: count/cycle/repeat for infinite streams (bound them with islice),
chain to concatenate, groupby for consecutive runs (sort first!), accumulate for
running reductions, and product/permutations/combinations for combinatorics without
nested loops. Because everything is an on-demand iterator, these tools scale to huge or
infinite data that would never fit in a list.