The CPython execution model, explained
"Python is interpreted" is only half the story. CPython actually compiles your source to
bytecode and then runs that bytecode on a virtual machine. Understanding this pipeline —
source → bytecode → evaluation loop — demystifies performance, .pyc files, and the GIL.
Source to bytecode
When you import or run a module, CPython parses the source into an AST and compiles it to bytecode: a sequence of low-level instructions for Python's virtual machine. This is not machine code — it's an intermediate representation the interpreter understands.
import dis
def add(a, b):
return a + b
dis.dis(add)
# LOAD_FAST a
# LOAD_FAST b
# BINARY_OP + (add)
# RETURN_VALUE
dis disassembles a function so you can see the exact instructions. Each line is one
bytecode operation the VM will execute.
Code objects hold the bytecode
The compiled bytecode lives in a code object (func.__code__), alongside metadata:
constants, variable names, argument count, and flags. A function is essentially a code
object plus its defaults and closure.
add.__code__.co_code # the raw bytecode bytes
add.__code__.co_consts # constants used
add.__code__.co_varnames # ('a', 'b')
add.__code__.co_argcount # 2
Code objects are reused — defining a function compiles once, and every call re-runs the same code object with a fresh frame.
The evaluation loop and the stack
CPython runs bytecode in a giant loop (historically ceval.c) — a stack-based virtual
machine. Instructions push and pop operands on an evaluation stack.
# return a + b becomes:
LOAD_FAST a # push a onto the stack
LOAD_FAST b # push b
BINARY_OP + # pop both, push their sum
RETURN_VALUE # pop and return
Each function call creates a frame object holding its local variables and its own stack. This is what you see in a traceback — a stack of frames.
.pyc files cache the bytecode
Compilation isn't free, so CPython caches the bytecode of imported modules in __pycache__
as .pyc files. On the next import, if the source is unchanged, it loads the .pyc and
skips recompiling.
__pycache__/
mymodule.cpython-312.pyc
The filename encodes the interpreter version, since bytecode is not stable across Python
versions — a .pyc from 3.11 won't run on 3.12. Note the top-level script you run
directly isn't cached, only imported modules.
Where the GIL fits
The Global Interpreter Lock protects the interpreter's internal state (including reference counts) by ensuring only one thread executes bytecode at a time. It lives at the evaluation-loop level: a thread holds the GIL while running bytecode and periodically releases it.
# Two threads doing CPU work don't run bytecode simultaneously — the GIL serialises them.
# Threads DO overlap when one is blocked on I/O, because the GIL is released during the wait.
This is why threads help I/O-bound code but not CPU-bound code, and why true CPU parallelism needs multiple processes. (CPython 3.13 introduced an experimental free-threaded build that can disable the GIL.)
CPython vs other implementations
CPython is the reference implementation, but the model isn't the only one. PyPy uses a JIT compiler to turn hot bytecode into machine code for big speedups; Jython and IronPython target the JVM and .NET. "Python the language" is a spec; CPython is one (dominant) way to run it.
Recap
CPython compiles source to bytecode, stored in code objects, then executes it on
a stack-based virtual machine — the evaluation loop — where each call gets its own
frame. Imported modules' bytecode is cached as version-stamped .pyc files in
__pycache__ to skip recompilation. The GIL lives at this level, letting only one
thread run bytecode at a time (released during I/O), which is why CPU parallelism needs
processes. Use dis to see the bytecode, and remember CPython is just one implementation of
the language.