The CPython Execution Model Explained — Bytecode, the Interpreter Loop, and the GIL

The CPython execution model, explained

"Python is interpreted" is only half the story. CPython actually compiles your source to bytecode and then runs that bytecode on a virtual machine. Understanding this pipeline — source → bytecode → evaluation loop — demystifies performance, .pyc files, and the GIL.

Source to bytecode

When you import or run a module, CPython parses the source into an AST and compiles it to bytecode: a sequence of low-level instructions for Python's virtual machine. This is not machine code — it's an intermediate representation the interpreter understands.

import dis

def add(a, b):
    return a + b

dis.dis(add)
#   LOAD_FAST   a
#   LOAD_FAST   b
#   BINARY_OP   + (add)
#   RETURN_VALUE

dis disassembles a function so you can see the exact instructions. Each line is one bytecode operation the VM will execute.

Code objects hold the bytecode

The compiled bytecode lives in a code object (func.__code__), alongside metadata: constants, variable names, argument count, and flags. A function is essentially a code object plus its defaults and closure.

add.__code__.co_code         # the raw bytecode bytes
add.__code__.co_consts       # constants used
add.__code__.co_varnames     # ('a', 'b')
add.__code__.co_argcount     # 2

Code objects are reused — defining a function compiles once, and every call re-runs the same code object with a fresh frame.

The evaluation loop and the stack

CPython runs bytecode in a giant loop (historically ceval.c) — a stack-based virtual machine. Instructions push and pop operands on an evaluation stack.

# return a + b becomes:
LOAD_FAST a      # push a onto the stack
LOAD_FAST b      # push b
BINARY_OP +      # pop both, push their sum
RETURN_VALUE     # pop and return

Each function call creates a frame object holding its local variables and its own stack. This is what you see in a traceback — a stack of frames.

.pyc files cache the bytecode

Compilation isn't free, so CPython caches the bytecode of imported modules in __pycache__ as .pyc files. On the next import, if the source is unchanged, it loads the .pyc and skips recompiling.

__pycache__/
    mymodule.cpython-312.pyc

The filename encodes the interpreter version, since bytecode is not stable across Python versions — a .pyc from 3.11 won't run on 3.12. Note the top-level script you run directly isn't cached, only imported modules.

Where the GIL fits

The Global Interpreter Lock protects the interpreter's internal state (including reference counts) by ensuring only one thread executes bytecode at a time. It lives at the evaluation-loop level: a thread holds the GIL while running bytecode and periodically releases it.

# Two threads doing CPU work don't run bytecode simultaneously — the GIL serialises them.
# Threads DO overlap when one is blocked on I/O, because the GIL is released during the wait.

This is why threads help I/O-bound code but not CPU-bound code, and why true CPU parallelism needs multiple processes. (CPython 3.13 introduced an experimental free-threaded build that can disable the GIL.)

CPython vs other implementations

CPython is the reference implementation, but the model isn't the only one. PyPy uses a JIT compiler to turn hot bytecode into machine code for big speedups; Jython and IronPython target the JVM and .NET. "Python the language" is a spec; CPython is one (dominant) way to run it.

Recap

CPython compiles source to bytecode, stored in code objects, then executes it on a stack-based virtual machine — the evaluation loop — where each call gets its own frame. Imported modules' bytecode is cached as version-stamped .pyc files in __pycache__ to skip recompilation. The GIL lives at this level, letting only one thread run bytecode at a time (released during I/O), which is why CPU parallelism needs processes. Use dis to see the bytecode, and remember CPython is just one implementation of the language.

More ways to practice