The CPython Execution Model Interview Questions & Answers
6 questions Updated 2026-06-18
Python interview questions on the CPython execution model: source to bytecode to .pyc, the interpreter loop, the dis module, frames and the call stack, and whether Python is compiled or interpreted.
CPython first compiles your .py source into bytecode — a compact,
platform-independent instruction set for the Python virtual machine. That
bytecode is then executed by the interpreter loop (a big evaluation loop,
historically a giant switch), which runs one bytecode instruction at a time.
Compiled bytecode is cached as .pyc files in __pycache__.
# mymod.py -> compiled to __pycache__/mymod.cpython-3xx.pyc
# The .pyc is reused if the source is unchanged (matched by hash/timestamp),
# so imports skip recompiling. It is NOT machine code — still bytecode.
def add(a, b):
return a + b
The key points: bytecode is an intermediate representation (not native
machine code), .pyc is just a cache to skip recompilation on import, and the
VM interprets it at runtime. Rule of thumb: source -> bytecode -> interpreter
loop, with .pyc caching the middle step.
CPython is the reference implementation written in C — it's what you get from python.org and what most people mean by "Python." The language is a spec; an implementation runs it. Alternatives include PyPy (with a JIT that often runs much faster), Jython (runs on the JVM), and IronPython (on .NET).
import platform
print(platform.python_implementation()) # 'CPython', 'PyPy', 'Jython', ...
They differ in performance and integration: PyPy speeds up long-running pure Python via JIT, Jython/IronPython interoperate with Java/.NET libraries, and CPython has the widest C-extension ecosystem (NumPy, etc.) plus the GIL. Rule of thumb: "Python" is the language; CPython is the dominant implementation, and others trade ecosystem reach for speed or platform integration.
The dis ("disassemble") module shows the bytecode instructions a
function compiles to — useful for understanding what Python actually does under
the hood and for comparing the cost of two approaches.
import dis
def add(a, b):
return a + b
dis.dis(add)
# Example output (abbreviated):
# LOAD_FAST a
# LOAD_FAST b
# BINARY_OP + # add the two
# RETURN_VALUE
Each line is one VM instruction the interpreter loop executes. dis is great for
answering "is this comprehension really faster?" or seeing how the compiler
desugars a construct. Rule of thumb: when you want to know what the interpreter
literally runs, disassemble it with dis.
Every time a function is called, CPython creates a frame object — a record holding that call's local variables, the current instruction pointer, and a reference to the caller. Frames are pushed onto the call stack as functions call each other and popped as they return. This is the structure a traceback walks when printing an error.
import inspect
def inner():
frame = inspect.currentframe()
print(frame.f_code.co_name) # 'inner'
print(frame.f_back.f_code.co_name) # 'outer' — the caller's frame
def outer():
inner()
outer()
Frames are why each call has isolated locals and why exceptions can report the
full chain of calls. Deep/infinite recursion piles up frames until
RecursionError (the stack limit). Rule of thumb: one frame per active call, all
chained together as the call stack.
Both. Python source is first compiled to bytecode (a real compilation step), and that bytecode is then interpreted by the CPython virtual machine at runtime. So the common "Python is interpreted" is only half the story — there's a compile phase, just to bytecode rather than to native machine code.
import py_compile
py_compile.compile("mymod.py") # explicitly produces the .pyc bytecode
# At runtime, the VM's interpreter loop executes that bytecode.
The distinction that matters: Python compiles to portable bytecode, not to CPU-specific machine code (the way C does ahead-of-time). PyPy adds a JIT that does compile hot bytecode to machine code at runtime. Rule of thumb: Python is compiled to bytecode and then interpreted.
The GIL (Global Interpreter Lock) is a single mutex that lets only one thread execute Python bytecode at a time in a CPython process. The interpreter loop holds it while running instructions and periodically releases it (and during blocking I/O), so threads take turns rather than running Python code in parallel.
# Two threads doing CPU-bound Python work do NOT run in parallel:
# the GIL serializes their bytecode execution on one core.
# threads -> good for I/O-bound (GIL released during I/O)
# processes -> needed for CPU-bound parallelism (each has its own GIL)
It exists largely to make CPython's memory management (reference counting) simpler and C extensions safer. The consequence is that threads don't give CPU parallelism — that's what multiprocessing is for (see the Concurrency & Parallelism topic). Rule of thumb: the GIL means one-bytecode-at-a-time per process, so use processes for CPU-bound parallelism.
Practice tests are coming soon
Get notified when interactive mock interviews and quizzes launch.