Threading & the GIL Interview Questions & Answers

5 questions Updated 2026-06-18

Python interview questions on threading and the GIL — what the GIL is, threads vs multiprocessing, CPU-bound vs I/O-bound work, race conditions and locks, and ThreadPoolExecutor vs ProcessPoolExecutor.

Read the in-depth guidePython Threading and the GIL Explained — Threads vs Multiprocessing

The Global Interpreter Lock is a mutex in CPython that allows only one thread to execute Python bytecode at a time. Even on a multi-core machine, a multithreaded pure-Python program runs its bytecode on one core at a time — threads take turns holding the lock.

It exists to make CPython's memory management (especially reference counting) simple and fast: without it, every refcount update would need its own lock. The interpreter releases the GIL periodically and around blocking I/O so other threads can run.

import threading
# both threads exist, but the GIL serializes their bytecode execution
def work():
    total = 0
    for _ in range(10_000_000):   # CPU-bound — holds the GIL
        total += 1

t1 = threading.Thread(target=work)
t2 = threading.Thread(target=work)
t1.start(); t2.start(); t1.join(); t2.join()   # ~no speedup vs one thread

Why it matters: the GIL is a CPython implementation detail (not in the language spec, and absent in Jython/the free-threaded 3.13+ build) that shapes when threads do and don't help.

Threads share one process and one memory space, so they're cheap to create and share data directly — but in CPython they're serialized by the GIL. Processes each have their own interpreter and memory, so they run on separate cores in true parallel, bypassing the GIL — at the cost of higher startup overhead and needing to serialize (pickle) data to communicate.

from threading import Thread
from multiprocessing import Process

Thread(target=fn)    # shared memory, GIL-bound, light
Process(target=fn)   # separate memory, real parallelism, heavier

Threads communicate through shared objects (guarded by locks); processes communicate through Queue, Pipe, or shared-memory primitives because they don't share state.

Rule of thumb: threads for I/O-bound concurrency, processes for CPU-bound parallelism.

For CPU-bound work, threads are constantly executing bytecode, so they're always contending for the GIL — only one runs at a time, and you get no parallel speedup (often a small slowdown from lock contention and context switches).

For I/O-bound work, a thread that's waiting on the network, disk, or a database is blocked outside the interpreter — and CPython releases the GIL during blocking I/O. So other threads run while one waits, giving real concurrency.

import requests, threading

def fetch(url):
    requests.get(url)     # blocks on the network -> GIL released here

# 10 threads overlap their waiting time -> much faster than sequential
threads = [threading.Thread(target=fetch, args=(u,)) for u in urls]

Rule of thumb: if your bottleneck is waiting, use threads (or asyncio); if it's computing, use processes to get past the GIL.

A race condition occurs when two threads access shared mutable state and the result depends on timing. Even x += 1 is not atomic — it's read, add, write, and a thread can be switched out mid-sequence, so updates get lost.

A lock (threading.Lock) creates a critical section: only the thread holding it can proceed, the rest wait, so the read-modify-write happens atomically. Using with lock: guarantees the lock is always released, even on exceptions.

import threading
counter = 0
lock = threading.Lock()

def increment():
    global counter
    for _ in range(100_000):
        with lock:           # only one thread in here at a time
            counter += 1     # now safe from lost updates

Beware over-locking: acquiring multiple locks in different orders can cause deadlock. Rule of thumb: guard every access to shared mutable state, keep critical sections small, and prefer thread-safe queue.Queue for handoff.

Both come from concurrent.futures and share the same API — submit / map returning Future objects — so you can swap them with one line. The difference is the worker type: ThreadPoolExecutor runs tasks in threads (shared memory, GIL-bound) and ProcessPoolExecutor runs them in separate processes (true parallelism, pickled args).

from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

# I/O-bound: many network/file waits -> threads
with ThreadPoolExecutor(max_workers=20) as ex:
    results = list(ex.map(download, urls))

# CPU-bound: heavy computation -> processes (one per core)
with ProcessPoolExecutor() as ex:
    results = list(ex.map(crunch_numbers, datasets))

Use ThreadPoolExecutor for I/O-bound tasks (downloads, DB calls) where waiting dominates, and ProcessPoolExecutor for CPU-bound tasks to use all cores — remembering its arguments and return values must be picklable.

Rule of thumb: pick the executor by the bottleneck (waiting vs computing), and let concurrent.futures handle the pool lifecycle and result collection.

Practice tests are coming soon

Get notified when interactive mock interviews and quizzes launch.