[{"data":1,"prerenderedAt":69},["ShallowReactive",2],{"qa-\u002Fpython\u002Fconcurrency\u002Fmultiprocessing":3},{"page":4,"siblings":55,"blog":47},{"id":5,"title":6,"body":7,"description":11,"difficulty":14,"extension":15,"framework":16,"frameworkSlug":17,"meta":18,"navigation":19,"order":12,"path":20,"questions":21,"related":47,"seo":48,"seoDescription":49,"stem":50,"subtopic":6,"topic":51,"topicSlug":52,"updated":53,"__hash__":54},"qa\u002Fpython\u002Fconcurrency\u002Fmultiprocessing.md","Multiprocessing",{"type":8,"value":9,"toc":10},"minimark",[],{"title":11,"searchDepth":12,"depth":12,"links":13},"",2,[],"hard","md","Python","python",{},true,"\u002Fpython\u002Fconcurrency\u002Fmultiprocessing",[22,26,31,35,39,43],{"id":23,"difficulty":14,"q":24,"a":25},"multiprocessing-vs-threading","How does multiprocessing sidestep the GIL, and how is it different from threading?","**Threading** runs multiple threads inside **one process** that share one\ninterpreter — and therefore one **GIL** (Global Interpreter Lock), which lets\nonly one thread execute Python bytecode at a time. **Multiprocessing** spawns\n**separate OS processes**, each with its **own** interpreter and **own GIL**, so\nthey can run Python code in **true parallel** on multiple CPU cores.\n\n```python\nfrom multiprocessing import Process\nimport os\n\ndef work():\n    print(f\"running in pid {os.getpid()}\")  # a distinct process each time\n\nif __name__ == \"__main__\":            # required guard on Windows\u002Fspawn\n    ps = [Process(target=work) for _ in range(4)]\n    for p in ps: p.start()\n    for p in ps: p.join()\n```\n\nThe tradeoff: processes don't share memory, so passing data costs\n**serialization (pickling) and IPC overhead**, and each process has higher\nstartup cost than a thread. Rule of thumb: reach for multiprocessing when you\nneed real CPU parallelism, not just concurrency.\n",{"id":27,"difficulty":28,"q":29,"a":30},"process-vs-pool","medium","What is the difference between Process and Pool?","A **`Process`** represents a **single** child process you start and join\nmanually — good when you have a fixed, small number of distinct tasks. A\n**`Pool`** manages a **reusable group of worker processes** and hands out work\nto them, which is far more convenient for **many homogeneous tasks** over a\ndataset.\n\n```python\nfrom multiprocessing import Pool\n\ndef square(n):\n    return n * n\n\nif __name__ == \"__main__\":\n    with Pool(processes=4) as pool:\n        results = pool.map(square, range(10))   # distributed across 4 workers\n    print(results)   # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]\n```\n\n`Pool` reuses workers (amortizing startup cost) and offers `map`, `imap`,\n`apply_async`, etc. Use `Process` for a few long-lived distinct jobs; use `Pool`\nwhen you're fanning the same function over many inputs.\n",{"id":32,"difficulty":14,"q":33,"a":34},"ipc-queue-pipe","How do processes communicate (Queue vs Pipe)?","Because processes don't share memory, they communicate through **IPC**\nprimitives. A **`Queue`** is a **multi-producer\u002Fmulti-consumer**, thread- and\nprocess-safe FIFO — the general-purpose choice. A **`Pipe`** is a faster but\nlower-level **two-endpoint** connection, best for communication between exactly\n**two** processes.\n\n```python\nfrom multiprocessing import Process, Queue\n\ndef producer(q):\n    q.put(\"result\")          # values are pickled across the boundary\n\nif __name__ == \"__main__\":\n    q = Queue()\n    p = Process(target=producer, args=(q,))\n    p.start()\n    print(q.get())           # \"result\"\n    p.join()\n```\n\nBoth serialize objects under the hood, so only **picklable** data flows through\nthem. Use `Queue` for fan-in\u002Ffan-out among many workers; use `Pipe` for a tight\none-to-one channel where you want the lower overhead.\n",{"id":36,"difficulty":14,"q":37,"a":38},"pickling-overhead","What is the pickling overhead, and what can't be pickled?","Every argument and return value crossing a process boundary must be\n**serialized with `pickle`**, sent, then **deserialized** on the other side.\nFor large objects this **copying cost can dwarf the parallelism gains**, and\nsome objects simply **can't be pickled**.\n\n```python\nimport pickle\n\npickle.dumps(lambda x: x)   # PicklingError — lambdas aren't picklable\n# Also unpicklable: open file handles, sockets, locks, db connections,\n# local\u002Fnested functions, and generators.\n```\n\nPicklable things include module-level functions and classes, and basic\ncontainers of picklable values. The practical implications: pass **small,\npicklable** payloads, define worker functions at **module top level**, and\navoid shipping huge data structures between processes. Minimizing what crosses\nthe boundary is the key to multiprocessing performance.\n",{"id":40,"difficulty":14,"q":41,"a":42},"shared-state","How do you share state between processes?","Since each process has its own memory, you need explicit shared-state tools.\n**`Value`** and **`Array`** put simple data in **shared memory** (fast, but\nlimited types and you must guard with a lock). A **`Manager`** hosts richer\nshared objects (`dict`, `list`, etc.) via a **server process** — more flexible\nbut slower because access is proxied.\n\n```python\nfrom multiprocessing import Process, Value, Lock\n\ndef inc(counter, lock):\n    for _ in range(1000):\n        with lock:               # protect the shared value\n            counter.value += 1\n\nif __name__ == \"__main__\":\n    counter = Value(\"i\", 0)      # shared int in shared memory\n    lock = Lock()\n    ps = [Process(target=inc, args=(counter, lock)) for _ in range(4)]\n    for p in ps: p.start()\n    for p in ps: p.join()\n    print(counter.value)         # 4000\n```\n\nPrefer **message passing** (Queue\u002FPipe) over shared state when you can — it's\neasier to reason about. Reach for `Value`\u002F`Array` for hot, simple counters and a\n`Manager` only when you genuinely need shared complex objects.\n",{"id":44,"difficulty":28,"q":45,"a":46},"when-multiprocessing","When should you use multiprocessing instead of threads or asyncio?","Use multiprocessing for **CPU-bound** work — number crunching, image\nprocessing, data transforms — where you need to **saturate multiple cores** with\nPython code. Threads and asyncio can't do that because the **GIL** serializes\nbytecode execution; only separate processes get separate GILs.\n\n```python\nfrom multiprocessing import Pool\n\ndef heavy(n):                      # pure-Python CPU work\n    return sum(i * i for i in range(n))\n\nif __name__ == \"__main__\":\n    with Pool() as pool:           # defaults to os.cpu_count() workers\n        print(pool.map(heavy, [10_000_000] * 8))   # runs in parallel\n```\n\nFor **I\u002FO-bound** work (network, disk, DB), threads or asyncio are usually\nbetter — they're cheaper and the GIL is released during I\u002FO anyway. Rule of\nthumb: CPU-bound -> multiprocessing; I\u002FO-bound -> threads\u002Fasyncio.\n",null,{"description":11},"Python interview questions on the multiprocessing module: how it sidesteps the GIL, Process vs Pool, IPC with Queue\u002FPipe, pickling limits, and shared state.","python\u002Fconcurrency\u002Fmultiprocessing","Concurrency & Parallelism","concurrency","2026-06-18","q6g-NDE__tlhDOdIPbX1WLFlzxD4custMmavX-rPQ3s",[56,60,61,65],{"subtopic":57,"path":58,"order":59},"Threading & the GIL","\u002Fpython\u002Fconcurrency\u002Fgil",1,{"subtopic":6,"path":20,"order":12},{"subtopic":62,"path":63,"order":64},"asyncio & async\u002Fawait","\u002Fpython\u002Fconcurrency\u002Fasyncio",3,{"subtopic":66,"path":67,"order":68},"concurrent.futures","\u002Fpython\u002Fconcurrency\u002Fconcurrent-futures",4,1781808681434]