Files, pathlib & os Interview Questions & Answers

5 questions Updated 2026-06-18

Python interview questions on open() and the with statement, lazy file iteration, text vs binary mode and encoding, pathlib.Path vs os.path, and globbing.

with open(...) uses the file as a context manager, which guarantees the file is closed when the block exits — even if an exception is raised. Without it you must remember to call .close() manually, and a crash mid-block leaks the handle.

with open("data.txt") as f:      # f.close() is automatic
    contents = f.read()
# file is closed here, even on error

f = open("data.txt")             # manual style — fragile
try:
    contents = f.read()
finally:
    f.close()

Leaked handles can exhaust OS file descriptors and leave buffered writes unflushed. Always prefer with for any resource that needs cleanup (files, locks, sockets).

A file object is its own iterator, yielding one line at a time and holding only that line in memory. read() loads the entire file into a single string, and readlines() loads all lines into a list — both can blow up memory on large files.

with open("huge.log") as f:
    for line in f:               # lazy — one line in memory at a time
        process(line)

with open("huge.log") as f:
    data = f.read()              # whole file as one string
    lines = f.readlines()        # whole file as a list of strings

Iterating is the idiomatic, memory-safe way to process big files line by line. Reserve read()/readlines() for small files where you genuinely need the whole content at once.

Text mode ("r", the default) decodes bytes into str using an encoding and normalizes newlines. Binary mode ("rb"/"wb") reads and writes raw bytes with no decoding — required for images, archives, or any non-text data. In text mode you should pass encoding= explicitly, because the default is platform-dependent.

with open("notes.txt", "r", encoding="utf-8") as f:
    text: str = f.read()         # decoded to str

with open("photo.jpg", "rb") as f:
    raw: bytes = f.read()        # raw bytes, no decoding

Relying on the default encoding is a classic cross-platform bug (UTF-8 on Linux/Mac, often a legacy codepage on Windows). Always specify encoding="utf-8" for text, and use binary mode for everything that isn't text.

pathlib.Path is the modern, object-oriented way to handle filesystem paths. A Path is an object with methods and operators, whereas the older os.path module is a collection of string-based functions. Path's / operator joins segments cleanly and works across operating systems.

from pathlib import Path

p = Path("data") / "logs" / "app.log"   # join with /
p.exists()
p.suffix                                 # ".log"
p.stem                                   # "app"
p.read_text(encoding="utf-8")            # one-liner read

import os.path
old = os.path.join("data", "logs", "app.log")
os.path.exists(old)

pathlib is generally preferred for new code: it's more readable and bundles common operations (read_text, mkdir, glob) as methods. Use os.path mainly when working with existing string-based APIs.

Use Path.glob(pattern) for matches in one directory and Path.rglob(pattern) (or glob("**/...")) to recurse into subdirectories. Both return a lazy generator of Path objects, where * matches any characters and ** matches directories recursively.

from pathlib import Path

root = Path("project")
for py in root.glob("*.py"):         # top level only
    print(py.name)

for py in root.rglob("*.py"):        # all subdirectories too
    print(py)

root.mkdir(parents=True, exist_ok=True)  # create dirs safely
[p.name for p in root.iterdir()]         # list directory contents

Other handy Path methods: iterdir() (list a directory), is_file()/is_dir(), mkdir(), unlink() (delete), and with_suffix(). Globbing returns generators, so wrap in list(...) if you need a concrete collection.

Practice tests are coming soon

Get notified when interactive mock interviews and quizzes launch.