Files, pathlib & os Interview Questions & Answers
5 questions Updated 2026-06-18
Python interview questions on open() and the with statement, lazy file iteration, text vs binary mode and encoding, pathlib.Path vs os.path, and globbing.
with open(...) uses the file as a context manager, which guarantees the file
is closed when the block exits — even if an exception is raised. Without it you
must remember to call .close() manually, and a crash mid-block leaks the handle.
with open("data.txt") as f: # f.close() is automatic
contents = f.read()
# file is closed here, even on error
f = open("data.txt") # manual style — fragile
try:
contents = f.read()
finally:
f.close()
Leaked handles can exhaust OS file descriptors and leave buffered writes unflushed.
Always prefer with for any resource that needs cleanup (files, locks, sockets).
A file object is its own iterator, yielding one line at a time and holding
only that line in memory. read() loads the entire file into a single string,
and readlines() loads all lines into a list — both can blow up memory on large
files.
with open("huge.log") as f:
for line in f: # lazy — one line in memory at a time
process(line)
with open("huge.log") as f:
data = f.read() # whole file as one string
lines = f.readlines() # whole file as a list of strings
Iterating is the idiomatic, memory-safe way to process big files line by line.
Reserve read()/readlines() for small files where you genuinely need the whole
content at once.
Text mode ("r", the default) decodes bytes into str using an encoding and
normalizes newlines. Binary mode ("rb"/"wb") reads and writes raw bytes
with no decoding — required for images, archives, or any non-text data. In text
mode you should pass encoding= explicitly, because the default is
platform-dependent.
with open("notes.txt", "r", encoding="utf-8") as f:
text: str = f.read() # decoded to str
with open("photo.jpg", "rb") as f:
raw: bytes = f.read() # raw bytes, no decoding
Relying on the default encoding is a classic cross-platform bug (UTF-8 on Linux/Mac,
often a legacy codepage on Windows). Always specify encoding="utf-8" for text, and
use binary mode for everything that isn't text.
pathlib.Path is the modern, object-oriented way to handle filesystem paths. A
Path is an object with methods and operators, whereas the older os.path
module is a collection of string-based functions. Path's / operator joins
segments cleanly and works across operating systems.
from pathlib import Path
p = Path("data") / "logs" / "app.log" # join with /
p.exists()
p.suffix # ".log"
p.stem # "app"
p.read_text(encoding="utf-8") # one-liner read
import os.path
old = os.path.join("data", "logs", "app.log")
os.path.exists(old)
pathlib is generally preferred for new code: it's more readable and bundles
common operations (read_text, mkdir, glob) as methods. Use os.path mainly
when working with existing string-based APIs.
Use Path.glob(pattern) for matches in one directory and Path.rglob(pattern)
(or glob("**/...")) to recurse into subdirectories. Both return a lazy generator
of Path objects, where * matches any characters and ** matches directories
recursively.
from pathlib import Path
root = Path("project")
for py in root.glob("*.py"): # top level only
print(py.name)
for py in root.rglob("*.py"): # all subdirectories too
print(py)
root.mkdir(parents=True, exist_ok=True) # create dirs safely
[p.name for p in root.iterdir()] # list directory contents
Other handy Path methods: iterdir() (list a directory), is_file()/is_dir(),
mkdir(), unlink() (delete), and with_suffix(). Globbing returns generators, so
wrap in list(...) if you need a concrete collection.
Practice tests are coming soon
Get notified when interactive mock interviews and quizzes launch.