Question 1

Why open files with the `with` statement?

Accepted Answer

`with open(...)` uses the file as a **context manager**, which **guarantees the file
is closed** when the block exits — even if an exception is raised. Without it you
must remember to call `.close()` manually, and a crash mid-block leaks the handle.

```python
with open("data.txt") as f:      # f.close() is automatic
    contents = f.read()
# file is closed here, even on error

f = open("data.txt")             # manual style — fragile
try:
    contents = f.read()
finally:
    f.close()
```

Leaked handles can exhaust OS file descriptors and leave buffered writes unflushed.
Always prefer `with` for any resource that needs cleanup (files, locks, sockets).

Question 2

What is the difference between iterating a file and read()/readlines()?

Accepted Answer

A file object is its own **iterator**, yielding **one line at a time** and holding
only that line in memory. `read()` loads the **entire file** into a single string,
and `readlines()` loads **all lines into a list** — both can blow up memory on large
files.

```python
with open("huge.log") as f:
    for line in f:               # lazy — one line in memory at a time
        process(line)

with open("huge.log") as f:
    data = f.read()              # whole file as one string
    lines = f.readlines()        # whole file as a list of strings
```

Iterating is the idiomatic, memory-safe way to process big files line by line.
Reserve `read()`/`readlines()` for small files where you genuinely need the whole
content at once.

Question 3

What is the difference between text and binary mode, and why specify encoding?

Accepted Answer

**Text mode** (`"r"`, the default) decodes bytes into `str` using an **encoding** and
normalizes newlines. **Binary mode** (`"rb"`/`"wb"`) reads and writes raw `bytes`
with no decoding — required for images, archives, or any non-text data. In text
mode you should pass **`encoding=`** explicitly, because the default is
platform-dependent.

```python
with open("notes.txt", "r", encoding="utf-8") as f:
    text: str = f.read()         # decoded to str

with open("photo.jpg", "rb") as f:
    raw: bytes = f.read()        # raw bytes, no decoding
```

Relying on the default encoding is a classic cross-platform bug (UTF-8 on Linux/Mac,
often a legacy codepage on Windows). Always specify `encoding="utf-8"` for text, and
use binary mode for everything that isn't text.

Question 4

What is pathlib.Path and how does it compare to os.path?

Accepted Answer

**`pathlib.Path`** is the modern, object-oriented way to handle filesystem paths. A
`Path` is an object with **methods and operators**, whereas the older **`os.path`**
module is a collection of **string-based functions**. Path's `/` operator joins
segments cleanly and works across operating systems.

```python
from pathlib import Path

p = Path("data") / "logs" / "app.log"   # join with /
p.exists()
p.suffix                                 # ".log"
p.stem                                   # "app"
p.read_text(encoding="utf-8")            # one-liner read

import os.path
old = os.path.join("data", "logs", "app.log")
os.path.exists(old)
```

`pathlib` is generally preferred for new code: it's more readable and bundles
common operations (`read_text`, `mkdir`, `glob`) as methods. Use `os.path` mainly
when working with existing string-based APIs.

Question 5

How do you find files with globbing using pathlib?

Accepted Answer

Use **`Path.glob(pattern)`** for matches in one directory and **`Path.rglob(pattern)`**
(or `glob("**/...")`) to recurse into subdirectories. Both return a **lazy generator**
of `Path` objects, where `*` matches any characters and `**` matches directories
recursively.

```python
from pathlib import Path

root = Path("project")
for py in root.glob("*.py"):         # top level only
    print(py.name)

for py in root.rglob("*.py"):        # all subdirectories too
    print(py)

root.mkdir(parents=True, exist_ok=True)  # create dirs safely
[p.name for p in root.iterdir()]         # list directory contents
```

Other handy `Path` methods: `iterdir()` (list a directory), `is_file()`/`is_dir()`,
`mkdir()`, `unlink()` (delete), and `with_suffix()`. Globbing returns generators, so
wrap in `list(...)` if you need a concrete collection.

Files, pathlib & os Interview Questions & Answers

Why open files with the `with` statement?

What is the difference between iterating a file and read()/readlines()?

What is the difference between text and binary mode, and why specify encoding?

What is pathlib.Path and how does it compare to os.path?

How do you find files with globbing using pathlib?

Practice tests are coming soon