Question 1

What is the difference between re.match, re.search, and re.fullmatch?

Accepted Answer

They differ in **where** the pattern must match. `re.match` anchors at the
**start** of the string (but not the end). `re.search` scans for the pattern
**anywhere** in the string. `re.fullmatch` requires the pattern to match the
**entire** string. All return a `Match` object on success or `None` on failure.

```python
import re
re.match("ab", "abcd")      # match — starts with 'ab'
re.match("cd", "abcd")      # None  — not at the start
re.search("cd", "abcd")     # match — found anywhere
re.fullmatch("ab", "abcd")  # None  — must match the whole string
re.fullmatch("abcd", "abcd")# match
```

A common bug is using `match` expecting whole-string validation — it only
anchors the start. Rule of thumb: use `search` to find, `fullmatch` to
validate, and `match` only when you specifically mean "begins with".

Question 2

How do capture groups and named groups work?

Accepted Answer

Parentheses `( )` create a **capture group** you retrieve by **number** (1-based; group 0 is the whole match). `(?P...)` creates a **named group** you retrieve by name — far more readable. `(?:...)` groups **without capturing** when you only need it for grouping/alternation. ```python import re m = re.search(r"(\d{4})-(\d{2})", "2026-06") m.group(0) # '2026-06' — whole match m.group(1) # '2026' — first group m.groups() # ('2026', '06') m = re.search(r"(?P\d{4})-(?P\d{2})", "2026-06") m.group("year") # '2026' m.groupdict() # {'year': '2026', 'month': '06'} ``` Named groups make patterns self-documenting and resilient to reordering. Rule of thumb: use `(?P...)` for anything you'll extract, and `(?:...)` when grouping is structural only.

Question 3

What does re.compile do, and why would you use it?

Accepted Answer

`re.compile(pattern)` builds a **reusable pattern object** once, then you call methods (`.search`, `.match`, `.findall`, `.sub`) on it. The module-level functions actually compile internally and **cache** recent patterns, so the main win is **clarity and reuse** — plus a small speedup when a pattern is used **many times in a loop**. ```python import re DATE = re.compile(r"(?P\d{4})-(?P\d{2})") # compile once for line in lines: m = DATE.search(line) # reuse the compiled object if m: print(m.group("year")) ``` It also lets you attach **flags** (e.g. `re.IGNORECASE`, `re.VERBOSE`) in one place. Rule of thumb: compile patterns used repeatedly or shared across a module; for one-off use the module functions are fine.

Question 4

What is the difference between greedy and non-greedy matching?

Accepted Answer

By default quantifiers (`*`, `+`, `?`, `{m,n}`) are **greedy** — they match as **much** as possible, then backtrack. Adding a trailing `?` makes them **non-greedy** (lazy) — they match as **little** as possible. This matters hugely when a delimiter can appear multiple times. ```python import re text = "" re.search(r"<.*>", text).group() # '' — greedy, grabs everything re.search(r"<.*?>", text).group() # '' — lazy, stops at first '>' ``` Greedy patterns over-matching is a classic "regex ate too much" bug. Rule of thumb: when matching content **between delimiters**, reach for the lazy `*?` / `+?` (or a negated character class like `[^>]*`).

Question 5

How does re.sub work, and why use raw strings for patterns?

Accepted Answer

`re.sub(pattern, repl, string)` returns a **new** string with all matches
replaced. The replacement can reference captured groups with `\1` or
`\g<name>`, or be a **function** that receives each `Match` for dynamic
replacement. You should write patterns as **raw strings** (`r"..."`) so that
backslash escapes like `\d` and `\b` reach the regex engine instead of being
interpreted by Python first.

```python
import re
re.sub(r"\s+", " ", "a   b	c")          # 'a b c'  — collapse whitespace
re.sub(r"(\d{4})-(\d{2})", r"\2/\1", "2026-06")  # '06/2026' — reorder groups
re.sub(r"\d+", lambda m: f"[{m.group()}]", "x9")  # 'x[9]' — function repl

"\d"     # in a normal string this is an invalid escape (warns)
r"\d"    # raw string — passes \d straight to the engine
```

Without `r""`, `"\b"` becomes a backspace character, not a word boundary —
a subtle, hard-to-spot bug. Rule of thumb: **always** prefix regex patterns
with `r`.

Regular Expressions Interview Questions & Answers

What is the difference between re.match, re.search, and re.fullmatch?

How do capture groups and named groups work?

What does re.compile do, and why would you use it?

What is the difference between greedy and non-greedy matching?

How does re.sub work, and why use raw strings for patterns?

Practice tests are coming soon