Middleware is code that runs around every request — before the handler (on the way in) and after the handler (on the way out). It can modify the request, modify the response, short-circuit with an early response, or just observe (logging, timing).
from starlette.middleware.base import BaseHTTPMiddleware
from fastapi import Request
class TimingMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request: Request, call_next):
import time
start = time.perf_counter()
response = await call_next(request) # call the rest of the stack
elapsed = time.perf_counter() - start
response.headers["X-Process-Time"] = f"{elapsed:.4f}"
return response
app.add_middleware(TimingMiddleware)
Rule of thumb: middleware is for cross-cutting concerns (timing, CORS, auth headers, compression); use dependencies for per-route business logic.
Middleware added with add_middleware() wraps the app in reverse addition
order — last added = outermost wrapper = runs first for requests.
app.add_middleware(A) # added first → innermost
app.add_middleware(B) # added second → middle
app.add_middleware(C) # added last → outermost
# Request: C → B → A → handler
# Response: handler → A → B → C
Built-in middleware (CORS, GZip, TrustedHost) should typically be outermost (added last) so they wrap everything including custom middleware.
Rule of thumb: add CORS middleware last (so it wraps everything) and logging middleware first (so its timing includes all other middleware).
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["https://app.example.com", "https://admin.example.com"],
allow_credentials=True, # required for cookies/auth headers cross-origin
allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
allow_headers=["Content-Type", "Authorization", "X-Request-ID"],
expose_headers=["X-Total-Count"], # headers the browser can read
max_age=3600, # preflight cache duration (seconds)
)
Critical interaction: allow_origins=["*"] + allow_credentials=True is
invalid — browsers reject this combination. You must list exact origins
when using credentials.
Rule of thumb: list exact origins, never "*", for APIs that send or receive
auth cookies or Authorization headers.
from starlette.middleware.gzip import GZipMiddleware
app.add_middleware(GZipMiddleware, minimum_size=1000)
minimum_size (bytes) — responses smaller than this threshold are not
compressed (compression overhead isn't worth it for tiny responses).
GZipMiddleware respects Accept-Encoding: gzip from the client — clients
that don't support gzip get uncompressed responses.
Rule of thumb: enable GZip for APIs that return large JSON payloads (lists of items, reports); don't bother for APIs that only return small objects.
It validates the Host header against an allowlist, returning 400 if the
host doesn't match. This prevents Host header injection attacks:
from starlette.middleware.trustedhost import TrustedHostMiddleware
app.add_middleware(
TrustedHostMiddleware,
allowed_hosts=["api.example.com", "*.example.com"],
)
Without it, an attacker who routes a request to your server with a forged Host
header can exploit password reset links, CORS checks, or server-generated URLs
that rely on request.base_url.
Rule of thumb: add TrustedHostMiddleware before deploying publicly; list
all valid hostnames including your CDN/load balancer's hostname.
from starlette.middleware.httpsredirect import HTTPSRedirectMiddleware
app.add_middleware(HTTPSRedirectMiddleware)
All HTTP requests receive a 307 Temporary Redirect to the HTTPS equivalent. Behind a reverse proxy that terminates TLS, the proxy should handle the redirect instead — the FastAPI middleware is a fallback.
Note: this middleware checks the X-Forwarded-Proto header set by reverse proxies.
Configure Nginx/ALB to set this header; otherwise the middleware may redirect
HTTPS requests (which arrive to Uvicorn as plain HTTP).
Rule of thumb: use HTTPSRedirectMiddleware as defence-in-depth; configure the
primary HTTP→HTTPS redirect at the reverse proxy level.
BaseHTTPMiddleware provides a high-level async dispatch(request, call_next)
interface. It's easy to write but has a performance cost:
- It buffers the response into memory before passing it to the next middleware.
- It cannot stream responses incrementally.
- It adds overhead compared to pure ASGI middleware.
from starlette.middleware.base import BaseHTTPMiddleware
class RequestIDMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
request.state.request_id = str(uuid4())
response = await call_next(request)
response.headers["X-Request-ID"] = request.state.request_id
return response
For high-performance production middleware, implement the raw ASGI interface:
class RawMiddleware:
def __init__(self, app):
self.app = app
async def __call__(self, scope, receive, send):
...
await self.app(scope, receive, send)
Rule of thumb: use BaseHTTPMiddleware for simplicity unless profiling shows
it as a bottleneck; use raw ASGI middleware for streaming responses.
Yes — exceptions propagate through the middleware stack. Middleware can catch
them with try/except around call_next():
class ErrorLoggingMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
try:
response = await call_next(request)
except Exception as e:
logger.error("Unhandled error: %s", e, exc_info=True)
return JSONResponse({"detail": "Internal server error"}, status_code=500)
return response
However, HTTPException is caught by FastAPI's exception handler before
propagating through middleware — middleware doesn't see handled HTTPException
responses.
Rule of thumb: use middleware to catch unexpected exceptions (for logging);
use @app.exception_handler to convert expected domain exceptions to HTTP responses.
import uuid
from starlette.middleware.base import BaseHTTPMiddleware
class RequestIDMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
# Use X-Request-ID if provided (from upstream), else generate one
req_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
request.state.request_id = req_id
response = await call_next(request)
response.headers["X-Request-ID"] = req_id
return response
app.add_middleware(RequestIDMiddleware)
In handlers and dependencies, access via request.state.request_id and include
it in log records for correlation.
Rule of thumb: always propagate the request ID from upstream if present — it allows tracing a request across multiple services in a distributed system.
Use starlette.middleware.sessions.SessionMiddleware (cookie-signed sessions):
from starlette.middleware.sessions import SessionMiddleware
app.add_middleware(
SessionMiddleware,
secret_key=settings.secret_key,
session_cookie="session",
max_age=3600, # 1 hour
https_only=True, # cookie sent over HTTPS only
same_site="lax",
)
Access in handlers:
@app.get("/me")
async def me(request: Request):
user_id = request.session.get("user_id")
if not user_id:
raise HTTPException(401)
return {"user_id": user_id}
@app.post("/login")
async def login(request: Request, ...):
request.session["user_id"] = user.id
return {"status": "logged in"}
Rule of thumb: Starlette sessions store data in a signed cookie — all session data is sent to the browser. Use Redis-backed sessions for sensitive data.
import time
from collections import defaultdict
from fastapi import Request
from fastapi.responses import JSONResponse
from starlette.middleware.base import BaseHTTPMiddleware
class RateLimitMiddleware(BaseHTTPMiddleware):
def __init__(self, app, max_calls: int = 100, period: int = 60):
super().__init__(app)
self.max_calls = max_calls
self.period = period
self.calls: dict[str, list[float]] = defaultdict(list)
async def dispatch(self, request: Request, call_next):
key = request.client.host
now = time.time()
window_start = now - self.period
# Keep only calls within the window
self.calls[key] = [t for t in self.calls[key] if t > window_start]
if len(self.calls[key]) >= self.max_calls:
return JSONResponse({"detail": "Rate limit exceeded"}, status_code=429)
self.calls[key].append(now)
return await call_next(request)
app.add_middleware(RateLimitMiddleware, max_calls=100, period=60)
For production use Redis instead of in-memory dict (shared across workers).
Rule of thumb: the in-memory dict is per-worker — each worker has separate counters. Use Redis for a global rate limit that works across all workers.
| Approach | Pros | Cons |
|---|---|---|
| Middleware | Runs before routing; can block early | Can't access route params; not testable via dependency_overrides |
| Dependency | Testable; has route context; composable | Only runs after routing |
Best practice — hybrid:
- Middleware: lightweight token presence check (is
Authorizationheader present?). - Dependency: full JWT validation + DB user lookup.
# middleware — fast early reject
class BearerPresenceMiddleware(BaseHTTPMiddleware):
UNPROTECTED = {"/token", "/docs", "/health"}
async def dispatch(self, request, call_next):
if request.url.path not in self.UNPROTECTED:
if not request.headers.get("Authorization"):
return JSONResponse({"detail": "Missing token"}, status_code=401)
return await call_next(request)
# dependency — full validation
async def get_current_user(token = Depends(oauth2_scheme)):
...
Rule of thumb: use middleware for coarse early rejection; use dependencies for fine-grained auth with route context.
More Deployment & Middleware interview questions
More ways to practice
The self-quiz is live. Get notified when mock interviews and new question packs drop.