Python Dataclasses and __slots__ Explained — @dataclass, frozen, and field()

Python dataclasses, explained

Writing __init__, __repr__, and __eq__ by hand for a simple data-holding class is tedious boilerplate. @dataclass generates them from your type annotations. Pair it with __slots__ for memory savings and you have Python's modern answer to "I just need a record".

What @dataclass generates

Decorate a class with @dataclass and annotate its fields — Python writes the dunder methods for you:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

p = Point(1, 2)
p                # Point(x=1, y=2)        — generated __repr__
p == Point(1, 2) # True                   — generated __eq__

By default it generates __init__, __repr__, and __eq__. Options like @dataclass(order=True) add comparison methods, and frozen=True makes it immutable.

frozen dataclasses

frozen=True makes instances immutable — assigning to a field raises an error — and, as a bonus, makes them hashable so they can be dict keys or set members:

@dataclass(frozen=True)
class Point:
    x: int
    y: int

p = Point(1, 2)
p.x = 5          # FrozenInstanceError
{p: "origin-ish"}   # hashable — works as a dict key

The mutable default trap — field(default_factory)

Just like mutable default arguments, a mutable default on a dataclass field would be shared across instances. Dataclasses forbid it outright and make you use field(default_factory=...):

from dataclasses import dataclass, field

@dataclass
class Cart:
    items: list = []                          # ValueError at class definition!

@dataclass
class Cart:
    items: list = field(default_factory=list) # correct — fresh list per instance

The factory is a zero-argument callable (list, dict, or a lambda) called once per new instance.

post_init for derived fields and validation

__init__ is generated, so to run extra logic after the fields are set, define __post_init__:

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)    # not a constructor argument

    def __post_init__(self):
        if self.width <= 0:
            raise ValueError("width must be positive")
        self.area = self.width * self.height

field(init=False) keeps area out of the constructor signature so you can compute it here.

slots — smaller, faster instances

By default each instance stores its attributes in a per-instance __dict__, which is flexible but memory-hungry. __slots__ replaces it with a fixed layout, cutting memory and speeding attribute access — at the cost of being unable to add new attributes:

class Point:
    __slots__ = ("x", "y")
    def __init__(self, x, y):
        self.x, self.y = x, y

p = Point(1, 2)
p.z = 3          # AttributeError — not in __slots__

Since Python 3.10 you can combine it with dataclasses via @dataclass(slots=True). Use __slots__ when you create huge numbers of small objects.

dataclass vs namedtuple vs NamedTuple

All three model records; pick by mutability and behaviour:

@dataclass — mutable by default (or frozen), supports methods, defaults, and inheritance. The general-purpose choice.
namedtuple / typing.NamedTuple — immutable, tuple-based (indexable and unpackable), lighter weight. Best when you want tuple behaviour and immutability.

# Want tuple unpacking and immutability with little code -> NamedTuple
# Want mutability, methods, or rich behaviour            -> dataclass

Recap

@dataclass generates __init__, __repr__, and __eq__ from your annotations; frozen=True makes instances immutable and hashable. Mutable defaults are banned — use field(default_factory=list) for a fresh object per instance, and __post_init__ for validation or derived fields (field(init=False)). __slots__ drops the per-instance __dict__ to save memory and speed access for large object counts. Reach for a dataclass for general records, and a named tuple when you want immutable, tuple-like behaviour.

Python Dataclasses and slots Explained — @dataclass, frozen, and field()