The Value Class

The Autograd Engine

Every neural network needs to compute gradients — the direction to nudge each weight to reduce loss. In frameworks like PyTorch, this happens automatically. In MicroGPT, we build it ourselves.

The foundation is a single class: Value. It wraps a scalar number and records two things:

Which other Values created it (the computation graph)
What the local gradient of each parent is (how this output changes when each parent changes)

class Value:
    __slots__ = ('data', 'grad', '_children', '_local_grads')

    def __init__(self, data, children=(), local_grads=()):
        self.data = data        # the scalar value
        self.grad = 0           # accumulated gradient (starts at 0)
        self._children = children       # parent nodes in the graph
        self._local_grads = local_grads # d(output)/d(each parent)

Addition

When c = a + b:

Output: c.data = a.data + b.data
Local gradients: dc/da = 1 and dc/db = 1

So c stores children=(a, b) and local_grads=(1, 1).

def __add__(self, other):
    other = other if isinstance(other, Value) else Value(other)
    return Value(self.data + other.data, (self, other), (1, 1))

Multiplication

When c = a * b:

Output: c.data = a.data * b.data
Local gradients: dc/da = b.data and dc/db = a.data

def __mul__(self, other):
    other = other if isinstance(other, Value) else Value(other)
    return Value(self.data * other.data, (self, other), (other.data, self.data))

Your Task

Implement the Value class with __init__, __add__, and __mul__.

← Previous Next →

Python runtime loading...

Click "Run" to execute your code.