Lesson 1 of 17

The Value Class

The Autograd Engine

Every neural network needs to compute gradients — the direction to nudge each weight to reduce loss. In frameworks like PyTorch, this happens automatically. In MicroGPT, we build it ourselves.

The foundation is a single class: Value. It wraps a scalar number and records two things:

  1. Which other Values created it (the computation graph)
  2. What the local gradient of each parent is (how this output changes when each parent changes)
class Value:
    __slots__ = ('data', 'grad', '_children', '_local_grads')

    def __init__(self, data, children=(), local_grads=()):
        self.data = data        # the scalar value
        self.grad = 0           # accumulated gradient (starts at 0)
        self._children = children       # parent nodes in the graph
        self._local_grads = local_grads # d(output)/d(each parent)

Addition

When c = a + b:

  • Output: c.data = a.data + b.data
  • Local gradients: dc/da = 1 and dc/db = 1

So c stores children=(a, b) and local_grads=(1, 1).

def __add__(self, other):
    other = other if isinstance(other, Value) else Value(other)
    return Value(self.data + other.data, (self, other), (1, 1))

Multiplication

When c = a * b:

  • Output: c.data = a.data * b.data
  • Local gradients: dc/da = b.data and dc/db = a.data
def __mul__(self, other):
    other = other if isinstance(other, Value) else Value(other)
    return Value(self.data * other.data, (self, other), (other.data, self.data))

Your Task

Implement the Value class with __init__, __add__, and __mul__.

Python runtime loading...
Loading...
Click "Run" to execute your code.