Lesson 1 of 17
The Value Class
The Autograd Engine
Every neural network needs to compute gradients — the direction to nudge each weight to reduce loss. In frameworks like PyTorch, this happens automatically. In MicroGPT, we build it ourselves.
The foundation is a single class: Value. It wraps a scalar number and records two things:
- Which other Values created it (the computation graph)
- What the local gradient of each parent is (how this output changes when each parent changes)
class Value:
__slots__ = ('data', 'grad', '_children', '_local_grads')
def __init__(self, data, children=(), local_grads=()):
self.data = data # the scalar value
self.grad = 0 # accumulated gradient (starts at 0)
self._children = children # parent nodes in the graph
self._local_grads = local_grads # d(output)/d(each parent)
Addition
When c = a + b:
- Output:
c.data = a.data + b.data - Local gradients:
dc/da = 1anddc/db = 1
So c stores children=(a, b) and local_grads=(1, 1).
def __add__(self, other):
other = other if isinstance(other, Value) else Value(other)
return Value(self.data + other.data, (self, other), (1, 1))
Multiplication
When c = a * b:
- Output:
c.data = a.data * b.data - Local gradients:
dc/da = b.dataanddc/db = a.data
def __mul__(self, other):
other = other if isinstance(other, Value) else Value(other)
return Value(self.data * other.data, (self, other), (other.data, self.data))
Your Task
Implement the Value class with __init__, __add__, and __mul__.
Python runtime loading...
Loading...
Click "Run" to execute your code.