Lesson 2 of 17

More Operations

More Operations on Value

The neural network uses six scalar operations beyond add and mul: power, log, exp, relu, and their negation/division variants.

Each operation records its local gradient — the derivative of the output with respect to the input.

Power

a ** n → local gradient is n * a^(n-1) (power rule):

def __pow__(self, other):
    return Value(self.data**other, (self,), (other * self.data**(other-1),))

Log and Exp

d/da log(a) = 1/a and d/da exp(a) = exp(a):

def log(self):
    return Value(math.log(self.data), (self,), (1/self.data,))

def exp(self):
    return Value(math.exp(self.data), (self,), (math.exp(self.data),))

ReLU

ReLU is max(0, x). Its derivative is 1 when x > 0, and 0 otherwise:

def relu(self):
    return Value(max(0, self.data), (self,), (float(self.data > 0),))

ReLU is the nonlinearity in the MLP block. Without nonlinearities, stacking linear layers collapses to a single linear transformation.

Your Task

Add __pow__, log, exp, and relu to the Value class.

Python runtime loading...
Loading...
Click "Run" to execute your code.