Lesson 2 of 17
More Operations
More Operations on Value
The neural network uses six scalar operations beyond add and mul: power, log, exp, relu, and their negation/division variants.
Each operation records its local gradient — the derivative of the output with respect to the input.
Power
a ** n → local gradient is n * a^(n-1) (power rule):
def __pow__(self, other):
return Value(self.data**other, (self,), (other * self.data**(other-1),))
Log and Exp
d/da log(a) = 1/a and d/da exp(a) = exp(a):
def log(self):
return Value(math.log(self.data), (self,), (1/self.data,))
def exp(self):
return Value(math.exp(self.data), (self,), (math.exp(self.data),))
ReLU
ReLU is max(0, x). Its derivative is 1 when x > 0, and 0 otherwise:
def relu(self):
return Value(max(0, self.data), (self,), (float(self.data > 0),))
ReLU is the nonlinearity in the MLP block. Without nonlinearities, stacking linear layers collapses to a single linear transformation.
Your Task
Add __pow__, log, exp, and relu to the Value class.
Python runtime loading...
Loading...
Click "Run" to execute your code.