Previous | Next --- Slide 25 of 79
Back to Lecture Thumbnails

Decrease p1 and increase p2, in order to reduce the value of df/dp1 and df/dp2


It's also interesting to know gradient descent with momentum: it considers the past gradients to smooth out the update. Gradient descent with momentum computes an exponentially weighted average of your gradients, and then use that gradient to update the weights instead. It works faster than the standard gradient descent algorithm.

Please log in to leave a comment.