Previous | Next --- Slide 22 of 79
Back to Lecture Thumbnails
suninhouse

Note that softmax is called "soft"-max for a reason: as it is asymptotically the same as hard arg max, and more like a smoothed version of the hard arg max. Note that argmax is defined as argmax(z_1, ..., z_n) = (0, ..., 1, ..., 0), and that's why it makes sense for softmax to have value between 0 and 1.

blipblop

Something Kayvon mentioned in class which I didn't realize about these error functions: they are designed to penalize the classifier when it is confidently wrong. So if the classifier outputs [0.25,0.25,0.25,0.25], that's not as bad as outputting [1,0,0,0]

assignment7

I think another way to understand softmax classifier is that the classifier is penalizing two different distributions using their entropy, one is the ground truth (one-hot) distribution and another is the (soft) confidence vector or the softmax output

Please log in to leave a comment.