-
Notifications
You must be signed in to change notification settings - Fork 91
Open
Description
In the backpropagation part, the first line code writes:
dtanh = softmaxOutput.diff(forward[len(forward)-1][2], y)
So it is activated then sent to softmax?
I guess for the last layer there is no need to add tanh before softmax? So the code would be:
dtanh = softmaxOutput.diff(forward[len(forward)-1][1], y)
Metadata
Metadata
Assignees
Labels
No labels