Skip to content

Extra tanh? #3

@dhzyingz

Description

@dhzyingz

In the backpropagation part, the first line code writes:

dtanh = softmaxOutput.diff(forward[len(forward)-1][2], y)

So it is activated then sent to softmax?
I guess for the last layer there is no need to add tanh before softmax? So the code would be:

dtanh = softmaxOutput.diff(forward[len(forward)-1][1], y)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions