I have tried to implement your functions, and i really do like your work. However, i find that when using the han.show_word_attention() function the attention weights does not add up to 1. How i have understood it, it should have because it is just supposed to be the softmax probabilities for the attention for each word. Do you know how i might fix this?
Best Regards
Malte