Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel. #773

FengDSP · 2023-10-24T17:05:58Z

This PR addresses an inconsistency in the shape of the masked_tokens array within the decoder's masked multi-head attention kernel. The expected shape of the masked_tokens array is [batch_size, session_length], however, the current implementation in the repo has it shaped as [batch_size, memory_length]. This discrepancy leads to unexpected behaviors when memory_length is not configured to be the same as session_length.

Feng Li added 2 commits October 24, 2023 16:37

masked_tokens uses session_length

cd59efe

masked_tokens uses session length everywhere

72319c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel. #773

Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel. #773

Uh oh!

FengDSP commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel. #773

Are you sure you want to change the base?

Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel. #773

Uh oh!

Conversation

FengDSP commented Oct 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant