More straightforward custom_collate_function (chapter 7) #683
Replies: 1 comment
-
Thanks for the feedback, and sorry for the late response. This must have gotten lost in my GitHub notifications back then. When I see it correctly, the difference is in how the inputs and targets are built via # your version
for item in batch:
padded_item = (
item.copy() +
[pad_token_id] * (batch_max_length - len(item))
)
inputs = torch.tensor(padded_item)
targets = torch.tensor(padded_item[1:] + [pad_token_id]) # add an extra pad_token_id (the main one) and # my version
for item in batch:
new_item = item.copy()
new_item += [pad_token_id]
padded = (
new_item + [pad_token_id] *
(batch_max_length - len(new_item))
)
inputs = torch.tensor(padded[:-1])
targets = torch.tensor(padded[1:]) Looking at it, I think your version is indeed pretty intuitive. I.e., constructing the targets from the inputs and then just adding the padding token. Mine is basically adding the padding token first, and then removing it from the input. Why I did it this way? I don't recall but I assume it was the first thing that came to mind. Perhaps because it was a bit easier to show that the inputs and targets have the same length but are shifted by one position. Anyways, thanks for suggesting, in hindsight, I like your version. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
I personally found the introduced
custom_collate_funciton
in the book a bit confusing. To be more specific, the way we added and ignored thepad_token_id
was not really straightforward for me (the idea behind it was clear, but in order to understand the code I had to read it multiple times), therefore I modified it slightly and here I'm going to share it (I validated it by comparing its output against the output of the original collate function introduced in the book for several inpouts and got the same output):Was there any specific reason that the padding was done in that way in the book?
Beta Was this translation helpful? Give feedback.
All reactions