-
Notifications
You must be signed in to change notification settings - Fork 835
Open
Labels
pendingSomething isn't workingSomething isn't working
Description
Describe the bug
As I mentioned in this issue, the default value of top_p
and temperature
is not guaranteed to be 1
. Therefore, the code below will get a modified logits, i.e., a distribution processed depending on generation_config
from hf end.
LMFlow/src/lmflow/models/hf_decoder_model.py
Lines 382 to 405 in 1b223f7
if self.use_accelerator: | |
outputs = self.backend_model.generate( | |
input_ids=inputs, | |
pad_token_id=self.tokenizer.pad_token_id, | |
*args, | |
**kwargs | |
) | |
else: | |
if self.device == "gpu": | |
outputs = self.ds_engine.module.generate( | |
input_ids=inputs, | |
synced_gpus=True, | |
pad_token_id=self.tokenizer.pad_token_id, | |
*args, | |
**kwargs | |
) | |
elif self.device == "cpu": | |
outputs = self.backend_model.generate( | |
input_ids=inputs, | |
synced_gpus=True, | |
pad_token_id=self.tokenizer.pad_token_id, | |
*args, | |
**kwargs | |
) |
Much worse, you applied top_p
and temperature
again in score_to_prob
, resulting unexpected distribution:
LMFlow/src/lmflow/pipeline/inferencer.py
Lines 435 to 440 in 1b223f7
for _ in range(num_new_tokens): | |
pred = self.predict_next_token(model=model, input_ids=sequence, num_new_tokens=1) # predict next one token | |
prob = self.score_to_prob(pred.scores[0], temperature=temperature) | |
sampled = self.sample(prob=prob, num_samples=1) | |
new_tokens.append(sampled) | |
sequence = torch.cat([sequence, sampled['sampled_token']], dim=1) |
jialefu and Kamichanw
Metadata
Metadata
Assignees
Labels
pendingSomething isn't workingSomething isn't working