Skip to content

Ignore decoder_inputs_embeds when decoder_input_ids are present (fixes #39542) #39566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

JoseAlvarezMedina
Copy link

@JoseAlvarezMedina JoseAlvarezMedina commented Jul 21, 2025

Fixes #39542

What does this PR do?

When using a custom seq2seq model together with PEFT/LoRA, both decoder_input_ids and decoder_inputs_embeds can end up being passed to the underlying decoder. This triggers the internal validation:

ValueError: You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time

This PR adds a small defensive override in Seq2SeqTrainer.compute_loss that drops decoder_inputs_embeds when decoder_input_ids are also present. This keeps backward compatibility and mirrors the user expectation that decoder_input_ids should take precedence.

Implementation details

if "decoder_input_ids" in inputs and "decoder_inputs_embeds" in inputs:
    inputs.pop("decoder_inputs_embeds")
A new test tests/trainer_seq2seq_test.py builds a small Marian model, simulates the conflicting inputs and verifies that training progresses without raising the exception.

Motivation
This pattern appears in real-world usage when composing an encoder from one model and a Marian (or similar) decoder while applying PEFT/LoRA. Making the Trainer resilient avoids forcing each user to patch their own subclass.

Tests
pytest tests/trainer_seq2seq_test.py -q passes locally.

Additional notes
Happy to adjust the location of the guard (e.g. move it to the base Trainer) if reviewers prefer.

Who can review?
Tagging trainer maintainers for visibility: @zach-huggingface @SunMarc

@Rocketknight1
Copy link
Member

It's unclear that this fixes the issue! Please don't run code agents on user issues when the cause hasn't been fully identified, they have a strong tendency to fix the wrong thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError: You cannot specify both decoder_input_ids and decoder_inputs_embeds at the same time
2 participants