Skip to content

Conversation

phi-jkim
Copy link
Collaborator

@phi-jkim phi-jkim commented Aug 14, 2025

What does this PR do?

Trainable Runner

  • GradComponent Integration: Runner now inherits from GradComponent
  • New forward() method for optimization (runner.py:792-950)
  • Chain Predecessors: Each step's output becomes a trainable predecessor for the next step, enabling gradient flow
  • Parameter Wrapping: Final results are wrapped in OutputParameter with proper gradient functions configured
  • Backward Context: Automatic setup of BackwardContext for gradient computation with prompt templates and backward engines
  • commented out CombineStepHistoryAndRunnerResult which follows the original ReAcT agent for optimization

Runner Trainer

  • New RunnerTrainer class provides a generic interface for training Runner models (runner_trainer.py:36-100)
  • comparison between new Runner workflow and original ReActAgent training
Before submitting
  • Was this discussed/agreed via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant