-
Notifications
You must be signed in to change notification settings - Fork 13
Add RFC 000 #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RFC 000 #44
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Minor comments
rfcs/000-project-phases.md
Outdated
This project aims at standardizing environments for both training and evaluation. In the training space, this means also standardizing reward pipelines, while in the eval space this means helping with reproducibility where a model can be shipped with a complete set of agentic evals that can be easily run by others. | ||
|
||
### The problem with abstraction boundaries | ||
Ideally, we would draw a boundary between environments and everything else (orchestration, resource allocation, RPCs, etc). We will try to do this as much as possible, but we will have to create additional interfaces so that if folks want to cross this boundary, they can. This will likely be necessary for things like reward pipelines that call reward models (which will very likely need to RPC to GPU machines), as well as for agentic evals like Tau where the eval itself involve two agents interacting with one another (and sending many RPCs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add: Interfaces for container providers also that we will need to support
2. Nailing our tools support | ||
3. Landing the basics of _sandboxing_, _versioning_, _binary distribution_, _dependency management_. | ||
|
||
We will conclude this phase with version 0.3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a convention to bump by 0.3 for evert phase? Just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh, I just thought 3 phases before 1.0, so let's do 0.3 :D at that point, we have 0.9 --> 1.0 for the final changes
rfcs/000-project-phases.md
Outdated
|
||
In the **first phase** of this project, we will focus **exclusively** on the narrowest definition of environments, without even worrying about rewards nor evals. Instead, the focus in this phase (and in the RFCs you find in this directory) is going to be on: | ||
1. Establishing a convention on what is an environment and where we draw the "environment" box. | ||
2. Nailing our tools support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: we can be precise tools meaning MCP as well as local tools
rfcs/000-project-phases.md
Outdated
We will group development from now till version 1.0 into three phases. | ||
|
||
In the **first phase** of this project, we will focus **exclusively** on the narrowest definition of environments, without even worrying about rewards nor evals. Instead, the focus in this phase (and in the RFCs you find in this directory) is going to be on: | ||
1. Establishing a convention on what is an environment and where we draw the "environment" box. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing for us to cover in phase 1 is: RPC method. In the current iteration, we only have HTTP but its possible that we might need a more long running session instead of request/response. This is a pattern which is applicable to any interpreted language - bash, python, ruby, etc.. We have taken an opinionated approach with pythonExec but I dont think we can skip bash or other languages.
No description provided.