Skip to content

Conversation

Darktex
Copy link
Contributor

@Darktex Darktex commented Oct 17, 2025

No description provided.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 17, 2025
Copy link
Contributor

@pankit-eng pankit-eng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Minor comments

This project aims at standardizing environments for both training and evaluation. In the training space, this means also standardizing reward pipelines, while in the eval space this means helping with reproducibility where a model can be shipped with a complete set of agentic evals that can be easily run by others.

### The problem with abstraction boundaries
Ideally, we would draw a boundary between environments and everything else (orchestration, resource allocation, RPCs, etc). We will try to do this as much as possible, but we will have to create additional interfaces so that if folks want to cross this boundary, they can. This will likely be necessary for things like reward pipelines that call reward models (which will very likely need to RPC to GPU machines), as well as for agentic evals like Tau where the eval itself involve two agents interacting with one another (and sending many RPCs).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add: Interfaces for container providers also that we will need to support

2. Nailing our tools support
3. Landing the basics of _sandboxing_, _versioning_, _binary distribution_, _dependency management_.

We will conclude this phase with version 0.3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it a convention to bump by 0.3 for evert phase? Just curious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh, I just thought 3 phases before 1.0, so let's do 0.3 :D at that point, we have 0.9 --> 1.0 for the final changes


In the **first phase** of this project, we will focus **exclusively** on the narrowest definition of environments, without even worrying about rewards nor evals. Instead, the focus in this phase (and in the RFCs you find in this directory) is going to be on:
1. Establishing a convention on what is an environment and where we draw the "environment" box.
2. Nailing our tools support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: we can be precise tools meaning MCP as well as local tools

We will group development from now till version 1.0 into three phases.

In the **first phase** of this project, we will focus **exclusively** on the narrowest definition of environments, without even worrying about rewards nor evals. Instead, the focus in this phase (and in the RFCs you find in this directory) is going to be on:
1. Establishing a convention on what is an environment and where we draw the "environment" box.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing for us to cover in phase 1 is: RPC method. In the current iteration, we only have HTTP but its possible that we might need a more long running session instead of request/response. This is a pattern which is applicable to any interpreted language - bash, python, ruby, etc.. We have taken an opinionated approach with pythonExec but I dont think we can skip bash or other languages.

@Darktex Darktex merged commit a3be2a7 into main Oct 21, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants