Skip to content

Conversation

MagellaX
Copy link

  • add shared AgentAction schema + dispatcher with pydantic v2 validation
  • move s1/s2/s2.5 prompts and workers to schema-checked JSON output and structured response mode
  • update engine adapters and utilities for schema-aware parsing and telemetry, and bump requirements

@MagellaX
Copy link
Author

@alckasoc any thoughts? u can merge this

@alckasoc
Copy link
Collaborator

alckasoc commented Oct 4, 2025

Hi @MagellaX , sorry for the late response! Thank you for the contribution!

Some questions:

  • From what I understand, this PR is to enforce JSON outputs from the llm calls, right?
  • If possible, can you implement this for S3? The other modules (s1, s2, s2_5) can be left alone.
  • Have you tested the JSON output format? How does it handle wrongly formatted outputs?
  • The JSON structure may be nice to have but does seem to introduce ~1k lines of code. Is there a way for this to be implemented more concisely?

@davidlunceford10
Copy link

davidlunceford10 commented Oct 5, 2025

It's interesting how you moved from a structured to a freeform architechture. I'm newer to programming so I'm enjoying learning more about how different frameworks and architechtures are designed.

@MagellaX
Copy link
Author

MagellaX commented Oct 5, 2025

Hi @MagellaX , sorry for the late response! Thank you for the contribution!

Some questions:

  • From what I understand, this PR is to enforce JSON outputs from the llm calls, right?
  • If possible, can you implement this for S3? The other modules (s1, s2, s2_5) can be left alone.
  • Have you tested the JSON output format? How does it handle wrongly formatted outputs?
  • The JSON structure may be nice to have but does seem to introduce ~1k lines of code. Is there a way for this to be implemented more concisely?

Totally aligned: the goal is one shared schema and exactly one JSON object per step, validated (pydantic) and dispatched to ACI-no more eval’d Python, fewer parse bugs, model‑agnostic via Structured Outputs when available, and strict post‑validation + fallback elsewhere.

If you prefer S3‑only, that’s easy: a thin S3 adapter injects the schema (response_format on OpenAI/Azure or a short JSON‑only prompt), validates the result, and maps to S3’s ACI via a tiny dispatcher; we can leave s1/s2/s2_5 untouched or hide behind a feature flag (AGENT_JSON_CALLS=1, default off) so u can merge this for the moment
. I’ve exercised the parser on click/dblclick/hotkey/wait and malformed outputs: valid payloads execute cleanly; bad JSON or schema mismatches raise a typed error, we log, issue a small WAIT, and reprompt once (idempotent via meta.idempotency_key).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants