Skip to content

Conversation

@junaway
Copy link
Contributor

@junaway junaway commented Dec 20, 2025

No description provided.

Copilot AI review requested due to automatic review settings December 20, 2025 00:10
@vercel
Copy link

vercel bot commented Dec 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jan 7, 2026 9:34am

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements and tests Daytona-based code evaluation functionality, transitioning from the legacy local sandbox to a new SDK-based approach. It includes improvements to code editor indentation handling for Python/code blocks and adds example evaluators for testing various dependencies and API endpoints.

Key Changes

  • Replaced legacy custom_code_run with new sdk_custom_code_run that uses the SDK's workflow-based evaluator system
  • Enhanced code editor to preserve exact indentation for Python/code (no transformations) while maintaining space-to-tab conversion for JSON/YAML
  • Added example evaluators for testing OpenAI, NumPy, and Agenta API endpoints in Daytona environments

Reviewed changes

Copilot reviewed 20 out of 25 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
api/oss/src/services/evaluators_service.py Implements new SDK-based custom code runner function that delegates to workflow system
api/oss/src/resources/evaluators/evaluators.py Updates default code template with deprecation note for app_params
sdk/agenta/sdk/workflows/runners/daytona.py Adds environment variables (OPENAI_API_KEY, AGENTA_HOST, AGENTA_CREDENTIALS) to sandbox
sdk/agenta/sdk/workflows/runners/local.py Exposes built-in Python types (dict, list, str, etc.) to restricted environment
sdk/agenta/sdk/decorators/running.py Adds fallback to request.credentials in credential resolution chain
web/oss/src/components/Editor/plugins/code/utils/pasteUtils.ts Preserves exact indentation for Python/code, converts spaces to tabs for JSON/YAML
web/oss/src/components/Editor/plugins/code/plugins/IndentationPlugin.tsx Uses 4 spaces for Python/code tab insertion, 2 spaces for JSON/YAML
web/oss/src/components/Editor/plugins/code/plugins/AutoFormatAndValidateOnPastePlugin.tsx Skips indentation transformation for Python/code, maintains it for JSON/YAML
examples/python/evaluators/openai/*.py Adds OpenAI SDK evaluators for testing API availability and exact match comparisons
examples/python/evaluators/numpy/*.py Adds NumPy evaluators for testing library availability and character counting
examples/python/evaluators/basic/*.py Adds basic evaluators using Python stdlib for string matching, length checks, JSON validation
examples/python/evaluators/ag/*.py Adds Agenta API endpoint evaluators for health, secrets, and config endpoints
examples/python/evaluators/*.md Provides comprehensive documentation (README, QUICKSTART, SUMMARY) for evaluators

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings December 23, 2025 11:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 37 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add standard provider keys from vault as env vars
Add templates
Fix credentials (and thus secrets and traces) in evaluator playground
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 63 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 56 out of 64 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings January 7, 2026 09:05
@junaway junaway changed the base branch from main to release/v0.75.0 January 7, 2026 09:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 53 out of 61 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 55 out of 63 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (6)

sdk/agenta/sdk/types.py:1

  • The import re statement appears after class definitions (line 498 shows a class ending). Move this import to the top of the file with other imports to follow Python conventions and improve code organization.
    web/oss/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx:1
  • The use of any type defeats TypeScript's type safety. Consider defining a more specific type or using unknown if the structure is truly dynamic, then narrow it with type guards where needed.
    sdk/agenta/sdk/workflows/runners/local.py:1
  • Using dict() instead of {} for creating an empty dictionary is less idiomatic and slightly less efficient. Use {} instead for consistency with Python conventions.
    web/oss/src/components/Editor/plugins/code/plugins/IndentationPlugin.tsx:1
  • The hardcoded space strings for indentation could be defined as named constants (e.g., JSON_YAML_INDENT = " ", CODE_INDENT = " ") to improve maintainability and make the indentation standards more explicit.
    sdk/agenta/sdk/middlewares/running/vault.py:1
  • The comment # pylint: disable=bare-except is misleading since the code actually catches Exception rather than using a bare except clause. Remove this comment as it's no longer accurate.
    sdk/agenta/sdk/middleware/vault.py:1
  • The comment # pylint: disable=bare-except is misleading since the code actually catches Exception rather than using a bare except clause. Remove this comment as it's no longer accurate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@junaway junaway merged commit 6497bfe into release/v0.75.0 Jan 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation example feature SDK size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants