-
Notifications
You must be signed in to change notification settings - Fork 88
Exclude '*mini' models from prompt_cache_retention #1345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…djust tests - Update model_features.get_features to skip mini variants - Update tests to piggyback existing coverage and validate mini excluded Co-authored-by: openhands <openhands@all-hands.dev>
…atterns + mini exclusions - Patterns: ['gpt-5', 'gpt-4.1'] with inline doc reference of actual listed models - Exclude all '*mini' in feature gate (covers gpt-5-mini, gpt-5.1-mini, codex-mini) - Extend tests to include explicit gpt-5.1-mini exclusion Co-authored-by: openhands <openhands@all-hands.dev>
… docs; keep other minis excluded - Update feature gate to carve out 'gpt-5.1-codex-mini' - Update tests to expect retention for 5.1-codex-mini Co-authored-by: openhands <openhands@all-hands.dev>
…ix E501 - Provide find_models_by_id for tests expecting resolve_model_configs - Wrap long error message to satisfy Ruff E501 Co-authored-by: openhands <openhands@all-hands.dev>
- Test failure was local-only; CI doesn’t run tests/github_workflows in tests.yml - run-eval workflow uses resolve_model_config.py (singular) directly Co-authored-by: openhands <openhands@all-hands.dev>
|
PASS (200) for all documented positives: Negative controls: |
- Define llm_51_codex_mini before use Co-authored-by: openhands <openhands@all-hands.dev>
…t_cache_retention Co-authored-by: openhands <openhands@all-hands.dev>
|
@xingyaoww Re: the failure in the agent behavior PR, looks like the issue is that some mini models don't support extended cache, while one does (gpt-5.1-codex-mini). I verified a list of models, all those above that should support it, and tried a few that don't; excluded the one in integration tests too. |
…che_retention; rename *_PATTERNS -> *_MODELS - Introduce apply_ordered_model_rules to handle lists like ["gpt-5", "!gpt-5-mini", "gpt-5.1-codex-mini"] (last wins) - Rewrite PROMPT_CACHE_RETENTION to PROMPT_CACHE_RETENTION_MODELS using ordered rules - Rename all *_PATTERNS constants to *_MODELS across utils and exceptions for clarity - Keep substring semantics and backward compatibility for other feature lists Co-authored-by: openhands <openhands@all-hands.dev>
…RETENTION_MODELS for simplicity - Keep explicit allow for gpt-5.1-codex-mini Co-authored-by: openhands <openhands@all-hands.dev>
… models) Use proper newlines in commit messages Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands Add the new GPT-5.2 to this PR. Verify that it is on OpenAI list with extended prompt cache support, and add it to the comment and to the relevant tests. |
|
I'm on it! enyst can track my progress at all-hands.dev |
- Verified against OpenAI prompt caching docs: gpt-5.2 is listed under models with extended (24h) retention - No code changes needed beyond docs comment since our ordered rules already match gpt-5.* (excluding mini) - Extend tests to assert retention for gpt-5.2, gpt-5.2-chat-latest, and gpt-5.2-pro; keep mini exclusions intact Co-authored-by: openhands <openhands@all-hands.dev>
|
Summary of changes for adding GPT-5.2 What I did
Validation
Commit
Checklist
Note about the “comment”
|
Summary
Context
Evaluation surfaced failures related to passing prompt_cache_retention to mini variants (e.g. gpt-5-mini / gpt-5.1-codex-mini) causing litellm BadRequest errors. The intended behavior is to avoid sending prompt_cache_retention for these mini models.
Changes
Validation
Notes
Co-authored-by: openhands openhands@all-hands.dev
@enyst can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:bfaf5c5-pythonRun
All tags pushed for this build
About Multi-Architecture Support
bfaf5c5-python) is a multi-arch manifest supporting both amd64 and arm64bfaf5c5-python-amd64) are also available if needed