Exclude '*mini' models from prompt_cache_retention #1345

enyst · 2025-12-06T14:47:18Z

Summary

Exclude mini variants from prompt_cache_retention support in model features
Piggyback existing tests to validate mini variants are not covered

Context
Evaluation surfaced failures related to passing prompt_cache_retention to mini variants (e.g. gpt-5-mini / gpt-5.1-codex-mini) causing litellm BadRequest errors. The intended behavior is to avoid sending prompt_cache_retention for these mini models.

Changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py
- supports_prompt_cache_retention now requires model to match GPT-5/GPT-4.1 patterns AND not contain "mini".
tests/sdk/llm/test_responses_parsing_and_kwargs.py
- Updated test_chat_and_responses_options_prompt_cache_retention_gpt_5_plus_and_non_gpt to assert no prompt_cache_retention for mini variants.
tests/sdk/llm/test_model_features.py
- Updated test_prompt_cache_retention_support expectations for mini variants to False.

Validation

Ran pre-commit on changed files: all hooks passed.
Ran targeted tests for the modified areas: passing.
- tests/sdk/llm/test_responses_parsing_and_kwargs.py::test_chat_and_responses_options_prompt_cache_retention_gpt_5_plus_and_non_gpt
- tests/sdk/llm/test_model_features.py::test_prompt_cache_retention_support

Notes

Full test suite has an unrelated import error in tests/github_workflows/test_resolve_model_config.py due to missing helper module; unrelated to this change.

Co-authored-by: openhands openhands@all-hands.dev

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:bfaf5c5-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-bfaf5c5-python \
  ghcr.io/openhands/agent-server:bfaf5c5-python

All tags pushed for this build

ghcr.io/openhands/agent-server:bfaf5c5-golang-amd64
ghcr.io/openhands/agent-server:bfaf5c5-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:bfaf5c5-golang-arm64
ghcr.io/openhands/agent-server:bfaf5c5-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:bfaf5c5-java-amd64
ghcr.io/openhands/agent-server:bfaf5c5-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:bfaf5c5-java-arm64
ghcr.io/openhands/agent-server:bfaf5c5-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:bfaf5c5-python-amd64
ghcr.io/openhands/agent-server:bfaf5c5-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:bfaf5c5-python-arm64
ghcr.io/openhands/agent-server:bfaf5c5-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:bfaf5c5-golang
ghcr.io/openhands/agent-server:bfaf5c5-java
ghcr.io/openhands/agent-server:bfaf5c5-python

About Multi-Architecture Support

Each variant tag (e.g., bfaf5c5-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., bfaf5c5-python-amd64) are also available if needed

…djust tests - Update model_features.get_features to skip mini variants - Update tests to piggyback existing coverage and validate mini excluded Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2025-12-06T14:50:54Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/llm/utils
model_features.py	46	2	95%	32, 184
TOTAL	12610	5623	55%

…atterns + mini exclusions - Patterns: ['gpt-5', 'gpt-4.1'] with inline doc reference of actual listed models - Exclude all '*mini' in feature gate (covers gpt-5-mini, gpt-5.1-mini, codex-mini) - Extend tests to include explicit gpt-5.1-mini exclusion Co-authored-by: openhands <openhands@all-hands.dev>

… docs; keep other minis excluded - Update feature gate to carve out 'gpt-5.1-codex-mini' - Update tests to expect retention for 5.1-codex-mini Co-authored-by: openhands <openhands@all-hands.dev>

openhands-sdk/openhands/sdk/llm/utils/model_features.py

tests/sdk/llm/test_responses_parsing_and_kwargs.py

…ix E501 - Provide find_models_by_id for tests expecting resolve_model_configs - Wrap long error message to satisfy Ruff E501 Co-authored-by: openhands <openhands@all-hands.dev>

- Test failure was local-only; CI doesn’t run tests/github_workflows in tests.yml - run-eval workflow uses resolve_model_config.py (singular) directly Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2025-12-06T16:26:34Z

PASS (200) for all documented positives:
openai/gpt-5.1
openai/gpt-5.1-codex
openai/gpt-5.1-codex-mini
openai/gpt-5.1-chat-latest
openai/gpt-5
openai/gpt-5-codex
openai/gpt-4.1

Negative controls:
openai/gpt-5-mini: UNEXPECTED-PASS (200), but response shows prompt_cache_retention: null and status=incomplete with incomplete_details.reason = max_output_tokens. So it didn’t error on the parameter, but the model didn’t produce output, and the retention was not applied.
openai/gpt-5.1-mini: 400 model_not_found (as before)

- Define llm_51_codex_mini before use Co-authored-by: openhands <openhands@all-hands.dev>

…t_cache_retention Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2025-12-06T16:36:44Z

@xingyaoww Re: the failure in the agent behavior PR, looks like the issue is that some mini models don't support extended cache, while one does (gpt-5.1-codex-mini).

I verified a list of models, all those above that should support it, and tried a few that don't; excluded the one in integration tests too.

Behavior tests

openhands-sdk/openhands/sdk/llm/utils/model_features.py

…che_retention; rename *_PATTERNS -> *_MODELS - Introduce apply_ordered_model_rules to handle lists like ["gpt-5", "!gpt-5-mini", "gpt-5.1-codex-mini"] (last wins) - Rewrite PROMPT_CACHE_RETENTION to PROMPT_CACHE_RETENTION_MODELS using ordered rules - Rename all *_PATTERNS constants to *_MODELS across utils and exceptions for clarity - Keep substring semantics and backward compatibility for other feature lists Co-authored-by: openhands <openhands@all-hands.dev>

…RETENTION_MODELS for simplicity - Keep explicit allow for gpt-5.1-codex-mini Co-authored-by: openhands <openhands@all-hands.dev>

… models) Use proper newlines in commit messages Co-authored-by: openhands <openhands@all-hands.dev>

openhands-sdk/openhands/sdk/llm/utils/model_features.py

enyst · 2025-12-17T03:09:32Z

@OpenHands Add the new GPT-5.2 to this PR. Verify that it is on OpenAI list with extended prompt cache support, and add it to the comment and to the relevant tests.

openhands-ai · 2025-12-17T03:09:41Z

I'm on it! enyst can track my progress at all-hands.dev

- Verified against OpenAI prompt caching docs: gpt-5.2 is listed under models with extended (24h) retention - No code changes needed beyond docs comment since our ordered rules already match gpt-5.* (excluding mini) - Extend tests to assert retention for gpt-5.2, gpt-5.2-chat-latest, and gpt-5.2-pro; keep mini exclusions intact Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2025-12-17T03:18:44Z

Summary of changes for adding GPT-5.2

What I did

Verified support on OpenAI docs
- Confirmed that GPT-5.2 is listed under models with extended prompt cache retention (24h) in the OpenAI Prompt Caching guide.
Code/tests updated
- Model features:
  - No logic changes required. Our ordered rules already allow gpt-5.* (excluding mini by default, with explicit allow for gpt-5.1-codex-mini). I updated the inline comment to explicitly include gpt-5.2 in the documented list.
- Tests:
  - tests/sdk/llm/test_model_features.py
    - Added asserts for extended retention support:
      - "gpt-5.2" -> True
      - "openai/gpt-5.2-chat-latest" -> True
      - "openai/gpt-5.2-pro" -> True
    - Left mini exclusions intact (still False), preserving the special allow for "gpt-5.1-codex-mini".
  - tests/sdk/llm/test_responses_parsing_and_kwargs.py
    - Extended test_chat_and_responses_options_prompt_cache_retention_gpt_5_plus_and_non_gpt:
      - Asserts prompt_cache_retention="24h" is included for both Chat and Responses paths for:
        
        "openai/gpt-5.2"
        
        "openai/gpt-5.2-chat-latest"
      - Keeps mini variants excluded.

Validation

Ran pre-commit on changed files: passed
Ran targeted tests: passed
- tests/sdk/llm/test_model_features.py::test_prompt_cache_retention_support
- tests/sdk/llm/test_responses_parsing_and_kwargs.py::test_chat_and_responses_options_prompt_cache_retention_gpt_5_plus_and_non_gpt

Commit

Branch: fix/exclude-mini-from-prompt-cache-retention
Message:
LLM: add GPT-5.2 to extended prompt_cache_retention support and tests
- Verified against OpenAI prompt caching docs: gpt-5.2 is listed under models with extended (24h) retention
- No code changes needed beyond docs comment since our ordered rules already match gpt-5.* (excluding mini)
- Extend tests to assert retention for gpt-5.2, gpt-5.2-chat-latest, and gpt-5.2-pro; keep mini exclusions intact
Co-authored-by: openhands openhands@all-hands.dev

Checklist

Verified GPT-5.2 extended retention in OpenAI docs
Updated tests to cover GPT-5.2 and variants
Ensured “mini” exclusions remain, except for gpt-5.1-codex-mini
Ran linters/type checks and tests
Committed and pushed to the same PR branch

Note about the “comment”

Per your request to “add it to the comment,” I didn’t post to the PR directly (following the project’s guidance to avoid PR comments in this flow). If you want, I can provide a short updated “PASS” block you can paste into the PR discussion to include GPT-5.2, GPT-5.2-chat-latest, and GPT-5.2-pro as positives.

_{View full conversation}

LLM: exclude '*mini' models from prompt_cache_retention support and a…

83a7ef5

…djust tests - Update model_features.get_features to skip mini variants - Update tests to piggyback existing coverage and validate mini excluded Co-authored-by: openhands <openhands@all-hands.dev>

enyst and others added 2 commits December 6, 2025 15:04

LLM: allow gpt-5.1-codex-mini for extended prompt cache retention per…

55d5945

… docs; keep other minis excluded - Update feature gate to carve out 'gpt-5.1-codex-mini' - Update tests to expect retention for 5.1-codex-mini Co-authored-by: openhands <openhands@all-hands.dev>

enyst commented Dec 6, 2025

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/llm/utils/model_features.py

8071d86

enyst commented Dec 6, 2025

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/llm/utils/model_features.py

b90d17b

enyst commented Dec 6, 2025

View reviewed changes

tests/sdk/llm/test_responses_parsing_and_kwargs.py Outdated Show resolved Hide resolved

enyst and others added 3 commits December 6, 2025 16:13

Update tests/sdk/llm/test_responses_parsing_and_kwargs.py

0d5ae21

CI: add resolve_model_configs.py shim for GitHub workflow tests and f…

04daca1

…ix E501 - Provide find_models_by_id for tests expecting resolve_model_configs - Wrap long error message to satisfy Ruff E501 Co-authored-by: openhands <openhands@all-hands.dev>

Revert: remove resolve_model_configs shim (out of scope for this PR)

c5a0f3c

- Test failure was local-only; CI doesn’t run tests/github_workflows in tests.yml - run-eval workflow uses resolve_model_config.py (singular) directly Co-authored-by: openhands <openhands@all-hands.dev>

Tests: fix undefined variable in prompt_cache_retention coverage test

f1a6779

- Define llm_51_codex_mini before use Co-authored-by: openhands <openhands@all-hands.dev>

OpenHands deleted a comment from openhands-ai bot Dec 6, 2025

enyst marked this pull request as ready for review December 6, 2025 16:30

enyst requested a review from xingyaoww December 6, 2025 16:30

Tests: explicitly assert gpt-5-mini-2025-08-07 is excluded from promp…

59e5e41

…t_cache_retention Co-authored-by: openhands <openhands@all-hands.dev>

enyst changed the title ~~Exclude '*mini' models from prompt_cache_retention support and adjust tests~~ Exclude '*mini' models from prompt_cache_retention Dec 6, 2025

xingyaoww reviewed Dec 8, 2025

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Outdated Show resolved Hide resolved

kmk142789 approved these changes Dec 8, 2025

View reviewed changes

enyst marked this pull request as draft December 9, 2025 06:05

openhands-agent added 3 commits December 9, 2025 06:11

llm(model_features): generalize exclusion to "!mini" in PROMPT_CACHE_…

9511fec

…RETENTION_MODELS for simplicity - Keep explicit allow for gpt-5.1-codex-mini Co-authored-by: openhands <openhands@all-hands.dev>

exceptions: revert LONG_PROMPT_PATTERNS and AUTH_PATTERNS naming (not…

4bff02d

… models) Use proper newlines in commit messages Co-authored-by: openhands <openhands@all-hands.dev>

enyst marked this pull request as ready for review December 10, 2025 01:24

Merge branch 'main' into fix/exclude-mini-from-prompt-cache-retention

c6735b1

enyst requested a review from xingyaoww December 10, 2025 01:25

xingyaoww reviewed Dec 10, 2025

View reviewed changes

openhands-sdk/openhands/sdk/llm/utils/model_features.py Show resolved Hide resolved

Merge branch 'main' into fix/exclude-mini-from-prompt-cache-retention

ad8cc9e

xingyaoww approved these changes Dec 17, 2025

View reviewed changes

enyst merged commit 7db5ce2 into main Dec 17, 2025
21 checks passed

enyst deleted the fix/exclude-mini-from-prompt-cache-retention branch December 17, 2025 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exclude '*mini' models from prompt_cache_retention #1345

Exclude '*mini' models from prompt_cache_retention #1345

Uh oh!

enyst commented Dec 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 6, 2025

Uh oh!

enyst commented Dec 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 17, 2025

Uh oh!

openhands-ai bot commented Dec 17, 2025

Uh oh!

openhands-ai bot commented Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Exclude '*mini' models from prompt_cache_retention #1345

Exclude '*mini' models from prompt_cache_retention #1345

Uh oh!

Conversation

enyst commented Dec 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 6, 2025

Uh oh!

enyst commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 17, 2025

Uh oh!

openhands-ai bot commented Dec 17, 2025

Uh oh!

openhands-ai bot commented Dec 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

enyst commented Dec 6, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Dec 6, 2025 •

edited

Loading

enyst commented Dec 6, 2025 •

edited

Loading