You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides three ways to use prompt caching:
83
+
Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides four ways to use prompt caching:
84
84
85
85
1.**Cache User Messages with [`CachePoint`][pydantic_ai.messages.CachePoint]**: Insert a `CachePoint` marker in your user messages to cache everything before it
86
86
2.**Cache System Instructions**: Set [`AnthropicModelSettings.anthropic_cache_instructions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_instructions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
87
87
3.**Cache Tool Definitions**: Set [`AnthropicModelSettings.anthropic_cache_tool_definitions`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_tool_definitions] to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly
88
+
4.**Cache All Messages**: Set [`AnthropicModelSettings.anthropic_cache_messages`][pydantic_ai.models.anthropic.AnthropicModelSettings.anthropic_cache_messages] to `True` to automatically cache all messages
88
89
89
-
You can combine all three strategies for maximum savings:
90
+
### Example 1: Automatic Message Caching
91
+
92
+
Use `anthropic_cache_messages` to automatically cache all messages up to and including the newest user message:
90
93
91
94
```python {test="skip"}
92
-
from pydantic_ai import Agent, CachePoint, RunContext
95
+
from pydantic_ai import Agent
96
+
from pydantic_ai.models.anthropic import AnthropicModelSettings
97
+
98
+
agent = Agent(
99
+
'anthropic:claude-sonnet-4-5',
100
+
system_prompt='You are a helpful assistant.',
101
+
model_settings=AnthropicModelSettings(
102
+
anthropic_cache_messages=True, # Automatically caches the last message
103
+
),
104
+
)
105
+
106
+
# The last message is automatically cached - no need for manual CachePoint
107
+
result1 = agent.run_sync('What is the capital of France?')
108
+
109
+
# Subsequent calls with similar conversation benefit from cache
110
+
result2 = agent.run_sync('What is the capital of Germany?')
Anthropic enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors.
189
+
190
+
#### How Cache Points Are Allocated
191
+
192
+
Cache points can be placed in three locations:
193
+
194
+
1.**System Prompt**: Via `anthropic_cache_instructions` setting (adds cache point to last system prompt block)
195
+
2.**Tool Definitions**: Via `anthropic_cache_tool_definitions` setting (adds cache point to last tool definition)
196
+
3.**Messages**: Via `CachePoint` markers or `anthropic_cache_messages` setting (adds cache points to message content)
197
+
198
+
Each setting uses **at most 1 cache point**, but you can combine them.
199
+
200
+
#### Example: Using All 3 Cache Point Sources
201
+
202
+
Define an agent with all cache settings enabled:
203
+
204
+
```python {test="skip"}
205
+
from pydantic_ai import Agent, CachePoint
206
+
from pydantic_ai.models.anthropic import AnthropicModelSettings
207
+
208
+
agent = Agent(
209
+
'anthropic:claude-sonnet-4-5',
210
+
system_prompt='Detailed instructions...',
211
+
model_settings=AnthropicModelSettings(
212
+
anthropic_cache_instructions=True, # 1 cache point
213
+
anthropic_cache_tool_definitions=True, # 1 cache point
214
+
anthropic_cache_messages=True, # 1 cache point
215
+
),
216
+
)
217
+
218
+
@agent.tool_plain
219
+
defmy_tool() -> str:
220
+
return'result'
221
+
222
+
223
+
# This uses 3 cache points (instructions + tools + last message)
224
+
# You can add 1 more CachePoint marker before hitting the limit
When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones).
238
+
239
+
Define an agent with 2 cache points from settings:
240
+
241
+
```python {test="skip"}
242
+
from pydantic_ai import Agent, CachePoint
243
+
from pydantic_ai.models.anthropic import AnthropicModelSettings
244
+
245
+
agent = Agent(
246
+
'anthropic:claude-sonnet-4-5',
247
+
system_prompt='Instructions...',
248
+
model_settings=AnthropicModelSettings(
249
+
anthropic_cache_instructions=True, # 1 cache point
250
+
anthropic_cache_tool_definitions=True, # 1 cache point
251
+
),
252
+
)
253
+
254
+
@agent.tool_plain
255
+
defsearch() -> str:
256
+
return'data'
257
+
258
+
# Already using 2 cache points (instructions + tools)
259
+
# Can add 2 more CachePoint markers (4 total limit)
260
+
result = agent.run_sync([
261
+
'Context 1', CachePoint(), # Oldest - will be removed
262
+
'Context 2', CachePoint(), # Will be kept (3rd point)
263
+
'Context 3', CachePoint(), # Will be kept (4th point)
0 commit comments