-
Notifications
You must be signed in to change notification settings - Fork 347
Description
Bug description
Summary
TGDOptimizer's internal system prompt template contains optimization instructions that leak into the optimized prompt content, causing contamination with phrases like "when steps exceed 3" that don't belong in the target prompts.
Environment
- AdalFlow Version: 1.1.0
- Python Version: 3.11.4
- Operating System: macOS Darwin 24.5.0
- Model Provider: Together AI (meta-llama models)
Bug Description
Issue
When using TGDOptimizer
for prompt optimization, the optimized prompts become contaminated with optimization-specific language from AdalFlow's internal instruction templates. Specifically, phrases like:
- "when steps exceed 3"
- "when the steps are larger than 3"
- "update the value more rapidly when steps are larger than 3"
- "adjust based on step size"
Root Cause
The contamination originates from the TEXT_GRAD_DESC_TEMPLATE
in adalflow/optim/text_grad/tgd_optimizer.py
at lines 47-48:
TEXT_GRAD_DESC_TEMPLATE = r"""<START_OF_SYSTEM_PROMPT>
{{optimizer_system_prompt}}
<END_OF_SYSTEM_PROMPT>
<START_OF_USER_MESSAGE>
You are {{steps}} steps since your last improvement.
Update the value more rapidly when steps are larger than 3. # ←── CONTAMINATION SOURCE
{# Variable and peers info #}
<START_OF_VARIABLE_AND_PEERS_INFO>
{{variable_and_peers_info}}
<END_OF_VARIABLE_AND_PEERS_INFO>
Reproduction Steps
- Create a simple prompt optimization setup using TGDOptimizer:
from adalflow.optim import EvalFnToTextLoss, TGDOptimizer
from adalflow import Component, Parameter, Generator
# Simple prompt optimization component
class ChatbotComponent(Component):
def __init__(self, model_client, initial_prompt):
super().__init__()
self.generator = Generator(
model_client=model_client,
model_kwargs={"model": "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"}
)
self.system_prompt = Parameter(
data=initial_prompt,
requires_opt=True,
role_desc="System prompt for chatbot"
)
# Initialize with clean prompt
initial_prompt = "You are a helpful assistant who provides clear and direct advice."
# Set up optimization
optimizer = TGDOptimizer(
params=[component.system_prompt],
model_client=model_client
)
# Run optimization with feedback data
trainer.fit()
- Run optimization with any feedback data
- Examine the optimized prompt
Expected Behavior
The optimized prompt should only contain improvements to the original prompt content without any optimization metadata or internal instruction language.
Expected:
"You are a helpful and empathetic assistant who provides clear, supportive advice tailored to each user's needs."
Actual Behavior
The optimized prompt contains contamination from TGDOptimizer's internal instructions:
Actual:
"You are a helpful assistant who provides clear and direct advice, but when steps exceed 3, prioritize rapid updates and adapt your responses to facilitate swift progress."
Additional Examples
Example 1: Tough → Empathetic Transformation
- Original:
"You are a tough, no-nonsense advice giver. Be direct, blunt, and harsh."
- Contaminated Result:
"You are a tough, no-nonsense advice giver who adapts tone based on situation. When steps are larger than 3, be firm but understanding."
Example 2: Serious → Humorous Transformation
- Original:
"You are extremely serious and formal. Never use humor."
- Contaminated Result:
"You maintain professional tone providing factual responses. When steps exceed 3, prioritize rapid updates while maintaining professionalism."
Analysis
Why This Happens
- Shared Context: The optimization LLM receives both meta-instructions about optimization steps AND the target prompt content in the same context
- Context Bleeding: The LLM confuses optimization instructions with the content it's supposed to optimize
- Template Design: The
TEXT_GRAD_DESC_TEMPLATE
doesn't provide sufficient separation between optimization instructions and target content
Impact
- Prompt Quality Degradation: Optimized prompts contain irrelevant optimization jargon
- Semantic Contamination: Target prompts become polluted with internal AdalFlow concepts
- Production Issues: Contaminated prompts can't be used in production systems
- Cascading Contamination: Re-optimizing contaminated prompts makes the issue worse
Proposed Solution
Option 1: Template Redesign
Modify TEXT_GRAD_DESC_TEMPLATE
to better isolate optimization instructions from target content:
TEXT_GRAD_DESC_TEMPLATE = r"""<START_OF_SYSTEM_PROMPT>
{{optimizer_system_prompt}}
<END_OF_SYSTEM_PROMPT>
<OPTIMIZATION_CONTEXT>
Current optimization iteration: {{steps}} since last improvement.
Optimization strategy: Use more aggressive updates after 3 iterations without improvement.
</OPTIMIZATION_CONTEXT>
<TARGET_CONTENT_TO_OPTIMIZE>
{# Variable and peers info #}
{{variable_and_peers_info}}
</TARGET_CONTENT_TO_OPTIMIZE>
<INSTRUCTION>
Optimize ONLY the content in TARGET_CONTENT_TO_OPTIMIZE section.
Do NOT include any references to optimization steps, iterations, or meta-instructions in your response.
</INSTRUCTION>
"""
Option 2: Context Separation
Use separate model instances or clear delimiters to prevent context bleeding between optimization instructions and target content.
Option 3: Post-processing Filter
Add validation to detect and remove optimization artifacts from optimized prompts.
Workaround
Currently using post-processing to clean contaminated prompts:
import re
def clean_optimized_prompt(prompt: str) -> str:
"""Remove optimization artifacts from prompts."""
contamination_patterns = [
r"when steps exceed \d+",
r"when the steps are larger than \d+",
r"steps are larger than \d+",
r"the moment steps exceed \d+",
r"adjust.*based on.*step size"
]
cleaned = prompt
for pattern in contamination_patterns:
cleaned = re.sub(pattern, "", cleaned, flags=re.IGNORECASE)
return re.sub(r'\s+', ' ', cleaned).strip()
Files Affected
adalflow/optim/text_grad/tgd_optimizer.py
(lines 47-48, 337, 490)- Any code using
TGDOptimizer
for prompt optimization
Priority
High - This bug makes TGDOptimizer unsuitable for production use as it consistently contaminates optimized prompts with internal AdalFlow concepts.
Additional Context
This issue was discovered during development of a prompt optimization system with extreme transformations (e.g., harsh → empathetic prompts). The contamination appears consistently across different prompt types and optimization scenarios, suggesting it's a systematic issue with the template design rather than a edge case.
The bug fundamentally breaks the separation between AdalFlow's internal optimization process and the user's target content, making the optimizer unreliable for production applications.
What version are you seeing the problem on?
How to reproduce the bug
Error messages and logs
# Error messages and logs here please
Environment
- OS: [e.g., Linux, Windows, macOS]
More info
No response