Skip to content

Conversation

@Vattikondadheeraj
Copy link
Collaborator

@Vattikondadheeraj Vattikondadheeraj commented Oct 27, 2025

Description by Korbit AI

What change is being made?

Refactor to clean up UI text in dynamic prompts, add a default Qwen3 config, and improve LLM response handling to extract and wrap reasoning content from replies.

Why are these changes being made?

To remove noisy UI patterns from the AXTree representation, provide a Qwen3-specific default configuration, and ensure reasoning content from model responses is preserved and clearly identified for downstream processing. This improves prompt clarity, config parity, and traceability of model thinking.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Functionality Missing AGENT_QWEN3 configuration ▹ view
Design Configuration Duplication Instead of Inheritance ▹ view
Documentation Missing response parameter structure documentation ▹ view
Performance Inefficient nested iteration in content processing ▹ view
Design Complex Function with Multiple Responsibilities ▹ view
Documentation Missing Purpose in UI Pattern Removal Docstring ▹ view
Functionality In-place mutation of observation dictionary ▹ view
Readability Hardcoded UI Pattern List ▹ view
Files scanned
File Path Reviewed
src/agentlab/agents/generic_agent/agent_configs.py
src/agentlab/llm/chat_api.py
src/agentlab/agents/dynamic_prompting.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment on lines +153 to +192
FLAGS_QWEN3 = GenericPromptFlags(
obs=dp.ObsFlags(
use_html=False,
use_ax_tree=True,
use_focused_element=True,
use_error_logs=False,
use_history=True,
use_past_error_logs=False,
use_action_history=True,
use_think_history=False, # Qwen3 doesn't include thinking history
use_diff=False,
html_type="pruned_html",
use_screenshot=False,
use_som=False,
extract_visible_tag=True,
extract_clickable_tag=False,
extract_coords="False",
filter_visible_elements_only=False,
),
action=dp.ActionFlags(
action_set=HighLevelActionSetArgs(
subsets=["bid"],
multiaction=False,
),
long_description=False,
individual_examples=True,
),
use_plan=False,
use_criticise=False,
use_thinking=True,
use_memory=False,
use_concrete_example=True,
use_abstract_example=True,
use_hints=True,
enable_chat=False,
max_prompt_tokens=40_000,
be_cautious=True,
extra_instructions=None,
add_missparsed_messages=True,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing AGENT_QWEN3 configuration category Functionality

Tell me more
What is the issue?

The FLAGS_QWEN3 configuration is defined but no corresponding AGENT_QWEN3 is created to use it.

Why this matters

Without a corresponding agent configuration, the FLAGS_QWEN3 cannot be used by the system, making the configuration effectively dead code that serves no functional purpose.

Suggested change ∙ Feature Preview

Add a corresponding agent configuration after the FLAGS_QWEN3 definition:

AGENT_QWEN3 = GenericAgentArgs(
    chat_model_args=CHAT_MODEL_ARGS_DICT["qwen3_model_key"],  # Replace with actual Qwen3 model key
    flags=FLAGS_QWEN3,
)
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +152 to +192
# qwen3 default config - same as llama3-70b but with use_think_history=False
FLAGS_QWEN3 = GenericPromptFlags(
obs=dp.ObsFlags(
use_html=False,
use_ax_tree=True,
use_focused_element=True,
use_error_logs=False,
use_history=True,
use_past_error_logs=False,
use_action_history=True,
use_think_history=False, # Qwen3 doesn't include thinking history
use_diff=False,
html_type="pruned_html",
use_screenshot=False,
use_som=False,
extract_visible_tag=True,
extract_clickable_tag=False,
extract_coords="False",
filter_visible_elements_only=False,
),
action=dp.ActionFlags(
action_set=HighLevelActionSetArgs(
subsets=["bid"],
multiaction=False,
),
long_description=False,
individual_examples=True,
),
use_plan=False,
use_criticise=False,
use_thinking=True,
use_memory=False,
use_concrete_example=True,
use_abstract_example=True,
use_hints=True,
enable_chat=False,
max_prompt_tokens=40_000,
be_cautious=True,
extra_instructions=None,
add_missparsed_messages=True,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration Duplication Instead of Inheritance category Design

Tell me more
What is the issue?

The code duplicates the entire configuration structure from FLAGS_LLAMA3_70B with only one parameter difference (use_think_history=False).

Why this matters

Code duplication increases maintenance burden and risk of inconsistencies when updates are needed. If the base configuration changes, both copies need to be updated.

Suggested change ∙ Feature Preview
# Create Qwen3 flags by inheriting from LLAMA3_70B flags
FLAGS_QWEN3 = FLAGS_LLAMA3_70B.copy()
FLAGS_QWEN3.obs.use_think_history = False  # Override only the different parameter
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +602 to +606
def _extract_thinking_content_from_response(response, wrap_tag: str = "think"):
"""
Extracts the content from the message, including reasoning if available.
It wraps the reasoning around <think>...</think> for easy identification of reasoning content,
when LLM produces 'text' and 'reasoning' in the same message.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing response parameter structure documentation category Documentation

Tell me more
What is the issue?

The docstring lacks information about the expected format/structure of the 'response' parameter.

Why this matters

Without clear documentation of the expected response structure, developers may pass incorrectly formatted responses leading to runtime errors.

Suggested change ∙ Feature Preview

def _extract_thinking_content_from_response(response, wrap_tag: str = "think"):
"""
Extracts the content from the message, including reasoning if available.
It wraps the reasoning around ... for easy identification of reasoning content,
when LLM produces 'text' and 'reasoning' in the same message.

Args:
    response: OpenAI-style completion response object with choices[0].message containing
             either a dict or object with 'content', 'text', 'reasoning', or 'reasoning_content' fields.
    wrap_tag: The tag name to wrap reasoning content (default: "think").

Returns:
    str: The extracted content with reasoning wrapped in specified tags.
"""
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +630 to +653
if isinstance(raw_content, list):
# Concatenate text-like parts if provider returns content blocks
parts = []
for part in raw_content:
if isinstance(part, str):
parts.append(part)
elif isinstance(part, dict):
# Common shapes: {"type":"text","text":"..."} or {"type":"text","text":{"value":"..."}}
if part.get("type") == "text":
txt = part.get("text")
if isinstance(txt, dict) and isinstance(txt.get("value"), str):
parts.append(txt["value"])
elif isinstance(txt, str):
parts.append(txt)
else:
# Fallback: try a few likely keys
for k in ("content", "text", "value"):
v = part.get(k)
if isinstance(v, str):
parts.append(v)
break
else:
parts.append(str(part))
msg_content = "\n".join(p for p in parts if p)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inefficient nested iteration in content processing category Performance

Tell me more
What is the issue?

Inefficient nested loops and repeated dictionary lookups when processing list-style content.

Why this matters

The nested loops with multiple dictionary get() calls and string type checks create O(n*m) complexity where n is the number of parts and m is the number of fallback keys. This will cause performance degradation when processing large content lists.

Suggested change ∙ Feature Preview

Extract the text extraction logic into a separate function and use early returns to avoid nested loops:

def _extract_text_from_part(part):
    if isinstance(part, str):
        return part
    if isinstance(part, dict):
        if part.get("type") == "text":
            txt = part.get("text")
            if isinstance(txt, dict):
                return txt.get("value", "")
            return txt or ""
        # Single loop for fallback keys
        for key in ("content", "text", "value"):
            value = part.get(key)
            if isinstance(value, str):
                return value
    return str(part)

# Then use: parts = [_extract_text_from_part(part) for part in raw_content]
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.


import logging

def _extract_thinking_content_from_response(response, wrap_tag: str = "think"):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complex Function with Multiple Responsibilities category Design

Tell me more
What is the issue?

The function is too complex and handles multiple responsibilities: message normalization, reasoning extraction, content parsing with multiple formats, and content wrapping.

Why this matters

The high complexity makes the code harder to maintain, test, and modify. Each responsibility could change for different reasons, violating the Single Responsibility Principle.

Suggested change ∙ Feature Preview

Split the function into smaller, focused functions:

def _normalize_message(response):
    message = response.choices[0].message
    return message.to_dict() if not isinstance(message, dict) else message

def _extract_reasoning(message):
    return message.get('reasoning') or message.get('reasoning_content')

def _parse_content(raw_content):
    # Handle string vs list content parsing logic

def _format_response(reasoning_text, msg_text, msg_content, wrap_tag):
    # Handle wrapping and combining the parts
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +884 to +893
def remove_ui_patterns(text):
"""
Remove lines containing specific UI patterns for ServiceNow accessibility tree text.
Args:
text (str): The input string containing the accessibility tree
Returns:
str: The cleaned string with lines containing UI patterns removed
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Purpose in UI Pattern Removal Docstring category Documentation

Tell me more
What is the issue?

The docstring lacks explanation of why these UI patterns need to be removed and what impact it has.

Why this matters

Without understanding the purpose, maintainers may inadvertently modify or remove this filtering which could impact the accessibility tree's usability.

Suggested change ∙ Feature Preview

def remove_ui_patterns(text):
"""
Remove ServiceNow UI management patterns from accessibility tree text to focus on core page content.

Filters out widget management controls that are not relevant for task completion
and would add noise to the accessibility tree representation.

Args:
    text (str): The input string containing the accessibility tree

Returns:
    str: The cleaned string with UI management patterns removed
"""
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

visible=lambda: flags.use_html,
prefix="## ",
)
obs["axtree_txt"] = remove_ui_patterns(obs["axtree_txt"])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In-place mutation of observation dictionary category Functionality

Tell me more
What is the issue?

The code modifies the observation dictionary in-place during object initialization, which could cause side effects if the same observation dictionary is used elsewhere or reused.

Why this matters

This mutation could lead to unexpected behavior in other parts of the system that expect the original unmodified observation data, potentially causing debugging difficulties and inconsistent state.

Suggested change ∙ Feature Preview

Create a copy of the observation or modify the axtree_txt after extraction:

obs = copy(obs)  # Add this line before modification
obs["axtree_txt"] = remove_ui_patterns(obs["axtree_txt"])

Or modify it in the AXTree constructor:

self.ax_tree = AXTree(
    remove_ui_patterns(obs["axtree_txt"]),
    visible_elements_only=flags.filter_visible_elements_only,
    visible=lambda: flags.use_ax_tree,
    coord_type=flags.extract_coords,
    visible_tag=flags.extract_visible_tag,
    prefix="## ",
)
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

"""

# Words to look for
words_to_remove = ["Edit Widget", "Edit Widget Preferences", "Close Widget", "Add content"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded UI Pattern List category Readability

Tell me more
What is the issue?

UI patterns to remove are hardcoded as a list within the function, making it difficult to modify or reuse these patterns elsewhere.

Why this matters

If UI patterns need to be updated or reused in other parts of the code, developers would need to modify them in multiple places, increasing maintenance burden and risk of inconsistency.

Suggested change ∙ Feature Preview

Move the UI patterns to a module-level constant or configuration:

UI_PATTERNS_TO_REMOVE = [
    "Edit Widget",
    "Edit Widget Preferences", 
    "Close Widget",
    "Add content"
]

def remove_ui_patterns(text, patterns=UI_PATTERNS_TO_REMOVE):
    ...
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants