Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions run.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ conversations_store:
type: sqlite
datasets: []
image_name: starter
# external_providers_dir: /opt/app-root/src/.llama/providers.d
external_providers_dir: ${env.EXTERNAL_PROVIDERS_DIR}
inference_store:
db_path: ~/.llama/storage/inference-store.db
type: sqlite
Expand Down Expand Up @@ -98,15 +98,27 @@ providers:
provider_id: rag-runtime
provider_type: inline::rag-runtime
vector_io:
- config:
persistence:
namespace: faiss_store
backend: kv_default
provider_id: faiss
provider_type: inline::faiss
- provider_id: solr-vector
provider_type: remote::solr_vector_io
config:
solr_url: "http://localhost:8983/solr"
collection_name: "portal-rag"
vector_field: "chunk_vector"
content_field: "chunk"
embedding_dimension: 384
inference_provider_id: sentence-transformers
persistence:
type: sqlite
db_path: .llama/distributions/ollama/portal_rag_kvstore.db
namespace: portal-rag
Comment on lines +101 to +113
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Hardcoded Solr URL should use environment variable.

The solr_url: "http://localhost:8983/solr" is hardcoded and won't work in containerized or distributed deployments. Consider using an environment variable reference similar to line 19's pattern.

       config:
-        solr_url: "http://localhost:8983/solr"
+        solr_url: ${env.SOLR_URL:-http://localhost:8983/solr}
         collection_name: "portal-rag"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: solr-vector
provider_type: remote::solr_vector_io
config:
solr_url: "http://localhost:8983/solr"
collection_name: "portal-rag"
vector_field: "chunk_vector"
content_field: "chunk"
embedding_dimension: 384
inference_provider_id: sentence-transformers
persistence:
type: sqlite
db_path: .llama/distributions/ollama/portal_rag_kvstore.db
namespace: portal-rag
- provider_id: solr-vector
provider_type: remote::solr_vector_io
config:
solr_url: ${env.SOLR_URL:-http://localhost:8983/solr}
collection_name: "portal-rag"
vector_field: "chunk_vector"
content_field: "chunk"
embedding_dimension: 384
inference_provider_id: sentence-transformers
persistence:
type: sqlite
db_path: .llama/distributions/ollama/portal_rag_kvstore.db
namespace: portal-rag
🤖 Prompt for AI Agents
In run.yaml around lines 118 to 130, the solr_url is hardcoded to
"http://localhost:8983/solr"; replace that literal with an environment-variable
reference (e.g. follow the pattern used on line 19) so the URL is configurable
at runtime. Update the solr_url field to read from a SOLR_URL (or similarly
named) env var, preserve quoting/format, and optionally provide a sensible
default/fallback if your template system supports it; ensure documentation and
deployment manifests (docker-compose / k8s) set that env var.

scoring_fns: []
server:
port: 8321
shields: []
tool_groups:
- provider_id: rag-runtime
toolgroup_id: builtin::rag
vector_dbs: []
storage:
backends:
kv_default:
Expand Down
165 changes: 138 additions & 27 deletions src/app/endpoints/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@
import json
import logging
import re
import traceback
from datetime import UTC, datetime
from typing import Annotated, Any, Optional, cast
from urllib.parse import urljoin

from fastapi import APIRouter, Depends, HTTPException, Request
from litellm.exceptions import RateLimitError
Expand All @@ -15,8 +17,9 @@
AsyncLlamaStackClient, # type: ignore
)
from llama_stack_client.types import Shield, UserMessage # type: ignore
from llama_stack_client.types.alpha.agents.turn import Turn
from llama_stack_client.types.alpha.agents.turn_create_params import (
from llama_stack_client.types.agents.turn import Turn
from llama_stack_client.types.agents.turn_create_params import (
Document,
Toolgroup,
ToolgroupAgentToolGroupWithArgs,
)
Comment on lines +20 to 25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n src/app/endpoints/query.py | head -50

Repository: lightspeed-core/lightspeed-stack

Length of output: 208


🏁 Script executed:

rg "Document" src/app/endpoints/query.py -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 916


🏁 Script executed:

fd "llama_stack" --type f | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 971


🏁 Script executed:

rg "from llama_stack_client" src/ -n | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 1789


🏁 Script executed:

cat -n src/app/endpoints/query.py | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 3331


🏁 Script executed:

# Check the correct import paths more thoroughly
rg "Turn|Document" src/app/endpoints/query.py -B2 -A2 -n | head -60

Repository: lightspeed-core/lightspeed-stack

Length of output: 2104


🏁 Script executed:

# Verify the actual import path for Turn and Document in llama_stack_client
rg "from llama_stack_client.types" src/ --type py | grep -E "(turn|agents)" | sort -u

Repository: lightspeed-core/lightspeed-stack

Length of output: 730


🏁 Script executed:

# Check if Document is actually used anywhere in query.py
rg -n "Document" src/app/endpoints/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 916


🏁 Script executed:

# Also check Toolgroup and ToolgroupAgentToolGroupWithArgs usage
rg -n "Toolgroup|ToolgroupAgentToolGroupWithArgs" src/app/endpoints/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 289


🏁 Script executed:

# Verify the full file length to see if Document is used elsewhere
wc -l src/app/endpoints/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 106


🏁 Script executed:

# Double-check that Document isn't used anywhere else in the file besides import and comments
rg -n "Document[^a-zA-Z]" src/app/endpoints/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 838


🏁 Script executed:

# Verify the correct import path by checking what llama_stack_client actually exports
python3 << 'EOF'
# Check if we can find documentation or inspect the actual module structure
import subprocess
result = subprocess.run(
    ['rg', '-n', 'class Turn|class Document', 'src/', '--type', 'py', '-A', '1'],
    capture_output=True,
    text=True
)
print("Local class definitions:")
print(result.stdout[:2000] if result.stdout else "No local definitions found")
EOF

Repository: lightspeed-core/lightspeed-stack

Length of output: 221


Fix import paths for llama_stack_client types.

The import paths at lines 20-25 use incorrect module paths. They should use llama_stack_client.types.alpha.agents instead of llama_stack_client.types.agents. This is consistent with other files in the codebase (src/models/requests.py, src/utils/token_counter.py, src/metrics/utils.py, etc.).

Additionally, remove the unused Document import on line 22—it is not used anywhere in the file except in a commented line.

Corrected imports should be:

  • from llama_stack_client.types.alpha.agents.turn import Turn
  • from llama_stack_client.types.alpha.agents.turn_create_params import Toolgroup, ToolgroupAgentToolGroupWithArgs
🧰 Tools
🪛 GitHub Actions: Integration tests

[error] 20-20: ImportError: No module named 'llama_stack_client.types.agents'.

🪛 GitHub Actions: Ruff

[error] 22-22: F401: Unused import Document from llama_stack_client.types.agents.turn_create_params. Remove unused import.

🪛 GitHub Actions: Unit tests

[error] 20-20: ModuleNotFoundError: No module named 'llama_stack_client.types.agents' during pytest collection. Command: 'uv run pytest tests/unit --cov=src --cov=runner --cov-report term-missing'.

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 20 to 25, the imports use the wrong
module path and include an unused Document import; update the import paths to
use llama_stack_client.types.alpha.agents and remove Document. Specifically
replace the two imports so they read: import Turn from
llama_stack_client.types.alpha.agents.turn and import Toolgroup and
ToolgroupAgentToolGroupWithArgs from
llama_stack_client.types.alpha.agents.turn_create_params, and remove the unused
Document name from the import list.

Expand Down Expand Up @@ -73,6 +76,10 @@
logger = logging.getLogger("app.endpoints.handlers")
router = APIRouter(tags=["query"])

# When OFFLINE is False, use reference_url for chunk source
# When OFFLINE is True, use parent_id for chunk source
# TODO: move this setting to a higher level configuration
OFFLINE = True

query_response: dict[int | str, dict[str, Any]] = {
200: QueryResponse.openapi_response(),
Expand Down Expand Up @@ -312,15 +319,18 @@ async def query_endpoint_handler_base( # pylint: disable=R0914
user_conversation=user_conversation, query_request=query_request
),
)
summary, conversation_id, referenced_documents, token_usage = (
await retrieve_response_func(
client,
llama_stack_model_id,
query_request,
token,
mcp_headers=mcp_headers,
provider_id=provider_id,
)
(
summary,
conversation_id,
referenced_documents,
token_usage,
) = await retrieve_response_func(
client,
llama_stack_model_id,
query_request,
token,
mcp_headers=mcp_headers,
provider_id=provider_id,
)

# Get the initial topic summary for the conversation
Expand Down Expand Up @@ -618,7 +628,7 @@ def parse_metadata_from_text_item(
url = data.get("docs_url")
title = data.get("title")
if url and title:
return ReferencedDocument(doc_url=url, doc_title=title)
return ReferencedDocument(doc_url=url, doc_title=title, doc_id=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find ReferencedDocument class definition
rg "class ReferencedDocument" -A 15 --type py

Repository: lightspeed-core/lightspeed-stack

Length of output: 938


🏁 Script executed:

# Check line 631 in context
head -650 src/app/endpoints/query.py | tail -30

Repository: lightspeed-core/lightspeed-stack

Length of output: 1195


🏁 Script executed:

# Search for the typo "attatchment" mentioned in the scratchpad
rg "attatchment" src/app/endpoints/query.py -n

Repository: lightspeed-core/lightspeed-stack

Length of output: 238


🏁 Script executed:

# Check if ReferencedDocument has any ConfigDict or extra field configuration
rg "class ReferencedDocument" -B 5 -A 20 src/models/responses.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 807


🏁 Script executed:

# Check if BaseModel is configured to allow extra fields globally
rg "ConfigDict\|config_dict\|extra.*allow" src/models/responses.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

# Check context around lines 912-913 for doc_ids_from_chunks
sed -n '900,920p' src/app/endpoints/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 793


🏁 Script executed:

# Check more context around the ReferencedDocument instantiation and see where doc_ids_from_chunks is used
sed -n '920,960p' src/app/endpoints/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 1750


🏁 Script executed:

# Check if doc_id is defined anywhere in ReferencedDocument or if there's a mismatch
rg "doc_id" src/models/responses.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


Remove doc_id=None parameter from ReferencedDocument instantiation at line 631.

The ReferencedDocument model in src/models/responses.py defines only doc_url and doc_title fields. Passing doc_id=None at line 631 will cause a Pydantic ValidationError at runtime.

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around line 631, the instantiation of
ReferencedDocument passes an extraneous doc_id=None which does not exist on the
model and will cause a Pydantic ValidationError; remove the doc_id=None argument
so you only pass doc_url=url and doc_title=title when constructing
ReferencedDocument.

logger.debug("Invalid metadata block (missing url or title): %s", block)
except (ValueError, SyntaxError) as e:
logger.debug("Failed to parse metadata block: %s | Error: %s", block, e)
Expand Down Expand Up @@ -751,19 +761,19 @@ async def retrieve_response( # pylint: disable=too-many-locals,too-many-branche
),
}

# Use specified vector stores or fetch all available ones
if query_request.vector_store_ids:
vector_db_ids = query_request.vector_store_ids
else:
vector_db_ids = [
vector_store.id
for vector_store in (await client.vector_stores.list()).data
]
toolgroups = (get_rag_toolgroups(vector_db_ids) or []) + [
mcp_server.name for mcp_server in configuration.mcp_servers
]
# Include RAG toolgroups when vector DBs are available
vector_dbs = await client.vector_dbs.list()
vector_db_ids = [vdb.identifier for vdb in vector_dbs]
mcp_toolgroups = [mcp_server.name for mcp_server in configuration.mcp_servers]

toolgroups = None
if vector_db_ids:
toolgroups = get_rag_toolgroups(vector_db_ids) + mcp_toolgroups
elif mcp_toolgroups:
toolgroups = mcp_toolgroups

# Convert empty list to None for consistency with existing behavior
Comment on lines +764 to 775
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard get_rag_toolgroups result when building toolgroups.

Here:

vector_dbs = await client.vector_dbs.list()
vector_db_ids = [vdb.identifier for vdb in vector_dbs]
mcp_toolgroups = [mcp_server.name for mcp_server in configuration.mcp_servers]

toolgroups = None
if vector_db_ids:
    toolgroups = get_rag_toolgroups(vector_db_ids) + mcp_toolgroups
elif mcp_toolgroups:
    toolgroups = mcp_toolgroups

get_rag_toolgroups can return None, so None + mcp_toolgroups will raise a TypeError and mypy flags this.

Use an intermediate variable and coerce None to []:

-        toolgroups = None
-        if vector_db_ids:
-            toolgroups = get_rag_toolgroups(vector_db_ids) + mcp_toolgroups
-        elif mcp_toolgroups:
-            toolgroups = mcp_toolgroups
+        toolgroups = None
+        if vector_db_ids:
+            rag_toolgroups = get_rag_toolgroups(vector_db_ids) or []
+            toolgroups = rag_toolgroups + mcp_toolgroups
+        elif mcp_toolgroups:
+            toolgroups = mcp_toolgroups

This will satisfy mypy and avoid runtime failures when no RAG toolgroups are available.

🧰 Tools
🪛 GitHub Actions: Type checks

[error] 748-748: mypy: Unsupported left operand type for + ('None'). (Command: 'uv run mypy --explicit-package-bases --disallow-untyped-calls --disallow-untyped-defs --disallow-incomplete-defs --ignore-missing-imports --disable-error-code attr-defined src/')

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 741 to 752, the code adds the return
value of get_rag_toolgroups (which can be None) to mcp_toolgroups causing a
TypeError and mypy failure; call get_rag_toolgroups into an intermediate
variable, coerce None to an empty list (e.g., rag_toolgroups =
get_rag_toolgroups(vector_db_ids) or []), then set toolgroups = rag_toolgroups +
mcp_toolgroups (or toolgroups = mcp_toolgroups if no vector_db_ids), and finally
convert an empty toolgroups list to None to preserve existing behavior.

if not toolgroups:
if toolgroups == []:
Comment on lines +769 to +776
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix type error: get_rag_toolgroups returns None which cannot be concatenated.

The pipeline reports Operator "+" not supported for "None" because get_rag_toolgroups() can return None. When vector_db_ids is truthy but get_rag_toolgroups returns None, the + operation fails.

         toolgroups = None
         if vector_db_ids:
-            toolgroups = get_rag_toolgroups(vector_db_ids) + mcp_toolgroups
+            rag_toolgroups = get_rag_toolgroups(vector_db_ids)
+            toolgroups = (rag_toolgroups or []) + mcp_toolgroups
         elif mcp_toolgroups:
             toolgroups = mcp_toolgroups
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
toolgroups = None
if vector_db_ids:
toolgroups = get_rag_toolgroups(vector_db_ids) + mcp_toolgroups
elif mcp_toolgroups:
toolgroups = mcp_toolgroups
# Convert empty list to None for consistency with existing behavior
if not toolgroups:
if toolgroups == []:
toolgroups = None
if vector_db_ids:
rag_toolgroups = get_rag_toolgroups(vector_db_ids)
toolgroups = (rag_toolgroups or []) + mcp_toolgroups
elif mcp_toolgroups:
toolgroups = mcp_toolgroups
# Convert empty list to None for consistency with existing behavior
if toolgroups == []:
🧰 Tools
🪛 GitHub Actions: Pyright

[error] 753-753: Python type checker (Pyright): Operator "+" not supported for "None" (reportOptionalOperand)

🪛 GitHub Actions: Type checks

[error] 753-753: mypy: Unsupported left operand type for + ("None")

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 751-758,
get_rag_toolgroups(vector_db_ids) can return None which causes a TypeError when
using "+"; ensure you normalize its result to an empty list before concatenation
(e.g., assign rag_toolgroups = get_rag_toolgroups(vector_db_ids) or []), then
build toolgroups by concatenating rag_toolgroups with mcp_toolgroups only when
appropriate, and preserve the existing behavior of converting an empty list to
None at the end.

toolgroups = None

# TODO: LCORE-881 - Remove if Llama Stack starts to support these mime types
Expand All @@ -776,8 +786,107 @@ async def retrieve_response( # pylint: disable=too-many-locals,too-many-branche
# for doc in query_request.get_documents()
# ]

# Extract RAG chunks from vector DB query response BEFORE calling agent
rag_chunks = []
doc_ids_from_chunks = []
retrieved_chunks = []
retrieved_scores = []

try:
if vector_db_ids:
vector_db_id = vector_db_ids[0] # Use first available vector DB

params = {"k": 5, "score_threshold": 0.0}
logger.info(f"Initial params: {params}")
logger.info(f"query_request.solr: {query_request.solr}")
if query_request.solr:
# Pass the entire solr dict under the 'solr' key
params["solr"] = query_request.solr
logger.info(f"Final params with solr filters: {params}")
else:
logger.info("No solr filters provided")
logger.info(f"Final params being sent to vector_io.query: {params}")

query_response = await client.vector_io.query(
vector_db_id=vector_db_id, query=query_request.query, params=params
Comment on lines +795 to +811
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix unbound variable vector_db_ids and type annotation for params.

Pipeline failures indicate:

  1. vector_db_ids is possibly unbound at line 778 (it's only defined inside the else branch at line 747-748)
  2. params dict type is incompatible with the expected signature at line 793
+    # Initialize vector_db_ids before conditional to avoid unbound variable
+    vector_db_ids: list[str] = []
+    
     # bypass tools and MCP servers if no_tools is True
     if query_request.no_tools:
         mcp_headers = {}
         agent.extra_headers = {}
         toolgroups = None
     else:
         # ... existing code ...
         vector_dbs = await client.vector_dbs.list()
         vector_db_ids = [vdb.identifier for vdb in vector_dbs]
         # ...

For the params type issue, explicitly type it:

-            params = {"k": 5, "score_threshold": 0.0}
+            params: dict[str, Any] = {"k": 5, "score_threshold": 0.0}

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 GitHub Actions: Pyright

[error] 778-778: Pyright: "vector_db_ids" is possibly unbound (reportPossiblyUnboundVariable)

🪛 GitHub Actions: Type checks

[error] 793-793: mypy: Argument "params" to "query" of "AsyncVectorIoResource" has incompatible type "dict[str, float]"; expected "dict[str, bool | float | str | Iterable[object] | object | None] | NotGiven"

)

logger.info(f"The query response total payload: {query_response}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid logging entire query response payloads at INFO level.

Logging the full query_response payload may expose sensitive document content. Log metadata only or move to DEBUG level.

-            logger.info(f"The query response total payload: {query_response}")
+            logger.debug("Query response chunk count: %d", len(query_response.chunks) if query_response.chunks else 0)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.info(f"The query response total payload: {query_response}")
logger.debug("Query response chunk count: %d", len(query_response.chunks) if query_response.chunks else 0)
🤖 Prompt for AI Agents
In src/app/endpoints/query.py around line 796, the code currently logs the
entire query_response at INFO level which may expose sensitive document content;
change this to log only metadata (e.g., response size, number of documents,
document IDs, elapsed time) and move the message to DEBUG level instead of INFO;
replace the full-payload logger.info call with a logger.debug that prints
non-sensitive metadata and ensure any remaining detailed payload logging is
gated behind DEBUG and sanitized before logging.


if query_response.chunks:
from models.responses import RAGChunk, ReferencedDocument

Comment on lines +817 to +818
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Move imports to module level.

Importing RAGChunk and ReferencedDocument inside the function is inefficient and unconventional. These are already imported at line 50-51.

             if query_response.chunks:
-                from models.responses import RAGChunk, ReferencedDocument
-
                 retrieved_chunks = query_response.chunks
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from models.responses import RAGChunk, ReferencedDocument
if query_response.chunks:
retrieved_chunks = query_response.chunks
🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 799-800, there's a local import "from
models.responses import RAGChunk, ReferencedDocument" inside the function;
remove this function-level import and rely on module-level imports
instead—ensure RAGChunk and ReferencedDocument are imported at the top of the
file (around lines 50-51), adding them there if missing, then delete the local
import to avoid runtime overhead and follow conventional import placement.

retrieved_chunks = query_response.chunks
retrieved_scores = (
query_response.scores if hasattr(query_response, "scores") else []
)

# Extract doc_ids from chunks for referenced_documents
metadata_doc_ids = set()
for chunk in query_response.chunks:
metadata = getattr(chunk, "metadata", None)
if metadata and "doc_id" in metadata:
reference_doc = metadata["doc_id"]
logger.info(reference_doc)
if reference_doc and reference_doc not in metadata_doc_ids:
metadata_doc_ids.add(reference_doc)
doc_ids_from_chunks.append(
ReferencedDocument(
doc_title=metadata.get("title", None),
doc_url="https://mimir.corp.redhat.com"
+ reference_doc,
)
)

logger.info(
f"Extracted {len(doc_ids_from_chunks)} unique document IDs from chunks"
)

except Exception as e:
logger.warning(f"Failed to query vector database for chunks: {e}")
logger.debug(f"Vector DB query error details: {traceback.format_exc()}")
# Continue without RAG chunks

# Convert retrieved chunks to RAGChunk format
for i, chunk in enumerate(retrieved_chunks):
# Extract source from chunk metadata based on OFFLINE flag
source = None
if chunk.metadata:
if OFFLINE:
parent_id = chunk.metadata.get("parent_id")
if parent_id:
source = urljoin("https://mimir.corp.redhat.com", parent_id)
else:
source = chunk.metadata.get("reference_url")

# Get score from retrieved_scores list if available
score = retrieved_scores[i] if i < len(retrieved_scores) else None

rag_chunks.append(
RAGChunk(
content=chunk.content,
source=source,
score=score,
)
)

logger.info(f"Retrieved {len(rag_chunks)} chunks from vector DB")

Comment on lines +851 to +874
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize urljoin parent_id type and RAGChunk field types.

Within the loop that builds rag_chunks:

if OFFLINE:
    parent_id = chunk.metadata.get("parent_id")
    if parent_id:
        source = urljoin("https://mimir.corp.redhat.com", parent_id)
...
rag_chunks.append(
    RAGChunk(
        content=chunk.content,
        source=source,
        score=score,
    )
)

Problems:

  • parent_id is typed as object; passing it directly to urljoin triggers mypy/pyright errors and might not be a string at runtime.
  • RAGChunk.content is presumably str, but chunk.content is an interleaved content object.

Fix:

-            if OFFLINE:
-                parent_id = chunk.metadata.get("parent_id")
-                if parent_id:
-                    source = urljoin("https://mimir.corp.redhat.com", parent_id)
+            if OFFLINE:
+                parent_id = chunk.metadata.get("parent_id")
+                if parent_id and isinstance(parent_id, str):
+                    source = urljoin("https://mimir.corp.redhat.com", parent_id)
@@
-        rag_chunks.append(
-            RAGChunk(
-                content=chunk.content,
-                source=source,
-                score=score,
-            )
-        )
+        rag_chunks.append(
+            RAGChunk(
+                content=str(chunk.content) if chunk.content else "",
+                source=source if isinstance(source, str) else None,
+                score=score,
+            )
+        )

This resolves the urljoin type error and ensures RAGChunk fields have the expected primitive types.

Also applies to: 835-835

🧰 Tools
🪛 GitHub Actions: Type checks

[error] 835-835: mypy: Value of type variable 'AnyStr' of 'urljoin' cannot be 'object'. (Command: 'uv run mypy --explicit-package-bases --disallow-untyped-calls --disallow-untyped-defs --disallow-incomplete-defs --ignore-missing-imports --disable-error-code attr-defined src/')

🤖 Prompt for AI Agents
In src/app/endpoints/query.py around lines 828-851, normalize types before
constructing RAGChunk: ensure parent_id is coerced to a string before calling
urljoin (e.g. parent_id = chunk.metadata.get("parent_id"); if parent_id is not
None: parent_id = str(parent_id); source =
urljoin("https://mimir.corp.redhat.com", parent_id)), ensure content is a
primitive string (extract a .text attribute if present or fallback to
str(chunk.content) so RAGChunk.content is str), and coerce score to a float or
None (e.g. score = float(retrieved_scores[i]) if i < len(retrieved_scores) and
retrieved_scores[i] is not None else None) before creating the RAGChunk.

# Format RAG context for injection into user message
rag_context = ""
if rag_chunks:
context_chunks = []
for chunk in rag_chunks[:5]: # Limit to top 5 chunks
chunk_text = f"Source: {chunk.source or 'Unknown'}\n{chunk.content}"
context_chunks.append(chunk_text)
rag_context = "\n\nRelevant documentation:\n" + "\n\n".join(context_chunks)
logger.info(f"Injecting {len(context_chunks)} RAG chunks into user message")

# Inject RAG context into user message
user_content = query_request.query + rag_context

response = await agent.create_turn(
messages=[UserMessage(role="user", content=query_request.query).model_dump()],
messages=[UserMessage(role="user", content=user_content)],
session_id=session_id,
# documents=documents,
stream=False,
Expand All @@ -795,12 +904,14 @@ async def retrieve_response( # pylint: disable=too-many-locals,too-many-branche
else ""
),
tool_calls=[],
tool_results=[],
rag_chunks=[],
rag_chunks=rag_chunks,
)

referenced_documents = parse_referenced_documents(response)

# Add documents from Solr chunks to referenced_documents
referenced_documents.extend(doc_ids_from_chunks)

# Update token count metrics and extract token usage in one call
model_label = model_id.split("/", 1)[1] if "/" in model_id else model_id
token_usage = extract_and_update_token_metrics(
Expand Down
Loading
Loading