Skip to content

Conversation

pamelafox
Copy link
Collaborator

Purpose

This is an example PR, not intended for merging, demonstrating the usage of Chat Completion API function calling in order to perform a more specific search query. In this case, I defined a "search_by_filename" function, and the model knows to call that when I enter a question like "Summarize the document named PerksPlus.pdf". I then pass that along to the search function and turn it into a filter.

This sort of approach could also work for searching by additional fields that are in an index. For example, if a field stores the timestamp of a document, you could search for documents created in a certain time period, by adding an appropriate function and filter condition.

Screenshots:

Screenshot 2025-09-17 at 7 07 00 AM

@pamelafox pamelafox marked this pull request as draft September 17, 2025 14:07
@pamelafox pamelafox changed the title Demonstration of using tools to do more specialized queries [Demo PR] Demonstration of using tools to do more specialized queries Sep 17, 2025
@pamelafox pamelafox changed the title [Demo PR] Demonstration of using tools to do more specialized queries Demonstration of using tools to do more specialized queries Sep 17, 2025
@pamelafox pamelafox closed this Sep 17, 2025
@cforce
Copy link
Contributor

cforce commented Sep 17, 2025

Why is it not worth merging?

@pamelafox
Copy link
Collaborator Author

I don't actually know that this is a common enough use case to warrant a special tool, I just wanted to make sure folks had an up-to-date example of how to add another tool (it came up in one of the issues).

@gaborvar
Copy link

Why is it not worth merging?

@cforce
It may be worth but it depends on the scenario. Adding tool calls may entail more changes in the code.
For example, when you start relying on tool calls, more details of the conversation history becomes important than the application preserves.

As long as LLM calls are limited to UI input/output you don't need to keep a log of all LLM calls that the backend makes. The backend does not pass data about its tool calls back to the client, hence this detail is not recorded and not preserved in the message history. The client re-generates the list if historic messages every time a new prompt is entered. It does so from the information that is available on the client. Hence the backend will not know what it did in the previous conversation turns with tool calls.
This is a fair assumption in the sample app. However, if the LLM has the option to make or skip potential API calls then you need to preserve this fact in the application state. Otherwise the app will appear forgetful of its own earlier actions.

(Note: I did not follow the evolution of the code base since April 2025 so if a full state persistence including LLM and tool call parameters is added to the app then my comment may be outdated)

I implemented state persistence for the backend-frontend roundtrip when I added optional tool calls. Session_state parameter was helpful but some workaround was needed.

@pamelafox
Copy link
Collaborator Author

@gaborvar That's a good assessment, the app does not yet use the standard approach of a tool-calling agent to actually append the tool call result to the conversation, we instead extract results and send a brand new conversation with the formatted results to an LLM with a different system prompt. This made sense at the time, as many LLMs did not support tool calling properly, but now that many (non-local) LLMs do support tool calling, we can consider moving the repo to use an actual tool-calling flow. There are drawbacks to that as well, however:

  • The more tools that an LLM receives, the harder it is for them to select a tool, and the non-determinism increases. Evaluation becomes even more crucial.
  • It would be harder for someone to switch the model for the query rewriting phase to a different model, if we instead had a single agent handling everything. Right now it's theoretically possible to change the model, with slight code changes.

@gaborvar
Copy link

@pamelafox You already removed the biggest hurdle to full message history preservation (incl tool params and results) when you removed openai_messages_token_helper in March. It had blocked the pathway of tool related messages.

You don't have to touch your existing message handling code. Instead, you can send tool related log out-of-band: there is a separate session_state pathway in the code base encompassing the client and the server, perfect for our purposes.

Hurdle No 2 is coexistence with a change in app.py in November when the session_state variable was reused for a session ID, creating a conflict. This can be fixed easily by moving that ID to a member of a dict. Then the ID can coexist with tools related message history.

Hurdle No 3: there are a few calls in the call chain that do not pass session_state on to the next call. This breaks the roundtrip of session_state but it is easy to fix. Add session_state as a parameter to e.g. run_until_final_call() Similarly, do this to the new functions created to support agentic call flow: run_agentic_retrieval(), run_search_approach() etc

To store the tool related messages I used a list that had a node corresponding to each message in the message list from the frontend.
As there are more than one (typically two or zero) messages for each tool call (one from assistant to signal its wish to call the function and a second message from the tool that signals the result to the LLM) each node is a list of two messages. If you have more calls per conversation turn it is inherently supported as the list can include more than 2 messages per conversation turn. You can use different LLMs for each of these calls, it is inherently supported. Typical example:

{'toolcall_related_messages_lists': [[], [], [], [{'content': None, 'refusal': None, 'role': 'assistant', 'function_call': None, 'tool_calls': [{'id': 'call_QxB3DNjFlYEdkQmhKFhXNxkS', 'function': {'arguments': '{"search_query":"válóperek"}', 'name': 'search_sources'}, 'type': 'function'}]}, {'role': 'tool', 'content': '{"search_sources": true, "result": ["redacted"]}', 'tool_call_id': 'call_QxB3DNjFlYEdkQmhKFhXNxkS'}], [], ...

On the backend you have to merge message history with tool related calls from session_state before you make your calls to the LLM. Code that merges the tools related messages into the message list:

        merged_messages = []
        for i, toolcall_message_list in enumerate(session_state["toolcall_related_messages_lists"]):
            # If there's a tool call message list, append all messages in it
            if toolcall_message_list:
                merged_messages.extend(toolcall_message_list)
            # Append the corresponding frontend message
            merged_messages.append(messages[i])

This concept works in my fork.
(Note: I did not attempt to merge your commits since March 25 but it should be doable).

@hugokoopmans
Copy link

Hi Pamela thanks for this just for my understanding this code needs me to upgrade the repo to the new release from 19 sept i think?

@hugokoopmans
Copy link

I now have issues with my codebase after merging

ERROR: error executing step command 'package --all': failed building service 'backend': failed invoking event handlers for 'prebuild', 'prebuild' hook failed with exit code: '2', Path: '/tmp/azd-prebuild-3352719687.sh'. : exit code: 2, stdout: 
up to date, audited 401 packages in 1s

5 moderate severity vulnerabilities

To address issues that do not require attention, run:
  npm audit fix

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.

> frontend@0.0.0 build
> tsc && vite build

src/components/GPT4VSettings/GPT4VSettings.tsx(7,10): error TS2305: Module '"../../api"' has no exported member 'GPT4VInput'.

any quick suggestions i am missing?

thank you

@pamelafox
Copy link
Collaborator Author

GPT4VSettings does not exist in the current main, you can delete that file/folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants