-
I'm excited to implement Neuron into one of my projects. However, I'm currently curious as to how stream differs between chat and RAG and why a different function is needed? In a current project i have (not neuron), I simply have a tool, which then creates an embedding from the user prompt, searches the vector store and then returns results to the LLM. In this case, the same LLM is used, and RAG is only called if needed. Clarity on this would be appreciated. Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Having played around with it now - the RAG stream is used when you want to search the Vector DB and include results with every interaction with the LLM. For example, if you have an agent which simply returns results from your company wiki, there is no point having it setup as a tool, where an extra LLM call is needed. Otherwise, setting rag up as a tool works fine. |
Beta Was this translation helpful? Give feedback.
Having played around with it now - the RAG stream is used when you want to search the Vector DB and include results with every interaction with the LLM. For example, if you have an agent which simply returns results from your company wiki, there is no point having it setup as a tool, where an extra LLM call is needed. Otherwise, setting rag up as a tool works fine.