-
Notifications
You must be signed in to change notification settings - Fork 348
Open
Labels
bugSomething isn't working, either in /adalflow, /tutorials, or /use cases...Something isn't working, either in /adalflow, /tutorials, or /use cases...
Description
Bug description
The application does not work with stream set to True. The class adalflow.components.model_client.ollama_client has the method todo stream input as below:
def parse_stream_response(completion: GeneratorType) -> Any:
"""Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
for chunk in completion:
log.debug(f"Raw chunk: {chunk}")
raw_response = chunk["response"] if "response" in chunk else None
yield GeneratorOutput(data=None, raw_response=raw_response)
def parse_chat_completion(
self, completion: Union[GenerateResponse, GeneratorType]
) -> GeneratorOutput:
"""Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
log.debug(f"completion: {completion}, {isinstance(completion, GeneratorType)}")
if isinstance(completion, GeneratorType): # streaming
return parse_stream_response(completion)
else:
return parse_generate_response(completion)
The yield method would require a loop to get all the tokens. Is there a reason to use yield instead of return.
There are two ways to go about solving this:
SOLUTION 1
Change yield to return.
def parse_stream_response(completion: GeneratorType) -> Any:
"""Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
gen_output = GeneratorOutput(data=None, raw_response='')
for chunk in completion:
log.debug(f"Raw chunk: {chunk}")
raw_response = chunk["response"] if "response" in chunk else None
gen_output.raw_response += token
return gen_output
def parse_chat_completion(
self, completion: Union[GenerateResponse, GeneratorType]
) -> GeneratorOutput:
"""Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
log.debug(f"completion: {completion}, {isinstance(completion, GeneratorType)}")
if isinstance(completion, GeneratorType): # streaming
return parse_stream_response(completion)
else:
return parse_generate_response(completion)
SOLUTION 2
Change method parse_chat_completion to get all the token and then return the GeneratorOutput
def parse_stream_response(completion: GeneratorType) -> Any:
"""Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
for chunk in completion:
log.debug(f"Raw chunk: {chunk}")
raw_response = chunk["response"] if "response" in chunk else None
yield raw_response
def parse_chat_completion(
self, completion: Union[GenerateResponse, GeneratorType]
) -> GeneratorOutput:
"""Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
log.debug(f"completion: {completion}, {isinstance(completion, GeneratorType)}")
if isinstance(completion, GeneratorType): # streaming
gen_output = GeneratorOutput(data=None, raw_response='')
tokens = parse_stream_response(completion)
for token in tokens:
gen_output.raw_response += token
return gen_output
else:
return parse_generate_response(completion)
One thing to remember is that for async implementation we have to create async_parse_chat_completion as the method parse_chat_completion would not work for asynchronous calls.
@liyin2015 Once reviewed and verified that this is an issue i would go ahed and raise a PR for the implementation.
Regards,
What version are you seeing the problem on?
pip installed.
To get the version:
show adalflow
Name: adalflow
Version: 0.2.6
Summary: The Library to Build and Auto-optimize LLM Applications
Home-page: https://github.com/SylphAI-Inc/AdalFlow
Author: Li Yin
Author-email: li@sylphai.com
License: MIT
Location: <>
Requires: backoff, boto3, botocore, colorama, diskcache, jinja2, jsonlines, nest-asyncio, numpy, python-dotenv, pyyaml, tiktoken, tqdm
Required-by:
How to reproduce the bug
from adalflow.components.model_client.ollama_client import OllamaClient
from adalflow.core.generator import Generator
host = "127.0.0.1:11434"
ollama_ai = {
"model_client": OllamaClient(host=host),
"model_kwargs": {
"model": "phi3:latest",
"stream": True,
},
}
generator = Generator(**ollama_ai)
output = generator({"input_str": "What is the capital of France?"})
print(output)
### Error messages and logs
Error processing the output: 'generator' object has no attribute 'raw_response'
GeneratorOutput(id=None, data=None, error="'generator' object has no attribute 'raw_response'", usage=None, raw_response='<generator object Client._request.<locals>.inner at 0x12aa131b0>', metadata=None)
### Environment
- OS: [e.g., Linux, Windows, macOS]
macOS - M1 pro
Version OS - 15.1
### More info
_No response_
Metadata
Metadata
Assignees
Labels
bugSomething isn't working, either in /adalflow, /tutorials, or /use cases...Something isn't working, either in /adalflow, /tutorials, or /use cases...