Ollama client (from adalflow.components.model_client.ollama_client) does not work with stream=True

### Bug description

The application does not work with stream set to True. The class **adalflow.components.model_client.ollama_client** has the method todo stream input as below:

```
def parse_stream_response(completion: GeneratorType) -> Any:
    """Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
    for chunk in completion:
        log.debug(f"Raw chunk: {chunk}")
        raw_response = chunk["response"] if "response" in chunk else None
        yield GeneratorOutput(data=None, raw_response=raw_response)


def parse_chat_completion(
        self, completion: Union[GenerateResponse, GeneratorType]
    ) -> GeneratorOutput:
        """Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
        log.debug(f"completion: {completion}, {isinstance(completion, GeneratorType)}")
        if isinstance(completion, GeneratorType):  # streaming
            return parse_stream_response(completion)
        else:
            return parse_generate_response(completion)
```


The yield method would require a loop to get all the tokens. Is there a reason to use **yield** instead of **return**.

There are two ways to go about solving this:

 **SOLUTION 1**
Change **yield** to **return**.

```
def parse_stream_response(completion: GeneratorType) -> Any:
    """Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
    gen_output = GeneratorOutput(data=None, raw_response='')
    for chunk in completion:
        log.debug(f"Raw chunk: {chunk}")
        raw_response = chunk["response"] if "response" in chunk else None
        gen_output.raw_response += token
    return gen_output 
```

```
def parse_chat_completion(
        self, completion: Union[GenerateResponse, GeneratorType]
    ) -> GeneratorOutput:
        """Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
        log.debug(f"completion: {completion}, {isinstance(completion, GeneratorType)}")
        if isinstance(completion, GeneratorType):  # streaming
            return parse_stream_response(completion)
        else:
            return parse_generate_response(completion)
```

**SOLUTION 2**

Change method **parse_chat_completion** to get all the token and then return the GeneratorOutput
```
def parse_stream_response(completion: GeneratorType) -> Any:
    """Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
    for chunk in completion:
        log.debug(f"Raw chunk: {chunk}")
        raw_response = chunk["response"] if "response" in chunk else None
        yield raw_response 
```

```
def parse_chat_completion(
        self, completion: Union[GenerateResponse, GeneratorType]
    ) -> GeneratorOutput:
        """Parse the completion to a str. We use the generate with prompt instead of chat with messages."""
        log.debug(f"completion: {completion}, {isinstance(completion, GeneratorType)}")
        if isinstance(completion, GeneratorType):  # streaming
            gen_output = GeneratorOutput(data=None, raw_response='')
            tokens = parse_stream_response(completion)

            for token in tokens:
                gen_output.raw_response += token
            return gen_output
        else:
            return parse_generate_response(completion)
```


One thing to remember is that for async implementation we have to create **async_parse_chat_completion** as the method **parse_chat_completion** would not work for asynchronous calls.

@liyin2015 Once reviewed and verified that this is an issue i would go ahed and raise a PR for the implementation.

Regards,

### What version are you seeing the problem on?

```python
pip installed.

To get the version:

 show adalflow

Name: adalflow
Version: 0.2.6
Summary: The Library to Build and Auto-optimize LLM Applications
Home-page: https://github.com/SylphAI-Inc/AdalFlow
Author: Li Yin
Author-email: li@sylphai.com
License: MIT
Location: <>
Requires: backoff, boto3, botocore, colorama, diskcache, jinja2, jsonlines, nest-asyncio, numpy, python-dotenv, pyyaml, tiktoken, tqdm
Required-by:
```


### How to reproduce the bug

```python
from adalflow.components.model_client.ollama_client import OllamaClient
from adalflow.core.generator import Generator

host = "127.0.0.1:11434"


ollama_ai = {
        "model_client": OllamaClient(host=host),
        "model_kwargs": {
            "model": "phi3:latest",
            "stream": True,
        },
    }

generator = Generator(**ollama_ai)
output = generator({"input_str": "What is the capital of France?"})
print(output)
```
```


### Error messages and logs

Error processing the output: 'generator' object has no attribute 'raw_response'
GeneratorOutput(id=None, data=None, error="'generator' object has no attribute 'raw_response'", usage=None, raw_response='<generator object Client._request.<locals>.inner at 0x12aa131b0>', metadata=None)

```

```


### Environment

- OS: [e.g., Linux, Windows, macOS]
macOS - M1 pro
Version OS -  15.1

### More info

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ollama client (from adalflow.components.model_client.ollama_client) does not work with stream=True #299

Bug description

What version are you seeing the problem on?

How to reproduce the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ollama client (from adalflow.components.model_client.ollama_client) does not work with stream=True #299

Description

Bug description

What version are you seeing the problem on?

How to reproduce the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions