Introduce a custom system role #130

igorlima · 2024-12-30T15:49:28Z

This PR focuses on addressing two open issues by introducing a custom system role:

Supporting Vision models from Groq #65
[Bug]: New llama 3.2 vision models from groq not working: prompting with images is incompatible with system messages BerriAI/litellm#6569

Initially, there are no apparent issues with either litellm or Groq. However, this flexible approach will address both concerns.

here's a code snippet for you to try out:

snippet.py

# Groq PDF EXTRACT via zerox
from pyzerox import zerox
import pathlib, os, asyncio

kwargs = {}
custom_system_prompt = None
custom_role = 'user' # HERE IS the NEW custom_role parameter

model = "groq/llama-3.2-11b-vision-preview"

# Define main async entrypoint
async def main():
  file_path = "data/input.pdf" ## local filepath and file URL supported
  ## process only some pages or all
  select_pages = None ## None for all, but could be int or list(int) page numbers (1 indexed)
  output_dir = "./data" ## directory to save the consolidated markdown file
  output_dir = None
  result = await zerox(file_path=file_path, model=model, output_dir=output_dir,
                      custom_system_prompt=custom_system_prompt, select_pages=select_pages,
                      custom_role=custom_role, **kwargs)
  return result

# run the main function:
result = asyncio.run(main())
md_text = "\n".join([page.content for page in result.pages])

# print markdown md_text
print(md_text)
# save markdown to file
pathlib.Path("data/output-zerox-pdf.md").write_text(md_text)
print("Markdown saved to output-zerox-pdf.md")


"""
# HOW TO RUN THIS SCRIPT:    
({
export GROQ_API_KEY="XXXXXXXXXXXXXXXXXXXX"
python3 snippet.py
})
"""

# TO INSTALL dependencies
# installing from git repo branch
pip install --no-cache --upgrade-strategy eager -I \
  git+https://github.com/igorlima/zerox.git@default-role

# installing from local file system folder
# pip install ~/workstation/github/zerox

# TO RUN this script:
({
export GROQ_API_KEY="XXXXXXXXXXXXXXXXXXXX"
python3 snippet.py
})

The code was initially intended to function with a hardcoded system role. However, some LLMs require customization. This PR introduces a new custom_role parameter to address the above issues. Then, using custom_role=user should solve these issues.

Below are some notes and snippets to help refresh my memory whenever I revisit this code. They also provide newcomers with an overview of the underlying processes:

additional notes and snippets

The py-zerox library is designed to interact with LLMs via API, using the litellm library to ensure everything runs smoothly. Before jumping into action, py-zerox performs a couple of important checks using litellm methods:

model validation

The validate_model(self) method uses litellm.supports_vision(model=self.model) to confirm that the model is indeed a vision model. Essentially, litellm checks a comprehensive JSON map with detailed information on various LLM options to ensure compatibility.

snippets

({
# https://docs.python.org/3/library/logging.html#logging-levels
# https://github.com/BerriAI/litellm/blob/ea8f0913c2aac4a7b4ec53585cfe9ea04a3282de/litellm/_logging.py#L11
export LITELLM_LOG=CRITICAL

output=$(python3 <<EOF

import litellm
# # https://github.com/BerriAI/litellm/blob/11932d0576a073d83f38a418cbdf6b2d8d4ff46f/litellm/litellm_core_utils/get_llm_provider_logic.py#L322
litellm.suppress_debug_info = True
# https://docs.litellm.ai/docs/debugging/local_debugging#set-verbose
litellm.set_verbose=False

model = "bedrock/amazon.nova-pro-v1:0"
model = "bedrock/anthropic.claude-3-haiku-20240307-v1:0"
model = "bedrock/amazon.nova-lite-v1:0"
model = "groq/llama-3.2-11b-vision-preview"
isVisionModel = litellm.supports_vision(model)
print("Does %s supports visual? %s" % (model, isVisionModel))

EOF
)

echo $output
})

access validation

The validate_access(self) method uses litellm.check_valid_key(model=self.model, api_key=None) to verify access to the model. This check ensures that environment variables are correctly set with proper values.

In short, litellm performs a simple API request to the given LLM. If any issues arise, it simply returns False. Otherwise, it returns a True for a successful outcome.
- a quick note: exceptions during this process will not be displayed - to view them, you must start debugging or set the appropriate Debug Environment Variable.

snippets

({
# https://docs.python.org/3/library/logging.html#logging-levels
# https://github.com/BerriAI/litellm/blob/ea8f0913c2aac4a7b4ec53585cfe9ea04a3282de/litellm/_logging.py#L11
export LITELLM_LOG=DEBUG
export GROQ_API_KEY="xxxxxxxxxxxxxxxx"

output=$(python3 <<EOF

import litellm
# # https://github.com/BerriAI/litellm/blob/11932d0576a073d83f38a418cbdf6b2d8d4ff46f/litellm/litellm_core_utils/get_llm_provider_logic.py#L322
litellm.suppress_debug_info = False
# https://docs.litellm.ai/docs/debugging/local_debugging#set-verbose
litellm.set_verbose=True

model = "bedrock/amazon.nova-pro-v1:0"
model = "bedrock/anthropic.claude-3-haiku-20240307-v1:0"
model = "bedrock/amazon.nova-lite-v1:0"
model = "groq/llama-3.2-11b-vision-preview"
isAllSet = litellm.check_valid_key(model,api_key=None)
print("Does %s have everything set? %s" % (model, isAllSet))

EOF
)

echo $output
})

Once these checks are completed, py-zerox begins using the litellm to convert PDFs into markdown format.

Thanks!

igorlima marked this pull request as ready for review December 30, 2024 18:05

This was referenced Dec 30, 2024

[Bug]: New llama 3.2 vision models from groq not working: prompting with images is incompatible with system messages BerriAI/litellm#6569

Closed

Supporting Vision models from Groq #65

Open

igorlima changed the title ~~Introduce a customizable and flexible system role~~ Introduce a custom system role Dec 31, 2024

Introduce a customizable and flexible system role

4c42b81

igorlima force-pushed the default-role branch from c249dec to 4c42b81 Compare February 17, 2025 01:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce a custom system role #130

Introduce a custom system role #130

Uh oh!

igorlima commented Dec 30, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Introduce a custom system role #130

Are you sure you want to change the base?

Introduce a custom system role #130

Uh oh!

Conversation

igorlima commented Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

igorlima commented Dec 30, 2024 •

edited

Loading