-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
In addition to the ImageFileToImageContent
and PDFToImageContent
components we should add an Indexing example for how to use these conversion components to convert Image Files to Haystack Documents and then write those into a database.
For inspiration we should consult https://github.com/deepset-ai/dc-pipeline-templates/blob/main/templates/Vision_gpt4o_en_indexing.yaml
It requires both the FileToImageContent
and PDFToImageContent
converters as well as some additional ones (e.g. ChatPromptBuilders + ChatGenerators) to perform Image Captioning using an LLM and then more components to convert the Image caption + ImageContent.meta back into a Haystack Document. So we may want to consider adding a DocumentBuilder
component to help with this process.
This would be valuable both for exploring the full flow and as a material to share with users.
We would like to make two examples:
- One using image/text embedders like CLIP (preliminary example can be found PR description of feat: Add ImageFileToDocument converter haystack-experimental#336)
- One using
LLMContentExtractor
/LLMDocumentEnricher
and only Text Retrieval (preliminary example can be found PR description of feat: AddLLMDocumentContentExtractor
to enable Vision-based LLMs to describe/convert an image into text haystack-experimental#338)