Skip to content

Create Image Indexing Example #9321

@sjrl

Description

@sjrl

In addition to the ImageFileToImageContent and PDFToImageContent components we should add an Indexing example for how to use these conversion components to convert Image Files to Haystack Documents and then write those into a database.

For inspiration we should consult https://github.com/deepset-ai/dc-pipeline-templates/blob/main/templates/Vision_gpt4o_en_indexing.yaml

It requires both the FileToImageContent and PDFToImageContent converters as well as some additional ones (e.g. ChatPromptBuilders + ChatGenerators) to perform Image Captioning using an LLM and then more components to convert the Image caption + ImageContent.meta back into a Haystack Document. So we may want to consider adding a DocumentBuilder component to help with this process.

This would be valuable both for exploring the full flow and as a material to share with users.

We would like to make two examples:

  1. One using image/text embedders like CLIP (preliminary example can be found PR description of feat: Add ImageFileToDocument converter haystack-experimental#336)
  2. One using LLMContentExtractor/LLMDocumentEnricher and only Text Retrieval (preliminary example can be found PR description of feat: Add LLMDocumentContentExtractor to enable Vision-based LLMs to describe/convert an image into text haystack-experimental#338)

Metadata

Metadata

Assignees

Labels

P1High priority, add to the next sprint

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions