Create Image Indexing Example

In addition to the `ImageFileToImageContent` and `PDFToImageContent` components we should add an Indexing example for how to use these conversion components to convert Image Files to Haystack Documents and then write those into a database.

For inspiration we should consult https://github.com/deepset-ai/dc-pipeline-templates/blob/main/templates/Vision_gpt4o_en_indexing.yaml

It requires both the `FileToImageContent` and `PDFToImageContent` converters as well as some additional ones (e.g. ChatPromptBuilders + ChatGenerators) to perform Image Captioning using an LLM and then more components to convert the Image caption + ImageContent.meta back into a Haystack Document. So we may want to consider adding a `DocumentBuilder` component to help with this process.

This would be valuable both for exploring the full flow and as a material to share with users.

We would like to make two examples:
1. One using image/text embedders like CLIP (preliminary example can be found PR description of https://github.com/deepset-ai/haystack-experimental/pull/336)
2. One using `LLMContentExtractor`/`LLMDocumentEnricher` and only Text Retrieval (preliminary example can be found PR description of https://github.com/deepset-ai/haystack-experimental/pull/338)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create Image Indexing Example #9321

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create Image Indexing Example #9321

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions