Skip to content

End-to-end PDF RAG: FastAPI + Streamlit UI, Qdrant, and RAG workflows powered by LangChain/LangGraph. Dockerized with caching, optional GPU, and Prometheus/Grafana/Loki.

Notifications You must be signed in to change notification settings

mertafacan/end-to-end-pdf-rag-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ End-to-End PDF RAG System

Streamlit Demo
🎬 Streamlit Demo

Grafana Dashboard & Prometheus Alerts
πŸ“Š Grafana Dashboard & Prometheus Alerts


Python 3.12+ FastAPI Streamlit App Docker Qdrant (Vector DB) Grafana Dashboard Prometheus Metrics LangChain LangGraph Loki Logs

πŸ“Š Project Overview

This project is a modern Retrieval-Augmented Generation (RAG) system built to simplify document management and information access. By uploading PDF files, it provides intelligent Q&A capabilities over those documents. As an example use case, it showcases information retrieval from HR documents.


⚑ Core Components

Component Path Contents
🌐 API Layer src/api.py REST API endpoints, session management, monitoring & metrics collection
🧠 Core Logic src/helper_func.py PDF processing & text extraction, RAG workflow orchestration, model management/optimization, caching
πŸ–₯️ Web UI src/app.py Streamlit-based UI, document upload & management, Q&A interaction interface, result visualization
πŸ“ Logging src/loki_logger.py Loki integration, Trace ID tracking, structured logging, performance analysis
πŸ“Š Monitoring β€” Grafana dashboards, Prometheus metrics, automatic alert rules, real-time monitoring

🎯 Goals & Features

  • Intelligent RAG Workflow: retrieval, rerank, reflection, multi-hop support
  • Performance: caching, GPU support, asynchronous processing, model warmup/preloading
  • Monitoring & Logging: Prometheus, Grafana, Loki integration
  • Scalability: containerization

✨ Tech Stack

πŸ—οΈ Architecture & Infrastructure

  • Python 3.12+ β€” modern language features and type hints
  • Docker & Docker Compose β€” containerized, reproducible services
  • Grafana & Prometheus β€” metrics collection and visualization
  • Loki β€” structured log aggregation and querying

🌐 Application Layer

  • FastAPI + Uvicorn β€” high-performance API layer
  • Streamlit β€” interactive web UI

πŸ’» Development Environment

  • uv β€” fast package/env management and command runner (uv sync, uv run)

πŸš€ Setup & Run

Requirements

  • Python 3.12+
  • uv (recommended package/env manager)
  • Docker & Docker Compose
pip install uv # if not installed

Steps

1) Clone the Repository

git clone https://github.com/mertafacan/end-to-end-pdf-rag-system.git
cd end-to-end-pdf-rag-system

2) Configure Environment Variables

cp .env.example .env

3) Install Dependencies (uv)

# Create the environment 
uv venv

# Activate the environment

# Linux/Mac source
.venv/bin/activate

# Windows:
.venv\Scripts\activate

# Install dependencies
uv sync

4) Start Docker Services

cd config
docker-compose up -d

5) Start the Application

with uv:

cd src && uv run uvicorn api:app --port 8000 --reload
cd src && uv run streamlit run app.py

Available Services


πŸ—οΈ Project Architecture

πŸ”§ Architecture

flowchart TB
  U[User] --> C[Streamlit Client]

  C -- Upload PDF --> INDEX[POST /index]
  INDEX --> CH[PDF / pages / chunks]
  CH --> EMB[Embedding]
  EMB --> VDB[Qdrant Vector DB]

  C -- Question --> ASK[POST /ask]
  ASK --> RET[Retriever - Qdrant]
  RET --> RER[Optional Reranker - CrossEncoder]
  RET --> LG[LangGraph - retrieve / decide / generate / reflect]
  RER --> LG
  LG --> LLM[LLM - ChatLiteLLM]
  LLM --> C

  PROM[Prometheus /metrics] --- GRAF[Grafana Dashboard]
  LOKI[Loki & Console Logs - trace_id] --- GRAF
Loading

πŸ“ Directory Structure

src/
β”œβ”€β”€ api.py              # FastAPI endpoints
β”œβ”€β”€ app.py              # Streamlit UI
β”œβ”€β”€ helper_func.py      # Business logic
β”œβ”€β”€ loki_logger.py      # Logging system
└── uploaded_docs/      # Uploaded documents

config/
β”œβ”€β”€ alert_rules.yml     # Prometheus alert rules
β”œβ”€β”€ docker-compose.yml  # Docker services (Qdrant, Prometheus, Grafana, Loki)
β”œβ”€β”€ loki.yml            # Loki log server configuration
└── prometheus.yml      # Prometheus metrics collection configuration

grafana/
└── provisioning/
    β”œβ”€β”€ dashboards/
    β”‚   β”œβ”€β”€ dashboards.yml              # Dashboard provisioning
    β”‚   β”œβ”€β”€ PDF rag-loki-logs.json      # Loki log dashboard
    β”‚   └── rag-system-dashboard.json   # System dashboard
    └── datasources/
        └── prometheus.yml              # Prometheus & Loki data sources

🧩 Core Components & Responsibilities

src/api.py β€” API Layer

Exposes REST API endpoints, handles session management, authentication/authorization, and collects metrics.

src/helper_func.py β€” Business Logic

PDF processing and text extraction, coordination of the RAG workflow, model management/optimization, and caching.

src/app.py β€” Web UI

Document upload and management screens, Q&A interaction, and visualization of results (Streamlit).

src/loki_logger.py β€” Logging

Structured logging integrated with Loki, Trace ID tracking, and a rich log format for performance analysis.

Configuration

  • config/alert_rules.yml β€” Prometheus alert rules (FastAPI latency, error rate, Qdrant, disk/RAM).
  • config/prometheus.yml β€” Metrics collection (FastAPI, Qdrant, system).
  • config/loki.yml β€” Loki logging configuration.
  • config/docker-compose.yml β€” Services: Qdrant, Prometheus, Grafana, Loki.

Grafana

  • grafana/provisioning/dashboards/*.json β€” Automatic dashboard provisioning (logs, system, RAG).
  • grafana/provisioning/datasources/prometheus.yml β€” Prometheus & Loki data sources.

Highlights

  • Alerts: latency (>1s), error rate (>10%), Qdrant health, disk/RAM.
  • Dashboards: real-time metrics & log visualization.
  • Logging: structured logs, Trace ID tracking.
  • Metrics: HTTP requests, Qdrant queries, LLM calls, resource usage.

πŸ“¬ Contact

Mert Afacan – https://www.linkedin.com/in/mert-afacan/ – mert0afacan@gmail.com

About

End-to-end PDF RAG: FastAPI + Streamlit UI, Qdrant, and RAG workflows powered by LangChain/LangGraph. Dockerized with caching, optional GPU, and Prometheus/Grafana/Loki.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages