π Grafana Dashboard & Prometheus Alerts
This project is a modern Retrieval-Augmented Generation (RAG) system built to simplify document management and information access. By uploading PDF files, it provides intelligent Q&A capabilities over those documents. As an example use case, it showcases information retrieval from HR documents.
Component | Path | Contents |
---|---|---|
π API Layer | src/api.py |
REST API endpoints, session management, monitoring & metrics collection |
π§ Core Logic | src/helper_func.py |
PDF processing & text extraction, RAG workflow orchestration, model management/optimization, caching |
π₯οΈ Web UI | src/app.py |
Streamlit-based UI, document upload & management, Q&A interaction interface, result visualization |
π Logging | src/loki_logger.py |
Loki integration, Trace ID tracking, structured logging, performance analysis |
π Monitoring | β | Grafana dashboards, Prometheus metrics, automatic alert rules, real-time monitoring |
- Intelligent RAG Workflow: retrieval, rerank, reflection, multi-hop support
- Performance: caching, GPU support, asynchronous processing, model warmup/preloading
- Monitoring & Logging: Prometheus, Grafana, Loki integration
- Scalability: containerization
- Python 3.12+ β modern language features and type hints
- Docker & Docker Compose β containerized, reproducible services
- Grafana & Prometheus β metrics collection and visualization
- Loki β structured log aggregation and querying
- FastAPI + Uvicorn β high-performance API layer
- Streamlit β interactive web UI
- uv β fast package/env management and command runner (
uv sync
,uv run
)
- Python 3.12+
- uv (recommended package/env manager)
- Docker & Docker Compose
pip install uv # if not installed
git clone https://github.com/mertafacan/end-to-end-pdf-rag-system.git
cd end-to-end-pdf-rag-system
cp .env.example .env
# Create the environment
uv venv
# Activate the environment
# Linux/Mac source
.venv/bin/activate
# Windows:
.venv\Scripts\activate
# Install dependencies
uv sync
cd config
docker-compose up -d
with uv:
cd src && uv run uvicorn api:app --port 8000 --reload
cd src && uv run streamlit run app.py
- API: http://localhost:8000/docs
- Web: http://localhost:8501
- Grafana: http://localhost:3000
- Prometheus: http://localhost:9090
flowchart TB
U[User] --> C[Streamlit Client]
C -- Upload PDF --> INDEX[POST /index]
INDEX --> CH[PDF / pages / chunks]
CH --> EMB[Embedding]
EMB --> VDB[Qdrant Vector DB]
C -- Question --> ASK[POST /ask]
ASK --> RET[Retriever - Qdrant]
RET --> RER[Optional Reranker - CrossEncoder]
RET --> LG[LangGraph - retrieve / decide / generate / reflect]
RER --> LG
LG --> LLM[LLM - ChatLiteLLM]
LLM --> C
PROM[Prometheus /metrics] --- GRAF[Grafana Dashboard]
LOKI[Loki & Console Logs - trace_id] --- GRAF
src/
βββ api.py # FastAPI endpoints
βββ app.py # Streamlit UI
βββ helper_func.py # Business logic
βββ loki_logger.py # Logging system
βββ uploaded_docs/ # Uploaded documents
config/
βββ alert_rules.yml # Prometheus alert rules
βββ docker-compose.yml # Docker services (Qdrant, Prometheus, Grafana, Loki)
βββ loki.yml # Loki log server configuration
βββ prometheus.yml # Prometheus metrics collection configuration
grafana/
βββ provisioning/
βββ dashboards/
β βββ dashboards.yml # Dashboard provisioning
β βββ PDF rag-loki-logs.json # Loki log dashboard
β βββ rag-system-dashboard.json # System dashboard
βββ datasources/
βββ prometheus.yml # Prometheus & Loki data sources
Exposes REST API endpoints, handles session management, authentication/authorization, and collects metrics.
PDF processing and text extraction, coordination of the RAG workflow, model management/optimization, and caching.
Document upload and management screens, Q&A interaction, and visualization of results (Streamlit).
Structured logging integrated with Loki, Trace ID tracking, and a rich log format for performance analysis.
config/alert_rules.yml
β Prometheus alert rules (FastAPI latency, error rate, Qdrant, disk/RAM).config/prometheus.yml
β Metrics collection (FastAPI, Qdrant, system).config/loki.yml
β Loki logging configuration.config/docker-compose.yml
β Services: Qdrant, Prometheus, Grafana, Loki.
grafana/provisioning/dashboards/*.json
β Automatic dashboard provisioning (logs, system, RAG).grafana/provisioning/datasources/prometheus.yml
β Prometheus & Loki data sources.
- Alerts: latency (>1s), error rate (>10%), Qdrant health, disk/RAM.
- Dashboards: real-time metrics & log visualization.
- Logging: structured logs, Trace ID tracking.
- Metrics: HTTP requests, Qdrant queries, LLM calls, resource usage.
Mert Afacan β https://www.linkedin.com/in/mert-afacan/ β mert0afacan@gmail.com