A full-stack Retrieval-Augmented Generation (RAG) application that allows users to upload documents and ask questions about them using AI. The system uses LangChain, ChromaDB for vector storage, and supports both Anthropic and OpenAI LLMs.
- Document Upload: Upload PDFs, TXT, and DOCX files with drag-and-drop support
- Smart Chunking: Automatically splits documents into optimal chunks for retrieval
- Vector Search: Uses ChromaDB for fast semantic search across documents
- AI-Powered Answers: Leverages Claude or GPT models for context-aware responses
- Source Citations: Every answer includes references to source documents with page numbers
- Conversation Memory: Maintains conversation context for follow-up questions
- Beautiful UI: Modern React interface with real-time updates
1. Upload Document
β
2. Parse & Chunk (with overlap)
β
3. Generate Embeddings (Sentence Transformers)
β
4. Store in Vector DB (ChromaDB)
β
5. User Question β Semantic Search
β
6. Retrieve Relevant Chunks
β
7. LLM Context + Question β Answer
β
8. Return Answer + Source Citations
- FastAPI 0.109.0 - High-performance Python web framework
- LangChain 0.1.0 - RAG pipeline orchestration
- ChromaDB 0.4.22 - Vector database for embeddings
- Sentence Transformers 2.2.2 - Local embedding generation
- Pydantic 2.5.0 - Data validation and settings
- Anthropic/OpenAI - LLM providers
- React 18.2.0 - Modern UI library
- TypeScript 5.3.3 - Type-safe development
- Vite 5.0.11 - Fast build tool
- Axios 1.6.5 - HTTP client
- React Dropzone 14.2.3 - File upload with drag & drop
- React Markdown 9.0.1 - Render formatted responses
- Lucide React - Beautiful icons
KnowledgeAssist RAG/
βββ backend/ # FastAPI backend
β βββ app/
β β βββ api/
β β β βββ routes/ # API endpoints (upload, chat, documents)
β β β βββ models/ # Request/response models
β β βββ services/
β β β βββ vector_store.py # ChromaDB integration
β β β βββ document_processor.py # File parsing & chunking
β β β βββ rag_service.py # RAG pipeline
β β βββ core/ # Configuration
β β βββ storage/ # File and vector storage
β βββ requirements.txt
β βββ .env.example
β
βββ frontend/ # React frontend
β βββ src/
β β βββ components/
β β β βββ FileUploader.tsx # Drag-and-drop upload
β β β βββ ChatWindow.tsx # Chat interface
β β β βββ Message.tsx # Message display
β β β βββ SourceCitation.tsx # Source references
β β βββ services/ # API client
β β βββ types/ # TypeScript definitions
β β βββ styles/ # CSS files
β βββ package.json
β βββ vite.config.ts
β
βββ README.md
Before you begin, ensure you have:
Open a terminal and run:
# Navigate to backend directory
cd backend
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create environment file
cp .env.example .envEdit the .env file and add your API key:
# For Anthropic (Claude)
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxx
# OR for OpenAI (GPT)
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-xxxxxxxxxxxxxStart the backend server:
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000You should see:
INFO: Uvicorn running on http://0.0.0.0:8000
INFO: Application startup complete.
Keep this terminal running!
Open a new terminal window:
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# (Optional) Create environment file
cp .env.example .env
# Start development server
npm run devYou should see:
VITE v5.0.11 ready in 500 ms
β Local: http://localhost:5173/
Open your browser and go to: http://localhost:5173
You should see the Knowledge Assist RAG interface! π
- Drag and drop PDF, TXT, or DOCX files into the upload area
- Wait for processing (you'll see the number of chunks created)
- Multiple files can be uploaded
- Type your question in the chat input
- Press Enter or click Send
- The AI will respond based on your documents
- Click on the sources toggle to see which document chunks were used
- Each source shows:
- Document name
- Page number (for PDFs)
- Relevant text snippet
- Continue the conversation naturally
- The system maintains context for related questions
After uploading a document:
- "What are the main topics discussed in this document?"
- "Can you summarize the key points?"
- "What does the document say about [specific topic]?"
- "List the important dates or numbers mentioned."
- "What conclusions does the author draw?"
Visit http://localhost:8000/docs for interactive API documentation (Swagger UI).
POST /api/v1/upload/- Upload a single filePOST /api/v1/upload/batch- Upload multiple files
POST /api/v1/chat/- Send a question and get an answerDELETE /api/v1/chat/conversation/{id}- Clear conversation history
GET /api/v1/documents/- List uploaded documentsDELETE /api/v1/documents/{id}- Delete a document
| Variable | Description | Default |
|---|---|---|
LLM_PROVIDER |
AI provider (anthropic/openai) | anthropic |
ANTHROPIC_API_KEY |
Anthropic API key | - |
OPENAI_API_KEY |
OpenAI API key | - |
LLM_MODEL |
Model to use | claude-3-5-sonnet-20241022 |
CHUNK_SIZE |
Text chunk size | 1000 |
CHUNK_OVERLAP |
Overlap between chunks | 200 |
RETRIEVAL_K |
Number of chunks to retrieve | 4 |
MAX_UPLOAD_SIZE |
Max file size in bytes | 10485760 (10MB) |
EMBEDDING_MODEL |
Embedding model | sentence-transformers/all-MiniLM-L6-v2 |
| Variable | Description | Default |
|---|---|---|
VITE_API_BASE_URL |
Backend API URL | http://localhost:8000 |
Problem: ModuleNotFoundError
- Solution: Make sure you activated the virtual environment and ran
pip install -r requirements.txt
Problem: ChromaDB initialization fails
- Solution: Delete
backend/app/storage/chroma_db/and restart
Problem: Out of memory when loading embeddings
- Solution: Use a smaller embedding model or reduce batch size
Problem: Invalid API key
- Solution: Check that you've correctly set your API key in
backend/.env
Problem: ENOENT: no such file or directory
- Solution: Make sure you're in the
frontenddirectory and rannpm install
Problem: CORS errors
- Solution: Ensure backend
ALLOWED_ORIGINSincludes your frontend URL
Problem: File upload fails
- Solution: Check file size limits and supported file types
Problem: Cannot connect to backend
- Solution: Ensure the backend is running on port 8000
Error: Rate limit exceeded
- Solution: You've hit your API provider's rate limit. Wait a few minutes and try again.
Run tests:
cd backend
pytestFormat code:
black app/Type checking:
cd frontend
npm run buildLinting:
npm run lintBefore deploying to production:
- Add authentication/authorization
- Implement rate limiting
- Set up database for document metadata (PostgreSQL)
- Use production vector store (Pinecone, Weaviate)
- Add file virus scanning
- Implement user quotas
- Set up HTTPS
- Configure CDN for frontend
- Add error tracking (Sentry)
- Set up CI/CD pipeline
- Add comprehensive test suite
- Implement caching (Redis)
- Configure backup strategy
- Set up a production database (PostgreSQL recommended)
- Use a production-grade vector store (Pinecone, Weaviate, or managed ChromaDB)
- Use a reverse proxy (Nginx)
- Set up HTTPS with SSL certificates
Example with Docker:
cd backend
docker build -t rag-backend .
docker run -p 8000:8000 --env-file .env rag-backend- Build for production:
cd frontend
npm run build-
Serve the
distfolder with a static file server or CDN -
Update
VITE_API_BASE_URLto point to your production API
β FastAPI application with CORS and lifecycle management β Pydantic request/response models with validation β File upload endpoints (single and batch) β Chat endpoint with conversation support β Document management endpoints β VectorStoreService: ChromaDB integration β DocumentProcessor: PDF, TXT, DOCX support with chunking β RAGService: Complete LangChain RAG pipeline β Embedding generation with Sentence Transformers β Support for both Anthropic and OpenAI LLMs β Comprehensive error handling
β FileUploader: Drag-and-drop with react-dropzone β ChatWindow: Full chat interface with auto-scroll β Message: Individual message display with markdown β SourceCitation: Expandable source references β Type-safe API client with Axios β Loading states and error handling β Modern, responsive CSS design β Mobile-friendly layout
β PDF support with page numbers β TXT and DOCX support β File size and type validation β Automatic text chunking with overlap β Vector embedding generation β Semantic search with ChromaDB β Conversational context management β Source citations with metadata β Markdown response formatting β Typing indicators
- Document Storage: Files are stored locally (use S3/cloud storage for production)
- Vector Store: ChromaDB is local (use managed service for scale)
- No Authentication: Open access (add auth for production)
- No Persistence: Conversation history is in-memory
- Rate Limiting: None implemented (add for production)
- Backend Setup: 5-10 minutes
- Frontend Setup: 5 minutes
- First Document Upload: 1-2 minutes (embeddings download)
- Total Time to Running: ~15 minutes
- API Documentation: http://localhost:8000/docs
- Frontend Dev Server: http://localhost:5173
- Backend Health Check: http://localhost:8000/health
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License.
- Built with LangChain
- Vector storage by ChromaDB
- Embeddings by Sentence Transformers
- UI components by Lucide Icons
For issues and questions:
- Check the troubleshooting section above
- Review the API documentation
- Open an issue on GitHub
To stop the servers:
- Backend: Press
Ctrl+Cin the backend terminal - Frontend: Press
Ctrl+Cin the frontend terminal
To deactivate the Python virtual environment:
deactivate