Skip to content

An LLM-powered augmented generation suite leveraging LangChain, Ollama, and vector databases to enhance response quality through caching, contextual memory, and retrieval-based methods. This collection of Jupyter notebooks showcases modular techniques for building intelligent, memory-efficient generative systems with real-time semantic awareness.

Notifications You must be signed in to change notification settings

DhanaVazanth/Retrieval_Augmented_Generation

Repository files navigation

LLM Augmented Generation

🚀 Augmented Generation Experiments with LLMs

Welcome to Augmented Generation with LLMs, a curated collection of interactive Colab notebooks exploring different approaches to enhance Large Language Model (LLM) outputs through context, cache, and retrieval-based techniques. Built using LangChain, Ollama, Vector Databases, and more, this repo demonstrates powerful patterns to improve LLM performance and memory capabilities.


🧠 Notebooks Included

📌 1. Cache_Augmented_Generation.ipynb

  • Concept: Enhances LLM inference by reusing previously computed results with a smart cache layer.
  • Tech Stack:
    • Pickling for storing & retrieving cached outputs
    • In-memory caching logic
    • Minimal recomputation, blazing speed ⚡
  • ✅ Ideal for repetitive or FAQ-style inputs.

🧩 2. Context_Augmented_Generation.ipynb

  • Concept: Augments responses using relevant context from previous interactions or documents.
  • Tech Stack:
    • Custom context memory
    • LangChain’s prompt management
    • Context-based generation pipeline
  • 📚 Boosts response richness and continuity.

🔍 3. Simple_Retrieval_Augmented_Generation.ipynb

  • Concept: Integrates Vector DBs and Embeddings to retrieve relevant chunks from external data for precise answering.
  • Tech Stack:
    • LangChain + FAISS/Chroma
    • Embedding models via Ollama
    • Retrieval-Augmented Generation (RAG) flow
  • 🔎 Perfect for knowledge-based systems and document Q&A.

🛠️ Tech Stack

Coding

Tech / Tool Purpose
LLMs Generative responses
LangChain Chaining prompts, memory, and tools
Ollama Lightweight local LLMs
Vector DB Fast document retrieval
Embeddings Semantic search capability
Pickling Output caching
Cache Memory Efficient reuse of responses
Jupyter Notebook Interactive development

📸 Screenshots

🧪 Live Testing with Cached Responses

Cache Demo

🔗 Context Injection Flow

Context Demo

📚 Retrieval-Augmented Answer

RAG Demo

💡 Tip: You can replace the above links with your actual GitHub asset paths or embed Colab previews using badges.


🧭 How to Run

  1. Open any notebook in Jupyter Notebook or Google Colab Colab.
  2. Follow the instructions in each cell.
  3. Make sure you have the required models via Ollama and libraries installed (langchain, faiss-cpu, etc.)

🌟 Star this repo if you find it helpful!

Questions or contributions? Open an issue or PR anytime.

About

An LLM-powered augmented generation suite leveraging LangChain, Ollama, and vector databases to enhance response quality through caching, contextual memory, and retrieval-based methods. This collection of Jupyter notebooks showcases modular techniques for building intelligent, memory-efficient generative systems with real-time semantic awareness.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •