llama-cpp
Here are 189 public repositories matching this topic...
Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
- 
            Updated
            
Jul 28, 2025  - Dart
 
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
- 
            Updated
            
Oct 26, 2025  - TypeScript
 
Build and run AI agents using Docker Compose. A collection of ready-to-use examples for orchestrating open-source LLMs, tools, and agent runtimes.
- 
            Updated
            
Oct 24, 2025  - TypeScript
 
LLama.cpp rust bindings
- 
            Updated
            
Jun 27, 2024  - Rust
 
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
- 
            Updated
            
May 21, 2025  - Python
 
This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
- 
            Updated
            
Jul 12, 2024  - Python
 
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
- 
            Updated
            
Aug 18, 2025  - Go
 
Local ML voice chat using high-end models.
- 
            Updated
            
Oct 25, 2025  - C++
 
Making offline AI models accessible to all types of edge devices.
- 
            Updated
            
Feb 12, 2024  - Dart
 
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
- 
            Updated
            
Jun 10, 2023  - Python
 
Improve this page
Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."