vision-ai

Here are 36 public repositories matching this topic...

GetStream / Vision-Agents

Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.

ai realtime tts agents stt ai-agents video-ai voice-ai vision-ai agentic-ai video-agents

Updated Nov 3, 2025
Python

instill-ai / console

Star

📺 Instill Console for 🔮 Instill Core: https://github.com/instill-ai/instill-core

console ui computer-vision deep-learning frontend image-classification object-detection structured-data data-pipeline no-code model-serving vdp unstructured-data data-connector vision-ai versatile-data-pipeline

Updated Nov 2, 2025
TypeScript

athrael-soju / Snappy

Star

Snappy: A vision-first document retrieval using ColPali embeddings - Search PDFs with FastAPI, Next.js 16, Qdrant, and React 19.2

Updated Nov 3, 2025
TypeScript

yihong1120 / YOLOv8-License-Plate-Insights

Star

This repository demonstrates YOLOv8-based license plate recognition with GCP Vision AI integration, enabling versatile real-world applications like vehicle identification, traffic monitoring, and geospatial analysis while capturing vital media metadata for enhanced insights.

Updated Feb 1, 2024
Jupyter Notebook

pej0918 / SK-RD4AD

Star

[CVPRW'25] Official Code For "SK-RD4AD: Skip-Connected Reverse Distillation for One-Class Anomaly Detection"

computer-vision anomaly-detection industrial-ai one-class-classification vision-ai skip-connection cvpr-workshop-2025

Updated Jul 7, 2025
Python

choudaryhussainali / MCQ_Grading_Bot

Star

MCQ_Grading_Bot is an AI-powered tool that grades solved MCQ exam sheets from images using Gemini Vision. It extracts student info, checks answers, calculates score, and displays detailed results—all through a simple Gradio interface in Colab.

python machine-learning ocr image-processing pillow edtech gradio educational-technology ai-project ai-in-education vision-ai google-generative-ai grading-bot automated-grading mcq-grading exam-evaluation exam-checking mcq-checker answer-sheet-evaluation

Updated Jun 19, 2025
Jupyter Notebook

josharsh / md-pdf-md

Star

Bidirectional Markdown↔PDF converter with AI-powered vision. MD→PDF with beautiful themes, PDF→MD with LLaVA - open source & privacy-first

Updated Oct 30, 2025
TypeScript

Navy10021 / MDDenseResNet

Star

MDDenseResNet : Enhanced Malware Detection Using DNNs

deep-neural-networks deep-learning-algorithms malware-analysis cyber-security malware-detection-framework vision-ai

Updated Jul 27, 2025
Jupyter Notebook

ShihabYasin / STGAN

Star

STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing

python research gan vision-ai

Updated Apr 19, 2020
Python

dineshtripathi / documind-engineering

Star

Hybrid AI orchestration stack combining local LLMs (Ollama), vector search (Qdrant), and Azure AI Foundry for scalable RAG, Agentic AI, and Vision. Built with .NET 8 and Python.

python dotnet routing inference orchestrator open-ai rag vision-ai qdrant hybrid-ai ollama qwen mistral-7b agentic-ai azure-ai-foundry phi3-mini

Updated Oct 12, 2025
Python

simonyang0608 / DeeperSimon

Star

General vision AI defect detection engine for MLops process/simulations

python opencv detection pytorch classification segmentation shell-scripting defect-detection mlops vision-ai

Updated Mar 5, 2025
Python

go-park-mail-ru / 2023_2_OND_team

Star

Backend проекта Pinterest команды OND team

Updated Mar 2, 2024
Go

s59mz / eagle-eye-ai

Star

Eagle-Eye-AI is a project designed for the Kria KR260 board that enables AI-driven camera tracking and face detection.

machine-learning deep-learning ros2 follow-camera vision-ai kria zynq-ultrascale kr260

Updated Sep 7, 2025
Tcl

Supershivam07 / Vision-AI

Star

Updated Aug 18, 2025
Jupyter Notebook

srvaroa / ai-camera

Sponsor

Star

People detection and notifications based on the Raspberry Pi + AI Camera

ai raspberry-pi-camera rasbperry-pi vision-ai

Updated Feb 3, 2025
Python

dj-ayush / MetaSynAI

Star

MetaSynAI is an AI‑driven accessibility framework that enables seamless interaction through voice commands, hand gestures, and eye‑tracking, offering a modern and inclusive way to control web interfaces.