Skip to content

FastAPI backend for Arabic Speech Recognition, Text Correction, and Text-to-Speech (TTS), powered by DeepAr & AraFix models.

Notifications You must be signed in to change notification settings

NourhanMahmoudd/Cairo-Dictionary-AI-FastAPI-Backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cairo Dictionary AI – Backend with FastAPI

FastAPI backend for Arabic Speech Recognition, Text Correction, and Text-to-Speech (TTS). Built with our own models — DeepAr for Arabic speech-to-text and AraFix for text correction — both included in this project as Git submodules. These models are also published on our CUAIStudents HuggingFace organization, where you can explore datasets and checkpoints used in training. This backend exposes them as REST APIs for speech recognition, text correction, and voice generation.

Installation

Make sure ffmpeg and python are installed on your system.

macOS:

# Install Homebrew (if not installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install ffmpeg
brew install ffmpeg

# Install Python (if needed)
brew install python@3.11

Ubuntu/Debian:

# Update package list
sudo apt update

# Install ffmpeg
sudo apt install ffmpeg

# Install Python
sudo apt install python3.11 python3.11-venv

Clone with submodules (includes DeepAr + AraFix):

git clone --recurse-submodules -b git-submodules https://github.com/NourhanMahmoudd/Cairo-Dictionary-AI-FastAPI-Backend.git <any_dir>
cd <any_dir>/backend

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt

Environment

Copy example config:

cp .env.example .env

Edit .env if needed. Most values are optional.

Run the API

uvicorn app.main:app --reload --reload-exclude 'logs/*'
  • API Documentation: http://localhost:8000/docs
  • Health Check: http://localhost:8000/health

Endpoints

Speech-to-Text

POST /api/v1/speech/transcribe

  • Upload audio (file)

  • Optional: reference_text for accuracy comparison.

Text Correction & Comparison

  • POST /api/v1/nlp/correct → returns corrected Arabic text + accuracy.

  • POST /api/v1/nlp/compare → compare transcription vs reference text.

Text-to-Speech

POST /api/v1/tts/generate

  • Input: text + optional lang, voice, rate, volume.

  • Returns: MP3 audio stream.

📁 Project Structure

project-root/
|
├── backend/                  # Backend API
│   ├── app/
│   │   ├── helper/          # Helper modules
│   │   ├── main.py          # FastAPI application
│   │   ├── models/          # ML model integration
│   │   │   ├── AraFix-V3.0/ # AraFix model files
│   │   │   ├── DeepAr/      # DeepAr model files
│   │   │   ├── araFix.py    # AraT5 Text Corrector
│   │   │   └── deepAr.py    # Arabic Whisper model
│   │   ├── routes/          # API endpoints
│   │   │   ├── speech.py    # Speech transcription endpoints
│   │   │   ├── text.py      # Text correction endpoint
│   │   │   └── tts.py       # Text-to-speech endpoint
│   │   ├── schemas/         # Pydantic models
│   │   └── utils/           # Utility functions
│   │       └── comparison.py # Comparison utilities
│   ├── logs_config.yaml     # Logging configuration
│   ├── requirements.txt     # Python dependencies
│   └── .env                 # Environment variables
└── README.md               # This file

Made with ❤️ for Arabic speech recognition

About

FastAPI backend for Arabic Speech Recognition, Text Correction, and Text-to-Speech (TTS), powered by DeepAr & AraFix models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages