Skip to content

A comprehensive real-time emotion recognition system combining facial and textual analysis with Furhat robot integration for social robotics applications.

Notifications You must be signed in to change notification settings

kudosscience/fer_and_ter_model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

74 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Multimodal Emotion Recognition System

A comprehensive, real-time emotion recognition system that combines Facial Emotion Recognition (FER) and Textual Emotion Recognition (TER) with advanced multimodal fusion capabilities and Furhat robot integration. This system provides accurate emotion detection through computer vision and natural language processing, with support for interactive robotics applications.

🌟 Key Features

  • 🎭 Facial Emotion Recognition: Real-time emotion detection from camera feed using CNN models trained on FER2013
  • πŸ’¬ Textual Emotion Recognition: Voice-to-text emotion analysis using DistilBERT-based models
  • πŸ”€ Multimodal Fusion: Advanced fusion strategies (confidence-based, weighted average, and formula-based)
  • πŸ€– Furhat Integration: Complete social robot platform integration with real-time emotion feedback
  • ⚑ Real-time Processing: Live emotion recognition with interactive GUI interfaces
  • πŸ›‘οΈ Robust Architecture: Comprehensive error handling and fallback mechanisms
  • πŸ“Š Comprehensive Testing: Full test suite with system health checks and validation

πŸ§ͺ Testing & Validation

The system includes comprehensive testing capabilities to ensure reliability:

Run All Tests

# Run comprehensive system tests
python tests/test_multimodal_system.py    # Complete system validation
python tests/test_fer_model.py           # FER component testing
python tests/test_furhat_integration.py  # Furhat integration testing
python tests/test_formula_fusion.py      # Fusion algorithm testing
python tests/test_corrected_formula.py   # Mathematical validation

System Health Check

# Validate system configuration and dependencies
python demos/demo_multimodal_usage.py

The test suite validates:

  • Model loading and initialization
  • Camera and microphone availability
  • Emotion fusion algorithm accuracy
  • Robot integration functionality
  • Error handling and fallback mechanisms

πŸ“ Project Architecture

fer_and_ter_model/
β”œβ”€β”€ src/                          # Source code
β”‚   β”œβ”€β”€ fer/                      # Facial Emotion Recognition
β”‚   β”‚   └── camera_fer_inference.py
β”‚   β”œβ”€β”€ ter/                      # Textual Emotion Recognition  
β”‚   β”‚   β”œβ”€β”€ voice_ter_inference.py
β”‚   β”‚   └── setup_voice_ter.py
β”‚   β”œβ”€β”€ multimodal/               # Multimodal Fusion
β”‚   β”‚   └── multimodal_emotion_inference.py
β”‚   β”œβ”€β”€ furhat/                   # Furhat Robot Integration
β”‚   β”‚   └── furhat_multimodal_emotion_inference.py
β”‚   └── utils/                    # Shared utilities
β”œβ”€β”€ models/                       # Trained models
β”‚   β”œβ”€β”€ fer2013_final_model.pth
β”‚   └── ter_distilbert_model/
β”œβ”€β”€ datasets/                     # Dataset files
β”‚   └── multimodal_emotion_dataset.json
β”œβ”€β”€ notebooks/                    # Jupyter notebooks
β”‚   β”œβ”€β”€ fer2013_model_training.ipynb
β”‚   β”œβ”€β”€ multimodal_emotion_fusion.ipynb
β”‚   β”œβ”€β”€ multimodal_emotion_recognition.ipynb
β”‚   └── textual_emotion_recognition_distilbert.ipynb
β”œβ”€β”€ tests/                        # Test scripts
β”‚   β”œβ”€β”€ test_corrected_formula.py
β”‚   β”œβ”€β”€ test_fer_model.py
β”‚   β”œβ”€β”€ test_formula_fusion.py
β”‚   β”œβ”€β”€ test_furhat_integration.py
β”‚   └── test_multimodal_system.py
β”œβ”€β”€ demos/                        # Demo scripts
β”‚   β”œβ”€β”€ demo_furhat_usage.py
β”‚   β”œβ”€β”€ demo_multimodal_usage.py
β”‚   β”œβ”€β”€ demo_usage.py
β”‚   └── demo_voice_ter.py
β”œβ”€β”€ docs/                         # Documentation
β”‚   β”œβ”€β”€ DATASET_SETUP.md
β”‚   β”œβ”€β”€ FER_PROJECT_SUMMARY.md
β”‚   β”œβ”€β”€ FURHAT_INTEGRATION_SUMMARY.md
β”‚   β”œβ”€β”€ IMPLEMENTATION_SUMMARY.md
β”‚   β”œβ”€β”€ JUNIE.md
β”‚   └── README_*.md files
β”œβ”€β”€ requirements/                 # Requirements files
β”‚   β”œβ”€β”€ requirements_backup.txt
β”‚   β”œβ”€β”€ requirements_camera_inference.txt
β”‚   β”œβ”€β”€ requirements_furhat.txt
β”‚   β”œβ”€β”€ requirements_multimodal.txt
β”‚   └── requirements_voice_ter.txt
└── README.md                     # This file

πŸš€ Quick Start

Prerequisites

  • Python 3.8+ (3.11+ recommended)
  • Webcam for facial emotion recognition
  • Microphone for voice/text emotion recognition
  • GPU support recommended for optimal performance

Installation

  1. Clone the repository:

    git clone https://github.com/kudosscience/fer_and_ter_model.git
    cd fer_and_ter_model
  2. Install the package:

    # Install with all components
    pip install -e ".[all]"
    
    # OR install specific components:
    pip install -e ".[multimodal]"    # For multimodal processing
    pip install -e ".[furhat]"        # For Furhat robot integration
    pip install -e ".[camera_inference]"  # For FER only
    pip install -e ".[voice_ter]"     # For TER only
  3. Alternative: Install from requirements files:

    # Choose based on your use case:
    pip install -r requirements/requirements_multimodal.txt     # Recommended
    pip install -r requirements/requirements_furhat.txt         # For robot integration
    pip install -r requirements/requirements_camera_inference.txt  # FER only
    pip install -r requirements/requirements_voice_ter.txt      # TER only

Quick Demo

# Run the comprehensive multimodal demo
python demos/demo_multimodal_usage.py

# Try individual components
python demos/demo_usage.py           # Basic FER demo
python demos/demo_voice_ter.py       # TER demo  
python demos/demo_furhat_usage.py    # Furhat integration demo

Console Commands

After installation, you can use these console commands:

# Run individual components
fer-camera              # Launch facial emotion recognition
ter-voice              # Launch textual emotion recognition  
multimodal-emotion     # Launch multimodal system
furhat-emotion         # Launch Furhat integration

🧩 System Components

🎭 FER (Facial Emotion Recognition)

  • Real-time Processing: Live camera feed emotion detection with GUI overlay
  • CNN Architecture: Deep learning model trained on FER2013 dataset
  • 7 Emotion Classes: Happy, Sad, Angry, Fear, Surprise, Disgust, Neutral
  • High Accuracy: Optimized model with robust preprocessing pipeline
  • Fallback Support: Graceful handling of camera unavailability

Usage:

# Direct script execution
python src/fer/camera_fer_inference.py

# Console command
fer-camera

πŸ’¬ TER (Textual Emotion Recognition)

  • Voice-to-Text Pipeline: Real-time speech recognition and emotion analysis
  • DistilBERT Model: State-of-the-art transformer-based emotion classification
  • Multi-emotion Support: Comprehensive emotion category detection
  • Robust Processing: Advanced text preprocessing and normalization
  • Audio Fallback: Multiple audio input handling strategies

Usage:

# Direct script execution  
python src/ter/voice_ter_inference.py

# Console command
ter-voice

πŸ”€ Multimodal Fusion System

The core innovation of this system - combines FER and TER for superior accuracy:

  • Multiple Fusion Strategies:

    • Confidence-based: Selects prediction with highest confidence score
    • Weighted Average: Combines predictions with 60% facial, 40% textual weighting
    • Formula-based: Advanced mathematical fusion using custom algorithm
  • Real-time Integration: Simultaneous processing of visual and audio streams

  • Adaptive Fallback: Works with single modality when needed

  • Interactive GUI: Live visualization of both streams and fusion results

Usage:

# Default multimodal processing
python src/multimodal/multimodal_emotion_inference.py

# With specific fusion strategy
python src/multimodal/multimodal_emotion_inference.py --fusion confidence_based
python src/multimodal/multimodal_emotion_inference.py --fusion weighted_average  
python src/multimodal/multimodal_emotion_inference.py --fusion formula_based

# Console command
multimodal-emotion

πŸ€– Furhat Robot Integration

Complete social robotics platform integration for interactive emotion recognition:

  • Furhat Remote API: Official SDK integration for robot communication
  • Interactive Responses: Robot gestures, speech, and LED feedback based on detected emotions
  • Voice Integration: Uses robot's microphone for natural interaction
  • Real-time Feedback: Immediate emotional responses and social cues
  • Robust Connection: Graceful fallback when robot unavailable

Robot Capabilities:

  • Emotional gesture mapping (BigSmile, Frown, Surprised, etc.)
  • LED color changes reflecting emotional states
  • Speech synthesis for emotion acknowledgment
  • Interactive conversation flow

Usage:

# Furhat integration (requires robot connection)
python src/furhat/furhat_multimodal_emotion_inference.py

# With fusion strategy
python src/furhat/furhat_multimodal_emotion_inference.py --fusion formula_based

# Console command  
furhat-emotion

πŸ“š Comprehensive Documentation

Detailed documentation for each component is available in the docs/ directory:

Core Documentation

Component-Specific Guides

βš™οΈ Advanced Configuration

Custom Model Usage

# Use custom FER model
python src/multimodal/multimodal_emotion_inference.py \
    --fer_model ./path/to/custom_fer_model.pth

# Use custom TER model  
python src/multimodal/multimodal_emotion_inference.py \
    --ter_model ./path/to/custom_ter_model/

Fusion Strategy Selection

# Confidence-based fusion (chooses most confident prediction)
python src/multimodal/multimodal_emotion_inference.py --fusion confidence_based

# Weighted average fusion (60% facial, 40% textual)  
python src/multimodal/multimodal_emotion_inference.py --fusion weighted_average

# Formula-based fusion (mathematical optimization)
python src/multimodal/multimodal_emotion_inference.py --fusion formula_based

Performance Optimization

# GPU acceleration (if available)
export CUDA_VISIBLE_DEVICES=0
python src/multimodal/multimodal_emotion_inference.py

# CPU-only mode
export CUDA_VISIBLE_DEVICES=""
python src/multimodal/multimodal_emotion_inference.py

πŸ”§ Development & Extension

Project Structure Overview

The system follows a modular architecture allowing easy extension and modification:

  • src/ - Core source code with modular components
  • models/ - Pre-trained models (FER CNN, TER DistilBERT)
  • datasets/ - Training and evaluation datasets
  • tests/ - Comprehensive test suite
  • demos/ - Usage examples and demonstrations
  • docs/ - Detailed technical documentation
  • requirements/ - Component-specific dependency management

Adding New Components

  1. Create module in appropriate src/ subdirectory
  2. Add requirements to relevant requirements file
  3. Create tests in tests/ directory
  4. Add demo in demos/ directory
  5. Update documentation in docs/

Contributing

This project uses professional development practices:

  • Modular Design: Each component is self-contained and reusable
  • Comprehensive Testing: Full test coverage with validation scripts
  • Documentation: Extensive documentation for all components
  • Package Management: Proper Python packaging with setuptools
  • Console Integration: Command-line tools for easy usage

πŸ› οΈ Troubleshooting

Common Issues

Camera not detected:

# Test camera access
python -c "import cv2; print('Camera available:', cv2.VideoCapture(0).isOpened())"

Microphone issues:

# Test microphone access  
python -c "import speech_recognition as sr; print('Microphone available:', len(sr.Microphone.list_microphone_names()) > 0)"

Model loading errors:

  • Ensure models are downloaded and in models/ directory
  • Check file permissions and paths
  • Verify CUDA availability for GPU models

Furhat connection issues:

  • Verify robot IP address and port
  • Check network connectivity
  • Ensure Furhat Remote API service is running

Performance Tips

  • Use GPU acceleration when available
  • Close other applications using camera/microphone
  • Ensure adequate lighting for facial recognition
  • Use external microphone for better voice recognition
  • Run system health check before important usage

πŸ“Š Model Information

FER Model (Facial Emotion Recognition)

  • Architecture: Custom CNN trained on FER2013
  • Input: 48x48 grayscale facial images
  • Output: 7 emotion classes (Happy, Sad, Angry, Fear, Surprise, Disgust, Neutral)
  • Accuracy: Optimized for real-time performance with robust preprocessing

TER Model (Textual Emotion Recognition)

  • Architecture: DistilBERT-based transformer
  • Input: Text transcribed from speech
  • Output: Multi-dimensional emotion classification
  • Features: Advanced text preprocessing and normalization

Fusion Algorithms

  • Confidence-based: Selects highest confidence prediction
  • Weighted Average: Optimized 60/40 facial/textual weighting
  • Formula-based: Mathematical fusion using correlation analysis

🎯 Use Cases

  • Research: Emotion recognition research and experimentation
  • Education: Teaching multimodal AI and emotion recognition
  • Healthcare: Patient emotion monitoring and therapy assistance
  • Human-Computer Interaction: Emotional interfaces and feedback systems
  • Social Robotics: Interactive robots with emotional intelligence
  • Accessibility: Emotion-aware assistive technologies

πŸ“ˆ Performance Metrics

The system has been validated across multiple scenarios:

  • Real-time Processing: < 100ms latency for combined FER+TER
  • Accuracy: Improved performance through multimodal fusion
  • Robustness: Graceful degradation with single modality
  • Scalability: Modular architecture supports easy extension

πŸ”— Related Projects

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ‘€ Author

Henry Ward

πŸ™ Acknowledgments

  • FER2013 dataset contributors
  • Hugging Face Transformers library
  • OpenCV community
  • Furhat Robotics platform
  • PyTorch and scikit-learn teams

πŸ“ž Support

For issues, questions, or contributions:

  1. Check the Issues page
  2. Review the comprehensive documentation in docs/
  3. Run the system health check: python demos/demo_multimodal_usage.py
  4. Create a new issue with detailed information

Built with ❀️ for multimodal emotion recognition research and applications

Releases

No releases published

Packages

No packages published