Cammotion
2025Creator
Overview
Cammotion is a sophisticated real-time emotion detection application powered by the EMO-AffectNetModel, a state-of-the-art ResNet50-based neural network achieving 66.49% accuracy on the AffectNet validation set. The application features an immersive circular camera interface with dynamic starburst visualizations that represent detected emotions in real-time.
Built with a modern Django backend and Vite-powered frontend, Cammotion processes your camera feed locally on your device, ensuring complete privacy while delivering smooth performance at ~10 FPS. The system supports multi-face detection and provides individual emotion analysis for each detected face.
This project demonstrates the practical application of cross-corpus visual emotion recognition research, integrating models trained on multiple emotion datasets (AffectNet, RAVDESS, CREMA-D, SAVEE, RAMAS, IEMOCAP, Aff-Wild2) into a production-ready web application.
Architecture
Features
Immersive Circular Interface: Dynamic circular camera feed with starburst emotion beams extending outward
Multi-Face Support: Detects and analyzes emotions for multiple faces simultaneously with individual tracking
Real-Time Performance: ~10 FPS processing with live bandwidth and frame count monitoring
Color-Coded Emotions: Each emotion has a unique color with beam length reflecting confidence scores
Advanced Face Detection: RetinaFace integration via batch-face library with OpenCV Haar cascade fallback
Privacy-First Design: All emotion recognition processed locally - no data sent to external servers
Hot Module Replacement: Vite-powered development with instant updates during development
Tech Stack
Backend
- • Django 4.2+ with Django REST Framework
- • Python 3.8+
- • Uvicorn/Gunicorn for serving
- • CORS headers for API access
Machine Learning
- • TensorFlow/Keras for inference
- • EMO-AffectNetModel (ResNet50)
- • VGGFace2 preprocessing
- • Trained on 7 emotion datasets
Computer Vision
- • batch-face (RetinaFace wrapper)
- • OpenCV for video processing
- • Haar cascade fallback detection
- • 224x224 RGB image input
Frontend
- • Vite for build tooling
- • Vanilla JavaScript ES6+
- • WebRTC for camera access
- • Canvas 2D API for rendering
How It Works
1. Frame Capture & Encoding
The Vite frontend captures video frames from your camera using WebRTC APIs. Each frame is rendered to a canvas element and encoded as base64 for transmission to the Django backend via POST request to the /api/detect/ endpoint.
2. Face Detection Pipeline
The backend uses RetinaFace (via batch-face library) as the primary face detector, with OpenCV's Haar cascade classifier as a fallback. RetinaFace provides superior accuracy for challenging poses and lighting conditions, detecting facial landmarks and bounding boxes for all visible faces.
3. Preprocessing & Inference
Each detected face is cropped, resized to 224x224 pixels, and preprocessed using VGGFace2 normalization. The preprocessed image is fed into the EMO-AffectNetModel (ResNet50 architecture), which performs forward inference to generate a 7-dimensional probability distribution via softmax activation over the emotion categories.
4. Real-Time Visualization
The API returns emotion predictions with confidence scores for each detected face. The frontend renders these results in a circular interface with color-coded starburst beams extending from each face. Beam length corresponds to confidence scores, creating an immersive visualization that updates at ~10 FPS. Bandwidth and frame rate metrics are displayed in real-time.
Performance & Research
Model Accuracy
The EMO-AffectNetModel achieves state-of-the-art performance through cross-corpus training on seven diverse emotion datasets: AffectNet, RAVDESS, CREMA-D, SAVEE, RAMAS, IEMOCAP, and Aff-Wild2.
Research Foundation
Cammotion integrates the EMO-AffectNetModel from the published research:
"In Search of a Robust Facial Expressions Recognition Model: A Large-Scale Visual Cross-Corpus Study"
Published in Neurocomputing, 2022
Development Story
Cammotion was created to bridge the gap between academic emotion recognition research and practical web applications. The challenge was taking the EMO-AffectNetModel—a research model trained on multiple emotion datasets—and integrating it into a production-ready system with real-time performance constraints.
The architecture combines Django's robust backend capabilities with Vite's modern development experience. Key engineering decisions included implementing RetinaFace for superior face detection (with OpenCV fallback for robustness), optimizing the inference pipeline to maintain ~10 FPS, and designing an immersive circular interface that makes emotion visualization intuitive and engaging.
Privacy was fundamental to the design. While the frontend communicates with the backend API for inference, the entire system runs on your own infrastructure—no third-party services, no external tracking. Your facial data remains under your control.
Use Cases
User Experience Research: Analyze emotional responses to digital products and interfaces
Accessibility: Help individuals learn to recognize and express emotions
Content Testing: Measure emotional engagement with media and content
Education: Demonstrate AI and computer vision concepts in an interactive way