Cammotion

2025

Creator

AIComputer VisionEmotion Detection

Overview

Cammotion is a sophisticated real-time emotion detection application powered by the EMO-AffectNetModel, a state-of-the-art ResNet50-based neural network achieving 66.49% accuracy on the AffectNet validation set. The application features an immersive circular camera interface with dynamic starburst visualizations that represent detected emotions in real-time.

Built with a modern Django backend and Vite-powered frontend, Cammotion processes your camera feed locally on your device, ensuring complete privacy while delivering smooth performance at ~10 FPS. The system supports multi-face detection and provides individual emotion analysis for each detected face.

This project demonstrates the practical application of cross-corpus visual emotion recognition research, integrating models trained on multiple emotion datasets (AffectNet, RAVDESS, CREMA-D, SAVEE, RAMAS, IEMOCAP, Aff-Wild2) into a production-ready web application.

Architecture

Frontend (Vite + Vanilla JS)

WebRTC Camera • Canvas 2D • Circular Interface

Django REST API

/api/detect/ • /api/health/

Face Detection

RetinaFace (batch-face) • OpenCV Fallback

EMO-AffectNetModel

ResNet50 + VGGFace2 • 66.49% Accuracy

7 Emotion Categories

Happy • Sad • Angry • Surprise • Fear • Disgust • Neutral

Features

•

Immersive Circular Interface: Dynamic circular camera feed with starburst emotion beams extending outward

•

Multi-Face Support: Detects and analyzes emotions for multiple faces simultaneously with individual tracking

•

Real-Time Performance: ~10 FPS processing with live bandwidth and frame count monitoring

•

Color-Coded Emotions: Each emotion has a unique color with beam length reflecting confidence scores

•

Advanced Face Detection: RetinaFace integration via batch-face library with OpenCV Haar cascade fallback

•

Privacy-First Design: All emotion recognition processed locally - no data sent to external servers

•

Hot Module Replacement: Vite-powered development with instant updates during development

Tech Stack

Backend

• Django 4.2+ with Django REST Framework
• Python 3.8+
• Uvicorn/Gunicorn for serving
• CORS headers for API access

Machine Learning

• TensorFlow/Keras for inference
• EMO-AffectNetModel (ResNet50)
• VGGFace2 preprocessing
• Trained on 7 emotion datasets

Computer Vision

• batch-face (RetinaFace wrapper)
• OpenCV for video processing
• Haar cascade fallback detection
• 224x224 RGB image input

Frontend

• Vite for build tooling
• Vanilla JavaScript ES6+
• WebRTC for camera access
• Canvas 2D API for rendering

How It Works

1. Frame Capture & Encoding

The Vite frontend captures video frames from your camera using WebRTC APIs. Each frame is rendered to a canvas element and encoded as base64 for transmission to the Django backend via POST request to the /api/detect/ endpoint.

2. Face Detection Pipeline

The backend uses RetinaFace (via batch-face library) as the primary face detector, with OpenCV's Haar cascade classifier as a fallback. RetinaFace provides superior accuracy for challenging poses and lighting conditions, detecting facial landmarks and bounding boxes for all visible faces.

3. Preprocessing & Inference

Each detected face is cropped, resized to 224x224 pixels, and preprocessed using VGGFace2 normalization. The preprocessed image is fed into the EMO-AffectNetModel (ResNet50 architecture), which performs forward inference to generate a 7-dimensional probability distribution via softmax activation over the emotion categories.

4. Real-Time Visualization

The API returns emotion predictions with confidence scores for each detected face. The frontend renders these results in a circular interface with color-coded starburst beams extending from each face. Beam length corresponds to confidence scores, creating an immersive visualization that updates at ~10 FPS. Bandwidth and frame rate metrics are displayed in real-time.

Performance & Research

Model Accuracy

AffectNet Validation

66.49%

Real-Time FPS

~10 FPS

The EMO-AffectNetModel achieves state-of-the-art performance through cross-corpus training on seven diverse emotion datasets: AffectNet, RAVDESS, CREMA-D, SAVEE, RAMAS, IEMOCAP, and Aff-Wild2.

Research Foundation

Cammotion integrates the EMO-AffectNetModel from the published research:

"In Search of a Robust Facial Expressions Recognition Model: A Large-Scale Visual Cross-Corpus Study"

Published in Neurocomputing, 2022

View on GitHub

Development Story

Cammotion was created to bridge the gap between academic emotion recognition research and practical web applications. The challenge was taking the EMO-AffectNetModel—a research model trained on multiple emotion datasets—and integrating it into a production-ready system with real-time performance constraints.

The architecture combines Django's robust backend capabilities with Vite's modern development experience. Key engineering decisions included implementing RetinaFace for superior face detection (with OpenCV fallback for robustness), optimizing the inference pipeline to maintain ~10 FPS, and designing an immersive circular interface that makes emotion visualization intuitive and engaging.

Privacy was fundamental to the design. While the frontend communicates with the backend API for inference, the entire system runs on your own infrastructure—no third-party services, no external tracking. Your facial data remains under your control.

Use Cases

•

User Experience Research: Analyze emotional responses to digital products and interfaces

•

Accessibility: Help individuals learn to recognize and express emotions

•

Content Testing: Measure emotional engagement with media and content

•

Education: Demonstrate AI and computer vision concepts in an interactive way