Customer who is into automotive component manufacturing wanted to build a multi-modal distress recognition system for passenger transportation industry.
When a passenger is distressed or felt threatened due to behavior of Driver or in the event of Driver misbehaving with the passenger, the gateway device on car capturing the continuous audio stream analyze the audio and recognize the stress signal in the form of voice emotion detection and then triggers an event to server.
Server-side components, will further analyze the video -5 to +5 seconds and with the help of Human action recognition and classification model will analyze the thread level and further escalates to security personnel or according to the rules set at the notification engine
In parallel, the server-side component will send a notification to parent or guardian to get on to live view to see if the passenger is threatened and in real trouble
Identification of the basic emotions like Anger, Happiness, Sad, Neutral and Fear
Trigger a notification to the server in case negative emotions are detected in the conversations between the occupants in the transport vehicle
TECHNOLOGIES / TOOLS
Mel Coefficients & Mel Spectrograms as Audio Features
Used Hierarchical classifier to classify the EMOTION
Multiple Dataset like RAVDESS, SAVEE, IEMOCAP, we finally used RAVDESS only since the model performance with all three datasets was very bad
Successfully ported the model onto Platforms like QUALCOMM QC S 603 & 605