Multimodal perception for enhancing human computer interaction through real-world affect recognition
Abstract
Human-Computer Interaction can benefit from real-world affect recognition in applications like healthcare and assistive robotics. Human express emotions through various modalities, with audio-visual being the most significant. Using a unimodal approach, such as only speech or visual, is challenging in natural, dynamic environments. The proposed methodology integrated a pretrained model with a convolution neural network (CNN) to provide a robust initialization point and address the limited availability of facial expression data. The multimodal framework enhances discriminative power by combining visual scores with speech. This work addresses the challenges at each stage of the real-world affect recognition framework, including data preprocessing, feature extraction, feature fusion, and final classification. A 1D-CNN is employed for training on spectral and prosodic audio features, while deep visual features are processed using a 2D-CNN. The proposed system's performance was evaluated on the extended Cohn-Kanade (CK+), acted-facial-expressions in-the-wild (AFEW), and real-world-affective-face-database (RAF) datasets, which are commonly used in face recognition research. Experimental results indicate that 2% to 5% of visual data from natural settings were undetected, and the inclusion of the audio modality improved performance by providing relevant and supplementary information.
Keywords
Affect recognition; Audio-visual data; Deep neural network; Multimodal; Preprocessing
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v38.i1.pp428-438
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).