UniMSE: a unified approach for multimodal sentiment analysis leveraging the CMU-MOSI Dataset
Abstract
This paper explores multimodal sentiment analysis using the CMU-MOSI dataset to enhance emotion detection through a unified approach called UniMSE. Traditional sentiment analysis, often reliant on single modalities such as text, faces limitations in capturing complex emotional nuances. UniMSE overcomes these challenges by integrating text, audio, and visual cues, significantly improving sentiment classification accuracy. The study reviews key datasets and compares leading models, showcasing the strengths of multimodal approaches. UniMSE leverages task formalization, pre-trained modality fusion, and multimodal contrastive learning, achieving superior performance on widely used benchmarks like MOSI and MOSEI. Additionally, the paper addresses the difficulties in effectively fusing diverse modalities and interpreting non-verbal signals, including sarcasm and tone. Future research directions are proposed to further advance multimodal sentiment analysis, with potential applications in areas like social media monitoring and mental health assessment. This work highlights UniMSE's contribution to developing more empathetic artificial intelligence (AI) systems capable of understanding complex emotional expressions.
Keywords
Data fusion; Emotion recognition; MOSI; Sentiment analyze; UniMSE
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v39.i3.pp2032-2042
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).