Abstract
We present our work on sentiment prediction using the benchmark MOSI dataset from the CMU-MultimodalDataSDK. Previous work on multimodal sentiment analysis have been focused on input-level feature fusion or decision-level fusion for multimodal fusion. Here, we propose an intermediate-level feature fusion, which merges weights from each modality (audio, video, and text) during training with subsequent additional training. Moreover, we tested principle component analysis (PCA) for feature selection. We found that applying PCA increases unimodal performance, and multimodal fusion outperforms unimodal models. Our experiments show that our proposed intermediate-level feature fusion outperforms other fusion techniques, and it achieves the best performance with an overall binary accuracy of 74.0% on video+text modalities. Our work also improves feature selection for unimodal sentiment analysis, while proposing a novel and effective multimodal fusion architecture for this task.
Original language | English |
---|---|
Title of host publication | ACL 2018 - First Grand Challenge and Workshop on Human Multimodal Language (Challenge-HML) |
Subtitle of host publication | Proceedings of the Workshop - July 20, 2018 Melbourne, Australia |
Editors | Amir Zadeh, Louis-Philippe Morency, Paul Pu Liang, Soujanya Poria, Erik Cambria, Stefan Scherer |
Place of Publication | Stroudsburg PA USA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 64-72 |
Number of pages | 9 |
ISBN (Electronic) | 9781948087469 |
Publication status | Published - 2018 |
Externally published | Yes |
Event | Grand Challenge and Workshopon Human Multimodal Language 2018 - Melbourne, Australia Duration: 20 Jul 2018 → 20 Jul 2018 http://multicomp.cs.cmu.edu/acl2018multimodalchallenge/ |
Conference
Conference | Grand Challenge and Workshopon Human Multimodal Language 2018 |
---|---|
Abbreviated title | Challenge-HML 2018 |
Country/Territory | Australia |
City | Melbourne |
Period | 20/07/18 → 20/07/18 |
Internet address |