360博彩通-大发888卡-老虎机作弊器手机软件

研究院聯(lián)合研究團(tuán)隊(duì)成果收錄于CCF推薦國(guó)際學(xué)術(shù)會(huì)議

發(fā)布者:張鈺歆發(fā)布時(shí)間:2022-04-04瀏覽次數(shù):548

上海對(duì)外經(jīng)貿(mào)大學(xué)人工智能與變革管理研究院聯(lián)合研究團(tuán)隊(duì)付子旺(碩士生)、許晴(碩士生)、劉峰*、齊佳音*等撰寫(xiě)的論文“NHFNet: A Non-Homogeneous Fusion Network for Multimodal Sentiment Analysis”被中國(guó)計(jì)算機(jī)學(xué)會(huì)(CCF)推薦B類(lèi)國(guó)際學(xué)術(shù)會(huì)議—2022 IEEE International Conference on Multimedia and Expo(ICME)錄用。該研究提出了一種創(chuàng)新的非同階的多模態(tài)融合網(wǎng)絡(luò)來(lái)實(shí)現(xiàn)信息密度不同的三種模態(tài)的交互,利用帶有attention聚合融合模塊和跨模態(tài)attention融合來(lái)實(shí)現(xiàn)公平和高效的交互。該網(wǎng)絡(luò)設(shè)計(jì)了一種創(chuàng)新的融合策略來(lái)實(shí)現(xiàn)低階信號(hào)特征的強(qiáng)化,克服了成對(duì)注意力的二次復(fù)雜度,提高了模態(tài)間互補(bǔ)信息整合的能力。研究團(tuán)隊(duì)在CMU-MOSEI數(shù)據(jù)集上分別設(shè)定了對(duì)齊和非對(duì)齊兩種實(shí)驗(yàn)環(huán)境,實(shí)驗(yàn)結(jié)果表明,NHFNet優(yōu)于目前最先進(jìn)的融合策略。


成果摘要

Abstract: 

Fusion technology is crucial for multimodal sentiment analysis. Recent attention-based fusion methods demonstrate high performance and strong robustness. However, these approaches ignore the difference in information density among the three modalities, i.e., visual and audio have low-level signal features and conversely text has high-level semantic features. To this end, we propose a non-homogeneous fusion network (NHFNet) to achieve multimodal information interaction. Specifically, a fusion module with attention aggregation is designed to handle the fusion of visual and audio modalities to enhance them to high-level semantic features. Then, cross-modal attention is used to achieve information reinforcement of text modality and audio-visual fusion. NHFNet compensates for the differences in information density of different modalities enabling their fair interaction. To verify the effectiveness of the proposed method, we set up the aligned and unaligned experiments on the CMU-MOSEI dataset, respectively. The experimental results show that the proposed method outperforms the state-of-the-art.