Abstract
Recent advancements in deep learning allowed the deployment of large, multimodal models for depression analysis, but the increasing complexity of such models resulted in slow deployment times. This work proposes multi-stream audio-only models for depression analysis, that use transformer weights attended to by low-level descriptors (LLD) through an attention-weighted sum. It operates on the hypothesis that handcrafted feature sets will ameliorate extensive transformer pre-training. Extensive experimentation on the DAIC-WOZ test dataset shows that a combination of an audio spectrogram transformer (AST) and a Mel-frequency cepstral coefficient (MFCC) based convolutional neural network (AST-MFCC) produces the highest accuracy in our suite of models, but reports marginally lower macro F1 scores than both a naive AST and pure LLD-based models, suggesting that the injection of extra feature streams adds a sensibility element to models and limits false positives. However, the naive transformer-based and LLD-based models are surprisingly more effective at flagging depressed patients, although at the cost of an acceptable number of false positives. Our work suggests in totality that the addition of extra feature streams adds a distinct and controllable discriminating power to existing models and is able to assist lightweight models in low-data, audio-only settings.
Original language | English |
---|---|
Title of host publication | 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) |
Editors | Ting-Lan Lin, Yoshinobu Kajikawa, Zhaoxia Yin |
Place of Publication | Piscataway NJ USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 1323-1328 |
Number of pages | 6 |
ISBN (Electronic) | 9798350300673 |
ISBN (Print) | 9798350300680 |
DOIs | |
Publication status | Published - 2023 |
Event | Annual Summit and Conference of the Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA) 2023 - Taipei, Taiwan Duration: 31 Oct 2023 → 3 Nov 2023 Conference number: 15th https://www.apsipa2023.org/ (Website) https://ieeexplore.ieee.org/xpl/conhome/10317071/proceeding (Proceedings) |
Conference
Conference | Annual Summit and Conference of the Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA) 2023 |
---|---|
Abbreviated title | APSIPA 2023 |
Country/Territory | Taiwan |
City | Taipei |
Period | 31/10/23 → 3/11/23 |
Internet address |
|