Deep Encoded Linguistic and Acoustic Cues for Attention Based End to End Speech Emotion Recognition

Swapnil Bhosale; Rupayan Chakraborty; Sunil Kumar Kopparapu

doi:https://doi.org/10.1109/icassp40776.2020.9054621

doi.org/10.1109/icassp40776.2020.9054621

Deep Encoded Linguistic and Acoustic Cues for Attention Based End to End Speech Emotion Recognition

,

,

Sunil Kumar Kopparapu

14

Published: May 1, 2020

Abstract

An End-to-End model with convolutional layers and multi-head self attention mechanism is proposed for Speech Emotion Recognition (SER) task. As inputs, we propose to use both the deep encoded linguistic features that carry the language related context of emotion and the audio spectrogram that are representatives of acoustic cues. To facilitate the deep linguistic feature representation, we use outputs from the intermediate layers of a...

Paper Fields

Paper Details

Title

Deep Encoded Linguistic and Acoustic Cues for Attention Based End to End Speech Emotion Recognition

DOI

doi.org/10.1109/icassp40776.2020.9054621

Published Date

May 1, 2020

Citation AnalysisPro

You’ll need to upgrade your plan to Pro

Looking to understand the true influence of a researcher’s work across journals & affiliations?

Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.

Learn more

Notes

History