Original paper
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Pages: 646 - 650
Published: Apr 27, 2022
Abstract
Audio classification is an important task of mapping audio samples into their corresponding labels. Recently, the transformer model with self-attention mechanisms has been adopted in this field. However, existing audio transformers require large GPU memories and long training time, meanwhile relying on pretrained vision models to achieve high performance, which limits the model's scalability in audio tasks. To combat these problems, we introduce...
Paper Details
Title
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Published Date
Apr 27, 2022
Pages
646 - 650