Original paper
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Published: Jun 1, 2022
Abstract
We present CSWin Transformer, an efficient and effective Transformer-based backbone for general-purpose vision tasks. A challenging issue in Transformer design is that global self-attention is very expensive to compute whereas local self-attention often limits the field of interactions of each token. To address this issue, we develop the Cross-Shaped Window self-attention mechanism for computing self-attention in the horizontal and vertical...
Paper Details
Title
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Published Date
Jun 1, 2022