A high-performance batched matrix multiplication framework for GPUs under unbalanced input distribution

Volume: 78, Issue: 2, Pages: 1741 - 1758
Published: Jun 21, 2021
Abstract
In the past few decades, general matrix multiplication (GEMM), as the basic component of the Basic Linear Algebra Subprograms (BLAS) library, has played a vital role in various fields such as machine learning, image processing, and fluid dynamics. Because these fields tend to deconstruct the problem into multiple smaller sub-problems, today’s BLAS libraries have implemented batched GEMM routines to achieve high performance in this scenario....
Paper Details
Title
A high-performance batched matrix multiplication framework for GPUs under unbalanced input distribution
Published Date
Jun 21, 2021
Volume
78
Issue
2
Pages
1741 - 1758
Citation AnalysisPro
  • Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
  • Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.