A coordinated tiling and batching framework for efficient GEMM on GPUs

Published: Feb 16, 2019
Abstract
General matrix multiplication (GEMM) plays a paramount role in a broad range of domains such as deep learning, scientific computing, and image processing. The primary optimization method is to partition the matrix into many tiles and exploit the parallelism within and between tiles. The tiling hierarchy closely mirrors the thread hierarchy on GPUs. In practice, GPUs can fully unleash its computing power only when the matrix size is large and...
Paper Details
Title
A coordinated tiling and batching framework for efficient GEMM on GPUs
Published Date
Feb 16, 2019
Journal
Citation AnalysisPro
  • Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
  • Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.