The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems

Volume: 108, Pages: 495 - 504
Published: Jan 1, 2017
Abstract
A current trend in high-performance computing is to decompose a large linear algebra problem into batches containing thousands of smaller problems, that can be solved independently, before collating the results. To standardize the interface to these routines, the community is developing an extension to the BLAS standard (the batched BLAS), enabling users to perform thousands of small BLAS operations in parallel whilst making efficient use of...
Paper Details
Title
The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems
Published Date
Jan 1, 2017
Volume
108
Pages
495 - 504
Citation AnalysisPro
  • Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
  • Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.