Locality-Aware CTA Clustering for Modern GPUs

Ang Li; Shuaiwen Leon Song; Weifeng Liu; Xu Liu; Akash Kumar; Henk Corporaal

doi:https://doi.org/10.1145/3037697.3037709

doi.org/10.1145/3037697.3037709

Locality-Aware CTA Clustering for Modern GPUs

,

,

..., Henk Corporaal

30

Published: Apr 4, 2017

Abstract

Cache is designed to exploit locality; however, the role of on-chip L1 data caches on modern GPUs is often awkward. The locality among global memory requests from different SMs (Streaming Multiprocessors) is predominantly harvested by the commonly-shared L2 with long access latency; while the in-core locality, which is crucial for performance delivery, is handled explicitly by user-controlled scratchpad memory. In this work, we disclose another...

Paper Fields

Paper Details

Title

Locality-Aware CTA Clustering for Modern GPUs

DOI

doi.org/10.1145/3037697.3037709

Published Date

Apr 4, 2017

Citation AnalysisPro

You’ll need to upgrade your plan to Pro

Looking to understand the true influence of a researcher’s work across journals & affiliations?

Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.

Learn more

Notes

History