Clairvoyant prefetching for distributed machine learning I/O
Published: Nov 13, 2021
Abstract
I/O is emerging as a major bottleneck for machine learning training, especially in distributed environments. Indeed, at large scale, I/O takes as much as 85% of training time. Addressing this I/O bottleneck necessitates careful optimization, as optimal data ingestion pipelines differ between systems, and require a delicate balance between access to local storage, external filesystems, and remote nodes. We introduce NoPFS, a machine learning I/O...
Paper Details
Title
Clairvoyant prefetching for distributed machine learning I/O
Published Date
Nov 13, 2021
Citation AnalysisPro
You’ll need to upgrade your plan to Pro
Looking to understand the true influence of a researcher’s work across journals & affiliations?
- Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
- Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.
Notes
History