Read the original document by opening this link in a new tab.
Table of Contents
1 Introduction
2 Background and Related Works
3 Design
Summary
Recent advances on deep learning models come at the price of formidable training cost. DeepSpeed Data Efficiency is a framework that makes better use of data, increases training efficiency, and improves model quality. It combines two data efficiency techniques: efficient data sampling via a general curriculum learning library, and efficient data routing via a novel random layerwise token dropping technique. The framework achieves significant data/time/cost savings while maintaining high model quality. DeepSpeed Data Efficiency is easy to use and tune, enabling its application on various tasks.