Data-centric AI Tutorial (KDD’23)

By Daochen Zha et al
Read the original document by opening this link in a new tab.

Table of Contents

Introduction and overview
What is data-centric AI (DCAI)? Why is it needed? Challenges? Overview of the methods?
Training data development
How to properly prepare the training data?
How to efficiently and effectively label data?
Inference data development
How to construct evaluation data to provide model insights?
How to engineer input data to unlock model capabilities?
Data maintenance & DCAI Benchmark
What efforts have been made or are in progress to support DCAI?
Data bias and fairness
Bias/fairness issues in data and the corresponding debiasing methods
DCAI in industry and summary
What are the challenges in industry? How we have addressed them? What remained to be done? What are the future directions?

Summary

Data-centric AI Tutorial (KDD’23) provides an in-depth exploration of data-centric AI, discussing the importance of engineering data for AI systems. The tutorial covers various aspects such as training data development, inference data development, data maintenance, bias and fairness, and the application of DCAI in industry. It emphasizes the need for proper preparation, labeling, and maintenance of training data, as well as the significance of constructing evaluation data for model insights. The tutorial also addresses challenges related to data bias, fairness issues, and the future directions of DCAI.
×
This is where the content will go.