Making Sense of Noisy Data: Why and How?

By Grace Y. Yi et al
Read the original document by opening this link in a new tab.

Table of Contents

Statistical Science Statistical Inference Modeling Data
Classification of Data
Use the collection method of data by design by observation
Use the size of data “small" data “big" data
Use the quality of data “good" quality: complete and no error “bad" quality: incomplete and error-prone
Measurement Error Examples and Sources
Example 1 - Cost Concern Example (Case-Control Study, Carroll et al. 1993)
Example 2 - Protection of Privacy Example (Survey Data, Hwang 1986)
Example 3 - Reporting Error Example (Survey Data, Bollinger 1998)
Example 4 - Imaging Data Prostate Cancer (e.g., Ward et al. 2012)
Other Examples measuring radiation dose measuring exposure to arsenic in drinking water, dust in the workplace, radon gas in the home and other environmental hazards
Some Sources of Measurement Error (Yi 2017)
Impact of Ignoring Measurement Error
General Classification
Research Monographs
Shameless Promotion
Summary and Take Home Messages
Noisy Data - Missing Value
Missing Data: Sources and Impact
Handling Missing Value and Measurement Error Separately - Comparisons from an Example
Ideal Longitudinal Data
Common Challenges
Accounting for Response Missingness Only
Accounting for Covariate Error Only
Comparisons
Accounting for Covariate Error Only
Accounting for Covariate Error Only
Handling Missing Value and Measurement Error Simultaneously - Two Examples
Causal Inference
Accounting for 2 Features
Summary and Take Home Messages
Concurrent Features measurement error, missing values, high dimensionality
High Dimensionality and Irrelevant Measurements

Summary

This document explores the challenges and implications of dealing with noisy data, particularly focusing on measurement error and missing values. It discusses the impact of ignoring measurement error, provides examples of measurement error in various studies, and highlights the importance of handling measurement error effects in data analysis. The document also delves into issues related to high dimensionality and irrelevant measurements in data analysis, emphasizing the need for dimensionality reduction techniques. Overall, it emphasizes the critical role of understanding and correcting for measurement error in conducting reliable data analysis.
×
This is where the content will go.