Preface.- 1 Topic-focused Introduction to R and Data Sets Used.- 2 Distribution, Pre-analysis of Missing Values and Data Quality.- 3 Detection of the Missing Values Mechanism with Tests and Models.- 4 Visualisation of Missing Values.- 5 General Considerations on Univariate Methods, Single and Multiple Imputation.- 6 Deductive Imputation and Outlier Replacement.- 7 Imputation Without a Model.- 8 Model-based Methods.- 9 Non-linear Methods.- 10 Methods for compositional data.- 11 Evaluation of the Quality of Imputation.- 12 Simulation of Data for Simulation Studies.
Matthias Templ is a Professor at the Institute for Competitiveness and Communication at the University of Applied Sciences and Arts Northwestern Switzerland in Olten, and a lecturer at ETH Zurich and the Vienna University of Technology, where he was awarded the venia docendi (habilitation) in statistics. His main research interests include computational statistics, compositional data analysis, robust statistics, imputation of missing values and anonymization of data. He is the Editor-in-Chief of the Austrian Journal of Statistics and (co-)author of four books including Statistical Disclosure Control for Microdata and Applied Compositional Data Analysis. He is also an author and the maintainer of several R packages, such as the R package sdcMicro for statistical disclosure control, the package robCompositions for robust analysis of compositional data, the simPop package for simulation of synthetic data, and the VIM package for visualization and imputation of missing values.
This book explores visualization and imputation techniques for missing values and presents practical applications using the statistical software R. It explains the concepts of common imputation methods with a focus on visualization, description of data problems and practical solutions using R, including modern methods of robust imputation, imputation based on deep learning and imputation for complex data. By describing the advantages, disadvantages and pitfalls of each method, the book presents a clear picture of which imputation methods are applicable given a specific data set at hand.
The material covered includes the pre-analysis of data, visualization of missing values in incomplete data, single and multiple imputation, deductive imputation and outlier replacement, model-based methods including methods based on robust estimates, non-linear methods such as tree-based and deep learning methods, imputation of compositional data, imputation quality evaluation from visual diagnostics to precision measures, coverage rates and prediction performance and a description of different model- and design-based simulation designs for the evaluation. The book also features a topic-focused introduction to R and R code is provided in each chapter to explain the practical application of the described methodology.
Addressed to researchers, practitioners and students who work with incomplete data, the book offers an introduction to the subject as well as a discussion of recent developments in the field. It is suitable for beginners to the topic and advanced readers alike.