Limit search to available items
Book Cover
E-book
Author Templ, Matthias, author.

Title Visualization and imputation of missing values : with applications in R / Matthias Templ
Published Cham, Switzerland : Springer, [2023]

Copies

Description 1 online resource (xxii, 462 pages) : illustrations (some color)
Series Statistics and computing
Statistics and computing.
Contents Intro -- Preface -- Limitations -- Mathematical Notation -- Code Excerpt at GitHub -- Acknowledgement -- Reference -- Contents -- 1 Topic-Focused Introduction to R and Data sets Used -- 1.1 Resources and Software for the Imputation of Missing Values -- 1.1.1 Amelia -- 1.1.2 mi -- 1.1.3 mice (and BaBooN) -- 1.1.4 missMDA -- 1.1.5 missForest and missRanger -- 1.1.6 robCompositions -- 1.1.7 VIM -- 1.2 The Statistics Environment R -- 1.3 Simple Calculations in R -- 1.4 Installation of and Updates -- 1.5 Help -- 1.6 The R Workspace and the Working Directory -- 1.7 Data Types
1.8 Generic Functions, Methods, and Classes -- 1.9 A Note on Functions with the Same Name in Different Packages -- 1.10 Basic Data Manipulation with the dplyr Package -- 1.10.1 Pipes -- 1.10.2 dplyr-tibbles -- 1.10.3 dplyr-Selection of Rows -- 1.10.4 dplyr-Order -- 1.10.5 dplyr-Selection of Columns -- 1.10.6 dplyr-Uniqueness -- 1.10.7 dplyr-Creating Variables -- 1.10.8 dplyr-Grouping and Summary Statistics -- 1.10.9 dplyr-Window Functions -- 1.11 Data Manipulation with the data.table Package -- 1.11.1 data.table-Variable Construction -- 1.11.2 data.table-Indexing/Subsetting
1.11.3 data.table-Keys -- 1.11.4 data.table-Fast Subsetting -- 1.11.5 data.table-Calculations in Groups -- 1.12 Data Sets -- 1.12.1 Census Data from UCI -- 1.12.2 Airquality -- 1.12.3 Breast Cancer -- 1.12.4 Brittleness Index -- 1.12.5 Kola C-horizon Data -- 1.12.6 Colic Horse Data -- 1.12.7 New York Collission Data -- 1.12.8 Diabetes -- 1.12.9 Austrian EU-SILC Data -- 1.12.10 Food Consumption -- 1.12.11 Pulp Lignin -- 1.12.12 Structural Business Statistics Data -- 1.12.13 Mammal Sleep Data -- 1.12.14 West Pacific Tropical Atmosphere Ocean Data -- 1.12.15 Wine Tasting and Price of Wines
1.12.16 Further Data Sets -- References -- 2 Distribution, Pre-analysis of Missing Values and Data Quality -- 2.1 Introduction -- 2.2 How Does Missing Data Arise? -- 2.2.1 Surveys in Official Statistics and Surveys Obtained with Questionaires -- 2.2.2 Comment on Structural Zeros and Non-applicable Questions in a Questionaire -- 2.2.3 Missing Values from Measuring Experiments -- 2.2.4 Censored Values -- 2.2.5 Monotone Missingness -- 2.3 Missing Value Mechanisms -- 2.3.1 Missing at Random (MAR) -- 2.3.2 Missing at Completely Random (MCAR) -- 2.3.3 Missing Not at Random (MNAR) -- 2.3.4 Example
2.3.5 Summary on MCAR, MAR, and MNAR -- 2.4 Limitations for the Detection of the Missing Value Mechanisms -- 2.5 Kinds of Attributes -- 2.5.1 Binary and Nominal Variables and Related Distances -- 2.5.2 Ordered Categorical Variables -- 2.5.3 Count Variables, Continuous Variables, Semi-continuous Variables, and Related Distances -- 2.5.4 The Gower Distance -- 2.6 Data Quality and Consistency of Data -- 2.6.1 Outliers -- 2.6.1.1 Outliers in Relation with Other Data Problems -- 2.6.1.2 What Is an Outlier and When an Outlier Should Be Deleted and Imputed? -- 2.6.1.3 Univariate Methods
Summary This book explores visualization and imputation techniques for missing values and presents practical applications using the statistical software R. It explains the concepts of common imputation methods with a focus on visualization, description of data problems and practical solutions using R, including modern methods of robust imputation, imputation based on deep learning and imputation for complex data. By describing the advantages, disadvantages and pitfalls of each method, the book presents a clear picture of which imputation methods are applicable given a specific data set at hand. The material covered includes the pre-analysis of data, visualization of missing values in incomplete data, single and multiple imputation, deductive imputation and outlier replacement, model-based methods including methods based on robust estimates, non-linear methods such as tree-based and deep learning methods, imputation of compositional data, imputation quality evaluation from visual diagnostics to precision measures, coverage rates and prediction performance and a description of different model- and design-based simulation designs for the evaluation. The book also features a topic-focused introduction to R and R code is provided in each chapter to explain the practical application of the described methodology. Addressed to researchers, practitioners and students who work with incomplete data, the book offers an introduction to the subject as well as a discussion of recent developments in the field. It is suitable for beginners to the topic and advanced readers alike
Bibliography Includes bibliographical references and index
Notes Description based on online resource; title from digital title page (viewed on January 19, 2024)
Subject Information visualization -- Data processing
R (Computer program language)
Missing observations (Statistics) -- Data processing
R (Computer program language)
Form Electronic book
ISBN 3031300734
9783031300738