Limit search to available items
Book Cover
E-book
Author Stram, Daniel O., author

Title Design, analysis, and interpretation of genome-wide association scans / Daniel O. Stram
Published New York, NY : Springer, 2014

Copies

Description 1 online resource (xv, 334 pages) : illustrations
Series Statistics for Biology and Health, 1431-8776
Statistics for biology and health, 1431-8776
Contents 880-01 Introduction to Genome-Wide Association Studies -- Topics of Quantitative Genetics -- An Introduction to Association Studies -- Correcting for Hidden Population Structure in Single Marker Association Testing and Estimation -- Haplotype Imputation for Association Analysis -- SNP Imputation for Association Studies -- Design of Large-scale Genetic Association Studies, Sample Size and Power -- Post-GWAS Analyses
880-01/(S Machine generated contents note: 1. Introduction -- 1.1. Historical Perspective -- 1.2. DNA Basics -- 1.2.1. Organization of Chromosomes -- 1.2.2. Organization of DNA -- 1.2.3. DNA and Protein -- 1.3. Types of Genetic Variation -- 1.3.1. Single-Nucleotide Variants and Polymorphisms -- 1.3.2. Insertions/Deletions -- 1.3.3. Larger Structural Variants -- 1.3.4. Exonic Variation and Disease -- 1.3.5. Non-exonic SNPs and Disease -- 1.3.6. SNP Haplotypes -- 1.3.7. Microsatellites -- 1.3.8. Mitochondrial Variation -- 1.4. Overview of Genotyping Methods -- 1.4.1. SNP Calling -- 1.5. Overview of GWAS Genotype Arrays -- 1.6. Software and Data Resources -- 1.7. Web Resources -- 1.7.1. Basic Genomics -- 1.7.2. GWAS Associations -- 1.7.3. Annotation -- 1.8. Hardware and Operating Systems -- 1.9. Data Example -- 1.9.1. Save Your Work -- References -- 2. Topics in Quantitative Genetics -- 2.1. Distribution of a Single Diallelic Variant in a Randomly Mixing Population -- 2.1.1. Hardy--Weinberg Equilibrium -- 2.1.2. Random Samples of Unrelated Individuals -- 2.1.3. Joint Distribution Between Relatives of Allele Counts for a Single SNP -- 2.1.4. Coefficients of Kinship and of Inbreeding -- 2.2. Relationship Between Identity by State and Identity by Descent for a Single Diallelic Marker -- 2.3. Estimating IBD Probabilities from Genotype Data -- 2.4. Covariance Matrix for a Single Allele in Nonrandomly Mixing Populations -- 2.4.1. Hidden Structure and Correlation -- 2.4.2. Effects of Incomplete Admixture on the Covariance Matrix of a Single Variant -- 2.5. Direct Estimation of Differentiation Parameter F from Genotype Data -- 2.5.1. Relatedness Revisited -- 2.5.2. Estimation of Allele Frequencies -- 2.6. Allele Frequency Distributions -- 2.6.1. Initial Mutations and Common Ancestors -- 2.6.2. Mutations and the Coalescent -- 2.6.3. Allelic Distribution of Genetic Variants -- 2.6.4. Allele Distributions Under Population Increase and Selection -- 2.7. Recombination and Linkage Disequilibrium -- 2.7.1. Quantification of Recombination -- 2.7.2. Phased Versus Unphased Data and LD Estimation -- 2.7.3. Hidden Population Structure -- 2.7.4. Pseudo-LD Induced by Hidden Structure and Relatedness -- 2.8. Covering the Genome for Common Alleles -- 2.8.1. High-Throughput Sequencing -- 2.9. Principal Components Analysis -- 2.9.1. Display of Principal Components for the HapMap Phase 3 Samples -- 2.10. Chapter Summary -- Data and Software Exercises -- References -- 3. Introduction to Association Analysis -- 3.1. Single Marker Associations -- 3.1.1. Dominant, Recessive, and Co-dominant Effects -- 3.2. Regression Analysis and Generalized Linear Models in Genetic Analysis -- 3.3. Tests of Hypotheses for Genotype Data Using Generalized Linear Models -- 3.3.1. Test of Hypothesis regarding Genotype Effects Testing Using Logistic Regression in Case--Control Analysis -- 3.3.2. Interpreting Regression Equation Coefficients -- 3.4. Summary of Maximum Likelihood Estimation, Wald Tests, Likelihood Ratio Tests, Score Tests, and Sufficient Statistics -- 3.4.1. Properties of Log Likelihood Functions -- 3.4.2. Score Tests -- 3.4.3. Likelihood Ratio Tests -- 3.4.4. Wald Tests -- 3.4.5. Fisher's Scoring Procedure for Finding the MLE -- 3.4.6. Scores and Information for Normal and Binary Regression -- 3.4.7. Score Tests of β = 0 for Linear and Logistic Models -- 3.4.8. Matrix Formulae for Estimators in OLS Regression -- 3.5. Covariates, Interactions, and Confounding -- 3.6. Conditional Logistic Regression -- 3.6.1. Breaking the Matching in Logistic Regression of Matched Data -- 3.6.2. Parent Affected-Offspring Design -- 3.7. Case-Only Analyses -- 3.7.1. Case-Only Analyses of Disease Subtype -- 3.7.2. Case-Only Analysis of Gene [×] Environment and Gene [×] Gene Interactions -- 3.8. Non-independent Phenotypes -- 3.8.1. OLS Estimation When Phenotypes Are Correlated -- 3.9. Needs of a GWAS Analysis -- 3.9.1. Hardware Requirements for GWAS -- 3.9.2. Software Solutions -- 3.10. Multiple Comparisons Problem -- 3.11. Behavior of the Bonferroni Correction with Non-Independent Tests -- 3.12. Reliability of Small p-Values -- 3.12.1. Test of a Single Binomial Proportion -- 3.12.2. Test of a Difference in Binomial Proportions -- 3.13. Chapter Summary -- Appendix -- References -- 4. Correcting for Hidden Population Structure in Single Marker Association Testing and Estimation -- 4.1. Effects of Hidden Population Structure on the Behavior of Statistical Tests for Association -- 4.1.1. Effects on Inference Induced by Correlated Phenotypes -- 4.1.2. Influences of Latent Variables -- 4.1.3. Hidden Structure as a Latent Variable -- 4.1.4. Polygenes, Latent Structure, Hidden Relatedness, and Confounding -- 4.1.5. Hidden Non-mixing Strata -- 4.1.6. Admixture -- 4.1.7. Polygenes and Cryptic Relatedness -- 4.2. Correcting for the Effects of Hidden Structure and Relatedness -- 4.2.1. Genomic Control -- 4.2.2. Regression-based Adjustment for Leading Principal Components -- 4.2.3. Implementation of Principal Components Adjustment Methods -- 4.2.4. Random Effects Models -- 4.2.5. Retrospective Methods -- 4.3. Comparison of Correction Methods by Simulation -- 4.3.1. Comparison of the Mixed Model and Retrospective Approach for Binary (case--control) Outcomes -- 4.3.2. Conclusions -- 4.4. Behavior of the Genomic Control Parameter as Sample Size increases -- 4.5. Removing Related Individuals as Part of Quality Control, Is It Needed-- 4.6. Chapter Summary -- Data and Software Exercises -- References -- 5. Haplotype Imputation for Association Analysis -- 5.1. Role of Haplotypes in Association Testing -- 5.2. Haplotypes, LD Blocks, and Haplotype Uncertainty -- 5.3. Haplotype Frequency Estimation and Imputation -- 5.3.1. Small Numbers of SNPs -- 5.3.2. Haplotype Uncertainty -- 5.4. Haplotype Frequency Estimation for Larger Numbers of SNPs -- 5.4.1. Partition-Ligation EM Algorithm -- 5.4.2. Phasing Large Numbers of SNPs -- 5.5. Regression Analysis Using Haplotypes as Explanatory Variables -- 5.5.1. Expectation Substitution -- 5.5.2. Fitting Dominant, Recessive, or Two Degrees of Freedom Models for the Effect of Haplotypes -- 5.6. Dealing with Uncertainty in Haplotype Estimation in Association Testing -- 5.6.1. Full Likelihood Estimation of Risk Parameters and Haplotype Frequencies -- 5.6.2. Ascertainment in Case--Control Studies -- 5.6.3. Example: Expectation-Substitution Method -- 5.7. Haplotype Analysis Genome-Wide -- 5.7.1. Studies of Homogeneous Non-admixed Populations -- 5.7.2. Four-Gamete Rule for Fast Block Definition -- 5.7.3. Multiple Comparisons in Haplotype Analysis -- 5.8. Multiple Populations -- 5.9. Chapter Summary -- References -- 6. SNP Imputation for Association Studies -- 6.1. Role of Imputed SNPs in Association Testing -- 6.2. EM Algorithm and SNP Imputation -- 6.3. Phasing Large Numbers of SNPs for the Reference Panel -- 6.4. Brief Introduction to Hidden Markov Models -- 6.4.1. Baum--Welch Algorithm -- 6.5. Large-Scale Imputation Using HMMs -- 6.6. Using an HMM to Impute Missing Genotype Data when Both the Reference Panel and Study Genotypes Are Phased -- 6.7. Using an HMM to Phase Reference or Main Study Genotypes -- 6.7.1. Initializing and Updating the Current List of Haplotypes -- 6.8. Practical Issues in Large-Scale SNP Imputation -- 6.8.1. Assessing Imputation Accuracy -- 6.8.2. Imputing Rare SNPs -- 6.8.3. Use of Cosmopolitan Reference Panels -- 6.9. Estimating Relative Risks for Imputed SNPs -- 6.9.1. Expectation Substitution -- 6.10. Chapter Summary -- 6.10.1. Links -- References -- 7. Design of Large-Scale Genetic Association Studies, Sample Size, and Power
Summary This book presents the statistical aspects of designing, analyzing and interpreting the results of genome-wide association scans (GWAS studies) for genetic causes of disease using unrelated subjects. Particular detail is given to the practical aspects of employing the bioinformatics and data handling methods necessary to prepare data for statistical analysis. The goal in writing this book is to give statisticians, epidemiologists, and students in these fields the tools to design a powerful genome-wide study based on current technology. The other part of this is showing readers how to conduct analysis of the created study. Design and Analysis of Genome-Wide Association Studies provides a compendium of well-established statistical methods based upon single SNP associations. It also provides an introduction to more advanced statistical methods and issues. Knowing that technology, for instance large scale SNP arrays, is quickly changing, this text has significant lessons for future use with sequencing data. Emphasis on statistical concepts that apply to the problem of finding disease associations irrespective of the technology ensures its future applications. The author includes current bioinformatics tools while outlining the tools that will be required for use with extensive databases from future large scale sequencing projects. The author includes current bioinformatics tools while outlining additional issues and needs arising from the extensive databases from future large scale sequencing projects
Bibliography Includes bibliographical references and index
Notes Online resource; title from PDF title page (SpringerLink, viewed November 25, 2013)
Subject Human population genetics -- Statistical methods
Bioinformatics.
Population genetics.
Human beings.
Genetics, Population
Humans
Computational Biology
Homo sapiens (species)
MATHEMATICS -- Applied.
MATHEMATICS -- Probability & Statistics -- General.
Population genetics
Bioinformatics
Human population genetics -- Statistical methods
Form Electronic book
ISBN 9781461494430
1461494435
1461494427
9781461494423