Limit search to available items
Book Cover

Title Computational methods for next generation sequencing data analysis / edited by Ion Măndoiu, Alexander Zelikovsky
Published Hoboken, New Jersey : John Wiley & Sons, 2016


Description 1 online resource
Series Wiley series in bioinformatics: computational techniques and engineering
Contents List of Contributors Preface PART I COMPUTING AND EXPERIMENTAL INFRASTRUCTURE FOR NGS 1 Cloud Computing for NGS Data Analysis; Xuan Guo, Ning Yu, Bing Li, and Yi Pan 1.1 Introduction 1.2 Challenges for NGS data analysis 1.3 Background for cloud computing and its programming models 1.3.1 Overview of cloud computing 1.3.2 Cloud service providers 1.3.3 Programming models 1.4 Cloud computing services for NGS data analysis 1.4.1 Hardware as a Service (HaaS) 1.4.2 Platform as a Service (PaaS) 1.4.3 Software as a Service (SaaS) 1.4.4 Data as a Service (DaaS) 1.5 Conclusions and future directions References 2 Analysis of Environmental Sequence Information Using MetaPathways; Niels W. Hanson, Kishori M. Konwar, Shang-Ju Wu, and Steven J
Hallam 2.1 Introduction & Overview 2.2 Background 2.3 MetaPathways Processes 2.3.1 Open Reading Frame (ORF) Prediction 2.3.2 Functional Annotation 2.3.3 Analysis Modules 2.3.4 ePGDB construction 2.4 Big Data Processing 2.4.1 A Master-worker model for grid distribution 2.4.2 GUI and Data Integration 2.5 Downstream Analyses 2.5.1 Large Table Comparisons 2.5.2 Pathway Tools Cellular Overview 2.5.3 R Statistics 2.5.4 VennDiagram 2.5.5 Clustering and Relating Samples by Pathways 2.5.6 Faceting Variables with ggplot2 2.6 Conclusions References 3 Pooling Strategy for Massive Viral Sequencing; Pavel Skums, Alexander Artyomenko, Olga Glebova, Sumathi Ramachandran, David S. Campo, Zoya Dimitrova, Ion I
Măndoiu, Alex Zelikovsky and Yuri Khudyakov 3.1 Introduction 3.2 Design of Pools for Big Viral Data 3.2.1 Pool design optimization formulation 3.2.2 Greedy heuristic for VSPD problem 3.2.3 The tabu search heuristic for the OCBG problem 3.3 Deconvolution of viral samples from pools 3.3.1 Deconvolution using generalized intersections and differences of pools 3.3.2 Maximum likelihood k-clustering 3.4 Performance of pooling methods on simulated data 3.4.1 Performance of the viral sample pool design algorithm 3.4.2 Performance of the pool deconvolution algorithm 3.5 Experimental validation of pooling strategy 3.5.1 Experimental pools and sequencing 3.5.2 Results 3.6 Conclusion References 4 Applications of High-Fidelity Sequencing Protocol to RNA Viruses; Serghei Mangul, Nicholas C
Wu, Ekaterina Nenastyeva, Nicholas Mancuso, Alex Zelikovsky, Ren Sun and Eleazar Eskin 4.1 Introduction 4.2 High-fidelity sequencing protocol 4.3 Assembly of high-fidelity sequencing data 4.3.1 Consensus construction 4.3.2 Reads mapping 4.3.3 Viral Genome Assembler (VGA) 4.3.4 Viral population quantification 4.4 Performance of VGA on simulated data 4.5 Performance of existing viral assemblers on simulated consensus error-corrected reads 4.6 Performance of VGA on real HIV data 4.6.1 Validation of de novo consensus 4.7 Comparison of alignment on error-corrected reads 4.8 Evaluating of error correction tools based on high-fidelity sequencing reads 4.9 Acknowledgement References PART II GENOMICS AND EPIGENOMICS 5 Scaffolding Algorithms; Igor Mandric, James Lindsay,
Chong Chu and Yufeng Wu 8.1 Background 8.2 Methods 8.2.1 Signatures of long indels in sequence reads 8.2.2 Methods for estimating long indels 8.2.3 Methods for finding long indels with exact breakpoints 8.2.4 Combined approaches 8.3 Applications 8.4 Conclusions and future directions 8.5 Acknowledgment References 9 NGS Data Analysis for Genome-Wide DNA Methylation Studies; Jeong-Hyeon Choi and Huidong Shi 9.1 Introduction 9.2 Enrichment-based approaches 9.2.1 Data analysis procedure 9.2.2 Available approaches 9.3 Bisulfite treatment-based approaches 9.3.1 Data analysis procedure 9.3.2 Available approaches 9.4 Conclusion References 10 Bisulfite-Conversion-Based Methods for DNA Methylation Sequencing Data Analysis; Elena Harris and Stefano Lonardi 10.1 Introduction 10.2 The problem of
Mapping BS-treated reads 10.3 Algorithmic approaches to the problem of mapping BS-treated reads 10.4 Methylation estimation 10.5 Possible biases in estimation of methylation level 10.6 Bisulfite conversion rate 10.7 Reduced representation bisulfite sequencing 10.8 Accuracy as a performance measurement References PART III TRANSCRIPTOMICS 11 Computational Methods for Transcript Assembly from RNA-seq Reads; Stefan Canzar and Liliana Florea 11.1 Introduction 11.2 De novo assembly 11.2.1 Pre-processing of reads 11.2.2 The de Bruijn graph for RNA-seq read assembly 11.2.3 Contig assembly 11.2.4 Filtering and error correction 11.2.5 Variations 11.3 Genome-based assembly 11.3.1 Candidate isoforms 11.3.2 Minimality 11.3.3 Accuracy 11.3.4 Completeness 11.3.5 Extensions <p &
Summary Introduces readers to core algorithmic techniques for next-generation sequencing (NGS) data analysis and discusses a wide range of computational techniques and applications This book provides an in-depth survey of some of the recent developments in NGS and discusses mathematical and computational challenges in various application areas of NGS technologies. The 18 chapters featured in this book have been authored by bioinformatics experts and represent the latest work in leading labs actively contributing to the fast-growing field of NGS. The book is divided into four parts: Part I focuses on computing and experimental infrastructure for NGS analysis, including chapters on cloud computing, modular pipelines for metabolic pathway reconstruction, pooling strategies for massive viral sequencing, and high-fidelity sequencing protocols. Part II concentrates on analysis of DNA sequencing data, covering the classic scaffolding problem, detection of genomic variants, including insertions and deletions, and analysis of DNA methylation sequencing data. Part III is devoted to analysis of RNA-seq data. This part discusses algorithms and compares software tools for transcriptome assembly along with methods for detection of alternative splicing and tools for transcriptome quantification and differential expression analysis. Part IV explores computational tools for NGS applications in microbiomics, including a discussion on error correction of NGS reads from viral populations, methods for viral quasispecies reconstruction, and a survey of state-of-the-art methods and future trends in microbiome analysis. Computational Methods for Next Generation Sequencing Data Analysis: -Reviews computational techniques such as new combinatorial optimization methods, data structures, high performance computing, machine learning, and inference algorithms -Discusses the mathematical and computational challenges in NGS technologies -Covers NGS error correction, de novo genome transcriptome assembly, variant detection from NGS reads, and more This text is a reference for biomedical professionals interested in expanding their knowledge of computational techniques for NGS data analysis. The book is also useful for graduate and post-graduate students in bioinformatics
Bibliography Includes bibliographical references and index
Notes Print version record and CIP data provided by publisher
Subject Nucleotide sequence -- Methodology
Nucleotide sequence -- Data processing
Base Sequence
MEDICAL -- Anatomy.
SCIENCE -- Life Sciences -- Human Anatomy & Physiology.
Nucleotide sequence -- Data processing.
Nucleotide sequence -- Methodology.
Form Electronic book
Author Măndoiu, Ion, editor
Zelikovsky, Alexander, editor
LC no. 2016014704
ISBN 9781119272168