Limit search to available items
170 results found. Sorted by relevance | date | title .
Book Cover
E-book
Author Wan, Shibiao, author

Title Machine learning for protein subcellular localization prediction / Shibiao Wan, Man-Wai Mak
Published Berlin, Germany ; Boston, Massachusetts : De Gruyter, 2015
©2015

Copies

Description 1 online resource (210 pages)
Contents Preface -- Contents -- List of Abbreviations -- 1 Introduction -- 1.1 Proteins and their subcellular locations -- 1.2 Why computationally predict protein subcellular localization? -- 1.2.1 Significance of the subcellular localization of proteins -- 1.2.2 Conventional wet-lab techniques -- 1.2.3 Computational prediction of protein subcellular localization -- 1.3 Organization of this book -- 2 Overview of subcellular localization prediction -- 2.1 Sequence-based methods -- 2.1.1 Composition-based methods -- 2.1.2 Sorting signal-based methods -- 2.1.3 Homology-based methods -- 2.2 Knowledge-based methods -- 2.2.1 GO-term extraction -- 2.2.2 GO-vector construction -- 2.3 Limitations of existing methods -- 2.3.1 Limitations of sequence-based methods -- 2.3.2 Limitations of knowledge-based methods -- 3 Legitimacy of using gene ontology information -- 3.1 Direct table lookup? -- 3.1.1 Table lookup procedure for single-label prediction -- 3.1.2 Table-lookup procedure for multi-label prediction -- 3.1.3 Problems of table lookup -- 3.2 Using only cellular component GO terms? -- 3.3 Equivalent to homologous transfer? -- 3.4 More reasons for using GO information -- 4 Single-location protein subcellular localization -- 4.1 Extracting GO from the Gene Ontology Annotation Database -- 4.1.1 Gene Ontology Annotation Database -- 4.1.2 Retrieval of GO terms -- 4.1.3 Construction of GO vectors -- 4.1.4 Multiclass SVM classification -- 4.2 FusionSVM: Fusion of gene ontology and homology-based features -- 4.2.1 InterProGOSVM: Extracting GO from InterProScan -- 4.2.2 PairProSVM: A homology-based method -- 4.2.3 Fusion of InterProGOSVM and PairProSVM -- 4.3 Summary -- 5 From single- to multi-location -- 5.1 Significance of multi-location proteins -- 5.2 Multi-label classification -- 5.2.1 Algorithm-adaptation methods
5.2.2 Problem transformation methods -- 5.2.3 Multi-label classification in bioinformatics -- 5.3 mGOASVM: A predictor for both single- and multi-location proteins -- 5.3.1 Feature extraction -- 5.3.2 Multi-label multiclass SVM classification -- 5.4 AD-SVM: An adaptive decision multi-label predictor -- 5.4.1 Multi-label SVM scoring -- 5.4.2 Adaptive decision for SVM (AD-SVM) -- 5.4.3 Analysis of AD-SVM -- 5.5 mPLR-Loc: A multi-label predictor based on penalized logistic regression -- 5.5.1 Single-label penalized logistic regression -- 5.5.2 Multi-label penalized logistic regression -- 5.5.3 Adaptive decision for LR (mPLR-Loc) -- 5.6 Summary -- 6 Mining deeper on GO for protein subcellular localization -- 6.1 Related work -- 6.2 SS-Loc: Using semantic similarity over GO -- 6.2.1 Semantic similarity measures -- 6.2.2 SS vector construction -- 6.3 HybridGO-Loc: Hybridizing GO frequency and semantic similarity features -- 6.3.1 Hybridization of two GO features -- 6.3.2 Multi-label multiclass SVM classification -- 6.4 Summary -- 7 Ensemble random projection for large-scale predictions -- 7.1 Random projection -- 7.2 RP-SVM: A multi-label classifier with ensemble random projection -- 7.2.1 Ensemble multi-label classifier -- 7.2.2 Multi-label classification -- 7.3 R3P-Loc: A compact predictor based on ridge regression and ensemble random projection -- 7.3.1 Limitation of using current databases -- 7.3.2 Creating compact databases -- 7.3.3 Single-label ridge regression -- 7.3.4 Multi-label ridge regression -- 7.4 Summary -- 8 Experimental setup -- 8.1 Prediction of single-label proteins -- 8.1.1 Datasets construction -- 8.1.2 Performance metrics -- 8.2 Prediction of multi-label proteins -- 8.2.1 Dataset construction -- 8.2.2 Datasets analysis -- 8.2.3 Performance metrics -- 8.3 Statistical evaluation methods -- 8.4 Summary
9 Results and analysis -- 9.1 Performance of GOASVM -- 9.1.1 Comparing GO vector construction methods -- 9.1.2 Performance of successive-search strategy -- 9.1.3 Comparing with methods based on other features -- 9.1.4 Comparing with state-of-the-art GO methods -- 9.1.5 GOASVM using old GOA databases -- 9.2 Performance of FusionSVM -- 9.2.1 Comparing GO vector construction and normalization methods -- 9.2.2 Performance of PairProSVM -- 9.2.3 Performance of FusionSVM -- 9.2.4 Effect of the fusion weights on the performance of FusionSVM -- 9.3 Performance of mGOASVM -- 9.3.1 Kernel selection and optimization -- 9.3.2 Term-frequency for mGOASVM -- 9.3.3 Multi-label properties for mGOASVM -- 9.3.4 Further analysis of mGOASVM -- 9.3.5 Comparing prediction results of novel proteins -- 9.4 Performance of AD-SVM -- 9.5 Performance of mPLR-Loc -- 9.5.1 Effect of adaptive decisions on mPLR-Loc -- 9.5.2 Effect of regularization on mPLR-Loc -- 9.6 Performance of HybridGO-Loc -- 9.6.1 Comparing different features -- 9.7 Performance of RP-SVM -- 9.7.1 Performance of ensemble random projection -- 9.7.2 Comparison with other dimension-reduction methods -- 9.7.3 Performance of single random-projection -- 9.7.4 Effect of dimensions and ensemble size -- 9.8 Performance of R3P-Loc -- 9.8.1 Performance on the compact databases -- 9.8.2 Effect of dimensions and ensemble size -- 9.8.3 Performance of ensemble random projection -- 9.9 Comprehensive comparison of proposed predictors -- 9.9.1 Comparison of benchmark datasets -- 9.9.2 Comparison of novel datasets -- 9.10 Summary -- 10 Properties of the proposed predictors -- 10.1 Noise data in the GOA Database -- 10.2 Analysis of single-label predictors -- 10.2.1 GOASVM vs FusionSVM -- 10.2.2 Can GOASVM be combined with PairProSVM? -- 10.3 Advantages of mGOASVM -- 10.3.1 GO-vector construction
10.3.2 GO subspace selection -- 10.3.3 Capability of handling multi-label problems -- 10.4 Analysis for HybridGO-Loc -- 10.4.1 Semantic similarity measures -- 10.4.2 GO-frequency features vs SS features -- 10.4.3 Bias analysis -- 10.5 Analysis for RP-SVM -- 10.5.1 Legitimacy of using RP -- 10.5.2 Ensemble random projection for robust performance -- 10.6 Comparing the proposed multi-label predictors -- 10.7 Summary -- 11 Conclusions and future directions -- 11.1 Conclusions -- 11.2 Future directions -- A Webservers for protein subcellular localization -- A.1 GOASVM webserver -- A.2 mGOASVM webserver -- A.3 HybridGO-Loc webserver -- A.4 mPLR-Loc webserver -- B Support vector machines -- B.1 Binary SVM classification -- B.2 One-vs-rest SVM classification -- C Proof of no bias in LOOCV -- D Derivatives for penalized logistic regression -- Bibliography -- Index
Summary For bioinformaticians, computational biologists, and wet-lab biologists, the authors provide the latest machine learning approaches for protein subcellular localization prediction with a systemic scheme for improving predictors performance
Analysis Bioinformatics
Computer Science
Proteomics
Bibliography Includes bibliographical references and index
Notes English
Print version record
Subject Proteins -- Physiological transport -- Data processing
Machine learning.
Probabilities -- Data processing
Carrier proteins.
Artificial intelligence.
Probabilities.
Carrier Proteins
Artificial Intelligence
Probability
Machine Learning
artificial intelligence.
probability.
Technology & Engineering -- Signals & Signal Processing.
Probabilities
Carrier proteins
Artificial intelligence
Machine learning
Probabilities -- Data processing
Form Electronic book
Author Mak, M. W., author
ISBN 9781501501500
150150150X
1501501526
9781501501524