Description |
1 online resource |
Contents |
Intro; Preface; Acknowledgements; Contents; Abbreviations; Symbols; Our Research Theme: Discrimination of two classes (n Cases and p Variables) by Eight LDFs and QDF; Two Facts of Theory; Six Mathematical Programming (MP)-Based LDFs by LINGO; Two Methods; Important Statistics; LINGO Programs:; Discriminant Functions by JMP; Seven Problems and Four Facts; 1 New Theory of Discriminant Analysis and Cancer Gene Analysis; 1.1 Introduction; 1.2 Fundamental of Theory; 1.2.1 The Motivation of Our Research; 1.2.2 IP-OLDF Based on MNM Criterion and Two Facts; 1.2.3 Simple Example |
|
1.2.4 Ordinary LP Solution1.3 Five Serious Problems and Three Excuses; 1.3.1 Four Problems; 1.3.2 Problem5; 1.3.3 Three Excuses of Cancer Gene Analysis; 1.4 Four OLDFs and MNM Instead of NM; 1.4.1 Revised IP-OLDF and the Defects of Number of Misclassifications; 1.4.2 Revised LP-OLDF and Revised IPLP-OLDF; 1.4.3 Hard-Margin SVM (H-SVM); 1.4.4 Soft-Margin SVM (S-SVM); 1.4.5 Statisticians Claim for MP-Based LDFs; 1.5 Matryoshka Feature Selection Method (Method2) and RatioSV; 1.5.1 Method2; 1.5.2 RatioSV: Measurement of the Degree of Linear Separability; 1.5.3 Six Famous Microarrays |
|
1.5.4 How to Develop Method2 (a Surprising 54-Day Research Diary)1.5.5 Results of Six Microarrays; 1.5.6 The Reason for Natural Feature Selection; 1.5.7 Two New Facts; 1.6 Validation of Method2 by Common Data; 1.6.1 Matryoshka Structure of Swiss Banknote Data; 1.6.2 Validation of LINGO Program3 Results; 1.6.3 Validation of Method2 by Japanese 44 Cars Data; 1.6.4 Examination of Duplicate Data; 1.7 Conclusion; References; 2 Overview of Cancer Gene Diagnosis; 2.1 Introduction; 2.2 Cancer Gene Diagnosis; 2.3 Analysis of 64 SMs Obtained by Alon's Microarray; 2.3.1 Analysis of 64 SMs |
|
2.3.2 Analysis of RipDS8 by Standard Statistical Methods2.4 Analysis of 64 RipDSs Data; 2.4.1 Examination of 64 RipDSs and RatioSV of RIP; 2.4.2 Ward Cluster Analysis of RipDSs New Data; 2.4.3 PCA Results of New Data; 2.5 The 130 BGSs of Alon's Microarray; 2.5.1 Results by Standard Statistical Methods; 2.5.2 Examination of RipDSs of 130 BGSs; 2.5.3 Examination of RipDSs New Data by PCA and Cluster Analysis; 2.5.4 Summary; 2.6 Other Five Microarrays; 2.6.1 Singh's Microarray; 2.6.2 Golub Microarray; 2.6.3 Tian's Microarray; 2.6.4 Chiaretti Microarray; 2.6.5 Shipp Microarray; 2.7 Conclusion |
Summary |
This book shows how to decompose high-dimensional microarrays into small subspaces (Small Matryoshkas, SMs), statistically analyze them, and perform cancer gene diagnosis. The information is useful for genetic experts, anyone who analyzes genetic data, and students to use as practical textbooks. Discriminant analysis is the best approach for microarray consisting of normal and cancer classes. Microarrays are linearly separable data (LSD, Fact 3). However, because most linear discriminant function (LDF) cannot discriminate LSD theoretically and error rates are high, no one had discovered Fact 3 until now. Hard-margin SVM (H-SVM) and Revised IP-OLDF (RIP) can find Fact3 easily. LSD has the Matryoshka structure and is easily decomposed into many SMs (Fact 4). Because all SMs are small samples and LSD, statistical methods analyze SMs easily. However, useful results cannot be obtained. On the other hand, H-SVM and RIP can discriminate two classes in SM entirely. RatioSV is the ratio of SV distance and discriminant range. The maximum RatioSVs of six microarrays is over 11.67%. This fact shows that SV separates two classes by window width (11.67%). Such easy discrimination has been unresolved since 1970. The reason is revealed by facts presented here, so this book can be read and enjoyed like a mystery novel. Many studies point out that it is difficult to separate signal and noise in a high-dimensional gene space. However, the definition of the signal is not clear. Convincing evidence is presented that LSD is a signal. Statistical analysis of the genes contained in the SM cannot provide useful information, but it shows that the discriminant score (DS) discriminated by RIP or H-SVM is easily LSD. For example, the Alon microarray has 2,000 genes which can be divided into 66 SMs. If 66 DSs are used as variables, the result is a 66-dimensional data. These signal data can be analyzed to find malignancy indicators by principal component analysis and cluster analysis |
Notes |
Online resource; title from PDF title page (EBSCO, viewed May 16, 2019) |
Subject |
Protein microarrays -- Statistical methods
|
|
SCIENCE -- Life Sciences -- Biochemistry.
|
Form |
Electronic book
|
ISBN |
9789811359972 |
|
9789811359989 |
|
9789811359996 |
|
9811359970 |
|
9811359989 |
|
9811359997 |
|