Limit search to available items
Book Cover
E-book
Author Healey, Christopher G. (Christopher Graham), 1967- author.

Title Disk-based algorithms for big data / Christopher G. Healey
Published Boca Raton : CRC Press, [2017]

Copies

Description 1 online resource (xx, 184 pages) : color illustrations
Contents 880-01 Chapter 1. Physical disk storage -- Chapter 2. File management -- Chapter 3. Sorting -- Chapter 4. Searching -- Chapter 5. Disk-based sorting -- Chapter 6. Disk-based searching -- Chapter 7. Storage technology -- Chapter 8. Distributed hast tables -- Chapter 9. Large file systems -- Chapter 10. NoSQL storage
880-01/(S Machine generated contents note: ch. 1 Physical Disk Storage -- 1.1. Physical Hard Disk -- 1.2. Clusters -- 1.2.1. Block Allocation -- 1.3. Access Cost -- 1.4. Logical to Physical -- 1.5. Buffer Management -- ch. 2 File Management -- 2.1. Logical Components -- 2.1.1. Positioning Components -- 2.2. Identifying Records -- 2.2.1. Secondary Keys -- 2.3. Sequential Access -- 2.3.1. Improvements -- 2.4. DIRECT ACCESS -- 2.4.1. Binary Search -- 2.5. File Management -- 2.5.1. Record Deletion -- 2.5.2. Fixed-Length Deletion -- 2.5.3. Variable-Length Deletion -- 2.6. File Indexing -- 2.6.1. Simple Indices -- 2.6.2. Index Management -- 2.6.3. Large Index Files -- 2.6.4. Secondary Key Index -- 2.6.5. Secondary Key Index Improvements -- ch. 3 Sorting -- 3.1. Heapsort -- 3.2. Mergesort -- 3.3. Timsort -- ch. 4 Searching -- 4.1. Linear Search -- 4.2. Binary Search -- 4.3. Binary Search Tree -- 4.4. k-d TREE -- 4.4.1. k-d Tree Index -- 4.4.2. Search -- 4.4.3. Performance -- 4.5. Hashing -- 4.5.1. Collisions -- 4.5.2. Hash Functions -- 4.5.3. Hash Value Distributions -- 4.5.4. Estimating Collisions -- 4.5.5. Managing Collisions -- 4.5.6. Progressive Overflow -- 4.5.7. Multirecord Buckets -- ch. 5 Disk-Based Sorting -- 5.1. Disk-Based Mergesort -- 5.1.1. Basic Mergesort -- 5.1.2. Timing -- 5.1.3. Scalability -- 5.2. Increased Memory -- 5.3. More Hard Drives -- 5.4. Multistep Merge -- 5.5. Increased Run Lengths -- 5.5.1. Replacement Selection -- 5.5.2. Average Run Size -- 5.5.3. Cost -- 5.5.4. Dual Hard Drives -- ch. 6 Disk-Based Searching -- 6.1. Improved Binary Search -- 6.1.1. Self-Correcting BSTs -- 6.1.2. Paged BSTs -- 6.2. B-TREE -- 6.2.1. Search -- 6.2.2. Insertion -- 6.2.3. Deletion -- 6.3. B* TREE -- 6.4. B+ TREE -- 6.4.1. Prefix Keys -- 6.5. Extendible Hashing -- 6.5.1. Trie -- 6.5.2. Radix Tree -- 6.6. Hash Tries -- 6.6.1. Trie Insertion -- 6.6.2. Bucket Insertion -- 6.6.3. Full Trie -- 6.6.4. Trie Size -- 6.6.5. Trie Deletion -- 6.6.6. Trie Performance -- ch. 7 Storage Technology -- 7.1. Optical Drives -- 7.1.1. Compact Disc -- 7.1.2. Digital Versatile Disc -- 7.1.3. Blu-ray Disc -- 7.2. Solid State Drives -- 7.2.1. Floating Gate Transistors -- 7.2.2. Read--Write--Erase -- 7.2.3. SSD Controller -- 7.2.4. Advantages -- 7.3. Holographic Storage -- 7.3.1. Holograms -- 7.3.2. Data Holograms -- 7.3.3. Commercialization -- 7.4. Molecular Memory -- 7.5. MRAM -- ch. 8 Distributed Hash Tables -- 8.1. History -- 8.2. Keyspace -- 8.3. Keyspace Partitioning -- 8.4. Overlay Network -- 8.5. Chord -- 8.5.1. Keyspace -- 8.5.2. Keyspace Partitioning -- 8.5.3. Overlay Network -- 8.5.4. Addition -- 8.5.5. Failure -- ch. 9 Large File Systems -- 9.1. Raid -- 9.1.1. Parity -- 9.2. ZFS -- 9.2.1. Fault Tolerance -- 9.2.2. Self-Healing -- 9.2.3. Snapshots -- 9.3. GFS -- 9.3.1. Architecture -- 9.3.2. Master Metadata -- 9.3.3. Mutations -- 9.3.4. Fault Tolerance -- 9.4. Hadoop -- 9.4.1. MapReduce -- 9.4.2. MapReduce Implementation -- 9.4.3. HDFS -- 9.4.4. Pig -- 9.4.5. Hive -- 9.5. Cassandra -- 9.5.1. Design -- 9.5.2. Improvements -- 9.5.3. Query Language -- 9.6. Presto -- ch. 10 NoSQL Storage -- 10.1. Graph Databases -- 10.1.1. Neo4j -- 10.1.2. Caching -- 10.1.3. Query Languages -- 10.2. Document Databases -- 10.2.1. SQL Versus NoSQL -- 10.2.2. MongoDB -- 10.2.3. Indexing -- 10.2.4. Query Languages -- Appendix A Order Notation -- A.1. θ-NOTATION -- A.2. O-NOTATION -- A.3. Ω-NOTATION -- A.4. INSERTION SORT -- A.5. SHELL SORT -- Appendix B Assignment 1: Search -- B.1. KEY AND SEEK LISTS -- B.2. PROGRAM EXECUTION -- B.3. IN-MEMORY SEQUENTIAL SEARCH -- B.4. IN-MEMORY BINARY SEARCH -- B.5. ON-DISK SEQUENTIAL SEARCH -- B.6. ON-DISK BINARY SEARCH -- B.7. PROGRAMMING ENVIRONMENT -- B.7.1. Reading Binary Integers -- B.7.2. Measuring Time -- B.7.3. Writing Results -- B.8. SUPPLEMENTAL MATERIAL -- B.9. HAND-IN REQUIREMENTS -- Appendix C Assignment 2: Indices -- C.1. Student File -- C.2. Program Execution -- C.3. In-Memory Primary Key Index -- C.4. In-Memory Availability List -- C.4.1. First Fit -- C.4.2. Best Fit -- C.4.3. Worst Fit -- C.5. User Interface -- C.5.1. Add -- C.5.2. Find -- C.5.3. Del -- C.5.4. End -- C.6. Programming Environment -- C.6.1. Writing Results -- C.7. SUPPLEMENTAL MATERIAL -- C.8. Hand-In Requirements -- Appendix D Assignment 3: Mergesort -- D.1. Index File -- D.2. Program Execution -- D.3. Available Memory -- D.4. Basic Mergesort -- D.5. Multistep Mergesort -- D.6. Replacement Selection Mergesort -- D.7. Programming Environment -- D.7.1. Measuring Time -- D.7.2. Writing Results -- D.8. Supplemental Material -- D.9. Hand-in Requirements -- Appendix E Assignment 4: B-Trees -- E.1. Index File -- E.2. Program Execution -- E.3. B-Tree Nodes -- E.3.1. Root Node Offset -- E.4. User Interface -- E.4.1. Add -- E.4.2. Find -- E.4.3. Print -- E.4.4. End -- E.5. Programming Environment -- E.6. Supplemental Material -- E.7. Hand-in Requirements
Summary Designed for senior undergraduate and graduate students, as well as professionals, this book provides a foundational discussion of physical storage devices and explains how algorithm performance is affected by the underlying storage system. -- Edited summary from book
Notes "A Chapman & Hall book."
Bibliography Includes bibliographical references and index
Notes Print version record
Subject Big data.
Disk access (Computer science)
COMPUTERS -- Data Processing.
Big data
Disk access (Computer science)
Form Electronic book
LC no. 2016029517
ISBN 9781315302867
1315302861
9781315302874
131530287X