Limit search to available items
Book Cover
E-book
Author Chambers, Bill (William Andrew), author

Title Spark : the definitive guide : big data processing made simple / Bill Chambers and Matei Zaharia
Edition First edition
Published Sebastopol, CA : O'Reilly Media, [2018]
©2018
Online access available from:
Safari O'Reilly books online    View Resource Record  

Copies

Description 1 online resource (xxvi, 576 pages) : illustrations
Contents Part 1. Gentle overview of big data and Spark. What is Apache Spark? -- A gentle introduction to Spark -- A tour of Spark's toolset -- Part 2. Structured APIs : DataFrames, SQL, and datasets. Structured API overview -- Basic structured operations -- Working with different types of data -- Aggregations -- Joins -- Data sources -- Spark SQL -- Datasets -- Part 3. Low-level APIs. Resilient distributed datasets (RDDs) -- Advanced RDDs -- Distributed shared variables -- Part 4. Production applications. How Spark runs on a cluster -- Developint Spark applications -- Deploying Spark -- Monitoring and debugging -- Performance tuning -- Part 5. Streaming. Stream processing fundamentals -- Structured streaming basics -- Event-time and stateful processing -- Structured streaming in production -- Part 6. Advanced analytics and machine learning. Advanced analytics and machine learning overview -- Preprocessing and feature engineering -- Classification -- Regression -- Recommendation -- Unsupervised learning -- Graph analytics -- Deep learning -- Part 7. Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) -- Ecosystem and community
Notes Includes index
Online resource; title from title page (Safari, viewed May 22, 2017)
Subject Spark (Electronic resource : Apache Software Foundation)
Big data.
Data mining.
Information retrieval.
Form Electronic book
Author Zaharia, Matei, author
ISBN 1491912200
1491912294 (electronic bk.)
1491912308 (electronic bk.)
9781491912201
9781491912294 (electronic bk.)
9781491912300 (electronic bk.)