Description |
1 online resource (1 volume) : illustrations |
Contents |
Analyzing big data -- Introduction to data analysis with Scala and Spark -- Recommending music and the audioscrobbler data set -- Predicting forest cover with decision trees -- Anomaly detection in network traffic with K-means clustering -- Understanding Wikipedia with latent semantic analysis -- Analyzing co-occurrence networks with GraphX -- Geospatial and temporal data analysis on the New York City taxi trip data -- Estimating financial risk through Monte Carlo simulation -- Analyzing genomics data and the BDG project -- Analyzing neuroimaging data with PySpark and Thunder |
Summary |
The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by presenting examples and a set of self-contained patterns for performing large-scale data analysis with Spark. You'll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques-classification, collaborative filtering, and anomaly detection among others-to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you'll find these patterns useful for working on your own data applications |
Notes |
Previous edition published: 2015 |
|
Includes index |
|
Description based on online resource; title from title page (Safari, viewed June 19, 2017) |
SUBJECT |
Spark (Electronic resource : Apache Software Foundation) http://id.loc.gov/authorities/names/no2015027445
|
|
Spark (Electronic resource : Apache Software Foundation) fast |
Subject |
Big data.
|
|
Data mining -- Computer programs
|
|
Big data
|
Form |
Electronic book
|
Author |
Laserson, Uri, author
|
|
Owen, Sean, author
|
|
Wills, Josh, author
|
ISBN |
9781491972946 |
|
1491972947 |
|
1491972955 |
|
9781491972953 |
|