Limit search to available items
Book Cover
E-book
Author Radtka, Zachary, author

Title Hadoop with Python / Zachary Radtka & Donald Miner
Edition First edition
Published Sebastopol, CA : O'Reilly Media, 2015
©2016

Copies

Description 1 online resource (1 volume) : illustrations
Summary Hadoop is mostly written in Java, but that doesn't exclude the use of other programming languages with this distributed storage and processing framework, particularly Python. With this concise book, you'll learn how to use Python with the Hadoop Distributed File System (HDFS), MapReduce, the Apache Pig platform and Pig Latin script, and the Apache Spark cluster-computing framework. Authors Zachary Radtka and Donald Miner from the data science firm Miner and Kasch take you through the basic concepts behind Hadoop, MapReduce, Pig, and Spark. Then, through multiple examples and use cases, you'll learn how to work with these technologies by applying various Python tools. Use the Python library Snakebite to access HDFS programmatically from within Python applications Write MapReduce jobs in Python with mrjob, the Python MapReduce library Extend Pig Latin with user-defined functions (UDFs) in Python Use the Spark Python API (PySpark) to write Spark programs with Python Learn how to use the Luigi Python workflow scheduler to manage MapReduce jobs and Pig scripts Zachary Radtka, a platform engineer at Miner and Kasch, has extensive experience creating custom analytics that run on petabyte-scale data sets
Notes Online resource; title from title page (viewed January 3, 2019)
SUBJECT Apache Hadoop. http://id.loc.gov/authorities/names/n2013024279
Apache Hadoop fast
Subject Python (Computer program language)
Python (Computer program language)
Form Electronic book
Author Miner, Donald, author.
ISBN 9781491942277
1491942274
1492048437
9781492048435