A Comprehensive Guide to Apache Hadoop: Storage and Analysis at Scale
$40.99
Price: $40.99
(as of May 20,2023 21:39:10 UTC – Details)
Unlock the potential of big data with the Apache Hadoop ecosystem. In the fourth edition of this definitive guide, author Tom White provides a comprehensive introduction to using and maintaining distributed systems with Hadoop 2.
This book is an essential resource for programmers and administrators working with large-scale datasets. New chapters cover the YARN resource manager and other related Apache projects such as Flume, Parquet, Crunch, and Spark. Explore the latest changes to the Hadoop ecosystem and get insights into exciting healthcare systems and genomics data processing case studies.
Discover the fundamental components of Apache Hadoop, including the MapReduce processing framework, the Hadoop Distributed File System (HDFS), and the YARN resource manager. Learn how to develop applications with MapReduce and set up and maintain a Hadoop cluster running on YARN.
Also, you will explore different data formats such as Avro and Parquet and learn how to use data transfer and streaming tools such as Flume and Sqoop. Understand how to leverage high-level data processing tools like Pig, Hive, Crunch, and Spark with Hadoop. Lastly, learn about the HBase distributed database and the ZooKeeper distributed configuration service.
Publisher : O’Reilly Media; 4th edition (May 5, 2015)
Language : English
Paperback : 754 pages
ISBN-10 : 1491901632
ISBN-13 : 978-1491901632
Item Weight : 2.78 pounds
Dimensions : 7 x 1.51 x 9.19 inches
User Reviews
Be the first to review “A Comprehensive Guide to Apache Hadoop: Storage and Analysis at Scale”
$40.99
There are no reviews yet.