Hadoop bought capabilities to store massive amount of data in distributed environment and provide the way to process them effectively. It’s a distributed data processing system which support distributed file systems and it offers a way to parallelize and execute programs on a cluster of machines. It could be installed on cluster with using large number of commodities hardware which intern optimized the overall solution costs. Apache Hadoop already adopted by technologies giant such as Yahoo, Facebook, Twitter, LinkedIn etc. to address their big data needs, and it’s making inroads across all industrial sectors This book covers basic understanding on Hadoop includes Hadoop technologies, architecture, process flow etc.
This book also includes Hadoop key ecosystem components such as MapReduce, Hive, HDFS etc. It provides introduction to YARN and how it differ with old Hadoop. The book has been written considering for beginner and intermediate developer who want to get introduce in Hadoop.
Table of Contents
1. Big Data
3. The Hadoop Distribution Filesystem(HDFS)
4. Getting Started with Hadoop
5. Interface to Access HDFS File System
9. Getting Started with Hive