Apache Hadoop

Hadoop Platform Components

The Hadoop project is built using :

  • Hadoop Common: A set of common utilities that support the other Hadoop modules.

  • Hadoop Distributed File System (HDFS): A distributed file system which provides high-throughput access to data.

  • YARN: A framework for job scheduling and cluster resource management.

  • MapReduce: A YARN-based system for parallel processing of large data sets.

  • Tez: A generalized data-flow programming framework, built on Hadoop YARN, which provides a powerful and flexible engine to execute an arbitrary DAG of tasks to process data for both batch and interactive use-cases.

  • ZooKeeper: A high-performance coordination service for distributed applications.

Last updated