HDFS
Last updated
Last updated
The Hadoop distributed filesystem (HDFS ) allows users to store large amount of data across multiple server instances. A HDFS cluster consists of NameNode to manage the file system metadata and DataNodes to store the actual data. The client applications contact NameNode for file metadata or file modifications, while the actual file I/O is performed directly with the DataNodes.
Fault tolerant and horizontally scalable
Provides distributed storage and distributed processing on commodity hardware.
Can scale linearly as data footprint grows
Provides users UNIX shell-like commands to interact with HDFS
For additional details, visit the page on Apache Hadoop website