Name Node
HDFS uses a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks are stored in a set of DataNodes.
The NameNode is the arbitrator and repository for all HDFS metadata. Data never flows through the NameNode and it is primarily used to execute file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes.
The NameNode is a critical component in the Hadoop Distributed File System (HDFS) architecture. It acts as the master server and is responsible for managing the metadata of the entire Hadoop Distributed File System (HDFS). The NameNode maintains the namespace of the file system, which includes information about files, directories, and the locations of blocks of data on the DataNodes.
The HDFS is designed to store vast amounts of data across a distributed system, and the NameNode plays a pivotal role in managing the data's structure and access. However, it's important to note that the NameNode does not store the actual data (i.e., the content of files) itself. Instead, it stores the metadata, while the actual data blocks are stored on DataNodes.
Key Responsibilities of the NameNode
File System Metadata Management:
The NameNode stores all the metadata for the files in the HDFS. This metadata includes the directory structure, file names, permissions, replication factors, and block locations for each file.
Namespace: This is the directory structure of HDFS that the NameNode manages (similar to a traditional file system directory).
Block Mapping: HDFS splits files into smaller chunks called blocks. The NameNode tracks where each block of a file is stored across various DataNodes in the cluster.
File-to-Block Mapping:
The NameNode tracks which file is stored in which block and on which DataNode. HDFS files are divided into blocks (typically 128 MB or 256 MB by default).
The NameNode doesn’t store the file content; it only stores the block location and the file’s metadata (i.e., the mapping between files and their blocks).
Block Replication Management:
HDFS uses replication to ensure data reliability and fault tolerance. By default, each block of data is replicated three times (this can be configured).
The NameNode monitors the number of replicas of each block and ensures that the replication factor is maintained, especially in the case of node failures. If a DataNode goes down, the NameNode schedules the replication of blocks to ensure the desired replication level.
Data Integrity:
The NameNode maintains checksums for every block stored on the DataNodes. If there’s corruption in a block, the NameNode is responsible for re-replicating the block from other DataNodes to ensure data consistency.
Access Control:
The NameNode controls file access permissions. It checks whether a client has the proper permissions to access, modify, or delete a file.
Namespace Operations:
Operations such as creating files, renaming files, deleting files, and moving files between directories are handled by the NameNode.
When a client wants to access a file, the NameNode provides the block location information, allowing the client to directly access the blocks stored on the DataNodes.
Components of the NameNode
FSImage:
The FSImage is a file that contains the entire namespace (directory structure) and metadata of the HDFS. It includes information about files, directories, blocks, and permissions.
FSImage is stored in the NameNode's local storage and is periodically written to disk as a snapshot of the file system's state.
EditLog:
The EditLog stores all the changes made to the namespace (such as file creation, deletion, or modification) since the last checkpoint.
Each modification made to the file system is logged in the EditLog, and periodically, these logs are merged into the FSImage to create an up-to-date snapshot of the file system state.
Checkpoint:
Periodically, the NameNode performs a checkpointing operation, where the EditLog is applied to the FSImage, and a new version of the FSImage is created.
This checkpointing process helps to reduce the recovery time in case of NameNode failure.
How the NameNode Works in HDFS
File Writing Process:
When a client wants to write data to HDFS, it first interacts with the NameNode to obtain information about where to store the file’s blocks.
The client is then directed to the DataNodes where the file’s data will be written.
The NameNode also ensures that blocks are replicated on multiple DataNodes for fault tolerance.
File Reading Process:
When a client wants to read a file from HDFS, it contacts the NameNode to get the location of the blocks for that file.
Once the client has the block locations, it communicates directly with the DataNodes to retrieve the blocks of the file.
The NameNode doesn’t serve the file contents; it only serves the metadata.
Block Replication:
The NameNode constantly monitors the replication levels of the blocks and ensures that if a DataNode fails, the missing blocks are replicated from other DataNodes to maintain the desired replication factor.
Failure Recovery:
If a DataNode goes down, the NameNode will detect it via heartbeats (sent regularly by DataNodes to the NameNode). Once the NameNode identifies that a DataNode is unavailable, it will trigger the re-replication of the lost blocks to other healthy DataNodes.
This ensures data availability even in the event of hardware or node failures.
Fault Tolerance in the NameNode
Single Point of Failure: The NameNode is a critical component in HDFS. Since it stores all the metadata and the block locations of the entire file system, a failure of the NameNode could bring down the entire HDFS system.
High Availability: In a production environment, it’s common to use Hadoop High Availability (HA) for the NameNode. This involves having a secondary NameNode or a standby NameNode that can take over if the primary NameNode fails.
The Active NameNode serves all the requests, while the Standby NameNode remains in sync with the Active one by regularly copying the FSImage and EditLogs.
In case of failure, the Standby NameNode becomes the Active NameNode and takes over the responsibilities.
Secondary NameNode (Not a Backup)
The Secondary NameNode is often misunderstood as a backup for the NameNode, but it is not. Its primary role is to checkpoint the namespace by merging the EditLog with the FSImage to create a new FSImage.
The Secondary NameNode does not serve client requests or replace the NameNode if it fails. It simply helps with the housekeeping of the HDFS metadata to ensure that the EditLog doesn't grow indefinitely.
Challenges and Considerations
Memory Usage: The NameNode stores all the metadata of the file system in memory, so it requires significant memory resources, especially in large clusters.
Scalability: The NameNode can become a bottleneck as the number of files and directories increases. However, with Hadoop 2.x, YARN decouples resource management from the NameNode, improving scalability.
Single Point of Failure: Although NameNode High Availability (HA) can help mitigate this, the NameNode remains a single point of failure for metadata management unless configured in an HA setup.
Last updated