Why do HDFS clusters have only a single NameNode?
Posted
by
grautur
on Programmers
See other posts from Programmers
or by grautur
Published on 2012-04-04T03:07:34Z
Indexed on
2012/04/04
5:39 UTC
Read the original article
Hit count: 232
hadoop
I'm trying to understand better how Hadoop works, and I'm reading
The NameNode is a Single Point of Failure for the HDFS Cluster. HDFS is not currently a High Availability system. When the NameNode goes down, the file system goes offline. There is an optional SecondaryNameNode that can be hosted on a separate machine. It only creates checkpoints of the namespace by merging the edits file into the fsimage file and does not provide any real redundancy. Hadoop 0.21+ has a BackupNameNode that is part of a plan to have an HA name service, but it needs active contributions from the people who want it (i.e. you) to make it Highly Available.
from http://wiki.apache.org/hadoop/NameNode
So why is the NameNode a single point of failure? What is bad or difficult about having a complete duplicate of the NameNode running as well?
© Programmers or respective owner