You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Yonghwan Kim (JIRA)" <ji...@apache.org> on 2013/06/30 05:38:19 UTC

[jira] [Created] (HDFS-4945) A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS

Yonghwan Kim created HDFS-4945:
----------------------------------

Summary: A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS
Key: HDFS-4945
URL: https://issues.apache.org/jira/browse/HDFS-4945
Project: Hadoop HDFS
Issue Type: New Feature
Components: auto-failover
Affects Versions: HA branch (HDFS-1623)
Reporter: Yonghwan Kim

Recently, Hadoop attracts much attention of engineers and researchers as an emerging and effective framework for Big Data.
HDFS(Hadoop Distributed File System) can manage huge amount of data with guaranteeing high performance and reliability
with only commodity hardware.

However, HDFS requires a single master node, called NameNode, to manage the entire namespace (or all the i-nodes)
of a file system. This causes SPOF (Single Point Of Failure) problem because the file system becomes inaccessible
when the NameNode fails. (HDFS-2064)

This also causes a bottleneck of efficiency since all the access requests to the file system have to contact the
NameNode. Hadoop 2.0 resolves the SPOF problem by introducing manual failover based on two NameNodes, Active and Standby.
However, it still has the efficiency bottleneck problem since all the access requests have to contact the Active
in ordinary executions. It may also lose an advantage of using commodity hardware since the two NameNodes have to
share a highly-reliable sophisticated storage.

We here propose a new HDFS architecture to resolve all the problems mentioned above.
The proposed architecture has the following features and advantages.

1. Multiple NameNodes (not restricted to two) can be utilized to improve availability.
The entire namespace of a file system is partitioned into several fragments, and replicas of each fragment are
dispersed among the NameNodes. When each fragment has k replicas, the file system can tolerate up to
floor(k/2 - 1) faulty NameNodes.

2. Multiple NameNodes can be utilized to improve performance. The performance bottleneck caused by a single
NameNode can be circumvented by assigning different NameNodes to different fragments as the primary ones
(or the entry points).

3. The highly-reliable storage shared by the NameNodes is removed by introducing message-based consistency
mechanism among the NameNodes. The architecture requires only commodity hardware.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Created] (HDFS-4945) A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS

Posted by Azuryy Yu <az...@gmail.com>.

hi,

fot your first question, if you deploy QJM ha, it doesnt need share
highly-reliable sophisticated storage.

--Send from my Sony mobile.
On Jun 30, 2013 11:38 AM, "Yonghwan Kim (JIRA)" <ji...@apache.org> wrote:

> Yonghwan Kim created HDFS-4945:
> ----------------------------------
>
>              Summary: A Distributed and Cooperative NameNode Cluster for a
> Highly-Available HDFS
>                  Key: HDFS-4945
>                  URL: https://issues.apache.org/jira/browse/HDFS-4945
>              Project: Hadoop HDFS
>           Issue Type: New Feature
>           Components: auto-failover
>     Affects Versions: HA branch (HDFS-1623)
>             Reporter: Yonghwan Kim
>
>
> Recently, Hadoop attracts much attention of engineers and researchers as
> an emerging and effective framework for Big Data.
> HDFS(Hadoop Distributed File System) can manage huge amount of data with
> guaranteeing high performance and reliability
> with only commodity hardware.
>
> However, HDFS requires a single master node, called NameNode, to manage
> the entire namespace (or all the i-nodes)
> of a file system. This causes SPOF (Single Point Of Failure) problem
> because the file system becomes inaccessible
> when the NameNode fails. (HDFS-2064)
>
> This also causes a bottleneck of efficiency since all the access requests
> to the file system have to contact the
> NameNode. Hadoop 2.0 resolves the SPOF problem by introducing manual
> failover based on two NameNodes, Active and Standby.
> However, it still has the efficiency bottleneck problem since all the
> access requests have to contact the Active
> in ordinary executions. It may also lose an advantage of using commodity
> hardware since the two NameNodes have to
> share a highly-reliable sophisticated storage.
>
> We here propose a new HDFS architecture to resolve all the problems
> mentioned above.
> The proposed architecture has the following features and advantages.
>
> 1. Multiple NameNodes (not restricted to two) can be utilized to improve
> availability.
> The entire namespace of a file system is partitioned into several
> fragments, and replicas of each fragment are
> dispersed among the NameNodes.  When each fragment has k replicas, the
> file system can tolerate up to
> floor(k/2 - 1) faulty NameNodes.
>
> 2. Multiple NameNodes can be utilized to improve performance. The
> performance bottleneck caused by a single
> NameNode can be circumvented by assigning different NameNodes to different
> fragments as the primary ones
> (or the entry points).
>
> 3. The highly-reliable storage shared by the NameNodes is removed by
> introducing message-based consistency
> mechanism among the NameNodes.  The architecture requires only commodity
> hardware.
>
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>