You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Sujee Maniyam <su...@sujee.net> on 2012/09/05 00:37:06 UTC

current direction in namenode HA

Hello devs,

I am trying to understand the current state / direction of  namenode
HA implementation.

For using shared directory, I see the following options
(from http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
  and  https://issues.apache.org/jira/browse/HDFS-3278)

1) rely on external HA filer
2) multiple edit directories
3) book keeper
4) keep edits in HDFS / quorum based

is there going to be an 'official / supported' method, or it is going
to be a configurable choice when setting up a cluster?

thanks
Sujee
http://sujee.net

Re: current direction in namenode HA

Posted by "Aaron T. Myers" <at...@cloudera.com>.
Hi Sujee,

On Tue, Sep 4, 2012 at 3:37 PM, Sujee Maniyam <su...@sujee.net> wrote:

> I am trying to understand the current state / direction of  namenode
> HA implementation.
>

Thanks for the interest.


>
> For using shared directory, I see the following options
> (from
> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
>   and  https://issues.apache.org/jira/browse/HDFS-3278)
>
> 1) rely on external HA filer
>

This is currently committed / released.


>  2) multiple edit directories
>

This was an option that was considered (and mentioned in the blog post you
referenced) but I don't know of anyone actively working on this.


> 3) book keeper
>

This is being actively developed. There's not an umbrella JIRA or label
that I'm aware of which tracks this work, but if you search the HDFS JIRA
for "bookkeeper" or "bkjm" (Bookkeeper journal manager) you should find
more info.


>  4) keep edits in HDFS / quorum based
>

This is also being actively developed. See this JIRA for more info:
https://issues.apache.org/jira/browse/HDFS-3077


>
> is there going to be an 'official / supported' method, or it is going
> to be a configurable choice when setting up a cluster?
>

At least for the time being, only option 1 is released. Options 3 and 4
will likely be completed / released soon, and once that happens all of
these will be mutually-exclusive options in HDFS. Each of these approaches
will coexist as a different option that can be configured when setting up
an HA HDFS cluster.

I hope that clears things up.

--
Aaron T. Myers
Software Engineer, Cloudera

Re: current direction in namenode HA

Posted by Uma Maheswara Rao G <ha...@gmail.com>.
small correction:
  This is for your Option 4:
   you can take a look at HDFS-3077
   In this umbrella JIRA, work is going actively.

@Ted, thanks for adding the link.

Regards,
Uma

On Wed, Sep 5, 2012 at 4:39 AM, Uma Maheswara Rao G <ha...@gmail.com>wrote:

> Hi Sujee,
>
> Thanks a lot for your interest on HA.
>
> for #1
> If you can invest on NFS filers, it is another option.  If you want to try
> this, you can use released Hadoop-2 version and try.
>   but above #2 and #3 will avoid this external hardware dependency.
>
> for #2 you can take a look at HDFS-3399
>   We are testing with BookeKeeper from last 2/3 months and going well. BK
> is progressing on autorecovery and security parts. Almost auto recoverry
> done(BOOKKEEPER-237) and will be released in BK 4.2 version very soon. BK
> already started work on security part as well. Also this integration part
> will come out with next hadoop-2 release as well. Also attached tested
> scenarios in HDFS-3399 for your reference if you want to take a look.
> Also there is one subTask in that umbrella  JIRA for user manual
> information.
>
>
> for #3 you can take a look at HDFS-3077
>    In this umbrella JIRA work is going on actively.
>
>
> for #4
> I am not sure any one working on it.
>
> The advantage here is, you can plugin the shared storage whichever you
> want.
>
> Regards,
> Uma
>
>
> On Wed, Sep 5, 2012 at 4:07 AM, Sujee Maniyam <su...@sujee.net> wrote:
>
>> Hello devs,
>>
>> I am trying to understand the current state / direction of  namenode
>> HA implementation.
>>
>> For using shared directory, I see the following options
>> (from
>> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
>>   and  https://issues.apache.org/jira/browse/HDFS-3278)
>>
>> 1) rely on external HA filer
>> 2) multiple edit directories
>> 3) book keeper
>> 4) keep edits in HDFS / quorum based
>>
>> is there going to be an 'official / supported' method, or it is going
>> to be a configurable choice when setting up a cluster?
>>
>> thanks
>> Sujee
>> http://sujee.net
>>
>
>

Re: current direction in namenode HA

Posted by Ted Yu <yu...@gmail.com>.
Uma:
Attachment is stripped in this mailing list.
I guess you were trying to attach this file:
https://issues.apache.org/jira/secure/attachment/12538911/BKTestDoc.pdf

On Tue, Sep 4, 2012 at 4:09 PM, Uma Maheswara Rao G <ha...@gmail.com>wrote:

> Hi Sujee,
>
> Thanks a lot for your interest on HA.
>
> for #1
> If you can invest on NFS filers, it is another option.  If you want to try
> this, you can use released Hadoop-2 version and try.
>   but above #2 and #3 will avoid this external hardware dependency.
>
> for #2 you can take a look at HDFS-3399
>   We are testing with BookeKeeper from last 2/3 months and going well. BK
> is progressing on autorecovery and security parts. Almost auto recoverry
> done(BOOKKEEPER-237) and will be released in BK 4.2 version very soon. BK
> already started work on security part as well. Also this integration part
> will come out with next hadoop-2 release as well. Also attached tested
> scenarios in HDFS-3399 for your reference if you want to take a look.
> Also there is one subTask in that umbrella  JIRA for user manual
> information.
>
>
> for #3 you can take a look at HDFS-3077
>    In this umbrella JIRA work is going on actively.
>
>
> for #4
> I am not sure any one working on it.
>
> The advantage here is, you can plugin the shared storage whichever you
> want.
>
> Regards,
> Uma
>
> On Wed, Sep 5, 2012 at 4:07 AM, Sujee Maniyam <su...@sujee.net> wrote:
>
> > Hello devs,
> >
> > I am trying to understand the current state / direction of  namenode
> > HA implementation.
> >
> > For using shared directory, I see the following options
> > (from
> >
> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
> >   and  https://issues.apache.org/jira/browse/HDFS-3278)
> >
> > 1) rely on external HA filer
> > 2) multiple edit directories
> > 3) book keeper
> > 4) keep edits in HDFS / quorum based
> >
> > is there going to be an 'official / supported' method, or it is going
> > to be a configurable choice when setting up a cluster?
> >
> > thanks
> > Sujee
> > http://sujee.net
> >
>

Re: current direction in namenode HA

Posted by Uma Maheswara Rao G <ha...@gmail.com>.
Hi Sujee,

Thanks a lot for your interest on HA.

for #1
If you can invest on NFS filers, it is another option.  If you want to try
this, you can use released Hadoop-2 version and try.
  but above #2 and #3 will avoid this external hardware dependency.

for #2 you can take a look at HDFS-3399
  We are testing with BookeKeeper from last 2/3 months and going well. BK
is progressing on autorecovery and security parts. Almost auto recoverry
done(BOOKKEEPER-237) and will be released in BK 4.2 version very soon. BK
already started work on security part as well. Also this integration part
will come out with next hadoop-2 release as well. Also attached tested
scenarios in HDFS-3399 for your reference if you want to take a look.
Also there is one subTask in that umbrella  JIRA for user manual
information.


for #3 you can take a look at HDFS-3077
   In this umbrella JIRA work is going on actively.


for #4
I am not sure any one working on it.

The advantage here is, you can plugin the shared storage whichever you want.

Regards,
Uma

On Wed, Sep 5, 2012 at 4:07 AM, Sujee Maniyam <su...@sujee.net> wrote:

> Hello devs,
>
> I am trying to understand the current state / direction of  namenode
> HA implementation.
>
> For using shared directory, I see the following options
> (from
> http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/
>   and  https://issues.apache.org/jira/browse/HDFS-3278)
>
> 1) rely on external HA filer
> 2) multiple edit directories
> 3) book keeper
> 4) keep edits in HDFS / quorum based
>
> is there going to be an 'official / supported' method, or it is going
> to be a configurable choice when setting up a cluster?
>
> thanks
> Sujee
> http://sujee.net
>