You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Sangmin Lee <sa...@gmail.com> on 2009/03/25 20:07:53 UTC

Highly Available HDFS ???

Hi all,

I am wondering if there is any effort or plans on HA (Highly Available) HDFS
out there.
Currently, NameNode is single point of failure and recovery requires human
intervention.
In addition, the recovered NameNode may not same as one before the failure.
Is there any plans or ongoing effort to improve this?

Thanks,
Sangmin

Re: Highly Available HDFS ???

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.
With recently introduced BackupNode HA becomes more feasible.
Although it is not done yet and as Sunjay mentioned the plans are not clear in that direction.

There is a presentation related to HA on hadoop wiki:
http://wiki.apache.org/hadoop/HadoopPresentations
http://files.meetup.com/1228907/Hadoop%20Namenode%20High%20Availability.pptx

Thanks,
--Konstantin


Sanjay Radia wrote:
> 
> On Mar 25, 2009, at 12:07 PM, Sangmin Lee wrote:
> 
>> Hi all,
>>
>> I am wondering if there is any effort or plans on HA (Highly 
>> Available) HDFS
>> out there.
>> Currently, NameNode is single point of failure and recovery requires 
>> human
>> intervention.
>>
> Many (and probably most) users of hadoop are using hdfs for batch 
> processing.
> As a result HA for name node has not received as high a priority as 
> other projects since
> batch jobs can wait while the name node is restarting.
> Clearly this is not acceptable for non-batch use of hdfs.
> 
> 
> Suresh has a rough prototype of HA'ed Namenode using linux HA that he is 
> planning put in contrib one of these days (it is low priority
> background task for him).
> 
> Sorry that I don't have a better answer.
> 
> sanjay
> 
>>
>> In addition, the recovered NameNode may not same as one before the 
>> failure.
>> Is there any plans or ongoing effort to improve this?
>>
>> Thanks,
>> Sangmin
>>
> 
> 

Re: Highly Available HDFS ???

Posted by Dhruba Borthakur <dh...@gmail.com>.
We are running a real-timeish cluster that is configured as two overlapping
hdfs clusters. The namenodes run on two different machines but the datanodes
run on the same set of slaves machines. (Each slave machine actually runs
two datanode instances.) The entire storage space is shared between the two
clusters and, at the same time, provides higher availability because there
are two namenodes. The downside is that there are two separate namespaces,
and the application has to handle this.

thanks,
dhruba

On Wed, Mar 25, 2009 at 12:25 PM, Sanjay Radia <sr...@yahoo-inc.com> wrote:

>
> On Mar 25, 2009, at 12:07 PM, Sangmin Lee wrote:
>
>  Hi all,
>>
>> I am wondering if there is any effort or plans on HA (Highly Available)
>> HDFS
>> out there.
>> Currently, NameNode is single point of failure and recovery requires human
>> intervention.
>>
>>  Many (and probably most) users of hadoop are using hdfs for batch
> processing.
> As a result HA for name node has not received as high a priority as other
> projects since
> batch jobs can wait while the name node is restarting.
> Clearly this is not acceptable for non-batch use of hdfs.
>
>
> Suresh has a rough prototype of HA'ed Namenode using linux HA that he is
> planning put in contrib one of these days (it is low priority
> background task for him).
>
> Sorry that I don't have a better answer.
>
> sanjay
>
>
>
>> In addition, the recovered NameNode may not same as one before the
>> failure.
>> Is there any plans or ongoing effort to improve this?
>>
>> Thanks,
>> Sangmin
>>
>>
>

Re: Highly Available HDFS ???

Posted by Sanjay Radia <sr...@yahoo-inc.com>.
On Mar 25, 2009, at 12:07 PM, Sangmin Lee wrote:

> Hi all,
>
> I am wondering if there is any effort or plans on HA (Highly  
> Available) HDFS
> out there.
> Currently, NameNode is single point of failure and recovery requires  
> human
> intervention.
>
Many (and probably most) users of hadoop are using hdfs for batch  
processing.
As a result HA for name node has not received as high a priority as  
other projects since
batch jobs can wait while the name node is restarting.
Clearly this is not acceptable for non-batch use of hdfs.


Suresh has a rough prototype of HA'ed Namenode using linux HA that he  
is planning put in contrib one of these days (it is low priority
background task for him).

Sorry that I don't have a better answer.

sanjay

>
> In addition, the recovered NameNode may not same as one before the  
> failure.
> Is there any plans or ongoing effort to improve this?
>
> Thanks,
> Sangmin
>