You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2011/12/07 08:10:18 UTC

HDFS Backup nodes

Does hadoop 0.20.205 supports configuring HDFS backup nodes ?

Thanks,
Praveenesh

Re: HDFS Backup nodes

Posted by Todd Lipcon <to...@cloudera.com>.

On Tue, Dec 13, 2011 at 10:42 PM, M. C. Srivas <mc...@gmail.com> wrote:
> Any simple file meta-data test will cause the NN to spiral to death with
> infinite GC.  For example, try create many many files. Or even simple
> "stat" a bunch of file continuously.

Sure. If I run "dd if=/dev/zero of=foo" my laptop will "spiral to
death" also. I think this is what you're referring to -- continuously
write files until it is out of RAM.

This is a well understood design choice of HDFS. It is not designed as
general purpose storage for small files, and if you run tests against
it assuming it is, you'll get bad results. I agree there.

>
> The real FUD going on is refusing to acknowledge that there is indeed a
> real problem.

Yes, if you use HDFS for workloads for which it was never designed,
you'll have a problem. If you stick to commonly accepted best
practices I think you'll find the same thing that hundreds of other
companies have found: HDFS is stable and reliable and has no such "GC
of death" problems when used as intended.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS Backup nodes

Posted by "M. C. Srivas" <mc...@gmail.com>.

On Tue, Dec 13, 2011 at 6:19 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Sun, Dec 11, 2011 at 10:47 PM, M. C. Srivas <mc...@gmail.com> wrote:
> > But if you use a Netapp, then the likelihood of the Netapp crashing is
> > lower than the likelihood of a garbage-collection-of-death happening in
> the
> > NN.
>
> This is pure FUD.


> I've never seen a "garbage collection of death" ever in any NN with
> smaller than a 40GB heap, and only a small handful of times on larger
> heaps. So, unless you're running a 4000 node cluster, you shouldn't be
> concerned with this. And the existence of many 4000 node clusters
> running fine on HDFS indicates that a properly tuned NN does just
> fine.
>

Any simple file meta-data test will cause the NN to spiral to death with
infinite GC.  For example, try create many many files. Or even simple
"stat" a bunch of file continuously.

The real FUD going on is refusing to acknowledge that there is indeed a
real problem.



>
> [Disclaimer: I don't spread FUD regardless of vendor affiliation.]
>
> -Todd
>
> >
> > [ disclaimer:  I don't work for Netapp, I work for MapR ]
> >
> >
> > On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
> >
> >> Thanks Joey. We've had enough problems with nfs (mainly under very high
> >> load) that we thought it might be riskier to use it for the NN.
> >>
> >> randy
> >>
> >>
> >> On 12/07/2011 06:46 PM, Joey Echeverria wrote:
> >>
> >>> Hey Rand,
> >>>
> >>> It will mark that storage directory as failed and ignore it from then
> >>> on. In order to do this correctly, you need a couple of options
> >>> enabled on the NFS mount to make sure that it doesn't retry
> >>> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
> >>> options set.
> >>>
> >>> -Joey
> >>>
> >>> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
> >>>
> >>>> What happens then if the nfs server fails or isn't reachable? Does
> hdfs
> >>>> lock up? Does it gracefully ignore the nfs copy?
> >>>>
> >>>> Thanks,
> >>>> randy
> >>>>
> >>>> ----- Original Message -----
> >>>> From: "Joey Echeverria"<jo...@cloudera.com>
> >>>> To: common-user@hadoop.apache.org
> >>>> Sent: Wednesday, December 7, 2011 6:07:58 AM
> >>>> Subject: Re: HDFS Backup nodes
> >>>>
> >>>> You should also configure the Namenode to use an NFS mount for one of
> >>>> it's storage directories. That will give the most up-to-date back of
> >>>> the metadata in case of total node failure.
> >>>>
> >>>> -Joey
> >>>>
> >>>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<praveenesh@gmail.com
> >
> >>>>  wrote:
> >>>>
> >>>>> This means still we are relying on Secondary NameNode idealogy for
> >>>>> Namenode's backup.
> >>>>> Can OS-mirroring of Namenode is a good alternative keep it alive all
> the
> >>>>> time ?
> >>>>>
> >>>>> Thanks,
> >>>>> Praveenesh
> >>>>>
> >>>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
> >>>>> maheswara@huawei.com>wrote:
> >>>>>
> >>>>>  AFAIK backup node introduced in 0.21 version onwards.
> >>>>>> ______________________________**__________
> >>>>>> From: praveenesh kumar [praveenesh@gmail.com]
> >>>>>> Sent: Wednesday, December 07, 2011 12:40 PM
> >>>>>> To: common-user@hadoop.apache.org
> >>>>>> Subject: HDFS Backup nodes
> >>>>>>
> >>>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Praveenesh
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Joseph Echeverria
> >>>> Cloudera, Inc.
> >>>> 443.305.9434
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: HDFS Backup nodes

Posted by Todd Lipcon <to...@cloudera.com>.

On Sun, Dec 11, 2011 at 10:47 PM, M. C. Srivas <mc...@gmail.com> wrote:
> But if you use a Netapp, then the likelihood of the Netapp crashing is
> lower than the likelihood of a garbage-collection-of-death happening in the
> NN.

This is pure FUD.

I've never seen a "garbage collection of death" ever in any NN with
smaller than a 40GB heap, and only a small handful of times on larger
heaps. So, unless you're running a 4000 node cluster, you shouldn't be
concerned with this. And the existence of many 4000 node clusters
running fine on HDFS indicates that a properly tuned NN does just
fine.

[Disclaimer: I don't spread FUD regardless of vendor affiliation.]

-Todd

>
> [ disclaimer:  I don't work for Netapp, I work for MapR ]
>
>
> On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
>
>> Thanks Joey. We've had enough problems with nfs (mainly under very high
>> load) that we thought it might be riskier to use it for the NN.
>>
>> randy
>>
>>
>> On 12/07/2011 06:46 PM, Joey Echeverria wrote:
>>
>>> Hey Rand,
>>>
>>> It will mark that storage directory as failed and ignore it from then
>>> on. In order to do this correctly, you need a couple of options
>>> enabled on the NFS mount to make sure that it doesn't retry
>>> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
>>> options set.
>>>
>>> -Joey
>>>
>>> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
>>>
>>>> What happens then if the nfs server fails or isn't reachable? Does hdfs
>>>> lock up? Does it gracefully ignore the nfs copy?
>>>>
>>>> Thanks,
>>>> randy
>>>>
>>>> ----- Original Message -----
>>>> From: "Joey Echeverria"<jo...@cloudera.com>
>>>> To: common-user@hadoop.apache.org
>>>> Sent: Wednesday, December 7, 2011 6:07:58 AM
>>>> Subject: Re: HDFS Backup nodes
>>>>
>>>> You should also configure the Namenode to use an NFS mount for one of
>>>> it's storage directories. That will give the most up-to-date back of
>>>> the metadata in case of total node failure.
>>>>
>>>> -Joey
>>>>
>>>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<pr...@gmail.com>
>>>>  wrote:
>>>>
>>>>> This means still we are relying on Secondary NameNode idealogy for
>>>>> Namenode's backup.
>>>>> Can OS-mirroring of Namenode is a good alternative keep it alive all the
>>>>> time ?
>>>>>
>>>>> Thanks,
>>>>> Praveenesh
>>>>>
>>>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
>>>>> maheswara@huawei.com>wrote:
>>>>>
>>>>>  AFAIK backup node introduced in 0.21 version onwards.
>>>>>> ______________________________**__________
>>>>>> From: praveenesh kumar [praveenesh@gmail.com]
>>>>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>>>>> To: common-user@hadoop.apache.org
>>>>>> Subject: HDFS Backup nodes
>>>>>>
>>>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>>>>>
>>>>>> Thanks,
>>>>>> Praveenesh
>>>>>>
>>>>>>
>>>>
>>>>
>>>> --
>>>> Joseph Echeverria
>>>> Cloudera, Inc.
>>>> 443.305.9434
>>>>
>>>
>>>
>>>
>>>
>>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS Backup nodes

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Dec 14, 2011 at 10:00 AM, Scott Carey <sc...@richrelevance.com> wrote:
>>As of today, there is no option except to use NFS.  And as you yourself
>>mention, the first HA prototype when it comes out will require NFS.
>
> How will it 'require' NFS?  Won't any 'remote, high availability storage'
> work?  NFS is unreliable unless in my experience unless:
...>
> A solution with a brief 'stall' in service while a SAN mount switched over
> or similar with drbd should be possible and data safe, if this is being
> built to truly 'require' NFS that is no better for me than the current
> situation, which we manage using OS level tools for failover that will
> temporarily break clients but resume availability quickly thereafter.
> Where I would like the most help from hadoop is in making the failover
> transparent to clients, not in solving the reliable storage problem or
> failover scenarios that Storage and OS vendors do.

Currently our requirement is that we can have two client machines
"mount" the storage, though only one needs to have it mounted rw at a
time. This is certainly doable with DRBD in conjunction with a
clustered filesystem like GPFS2. I believe Dhruba was doing some
experimentation with an approach like this.

It's not currently provided for, but it wouldn't be very difficult to
extend the design so that the standby didn't even need read access
until the failover event. It would just cause a longer failover period
since the standby would have more edits to "catch up" with, etc. I
don't think anyone's currently working on this, but if you wanted to
contribute I can point you in the right direction. If you happen to be
at the SF HUG tonight, grab me and I'll give you the rundown on what
would be needed.

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS Backup nodes

Posted by Konstantin Boudnik <co...@apache.org>.

On Wed, Dec 14, 2011 at 10:09AM, Scott Carey wrote:
> 
> On 12/13/11 11:28 PM, "Konstantin Boudnik" <co...@apache.org> wrote:
> 
> >On Tue, Dec 13, 2011 at 11:00PM, M. C. Srivas wrote:
> >> Suresh,
> >> 
> >> As of today, there is no option except to use NFS.  And as you yourself
> >> mention, the first HA prototype when it comes out will require NFS.
> >
> >
> >NFS is just happen to be readily available in any data center and doesn't
> >require much of the extra investment on top of what exists.
> 
> That is a false assumption.  I'm not buying a netapp filer just for this.
>  We have no NFS, or want any.  If we ever use it, it won't be in the data
> center with Hadoop!

It isn't a false assumption, it is a reasonable one based on the experience.
You don't need netapp for NFS, you can have a Thumper or whatever. I am not
saying NFS is the only and the best - all I said it is pretty common ;) I
would opt fo BK or Jini Spaces like solution any day, though.

Cos

Re: HDFS Backup nodes

Posted by Scott Carey <sc...@richrelevance.com>.

On 12/13/11 11:28 PM, "Konstantin Boudnik" <co...@apache.org> wrote:

>On Tue, Dec 13, 2011 at 11:00PM, M. C. Srivas wrote:
>> Suresh,
>> 
>> As of today, there is no option except to use NFS.  And as you yourself
>> mention, the first HA prototype when it comes out will require NFS.
>
>
>NFS is just happen to be readily available in any data center and doesn't
>require much of the extra investment on top of what exists.

That is a false assumption.  I'm not buying a netapp filer just for this.
 We have no NFS, or want any.  If we ever use it, it won't be in the data
center with Hadoop!

Re: HDFS Backup nodes

Posted by Konstantin Boudnik <co...@apache.org>.

On Tue, Dec 13, 2011 at 11:00PM, M. C. Srivas wrote:
> Suresh,
> 
> As of today, there is no option except to use NFS.  And as you yourself
> mention, the first HA prototype when it comes out will require NFS.

Well, in the interest of full disclosure NFS is just one of the options and
not the only one. Any auxiliary storage will do greatly. Distributed in-memory
redundant storage for sub-seconds fail-over? Sure, Gigaspaces do this for
years using very mature JINI.

NFS is just happen to be readily available in any data center and doesn't
require much of the extra investment on top of what exists. NFS comes with its
own set of problems of course. First and foremost is No-File-Security which
requires use of something like Kerberos for third-party user management. And
when paired with something like LinuxTaskController it can produce some very
interesting effects.

Cos

> (a) I wasn't aware that Bookkeeper had progressed that far. I wonder
> whether it would be able to keep up with the data rates that is required in
> order to hold the NN log without falling behind.
> 
> (b) I do know Karthik Ranga at FB just started a design to put the NN data
> in HDFS itself, but that is in very preliminary design stages with no real
> code there.
> 
> The problem is that the HA code written with NFS in mind is very different
> from the HA code written with HDFS in mind, which are both quite different
> from the code that is written with Bookkeeper in mind. Essentially the
> three options will form three different implementations, since the failure
> modes of each of the back-ends are different. Am I totally off base?
> 
> thanks,
> Srivas.
> 
> 
> 
> 
> On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas <su...@hortonworks.com>wrote:
> 
> > Srivas,
> >
> > As you may know already, NFS is just being used in the first prototype for
> > HA.
> >
> > Two options for editlog store are:
> > 1. Using BookKeeper. Work has already completed on trunk towards this. This
> > will replace need for NFS to  store the editlogs and is highly available.
> > This solution will also be used for HA.
> > 2. We have a short term goal also to enable editlogs going to HDFS itself.
> > The work is in progress.
> >
> > Regards,
> > Suresh
> >
> >
> > >
> > > ---------- Forwarded message ----------
> > > From: M. C. Srivas <mc...@gmail.com>
> > > Date: Sun, Dec 11, 2011 at 10:47 PM
> > > Subject: Re: HDFS Backup nodes
> > > To: common-user@hadoop.apache.org
> > >
> > >
> > > You are out of luck if you don't want to use NFS, and yet want redundancy
> > > for the NN.  Even the new "NN HA" work being done by the community will
> > > require NFS ... and the NFS itself needs to be HA.
> > >
> > > But if you use a Netapp, then the likelihood of the Netapp crashing is
> > > lower than the likelihood of a garbage-collection-of-death happening in
> > the
> > > NN.
> > >
> > > [ disclaimer:  I don't work for Netapp, I work for MapR ]
> > >
> > >
> > > On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
> > >
> > > > Thanks Joey. We've had enough problems with nfs (mainly under very high
> > > > load) that we thought it might be riskier to use it for the NN.
> > > >
> > > > randy
> > > >
> > > >
> > > > On 12/07/2011 06:46 PM, Joey Echeverria wrote:
> > > >
> > > >> Hey Rand,
> > > >>
> > > >> It will mark that storage directory as failed and ignore it from then
> > > >> on. In order to do this correctly, you need a couple of options
> > > >> enabled on the NFS mount to make sure that it doesn't retry
> > > >> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
> > > >> options set.
> > > >>
> > > >> -Joey
> > > >>
> > > >> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
> > > >>
> > > >>> What happens then if the nfs server fails or isn't reachable? Does
> > hdfs
> > > >>> lock up? Does it gracefully ignore the nfs copy?
> > > >>>
> > > >>> Thanks,
> > > >>> randy
> > > >>>
> > > >>> ----- Original Message -----
> > > >>> From: "Joey Echeverria"<jo...@cloudera.com>
> > > >>> To: common-user@hadoop.apache.org
> > > >>> Sent: Wednesday, December 7, 2011 6:07:58 AM
> > > >>> Subject: Re: HDFS Backup nodes
> > > >>>
> > > >>> You should also configure the Namenode to use an NFS mount for one of
> > > >>> it's storage directories. That will give the most up-to-date back of
> > > >>> the metadata in case of total node failure.
> > > >>>
> > > >>> -Joey
> > > >>>
> > > >>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<
> > praveenesh@gmail.com>
> > > >>>  wrote:
> > > >>>
> > > >>>> This means still we are relying on Secondary NameNode idealogy for
> > > >>>> Namenode's backup.
> > > >>>> Can OS-mirroring of Namenode is a good alternative keep it alive all
> > > the
> > > >>>> time ?
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Praveenesh
> > > >>>>
> > > >>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
> > > >>>> maheswara@huawei.com>wrote:
> > > >>>>
> > > >>>>  AFAIK backup node introduced in 0.21 version onwards.
> > > >>>>> ______________________________**__________
> > > >>>>> From: praveenesh kumar [praveenesh@gmail.com]
> > > >>>>> Sent: Wednesday, December 07, 2011 12:40 PM
> > > >>>>> To: common-user@hadoop.apache.org
> > > >>>>> Subject: HDFS Backup nodes
> > > >>>>>
> > > >>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Praveenesh
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Joseph Echeverria
> > > >>> Cloudera, Inc.
> > > >>> 443.305.9434
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >
> > >
> > >
> >

Re: HDFS Backup nodes

Posted by Todd Lipcon <to...@cloudera.com>.

On Tue, Dec 13, 2011 at 11:00 PM, M. C. Srivas <mc...@gmail.com> wrote:
> (a) I wasn't aware that Bookkeeper had progressed that far. I wonder
> whether it would be able to keep up with the data rates that is required in
> order to hold the NN log without falling behind.

It's a good question - but one which has data relatively available.
Reading from Flavio Junqueira's slides from the Hadoop In China
conference a few weeks ago, he can maintain ~50k TPS with <20ms
latency, with 128 byte transactions. Given that HDFS does batch
multiple transactions per commit (standard group commit techniques) we
might imagine 4KB transactions where it looks like about 5K TPS,
equating to around 20MB/sec throughput. These transaction rates should
be plenty for the edit logging use case in my experience.

>
> (b) I do know Karthik Ranga at FB just started a design to put the NN data
> in HDFS itself, but that is in very preliminary design stages with no real
> code there.

Agreed. But it's not particularly complex either.. things can move
from "preliminary design" to working code in short timelines.

>
> The problem is that the HA code written with NFS in mind is very different
> from the HA code written with HDFS in mind, which are both quite different
> from the code that is written with Bookkeeper in mind. Essentially the
> three options will form three different implementations, since the failure
> modes of each of the back-ends are different. Am I totally off base?

Actually since the beginning of the HA project we have been keeping in
mind that NFS is only a step along the way. The shared edits storage
only has to have the following very basic operations:
- write and append to files ("log segments")
- read from closed files
- fence another writer (which can also be implemented with STONITH)

As I understand it, BK supports all of the above and in fact the BK
team has a working prototype of journal storage in BK. The interface
is already made pluggable as of last month. So this is not far-off
brainstorming but rather a very real implementation that's coming very
soon to stable releases.

-Todd

> On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas <su...@hortonworks.com>wrote:
>
>> Srivas,
>>
>> As you may know already, NFS is just being used in the first prototype for
>> HA.
>>
>> Two options for editlog store are:
>> 1. Using BookKeeper. Work has already completed on trunk towards this. This
>> will replace need for NFS to  store the editlogs and is highly available.
>> This solution will also be used for HA.
>> 2. We have a short term goal also to enable editlogs going to HDFS itself.
>> The work is in progress.
>>
>> Regards,
>> Suresh
>>
>>
>> >
>> > ---------- Forwarded message ----------
>> > From: M. C. Srivas <mc...@gmail.com>
>> > Date: Sun, Dec 11, 2011 at 10:47 PM
>> > Subject: Re: HDFS Backup nodes
>> > To: common-user@hadoop.apache.org
>> >
>> >
>> > You are out of luck if you don't want to use NFS, and yet want redundancy
>> > for the NN.  Even the new "NN HA" work being done by the community will
>> > require NFS ... and the NFS itself needs to be HA.
>> >
>> > But if you use a Netapp, then the likelihood of the Netapp crashing is
>> > lower than the likelihood of a garbage-collection-of-death happening in
>> the
>> > NN.
>> >
>> > [ disclaimer:  I don't work for Netapp, I work for MapR ]
>> >
>> >
>> > On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
>> >
>> > > Thanks Joey. We've had enough problems with nfs (mainly under very high
>> > > load) that we thought it might be riskier to use it for the NN.
>> > >
>> > > randy
>> > >
>> > >
>> > > On 12/07/2011 06:46 PM, Joey Echeverria wrote:
>> > >
>> > >> Hey Rand,
>> > >>
>> > >> It will mark that storage directory as failed and ignore it from then
>> > >> on. In order to do this correctly, you need a couple of options
>> > >> enabled on the NFS mount to make sure that it doesn't retry
>> > >> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
>> > >> options set.
>> > >>
>> > >> -Joey
>> > >>
>> > >> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
>> > >>
>> > >>> What happens then if the nfs server fails or isn't reachable? Does
>> hdfs
>> > >>> lock up? Does it gracefully ignore the nfs copy?
>> > >>>
>> > >>> Thanks,
>> > >>> randy
>> > >>>
>> > >>> ----- Original Message -----
>> > >>> From: "Joey Echeverria"<jo...@cloudera.com>
>> > >>> To: common-user@hadoop.apache.org
>> > >>> Sent: Wednesday, December 7, 2011 6:07:58 AM
>> > >>> Subject: Re: HDFS Backup nodes
>> > >>>
>> > >>> You should also configure the Namenode to use an NFS mount for one of
>> > >>> it's storage directories. That will give the most up-to-date back of
>> > >>> the metadata in case of total node failure.
>> > >>>
>> > >>> -Joey
>> > >>>
>> > >>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<
>> praveenesh@gmail.com>
>> > >>>  wrote:
>> > >>>
>> > >>>> This means still we are relying on Secondary NameNode idealogy for
>> > >>>> Namenode's backup.
>> > >>>> Can OS-mirroring of Namenode is a good alternative keep it alive all
>> > the
>> > >>>> time ?
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Praveenesh
>> > >>>>
>> > >>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
>> > >>>> maheswara@huawei.com>wrote:
>> > >>>>
>> > >>>>  AFAIK backup node introduced in 0.21 version onwards.
>> > >>>>> ______________________________**__________
>> > >>>>> From: praveenesh kumar [praveenesh@gmail.com]
>> > >>>>> Sent: Wednesday, December 07, 2011 12:40 PM
>> > >>>>> To: common-user@hadoop.apache.org
>> > >>>>> Subject: HDFS Backup nodes
>> > >>>>>
>> > >>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>> Praveenesh
>> > >>>>>
>> > >>>>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Joseph Echeverria
>> > >>> Cloudera, Inc.
>> > >>> 443.305.9434
>> > >>>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >
>> >
>> >
>>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HDFS Backup nodes

Posted by Scott Carey <sc...@richrelevance.com>.


On 12/13/11 11:00 PM, "M. C. Srivas" <mc...@gmail.com> wrote:

>Suresh,
>
>As of today, there is no option except to use NFS.  And as you yourself
>mention, the first HA prototype when it comes out will require NFS.

How will it 'require' NFS?  Won't any 'remote, high availability storage'
work?  NFS is unreliable unless in my experience unless:
* Its a Netapp
* Its based on Solaris
(caveat: I have only used 5 NFS solution types over the last decade, and
the issues are not data integrity, rather availability from a client
perspective)


A solution with a brief 'stall' in service while a SAN mount switched over
or similar with drbd should be possible and data safe, if this is being
built to truly 'require' NFS that is no better for me than the current
situation, which we manage using OS level tools for failover that will
temporarily break clients but resume availability quickly thereafter.
Where I would like the most help from hadoop is in making the failover
transparent to clients, not in solving the reliable storage problem or
failover scenarios that Storage and OS vendors do.

>
>(a) I wasn't aware that Bookkeeper had progressed that far. I wonder
>whether it would be able to keep up with the data rates that is required
>in
>order to hold the NN log without falling behind.
>
>(b) I do know Karthik Ranga at FB just started a design to put the NN data
>in HDFS itself, but that is in very preliminary design stages with no real
>code there.
>
>The problem is that the HA code written with NFS in mind is very different
>from the HA code written with HDFS in mind, which are both quite different
>from the code that is written with Bookkeeper in mind. Essentially the
>three options will form three different implementations, since the failure
>modes of each of the back-ends are different. Am I totally off base?
>
>thanks,
>Srivas.
>
>
>
>
>On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas
><su...@hortonworks.com>wrote:
>
>> Srivas,
>>
>> As you may know already, NFS is just being used in the first prototype
>>for
>> HA.
>>
>> Two options for editlog store are:
>> 1. Using BookKeeper. Work has already completed on trunk towards this.
>>This
>> will replace need for NFS to  store the editlogs and is highly
>>available.
>> This solution will also be used for HA.
>> 2. We have a short term goal also to enable editlogs going to HDFS
>>itself.
>> The work is in progress.
>>
>> Regards,
>> Suresh
>>
>>
>> >
>> > ---------- Forwarded message ----------
>> > From: M. C. Srivas <mc...@gmail.com>
>> > Date: Sun, Dec 11, 2011 at 10:47 PM
>> > Subject: Re: HDFS Backup nodes
>> > To: common-user@hadoop.apache.org
>> >
>> >
>> > You are out of luck if you don't want to use NFS, and yet want
>>redundancy
>> > for the NN.  Even the new "NN HA" work being done by the community
>>will
>> > require NFS ... and the NFS itself needs to be HA.
>> >
>> > But if you use a Netapp, then the likelihood of the Netapp crashing is
>> > lower than the likelihood of a garbage-collection-of-death happening
>>in
>> the
>> > NN.
>> >
>> > [ disclaimer:  I don't work for Netapp, I work for MapR ]
>> >
>> >
>> > On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
>> >
>> > > Thanks Joey. We've had enough problems with nfs (mainly under very
>>high
>> > > load) that we thought it might be riskier to use it for the NN.
>> > >
>> > > randy
>> > >
>> > >
>> > > On 12/07/2011 06:46 PM, Joey Echeverria wrote:
>> > >
>> > >> Hey Rand,
>> > >>
>> > >> It will mark that storage directory as failed and ignore it from
>>then
>> > >> on. In order to do this correctly, you need a couple of options
>> > >> enabled on the NFS mount to make sure that it doesn't retry
>> > >> infinitely. I usually run with the
>>tcp,soft,intr,timeo=10,**retrans=10
>> > >> options set.
>> > >>
>> > >> -Joey
>> > >>
>> > >> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
>> > >>
>> > >>> What happens then if the nfs server fails or isn't reachable? Does
>> hdfs
>> > >>> lock up? Does it gracefully ignore the nfs copy?
>> > >>>
>> > >>> Thanks,
>> > >>> randy
>> > >>>
>> > >>> ----- Original Message -----
>> > >>> From: "Joey Echeverria"<jo...@cloudera.com>
>> > >>> To: common-user@hadoop.apache.org
>> > >>> Sent: Wednesday, December 7, 2011 6:07:58 AM
>> > >>> Subject: Re: HDFS Backup nodes
>> > >>>
>> > >>> You should also configure the Namenode to use an NFS mount for
>>one of
>> > >>> it's storage directories. That will give the most up-to-date back
>>of
>> > >>> the metadata in case of total node failure.
>> > >>>
>> > >>> -Joey
>> > >>>
>> > >>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<
>> praveenesh@gmail.com>
>> > >>>  wrote:
>> > >>>
>> > >>>> This means still we are relying on Secondary NameNode idealogy
>>for
>> > >>>> Namenode's backup.
>> > >>>> Can OS-mirroring of Namenode is a good alternative keep it alive
>>all
>> > the
>> > >>>> time ?
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Praveenesh
>> > >>>>
>> > >>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
>> > >>>> maheswara@huawei.com>wrote:
>> > >>>>
>> > >>>>  AFAIK backup node introduced in 0.21 version onwards.
>> > >>>>> ______________________________**__________
>> > >>>>> From: praveenesh kumar [praveenesh@gmail.com]
>> > >>>>> Sent: Wednesday, December 07, 2011 12:40 PM
>> > >>>>> To: common-user@hadoop.apache.org
>> > >>>>> Subject: HDFS Backup nodes
>> > >>>>>
>> > >>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>> Praveenesh
>> > >>>>>
>> > >>>>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Joseph Echeverria
>> > >>> Cloudera, Inc.
>> > >>> 443.305.9434
>> > >>>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >
>> >
>> >
>>

Re: HDFS Backup nodes

Posted by "M. C. Srivas" <mc...@gmail.com>.

Suresh,

As of today, there is no option except to use NFS.  And as you yourself
mention, the first HA prototype when it comes out will require NFS.

(a) I wasn't aware that Bookkeeper had progressed that far. I wonder
whether it would be able to keep up with the data rates that is required in
order to hold the NN log without falling behind.

(b) I do know Karthik Ranga at FB just started a design to put the NN data
in HDFS itself, but that is in very preliminary design stages with no real
code there.

The problem is that the HA code written with NFS in mind is very different
from the HA code written with HDFS in mind, which are both quite different
from the code that is written with Bookkeeper in mind. Essentially the
three options will form three different implementations, since the failure
modes of each of the back-ends are different. Am I totally off base?

thanks,
Srivas.




On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas <su...@hortonworks.com>wrote:

> Srivas,
>
> As you may know already, NFS is just being used in the first prototype for
> HA.
>
> Two options for editlog store are:
> 1. Using BookKeeper. Work has already completed on trunk towards this. This
> will replace need for NFS to  store the editlogs and is highly available.
> This solution will also be used for HA.
> 2. We have a short term goal also to enable editlogs going to HDFS itself.
> The work is in progress.
>
> Regards,
> Suresh
>
>
> >
> > ---------- Forwarded message ----------
> > From: M. C. Srivas <mc...@gmail.com>
> > Date: Sun, Dec 11, 2011 at 10:47 PM
> > Subject: Re: HDFS Backup nodes
> > To: common-user@hadoop.apache.org
> >
> >
> > You are out of luck if you don't want to use NFS, and yet want redundancy
> > for the NN.  Even the new "NN HA" work being done by the community will
> > require NFS ... and the NFS itself needs to be HA.
> >
> > But if you use a Netapp, then the likelihood of the Netapp crashing is
> > lower than the likelihood of a garbage-collection-of-death happening in
> the
> > NN.
> >
> > [ disclaimer:  I don't work for Netapp, I work for MapR ]
> >
> >
> > On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
> >
> > > Thanks Joey. We've had enough problems with nfs (mainly under very high
> > > load) that we thought it might be riskier to use it for the NN.
> > >
> > > randy
> > >
> > >
> > > On 12/07/2011 06:46 PM, Joey Echeverria wrote:
> > >
> > >> Hey Rand,
> > >>
> > >> It will mark that storage directory as failed and ignore it from then
> > >> on. In order to do this correctly, you need a couple of options
> > >> enabled on the NFS mount to make sure that it doesn't retry
> > >> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
> > >> options set.
> > >>
> > >> -Joey
> > >>
> > >> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
> > >>
> > >>> What happens then if the nfs server fails or isn't reachable? Does
> hdfs
> > >>> lock up? Does it gracefully ignore the nfs copy?
> > >>>
> > >>> Thanks,
> > >>> randy
> > >>>
> > >>> ----- Original Message -----
> > >>> From: "Joey Echeverria"<jo...@cloudera.com>
> > >>> To: common-user@hadoop.apache.org
> > >>> Sent: Wednesday, December 7, 2011 6:07:58 AM
> > >>> Subject: Re: HDFS Backup nodes
> > >>>
> > >>> You should also configure the Namenode to use an NFS mount for one of
> > >>> it's storage directories. That will give the most up-to-date back of
> > >>> the metadata in case of total node failure.
> > >>>
> > >>> -Joey
> > >>>
> > >>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<
> praveenesh@gmail.com>
> > >>>  wrote:
> > >>>
> > >>>> This means still we are relying on Secondary NameNode idealogy for
> > >>>> Namenode's backup.
> > >>>> Can OS-mirroring of Namenode is a good alternative keep it alive all
> > the
> > >>>> time ?
> > >>>>
> > >>>> Thanks,
> > >>>> Praveenesh
> > >>>>
> > >>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
> > >>>> maheswara@huawei.com>wrote:
> > >>>>
> > >>>>  AFAIK backup node introduced in 0.21 version onwards.
> > >>>>> ______________________________**__________
> > >>>>> From: praveenesh kumar [praveenesh@gmail.com]
> > >>>>> Sent: Wednesday, December 07, 2011 12:40 PM
> > >>>>> To: common-user@hadoop.apache.org
> > >>>>> Subject: HDFS Backup nodes
> > >>>>>
> > >>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Praveenesh
> > >>>>>
> > >>>>>
> > >>>
> > >>>
> > >>> --
> > >>> Joseph Echeverria
> > >>> Cloudera, Inc.
> > >>> 443.305.9434
> > >>>
> > >>
> > >>
> > >>
> > >>
> > >
> >
> >
>

Re: HDFS Backup nodes

Posted by Suresh Srinivas <su...@hortonworks.com>.

Srivas,

As you may know already, NFS is just being used in the first prototype for
HA.

Two options for editlog store are:
1. Using BookKeeper. Work has already completed on trunk towards this. This
will replace need for NFS to  store the editlogs and is highly available.
This solution will also be used for HA.
2. We have a short term goal also to enable editlogs going to HDFS itself.
The work is in progress.

Regards,
Suresh


>
> ---------- Forwarded message ----------
> From: M. C. Srivas <mc...@gmail.com>
> Date: Sun, Dec 11, 2011 at 10:47 PM
> Subject: Re: HDFS Backup nodes
> To: common-user@hadoop.apache.org
>
>
> You are out of luck if you don't want to use NFS, and yet want redundancy
> for the NN.  Even the new "NN HA" work being done by the community will
> require NFS ... and the NFS itself needs to be HA.
>
> But if you use a Netapp, then the likelihood of the Netapp crashing is
> lower than the likelihood of a garbage-collection-of-death happening in the
> NN.
>
> [ disclaimer:  I don't work for Netapp, I work for MapR ]
>
>
> On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:
>
> > Thanks Joey. We've had enough problems with nfs (mainly under very high
> > load) that we thought it might be riskier to use it for the NN.
> >
> > randy
> >
> >
> > On 12/07/2011 06:46 PM, Joey Echeverria wrote:
> >
> >> Hey Rand,
> >>
> >> It will mark that storage directory as failed and ignore it from then
> >> on. In order to do this correctly, you need a couple of options
> >> enabled on the NFS mount to make sure that it doesn't retry
> >> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
> >> options set.
> >>
> >> -Joey
> >>
> >> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
> >>
> >>> What happens then if the nfs server fails or isn't reachable? Does hdfs
> >>> lock up? Does it gracefully ignore the nfs copy?
> >>>
> >>> Thanks,
> >>> randy
> >>>
> >>> ----- Original Message -----
> >>> From: "Joey Echeverria"<jo...@cloudera.com>
> >>> To: common-user@hadoop.apache.org
> >>> Sent: Wednesday, December 7, 2011 6:07:58 AM
> >>> Subject: Re: HDFS Backup nodes
> >>>
> >>> You should also configure the Namenode to use an NFS mount for one of
> >>> it's storage directories. That will give the most up-to-date back of
> >>> the metadata in case of total node failure.
> >>>
> >>> -Joey
> >>>
> >>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<pr...@gmail.com>
> >>>  wrote:
> >>>
> >>>> This means still we are relying on Secondary NameNode idealogy for
> >>>> Namenode's backup.
> >>>> Can OS-mirroring of Namenode is a good alternative keep it alive all
> the
> >>>> time ?
> >>>>
> >>>> Thanks,
> >>>> Praveenesh
> >>>>
> >>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
> >>>> maheswara@huawei.com>wrote:
> >>>>
> >>>>  AFAIK backup node introduced in 0.21 version onwards.
> >>>>> ______________________________**__________
> >>>>> From: praveenesh kumar [praveenesh@gmail.com]
> >>>>> Sent: Wednesday, December 07, 2011 12:40 PM
> >>>>> To: common-user@hadoop.apache.org
> >>>>> Subject: HDFS Backup nodes
> >>>>>
> >>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
> >>>>>
> >>>>> Thanks,
> >>>>> Praveenesh
> >>>>>
> >>>>>
> >>>
> >>>
> >>> --
> >>> Joseph Echeverria
> >>> Cloudera, Inc.
> >>> 443.305.9434
> >>>
> >>
> >>
> >>
> >>
> >
>
>

Re: HDFS Backup nodes

Posted by "M. C. Srivas" <mc...@gmail.com>.

You are out of luck if you don't want to use NFS, and yet want redundancy
for the NN.  Even the new "NN HA" work being done by the community will
require NFS ... and the NFS itself needs to be HA.

But if you use a Netapp, then the likelihood of the Netapp crashing is
lower than the likelihood of a garbage-collection-of-death happening in the
NN.

[ disclaimer:  I don't work for Netapp, I work for MapR ]


On Wed, Dec 7, 2011 at 4:30 PM, randy <ra...@comcast.net> wrote:

> Thanks Joey. We've had enough problems with nfs (mainly under very high
> load) that we thought it might be riskier to use it for the NN.
>
> randy
>
>
> On 12/07/2011 06:46 PM, Joey Echeverria wrote:
>
>> Hey Rand,
>>
>> It will mark that storage directory as failed and ignore it from then
>> on. In order to do this correctly, you need a couple of options
>> enabled on the NFS mount to make sure that it doesn't retry
>> infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10
>> options set.
>>
>> -Joey
>>
>> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
>>
>>> What happens then if the nfs server fails or isn't reachable? Does hdfs
>>> lock up? Does it gracefully ignore the nfs copy?
>>>
>>> Thanks,
>>> randy
>>>
>>> ----- Original Message -----
>>> From: "Joey Echeverria"<jo...@cloudera.com>
>>> To: common-user@hadoop.apache.org
>>> Sent: Wednesday, December 7, 2011 6:07:58 AM
>>> Subject: Re: HDFS Backup nodes
>>>
>>> You should also configure the Namenode to use an NFS mount for one of
>>> it's storage directories. That will give the most up-to-date back of
>>> the metadata in case of total node failure.
>>>
>>> -Joey
>>>
>>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<pr...@gmail.com>
>>>  wrote:
>>>
>>>> This means still we are relying on Secondary NameNode idealogy for
>>>> Namenode's backup.
>>>> Can OS-mirroring of Namenode is a good alternative keep it alive all the
>>>> time ?
>>>>
>>>> Thanks,
>>>> Praveenesh
>>>>
>>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<
>>>> maheswara@huawei.com>wrote:
>>>>
>>>>  AFAIK backup node introduced in 0.21 version onwards.
>>>>> ______________________________**__________
>>>>> From: praveenesh kumar [praveenesh@gmail.com]
>>>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>>>> To: common-user@hadoop.apache.org
>>>>> Subject: HDFS Backup nodes
>>>>>
>>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>>>>
>>>>> Thanks,
>>>>> Praveenesh
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>>>
>>
>>
>>
>>
>

Re: HDFS Backup nodes

Posted by Harsh J <ha...@cloudera.com>.

Randy,

On recent releases (CDH3u2 here for example), you also have
"dfs.name.dir.restore", a boolean flag that will automatically try to
enable previously failed name directories upon every checkpoint if
possible. Hence if you have a SNN running, and your NFS failed at some
point and got marked as FAILED on your NN web UI, if the NFS is back
up again before the next checkpoint interval, it will be auto-restored
after the NN deems its in a writable state again.

On Thu, Dec 8, 2011 at 6:00 AM, randy <ra...@comcast.net> wrote:
> Thanks Joey. We've had enough problems with nfs (mainly under very high
> load) that we thought it might be riskier to use it for the NN.
>
> randy
>
>
> On 12/07/2011 06:46 PM, Joey Echeverria wrote:
>>
>> Hey Rand,
>>
>> It will mark that storage directory as failed and ignore it from then
>> on. In order to do this correctly, you need a couple of options
>> enabled on the NFS mount to make sure that it doesn't retry
>> infinitely. I usually run with the tcp,soft,intr,timeo=10,retrans=10
>> options set.
>>
>> -Joey
>>
>> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
>>>
>>> What happens then if the nfs server fails or isn't reachable? Does hdfs
>>> lock up? Does it gracefully ignore the nfs copy?
>>>
>>> Thanks,
>>> randy
>>>
>>> ----- Original Message -----
>>> From: "Joey Echeverria"<jo...@cloudera.com>
>>> To: common-user@hadoop.apache.org
>>> Sent: Wednesday, December 7, 2011 6:07:58 AM
>>> Subject: Re: HDFS Backup nodes
>>>
>>> You should also configure the Namenode to use an NFS mount for one of
>>> it's storage directories. That will give the most up-to-date back of
>>> the metadata in case of total node failure.
>>>
>>> -Joey
>>>
>>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<pr...@gmail.com>
>>>  wrote:
>>>>
>>>> This means still we are relying on Secondary NameNode idealogy for
>>>> Namenode's backup.
>>>> Can OS-mirroring of Namenode is a good alternative keep it alive all the
>>>> time ?
>>>>
>>>> Thanks,
>>>> Praveenesh
>>>>
>>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao
>>>> G<ma...@huawei.com>wrote:
>>>>
>>>>> AFAIK backup node introduced in 0.21 version onwards.
>>>>> ________________________________________
>>>>> From: praveenesh kumar [praveenesh@gmail.com]
>>>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>>>> To: common-user@hadoop.apache.org
>>>>> Subject: HDFS Backup nodes
>>>>>
>>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>>>>
>>>>> Thanks,
>>>>> Praveenesh
>>>>>
>>>
>>>
>>>
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>>
>>
>>
>>
>



-- 
Harsh J

Re: HDFS Backup nodes

Posted by randy <ra...@comcast.net>.

Thanks Joey. We've had enough problems with nfs (mainly under very high 
load) that we thought it might be riskier to use it for the NN.

randy

On 12/07/2011 06:46 PM, Joey Echeverria wrote:
> Hey Rand,
>
> It will mark that storage directory as failed and ignore it from then
> on. In order to do this correctly, you need a couple of options
> enabled on the NFS mount to make sure that it doesn't retry
> infinitely. I usually run with the tcp,soft,intr,timeo=10,retrans=10
> options set.
>
> -Joey
>
> On Wed, Dec 7, 2011 at 12:37 PM,<ra...@comcast.net>  wrote:
>> What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy?
>>
>> Thanks,
>> randy
>>
>> ----- Original Message -----
>> From: "Joey Echeverria"<jo...@cloudera.com>
>> To: common-user@hadoop.apache.org
>> Sent: Wednesday, December 7, 2011 6:07:58 AM
>> Subject: Re: HDFS Backup nodes
>>
>> You should also configure the Namenode to use an NFS mount for one of
>> it's storage directories. That will give the most up-to-date back of
>> the metadata in case of total node failure.
>>
>> -Joey
>>
>> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar<pr...@gmail.com>  wrote:
>>> This means still we are relying on Secondary NameNode idealogy for
>>> Namenode's backup.
>>> Can OS-mirroring of Namenode is a good alternative keep it alive all the
>>> time ?
>>>
>>> Thanks,
>>> Praveenesh
>>>
>>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G<ma...@huawei.com>wrote:
>>>
>>>> AFAIK backup node introduced in 0.21 version onwards.
>>>> ________________________________________
>>>> From: praveenesh kumar [praveenesh@gmail.com]
>>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>>> To: common-user@hadoop.apache.org
>>>> Subject: HDFS Backup nodes
>>>>
>>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>>>
>>>> Thanks,
>>>> Praveenesh
>>>>
>>
>>
>>
>> --
>> Joseph Echeverria
>> Cloudera, Inc.
>> 443.305.9434
>
>
>

Re: HDFS Backup nodes

Posted by Joey Echeverria <jo...@cloudera.com>.

Hey Rand,

It will mark that storage directory as failed and ignore it from then
on. In order to do this correctly, you need a couple of options
enabled on the NFS mount to make sure that it doesn't retry
infinitely. I usually run with the tcp,soft,intr,timeo=10,retrans=10
options set.

-Joey

On Wed, Dec 7, 2011 at 12:37 PM,  <ra...@comcast.net> wrote:
> What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy?
>
> Thanks,
> randy
>
> ----- Original Message -----
> From: "Joey Echeverria" <jo...@cloudera.com>
> To: common-user@hadoop.apache.org
> Sent: Wednesday, December 7, 2011 6:07:58 AM
> Subject: Re: HDFS Backup nodes
>
> You should also configure the Namenode to use an NFS mount for one of
> it's storage directories. That will give the most up-to-date back of
> the metadata in case of total node failure.
>
> -Joey
>
> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar <pr...@gmail.com> wrote:
>> This means still we are relying on Secondary NameNode idealogy for
>> Namenode's backup.
>> Can OS-mirroring of Namenode is a good alternative keep it alive all the
>> time ?
>>
>> Thanks,
>> Praveenesh
>>
>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G <ma...@huawei.com>wrote:
>>
>>> AFAIK backup node introduced in 0.21 version onwards.
>>> ________________________________________
>>> From: praveenesh kumar [praveenesh@gmail.com]
>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>> To: common-user@hadoop.apache.org
>>> Subject: HDFS Backup nodes
>>>
>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>>
>>> Thanks,
>>> Praveenesh
>>>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

RE: HDFS Backup nodes

Posted by Jorn Argelo - Ephorus <Jo...@ephorus.com>.

Hi Koji,

This was on CHD3U1. For the record I had the dfs.name.dir.restore which
Harsh mentioned enabled as well.

Jorn

-----Oorspronkelijk bericht-----
Van: Koji Noguchi [mailto:knoguchi@yahoo-inc.com] 
Verzonden: woensdag 7 december 2011 17:59
Aan: common-user@hadoop.apache.org
Onderwerp: Re: HDFS Backup nodes

Hi Jorn, 

Which hadoop version were you using when you hit that issue?

Koji


On 12/7/11 5:25 AM, "Jorn Argelo - Ephorus" <Jo...@ephorus.com>
wrote:

> Just to add to that note - we've ran into an issue where the NFS share
> was out of sync (the namenode storage failed even though the NFS share
> was working), but the other local metadata was fine. At the restart of
> the namenode it picked the NFS share's fsimage even if it was out of
> sync. This had the effect that loads of blocks were marked as invalid
> and deleted by the datanodes, and the namenode never came out of safe
> mode because it was missing blocks. The Hadoop documentation says it
> always picks the most recent version of the fsimage but in my case
this
> doesn't seem to have happened. Maybe a bug? With that said I've been
> having issues with NFS before (the NFS namenode storage always failed
> every hour even if the cluster was idle).
> 
> Now since this was just test data it wasn't all that important ... but
> if that would happen with your production cluster you got yourself a
> problem. I've moved away from NFS and I'm using DRBD instead. Not
having
> any problems anymore whatsoever.
> 
> YMMV.
> 
> Jorn
> 
> -----Oorspronkelijk bericht-----
> Van: Joey Echeverria [mailto:joey@cloudera.com]
> Verzonden: woensdag 7 december 2011 12:08
> Aan: common-user@hadoop.apache.org
> Onderwerp: Re: HDFS Backup nodes
> 
> You should also configure the Namenode to use an NFS mount for one of
> it's storage directories. That will give the most up-to-date back of
> the metadata in case of total node failure.
> 
> -Joey
> 
> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar
<pr...@gmail.com>
> wrote:
>> This means still we are relying on Secondary NameNode idealogy for
>> Namenode's backup.
>> Can OS-mirroring of Namenode is a good alternative keep it alive all
> the
>> time ?
>> 
>> Thanks,
>> Praveenesh
>> 
>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G
> <ma...@huawei.com>wrote:
>> 
>>> AFAIK backup node introduced in 0.21 version onwards.
>>> ________________________________________
>>> From: praveenesh kumar [praveenesh@gmail.com]
>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>> To: common-user@hadoop.apache.org
>>> Subject: HDFS Backup nodes
>>> 
>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>> 
>>> Thanks,
>>> Praveenesh
>>> 
> 
>

Re: HDFS Backup nodes

Posted by Koji Noguchi <kn...@yahoo-inc.com>.

Hi Jorn, 

Which hadoop version were you using when you hit that issue?

Koji


On 12/7/11 5:25 AM, "Jorn Argelo - Ephorus" <Jo...@ephorus.com> wrote:

> Just to add to that note - we've ran into an issue where the NFS share
> was out of sync (the namenode storage failed even though the NFS share
> was working), but the other local metadata was fine. At the restart of
> the namenode it picked the NFS share's fsimage even if it was out of
> sync. This had the effect that loads of blocks were marked as invalid
> and deleted by the datanodes, and the namenode never came out of safe
> mode because it was missing blocks. The Hadoop documentation says it
> always picks the most recent version of the fsimage but in my case this
> doesn't seem to have happened. Maybe a bug? With that said I've been
> having issues with NFS before (the NFS namenode storage always failed
> every hour even if the cluster was idle).
> 
> Now since this was just test data it wasn't all that important ... but
> if that would happen with your production cluster you got yourself a
> problem. I've moved away from NFS and I'm using DRBD instead. Not having
> any problems anymore whatsoever.
> 
> YMMV.
> 
> Jorn
> 
> -----Oorspronkelijk bericht-----
> Van: Joey Echeverria [mailto:joey@cloudera.com]
> Verzonden: woensdag 7 december 2011 12:08
> Aan: common-user@hadoop.apache.org
> Onderwerp: Re: HDFS Backup nodes
> 
> You should also configure the Namenode to use an NFS mount for one of
> it's storage directories. That will give the most up-to-date back of
> the metadata in case of total node failure.
> 
> -Joey
> 
> On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar <pr...@gmail.com>
> wrote:
>> This means still we are relying on Secondary NameNode idealogy for
>> Namenode's backup.
>> Can OS-mirroring of Namenode is a good alternative keep it alive all
> the
>> time ?
>> 
>> Thanks,
>> Praveenesh
>> 
>> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G
> <ma...@huawei.com>wrote:
>> 
>>> AFAIK backup node introduced in 0.21 version onwards.
>>> ________________________________________
>>> From: praveenesh kumar [praveenesh@gmail.com]
>>> Sent: Wednesday, December 07, 2011 12:40 PM
>>> To: common-user@hadoop.apache.org
>>> Subject: HDFS Backup nodes
>>> 
>>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>> 
>>> Thanks,
>>> Praveenesh
>>> 
> 
>

RE: HDFS Backup nodes

Posted by Jorn Argelo - Ephorus <Jo...@ephorus.com>.

Just to add to that note - we've ran into an issue where the NFS share
was out of sync (the namenode storage failed even though the NFS share
was working), but the other local metadata was fine. At the restart of
the namenode it picked the NFS share's fsimage even if it was out of
sync. This had the effect that loads of blocks were marked as invalid
and deleted by the datanodes, and the namenode never came out of safe
mode because it was missing blocks. The Hadoop documentation says it
always picks the most recent version of the fsimage but in my case this
doesn't seem to have happened. Maybe a bug? With that said I've been
having issues with NFS before (the NFS namenode storage always failed
every hour even if the cluster was idle).

Now since this was just test data it wasn't all that important ... but
if that would happen with your production cluster you got yourself a
problem. I've moved away from NFS and I'm using DRBD instead. Not having
any problems anymore whatsoever.

YMMV.

Jorn

-----Oorspronkelijk bericht-----
Van: Joey Echeverria [mailto:joey@cloudera.com] 
Verzonden: woensdag 7 december 2011 12:08
Aan: common-user@hadoop.apache.org
Onderwerp: Re: HDFS Backup nodes

You should also configure the Namenode to use an NFS mount for one of
it's storage directories. That will give the most up-to-date back of
the metadata in case of total node failure.

-Joey

On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar <pr...@gmail.com>
wrote:
> This means still we are relying on Secondary NameNode idealogy for
> Namenode's backup.
> Can OS-mirroring of Namenode is a good alternative keep it alive all
the
> time ?
>
> Thanks,
> Praveenesh
>
> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G
<ma...@huawei.com>wrote:
>
>> AFAIK backup node introduced in 0.21 version onwards.
>> ________________________________________
>> From: praveenesh kumar [praveenesh@gmail.com]
>> Sent: Wednesday, December 07, 2011 12:40 PM
>> To: common-user@hadoop.apache.org
>> Subject: HDFS Backup nodes
>>
>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>
>> Thanks,
>> Praveenesh
>>

-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: HDFS Backup nodes

Posted by ra...@comcast.net.

What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy?

Thanks,
randy

----- Original Message -----
From: "Joey Echeverria" <jo...@cloudera.com>
To: common-user@hadoop.apache.org
Sent: Wednesday, December 7, 2011 6:07:58 AM
Subject: Re: HDFS Backup nodes

You should also configure the Namenode to use an NFS mount for one of
it's storage directories. That will give the most up-to-date back of
the metadata in case of total node failure.

-Joey

On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar <pr...@gmail.com> wrote:
> This means still we are relying on Secondary NameNode idealogy for
> Namenode's backup.
> Can OS-mirroring of Namenode is a good alternative keep it alive all the
> time ?
>
> Thanks,
> Praveenesh
>
> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G <ma...@huawei.com>wrote:
>
>> AFAIK backup node introduced in 0.21 version onwards.
>> ________________________________________
>> From: praveenesh kumar [praveenesh@gmail.com]
>> Sent: Wednesday, December 07, 2011 12:40 PM
>> To: common-user@hadoop.apache.org
>> Subject: HDFS Backup nodes
>>
>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>
>> Thanks,
>> Praveenesh
>>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: HDFS Backup nodes

Posted by Joey Echeverria <jo...@cloudera.com>.

You should also configure the Namenode to use an NFS mount for one of
it's storage directories. That will give the most up-to-date back of
the metadata in case of total node failure.

-Joey

On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar <pr...@gmail.com> wrote:
> This means still we are relying on Secondary NameNode idealogy for
> Namenode's backup.
> Can OS-mirroring of Namenode is a good alternative keep it alive all the
> time ?
>
> Thanks,
> Praveenesh
>
> On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G <ma...@huawei.com>wrote:
>
>> AFAIK backup node introduced in 0.21 version onwards.
>> ________________________________________
>> From: praveenesh kumar [praveenesh@gmail.com]
>> Sent: Wednesday, December 07, 2011 12:40 PM
>> To: common-user@hadoop.apache.org
>> Subject: HDFS Backup nodes
>>
>> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>>
>> Thanks,
>> Praveenesh
>>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

RE: HDFS Backup nodes

Posted by Sagar Shukla <sa...@persistent.co.in>.

Yes ... it you are looking for high uptime then keeping the Namenode OS-mirror always running would be the best way to go.

We might need to explore further on the capabilities of HDFS backup node to see how it can be utilized.

Thanks,
Sagar

-----Original Message-----
From: praveenesh kumar [mailto:praveenesh@gmail.com] 
Sent: Wednesday, December 07, 2011 1:47 PM
To: common-user@hadoop.apache.org
Subject: Re: HDFS Backup nodes

This means still we are relying on Secondary NameNode idealogy for Namenode's backup.
Can OS-mirroring of Namenode is a good alternative keep it alive all the time ?

Thanks,
Praveenesh

On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G <ma...@huawei.com>wrote:

> AFAIK backup node introduced in 0.21 version onwards.
> ________________________________________
> From: praveenesh kumar [praveenesh@gmail.com]
> Sent: Wednesday, December 07, 2011 12:40 PM
> To: common-user@hadoop.apache.org
> Subject: HDFS Backup nodes
>
> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>
> Thanks,
> Praveenesh
>

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: HDFS Backup nodes

Posted by praveenesh kumar <pr...@gmail.com>.

This means still we are relying on Secondary NameNode idealogy for
Namenode's backup.
Can OS-mirroring of Namenode is a good alternative keep it alive all the
time ?

Thanks,
Praveenesh

On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G <ma...@huawei.com>wrote:

> AFAIK backup node introduced in 0.21 version onwards.
> ________________________________________
> From: praveenesh kumar [praveenesh@gmail.com]
> Sent: Wednesday, December 07, 2011 12:40 PM
> To: common-user@hadoop.apache.org
> Subject: HDFS Backup nodes
>
> Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
>
> Thanks,
> Praveenesh
>

RE: HDFS Backup nodes

Posted by Uma Maheswara Rao G <ma...@huawei.com>.

AFAIK backup node introduced in 0.21 version onwards.
________________________________________
From: praveenesh kumar [praveenesh@gmail.com]
Sent: Wednesday, December 07, 2011 12:40 PM
To: common-user@hadoop.apache.org
Subject: HDFS Backup nodes

Does hadoop 0.20.205 supports configuring HDFS backup nodes ?

Thanks,
Praveenesh