You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Stas Oskin <st...@gmail.com> on 2009/10/21 16:43:35 UTC

Secondary NameNodes or NFS exports?

Hi.

I'm want to keep a checkpoint data on several separate machines for backup,
and deliberating between exporting these machines disks via NFS, or actually
running Secondary Name Nodes there.

Can anyone advice what would be better in my case?

Regards.

Re: Secondary NameNodes or NFS exports?

Posted by Jason Venner <ja...@gmail.com>.
In my test case, the checkpoints take a small number of seconds or less.

On Thu, Dec 24, 2009 at 10:34 AM, Todd Lipcon <to...@cloudera.com> wrote:

> How long does the checkpoint take? It seems possible to me that if the 2NN
> checkpoint takes longer than the interval, it's possible that multiple
> checkpoints will overlap and might trigger this. (this is conjecture, so
> definitely worth testing)
>
> -Todd
>
> On Wed, Dec 23, 2009 at 6:38 PM, Jason Venner <jason.hadoop@gmail.com
> >wrote:
>
> > I agree, it seems very wrong, that is why I need a block of time to
> really
> > verify the behavior.
> >
> > My test case is the following, and the same test fails in 18.3 and 19.0
> and
> > 19.1
> >
> > set up a single node cluster, 1 namenode, 1 datanode, 1 secondary, all on
> > the same machine.
> > set the checkpoint interval to 2 minutes (120 sec)
> >
> > make a few files, wait, and verify that a checkpoint can happen.
> >
> > recursively start coping a deep tree into hdfs, what the checkpoint fail
> > with a timestamp error.
> >
> > The code explicitly uses the edits.new for the checkpoint verification
> > timestamp.
> >
> > The window is the time from the take of the edit log to the return of the
> > fsimage.
> >
> > On Wed, Dec 23, 2009 at 5:52 PM, Brian Bockelman <bbockelm@cse.unl.edu
> > >wrote:
> >
> > > Hey Jason,
> > >
> > > This analysis seems fairly unlikely - are you claiming that no edits
> can
> > be
> > > merged if files are being created?  Isn't this what edits.new is for?
> > >
> > > We roll the edits log successfully during periods of high transfer,
> when
> > a
> > > new file is being created every 1 second or so.
> > >
> > > We have had issues with unmergeable edits before - there might be some
> > race
> > > conditions in this area.
> > >
> > > Brian
> > >
> > > On Dec 23, 2009, at 7:07 PM, Jason Venner wrote:
> > >
> > > > I have no current solution.
> > > > When I can block a few days, I am going to instrument the code a bit
> > more
> > > to
> > > > verify my understanding.
> > > >
> > > > I believe the issue is that the time stamp is being checked against
> the
> > > > active edit log (the new one created then the checkpoint started)
> > rather
> > > > than the time stamp of the rolled (old) edit log.
> > > > As long as no transactions have hit, the time stamps are the same.
> > > >
> > > >
> > > > On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <st...@gmail.com>
> > > wrote:
> > > >
> > > >> Hi.
> > > >>
> > > >> What was your solution to this then?
> > > >>
> > > >> Regards.
> > > >>
> > > >> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <
> jason.hadoop@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> I have dug into this more, it turns out the problem is unrelated to
> > nfs
> > > >> or
> > > >>> solaris.
> > > >>> The issue is that if there is a meta data change, while the
> secondary
> > > is
> > > >>> rebuilding the fsimage, the rebuilt image is rejected.
> > > >>> On our production cluster, there is almost never a moment where
> there
> > > is
> > > >>> not
> > > >>> a file being created or altered, and as such the secondary is never
> > > make
> > > >> a
> > > >>> fresh fsimage for the cluster.
> > > >>>
> > > >>> I have checked this with several hadoop variants and with vanilla
> > > >>> distributions with the namenode, secondary and a datanode all
> running
> > > on
> > > >>> the
> > > >>> same machine.
> > > >>>
> > > >>> On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <
> > jason.hadoop@gmail.com
> > > >>>> wrote:
> > > >>>
> > > >>>> The namenode would never accept the rebuild fsimage from the
> > > secondary,
> > > >>> so
> > > >>>> the edit logs grew with outbounds.
> > > >>>>
> > > >>>>
> > > >>>> On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <
> stas.oskin@gmail.com>
> > > >>> wrote:
> > > >>>>
> > > >>>>> Hi.
> > > >>>>>
> > > >>>>> You mean, you couldn't recover the NameNode from checkpoints
> > because
> > > >> of
> > > >>>>> timestamps?
> > > >>>>>
> > > >>>>> Regards.
> > > >>>>>
> > > >>>>> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <
> > > jason.hadoop@gmail.com
> > > >>>>>> wrote:
> > > >>>>>
> > > >>>>>> We have been having some trouble with the secondary on a cluster
> > > >> that
> > > >>>>> has
> > > >>>>>> one edit log partition on an nfs server, with the namenode
> > rejecting
> > > >>> the
> > > >>>>>> merged images due to timestamp missmatches.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <
> > stas.oskin@gmail.com>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi.
> > > >>>>>>>
> > > >>>>>>> Thanks for the advice, it seems that the initial approach of
> > > >> having
> > > >>>>>> single
> > > >>>>>>> SecNameNode writing to exports is the way to go.
> > > >>>>>>>
> > > >>>>>>> By the way, I asked this already, but wanted to clarify:
> > > >>>>>>>
> > > >>>>>>> * It's possible to set how often SecNameNode checkpoints the
> data
> > > >>>>> (what
> > > >>>>>> is
> > > >>>>>>> the setting by the way)?
> > > >>>>>>>
> > > >>>>>>> * It's possible to let NameNode write to exports as well
> together
> > > >>> with
> > > >>>>>>> local
> > > >>>>>>> disk, which ensures the latest possible meta-data in case of
> disk
> > > >>>>> crash
> > > >>>>>>> (compared to pereodic check-pointing), but it's going to slow
> > down
> > > >>> the
> > > >>>>>>> operations due to network read/writes.
> > > >>>>>>>
> > > >>>>>>> Thanks again.
> > > >>>>>>>
> > > >>>>>>> On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> > > >>>>>>> <pa...@gmail.com>wrote:
> > > >>>>>>>
> > > >>>>>>>> From what I understand, it's rather tricky to set up multiple
> > > >>>>> secondary
> > > >>>>>>>> namenodes. In either case, running multiple 2ndary NNs doesn't
> > > >> get
> > > >>>>> you
> > > >>>>>>>> much.
> > > >>>>>>>> See this thread:
> > > >>>>>>>>
> > > >>>>>
> > > http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> > > >>>>>>>>
> > > >>>>>>>> On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> > > >>> stas.oskin@gmail.com>
> > > >>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> To clarify, it's either let single SecNameNode to write to
> > > >>>>> multiple
> > > >>>>>> NFS
> > > >>>>>>>>> exports, or actually have multiple SecNameNodes.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks again.
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> > > >>> stas.oskin@gmail.com
> > > >>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I'm want to keep a checkpoint data on several separate
> > > >>> machines
> > > >>>>> for
> > > >>>>>>>>> backup,
> > > >>>>>>>>>> and deliberating between exporting these machines disks via
> > > >>> NFS,
> > > >>>>> or
> > > >>>>>>>>> actually
> > > >>>>>>>>>> running Secondary Name Nodes there.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Can anyone advice what would be better in my case?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Regards.
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > >>>>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > >>>>>> www.prohadoopbook.com a community for Hadoop Professionals
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> --
> > > >>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > >>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > >>>> www.prohadoopbook.com a community for Hadoop Professionals
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> --
> > > >>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > >>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > >>> www.prohadoopbook.com a community for Hadoop Professionals
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > > www.prohadoopbook.com a community for Hadoop Professionals
> > >
> > >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Secondary NameNodes or NFS exports?

Posted by Todd Lipcon <to...@cloudera.com>.
How long does the checkpoint take? It seems possible to me that if the 2NN
checkpoint takes longer than the interval, it's possible that multiple
checkpoints will overlap and might trigger this. (this is conjecture, so
definitely worth testing)

-Todd

On Wed, Dec 23, 2009 at 6:38 PM, Jason Venner <ja...@gmail.com>wrote:

> I agree, it seems very wrong, that is why I need a block of time to really
> verify the behavior.
>
> My test case is the following, and the same test fails in 18.3 and 19.0 and
> 19.1
>
> set up a single node cluster, 1 namenode, 1 datanode, 1 secondary, all on
> the same machine.
> set the checkpoint interval to 2 minutes (120 sec)
>
> make a few files, wait, and verify that a checkpoint can happen.
>
> recursively start coping a deep tree into hdfs, what the checkpoint fail
> with a timestamp error.
>
> The code explicitly uses the edits.new for the checkpoint verification
> timestamp.
>
> The window is the time from the take of the edit log to the return of the
> fsimage.
>
> On Wed, Dec 23, 2009 at 5:52 PM, Brian Bockelman <bbockelm@cse.unl.edu
> >wrote:
>
> > Hey Jason,
> >
> > This analysis seems fairly unlikely - are you claiming that no edits can
> be
> > merged if files are being created?  Isn't this what edits.new is for?
> >
> > We roll the edits log successfully during periods of high transfer, when
> a
> > new file is being created every 1 second or so.
> >
> > We have had issues with unmergeable edits before - there might be some
> race
> > conditions in this area.
> >
> > Brian
> >
> > On Dec 23, 2009, at 7:07 PM, Jason Venner wrote:
> >
> > > I have no current solution.
> > > When I can block a few days, I am going to instrument the code a bit
> more
> > to
> > > verify my understanding.
> > >
> > > I believe the issue is that the time stamp is being checked against the
> > > active edit log (the new one created then the checkpoint started)
> rather
> > > than the time stamp of the rolled (old) edit log.
> > > As long as no transactions have hit, the time stamps are the same.
> > >
> > >
> > > On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <st...@gmail.com>
> > wrote:
> > >
> > >> Hi.
> > >>
> > >> What was your solution to this then?
> > >>
> > >> Regards.
> > >>
> > >> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <ja...@gmail.com>
> > >> wrote:
> > >>
> > >>> I have dug into this more, it turns out the problem is unrelated to
> nfs
> > >> or
> > >>> solaris.
> > >>> The issue is that if there is a meta data change, while the secondary
> > is
> > >>> rebuilding the fsimage, the rebuilt image is rejected.
> > >>> On our production cluster, there is almost never a moment where there
> > is
> > >>> not
> > >>> a file being created or altered, and as such the secondary is never
> > make
> > >> a
> > >>> fresh fsimage for the cluster.
> > >>>
> > >>> I have checked this with several hadoop variants and with vanilla
> > >>> distributions with the namenode, secondary and a datanode all running
> > on
> > >>> the
> > >>> same machine.
> > >>>
> > >>> On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <
> jason.hadoop@gmail.com
> > >>>> wrote:
> > >>>
> > >>>> The namenode would never accept the rebuild fsimage from the
> > secondary,
> > >>> so
> > >>>> the edit logs grew with outbounds.
> > >>>>
> > >>>>
> > >>>> On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <st...@gmail.com>
> > >>> wrote:
> > >>>>
> > >>>>> Hi.
> > >>>>>
> > >>>>> You mean, you couldn't recover the NameNode from checkpoints
> because
> > >> of
> > >>>>> timestamps?
> > >>>>>
> > >>>>> Regards.
> > >>>>>
> > >>>>> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <
> > jason.hadoop@gmail.com
> > >>>>>> wrote:
> > >>>>>
> > >>>>>> We have been having some trouble with the secondary on a cluster
> > >> that
> > >>>>> has
> > >>>>>> one edit log partition on an nfs server, with the namenode
> rejecting
> > >>> the
> > >>>>>> merged images due to timestamp missmatches.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <
> stas.oskin@gmail.com>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi.
> > >>>>>>>
> > >>>>>>> Thanks for the advice, it seems that the initial approach of
> > >> having
> > >>>>>> single
> > >>>>>>> SecNameNode writing to exports is the way to go.
> > >>>>>>>
> > >>>>>>> By the way, I asked this already, but wanted to clarify:
> > >>>>>>>
> > >>>>>>> * It's possible to set how often SecNameNode checkpoints the data
> > >>>>> (what
> > >>>>>> is
> > >>>>>>> the setting by the way)?
> > >>>>>>>
> > >>>>>>> * It's possible to let NameNode write to exports as well together
> > >>> with
> > >>>>>>> local
> > >>>>>>> disk, which ensures the latest possible meta-data in case of disk
> > >>>>> crash
> > >>>>>>> (compared to pereodic check-pointing), but it's going to slow
> down
> > >>> the
> > >>>>>>> operations due to network read/writes.
> > >>>>>>>
> > >>>>>>> Thanks again.
> > >>>>>>>
> > >>>>>>> On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> > >>>>>>> <pa...@gmail.com>wrote:
> > >>>>>>>
> > >>>>>>>> From what I understand, it's rather tricky to set up multiple
> > >>>>> secondary
> > >>>>>>>> namenodes. In either case, running multiple 2ndary NNs doesn't
> > >> get
> > >>>>> you
> > >>>>>>>> much.
> > >>>>>>>> See this thread:
> > >>>>>>>>
> > >>>>>
> > http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> > >>>>>>>>
> > >>>>>>>> On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> > >>> stas.oskin@gmail.com>
> > >>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> To clarify, it's either let single SecNameNode to write to
> > >>>>> multiple
> > >>>>>> NFS
> > >>>>>>>>> exports, or actually have multiple SecNameNodes.
> > >>>>>>>>>
> > >>>>>>>>> Thanks again.
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> > >>> stas.oskin@gmail.com
> > >>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi.
> > >>>>>>>>>>
> > >>>>>>>>>> I'm want to keep a checkpoint data on several separate
> > >>> machines
> > >>>>> for
> > >>>>>>>>> backup,
> > >>>>>>>>>> and deliberating between exporting these machines disks via
> > >>> NFS,
> > >>>>> or
> > >>>>>>>>> actually
> > >>>>>>>>>> running Secondary Name Nodes there.
> > >>>>>>>>>>
> > >>>>>>>>>> Can anyone advice what would be better in my case?
> > >>>>>>>>>>
> > >>>>>>>>>> Regards.
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > >>>>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > >>>>>> www.prohadoopbook.com a community for Hadoop Professionals
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > >>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > >>>> www.prohadoopbook.com a community for Hadoop Professionals
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > >>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > >>> www.prohadoopbook.com a community for Hadoop Professionals
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > www.prohadoopbook.com a community for Hadoop Professionals
> >
> >
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: Secondary NameNodes or NFS exports?

Posted by Jason Venner <ja...@gmail.com>.
I agree, it seems very wrong, that is why I need a block of time to really
verify the behavior.

My test case is the following, and the same test fails in 18.3 and 19.0 and
19.1

set up a single node cluster, 1 namenode, 1 datanode, 1 secondary, all on
the same machine.
set the checkpoint interval to 2 minutes (120 sec)

make a few files, wait, and verify that a checkpoint can happen.

recursively start coping a deep tree into hdfs, what the checkpoint fail
with a timestamp error.

The code explicitly uses the edits.new for the checkpoint verification
timestamp.

The window is the time from the take of the edit log to the return of the
fsimage.

On Wed, Dec 23, 2009 at 5:52 PM, Brian Bockelman <bb...@cse.unl.edu>wrote:

> Hey Jason,
>
> This analysis seems fairly unlikely - are you claiming that no edits can be
> merged if files are being created?  Isn't this what edits.new is for?
>
> We roll the edits log successfully during periods of high transfer, when a
> new file is being created every 1 second or so.
>
> We have had issues with unmergeable edits before - there might be some race
> conditions in this area.
>
> Brian
>
> On Dec 23, 2009, at 7:07 PM, Jason Venner wrote:
>
> > I have no current solution.
> > When I can block a few days, I am going to instrument the code a bit more
> to
> > verify my understanding.
> >
> > I believe the issue is that the time stamp is being checked against the
> > active edit log (the new one created then the checkpoint started) rather
> > than the time stamp of the rolled (old) edit log.
> > As long as no transactions have hit, the time stamps are the same.
> >
> >
> > On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <st...@gmail.com>
> wrote:
> >
> >> Hi.
> >>
> >> What was your solution to this then?
> >>
> >> Regards.
> >>
> >> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <ja...@gmail.com>
> >> wrote:
> >>
> >>> I have dug into this more, it turns out the problem is unrelated to nfs
> >> or
> >>> solaris.
> >>> The issue is that if there is a meta data change, while the secondary
> is
> >>> rebuilding the fsimage, the rebuilt image is rejected.
> >>> On our production cluster, there is almost never a moment where there
> is
> >>> not
> >>> a file being created or altered, and as such the secondary is never
> make
> >> a
> >>> fresh fsimage for the cluster.
> >>>
> >>> I have checked this with several hadoop variants and with vanilla
> >>> distributions with the namenode, secondary and a datanode all running
> on
> >>> the
> >>> same machine.
> >>>
> >>> On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <jason.hadoop@gmail.com
> >>>> wrote:
> >>>
> >>>> The namenode would never accept the rebuild fsimage from the
> secondary,
> >>> so
> >>>> the edit logs grew with outbounds.
> >>>>
> >>>>
> >>>> On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <st...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Hi.
> >>>>>
> >>>>> You mean, you couldn't recover the NameNode from checkpoints because
> >> of
> >>>>> timestamps?
> >>>>>
> >>>>> Regards.
> >>>>>
> >>>>> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <
> jason.hadoop@gmail.com
> >>>>>> wrote:
> >>>>>
> >>>>>> We have been having some trouble with the secondary on a cluster
> >> that
> >>>>> has
> >>>>>> one edit log partition on an nfs server, with the namenode rejecting
> >>> the
> >>>>>> merged images due to timestamp missmatches.
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi.
> >>>>>>>
> >>>>>>> Thanks for the advice, it seems that the initial approach of
> >> having
> >>>>>> single
> >>>>>>> SecNameNode writing to exports is the way to go.
> >>>>>>>
> >>>>>>> By the way, I asked this already, but wanted to clarify:
> >>>>>>>
> >>>>>>> * It's possible to set how often SecNameNode checkpoints the data
> >>>>> (what
> >>>>>> is
> >>>>>>> the setting by the way)?
> >>>>>>>
> >>>>>>> * It's possible to let NameNode write to exports as well together
> >>> with
> >>>>>>> local
> >>>>>>> disk, which ensures the latest possible meta-data in case of disk
> >>>>> crash
> >>>>>>> (compared to pereodic check-pointing), but it's going to slow down
> >>> the
> >>>>>>> operations due to network read/writes.
> >>>>>>>
> >>>>>>> Thanks again.
> >>>>>>>
> >>>>>>> On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> >>>>>>> <pa...@gmail.com>wrote:
> >>>>>>>
> >>>>>>>> From what I understand, it's rather tricky to set up multiple
> >>>>> secondary
> >>>>>>>> namenodes. In either case, running multiple 2ndary NNs doesn't
> >> get
> >>>>> you
> >>>>>>>> much.
> >>>>>>>> See this thread:
> >>>>>>>>
> >>>>>
> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> >>>>>>>>
> >>>>>>>> On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> >>> stas.oskin@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> To clarify, it's either let single SecNameNode to write to
> >>>>> multiple
> >>>>>> NFS
> >>>>>>>>> exports, or actually have multiple SecNameNodes.
> >>>>>>>>>
> >>>>>>>>> Thanks again.
> >>>>>>>>>
> >>>>>>>>> On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> >>> stas.oskin@gmail.com
> >>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi.
> >>>>>>>>>>
> >>>>>>>>>> I'm want to keep a checkpoint data on several separate
> >>> machines
> >>>>> for
> >>>>>>>>> backup,
> >>>>>>>>>> and deliberating between exporting these machines disks via
> >>> NFS,
> >>>>> or
> >>>>>>>>> actually
> >>>>>>>>>> running Secondary Name Nodes there.
> >>>>>>>>>>
> >>>>>>>>>> Can anyone advice what would be better in my case?
> >>>>>>>>>>
> >>>>>>>>>> Regards.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> >>>>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> >>>>>> www.prohadoopbook.com a community for Hadoop Professionals
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> >>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> >>>> www.prohadoopbook.com a community for Hadoop Professionals
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> >>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> >>> www.prohadoopbook.com a community for Hadoop Professionals
> >>>
> >>
> >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
>
>


-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Secondary NameNodes or NFS exports?

Posted by Brian Bockelman <bb...@cse.unl.edu>.
Hey Jason,

This analysis seems fairly unlikely - are you claiming that no edits can be merged if files are being created?  Isn't this what edits.new is for?

We roll the edits log successfully during periods of high transfer, when a new file is being created every 1 second or so.

We have had issues with unmergeable edits before - there might be some race conditions in this area.

Brian

On Dec 23, 2009, at 7:07 PM, Jason Venner wrote:

> I have no current solution.
> When I can block a few days, I am going to instrument the code a bit more to
> verify my understanding.
> 
> I believe the issue is that the time stamp is being checked against the
> active edit log (the new one created then the checkpoint started) rather
> than the time stamp of the rolled (old) edit log.
> As long as no transactions have hit, the time stamps are the same.
> 
> 
> On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <st...@gmail.com> wrote:
> 
>> Hi.
>> 
>> What was your solution to this then?
>> 
>> Regards.
>> 
>> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <ja...@gmail.com>
>> wrote:
>> 
>>> I have dug into this more, it turns out the problem is unrelated to nfs
>> or
>>> solaris.
>>> The issue is that if there is a meta data change, while the secondary is
>>> rebuilding the fsimage, the rebuilt image is rejected.
>>> On our production cluster, there is almost never a moment where there is
>>> not
>>> a file being created or altered, and as such the secondary is never make
>> a
>>> fresh fsimage for the cluster.
>>> 
>>> I have checked this with several hadoop variants and with vanilla
>>> distributions with the namenode, secondary and a datanode all running on
>>> the
>>> same machine.
>>> 
>>> On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <jason.hadoop@gmail.com
>>>> wrote:
>>> 
>>>> The namenode would never accept the rebuild fsimage from the secondary,
>>> so
>>>> the edit logs grew with outbounds.
>>>> 
>>>> 
>>>> On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <st...@gmail.com>
>>> wrote:
>>>> 
>>>>> Hi.
>>>>> 
>>>>> You mean, you couldn't recover the NameNode from checkpoints because
>> of
>>>>> timestamps?
>>>>> 
>>>>> Regards.
>>>>> 
>>>>> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <jason.hadoop@gmail.com
>>>>>> wrote:
>>>>> 
>>>>>> We have been having some trouble with the secondary on a cluster
>> that
>>>>> has
>>>>>> one edit log partition on an nfs server, with the namenode rejecting
>>> the
>>>>>> merged images due to timestamp missmatches.
>>>>>> 
>>>>>> 
>>>>>> On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi.
>>>>>>> 
>>>>>>> Thanks for the advice, it seems that the initial approach of
>> having
>>>>>> single
>>>>>>> SecNameNode writing to exports is the way to go.
>>>>>>> 
>>>>>>> By the way, I asked this already, but wanted to clarify:
>>>>>>> 
>>>>>>> * It's possible to set how often SecNameNode checkpoints the data
>>>>> (what
>>>>>> is
>>>>>>> the setting by the way)?
>>>>>>> 
>>>>>>> * It's possible to let NameNode write to exports as well together
>>> with
>>>>>>> local
>>>>>>> disk, which ensures the latest possible meta-data in case of disk
>>>>> crash
>>>>>>> (compared to pereodic check-pointing), but it's going to slow down
>>> the
>>>>>>> operations due to network read/writes.
>>>>>>> 
>>>>>>> Thanks again.
>>>>>>> 
>>>>>>> On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
>>>>>>> <pa...@gmail.com>wrote:
>>>>>>> 
>>>>>>>> From what I understand, it's rather tricky to set up multiple
>>>>> secondary
>>>>>>>> namenodes. In either case, running multiple 2ndary NNs doesn't
>> get
>>>>> you
>>>>>>>> much.
>>>>>>>> See this thread:
>>>>>>>> 
>>>>> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
>>>>>>>> 
>>>>>>>> On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
>>> stas.oskin@gmail.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> To clarify, it's either let single SecNameNode to write to
>>>>> multiple
>>>>>> NFS
>>>>>>>>> exports, or actually have multiple SecNameNodes.
>>>>>>>>> 
>>>>>>>>> Thanks again.
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
>>> stas.oskin@gmail.com
>>>>>> 
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi.
>>>>>>>>>> 
>>>>>>>>>> I'm want to keep a checkpoint data on several separate
>>> machines
>>>>> for
>>>>>>>>> backup,
>>>>>>>>>> and deliberating between exporting these machines disks via
>>> NFS,
>>>>> or
>>>>>>>>> actually
>>>>>>>>>> running Secondary Name Nodes there.
>>>>>>>>>> 
>>>>>>>>>> Can anyone advice what would be better in my case?
>>>>>>>>>> 
>>>>>>>>>> Regards.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
>>>>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
>>>>>> www.prohadoopbook.com a community for Hadoop Professionals
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
>>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
>>>> www.prohadoopbook.com a community for Hadoop Professionals
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
>>> http://www.amazon.com/dp/1430219424?tag=jewlerymall
>>> www.prohadoopbook.com a community for Hadoop Professionals
>>> 
>> 
> 
> 
> 
> -- 
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals


Re: Secondary NameNodes or NFS exports?

Posted by Jason Venner <ja...@gmail.com>.
I have no current solution.
When I can block a few days, I am going to instrument the code a bit more to
verify my understanding.

I believe the issue is that the time stamp is being checked against the
active edit log (the new one created then the checkpoint started) rather
than the time stamp of the rolled (old) edit log.
As long as no transactions have hit, the time stamps are the same.


On Wed, Dec 23, 2009 at 11:23 AM, Stas Oskin <st...@gmail.com> wrote:

> Hi.
>
> What was your solution to this then?
>
> Regards.
>
> On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <ja...@gmail.com>
> wrote:
>
> > I have dug into this more, it turns out the problem is unrelated to nfs
> or
> > solaris.
> > The issue is that if there is a meta data change, while the secondary is
> > rebuilding the fsimage, the rebuilt image is rejected.
> > On our production cluster, there is almost never a moment where there is
> > not
> > a file being created or altered, and as such the secondary is never make
> a
> > fresh fsimage for the cluster.
> >
> > I have checked this with several hadoop variants and with vanilla
> > distributions with the namenode, secondary and a datanode all running on
> > the
> > same machine.
> >
> > On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <jason.hadoop@gmail.com
> > >wrote:
> >
> > > The namenode would never accept the rebuild fsimage from the secondary,
> > so
> > > the edit logs grew with outbounds.
> > >
> > >
> > > On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <st...@gmail.com>
> > wrote:
> > >
> > >> Hi.
> > >>
> > >> You mean, you couldn't recover the NameNode from checkpoints because
> of
> > >> timestamps?
> > >>
> > >> Regards.
> > >>
> > >> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <jason.hadoop@gmail.com
> > >> >wrote:
> > >>
> > >> > We have been having some trouble with the secondary on a cluster
> that
> > >> has
> > >> > one edit log partition on an nfs server, with the namenode rejecting
> > the
> > >> > merged images due to timestamp missmatches.
> > >> >
> > >> >
> > >> > On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > Hi.
> > >> > >
> > >> > > Thanks for the advice, it seems that the initial approach of
> having
> > >> > single
> > >> > > SecNameNode writing to exports is the way to go.
> > >> > >
> > >> > > By the way, I asked this already, but wanted to clarify:
> > >> > >
> > >> > > * It's possible to set how often SecNameNode checkpoints the data
> > >> (what
> > >> > is
> > >> > > the setting by the way)?
> > >> > >
> > >> > > * It's possible to let NameNode write to exports as well together
> > with
> > >> > > local
> > >> > > disk, which ensures the latest possible meta-data in case of disk
> > >> crash
> > >> > > (compared to pereodic check-pointing), but it's going to slow down
> > the
> > >> > > operations due to network read/writes.
> > >> > >
> > >> > > Thanks again.
> > >> > >
> > >> > > On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> > >> > > <pa...@gmail.com>wrote:
> > >> > >
> > >> > > > From what I understand, it's rather tricky to set up multiple
> > >> secondary
> > >> > > > namenodes. In either case, running multiple 2ndary NNs doesn't
> get
> > >> you
> > >> > > > much.
> > >> > > > See this thread:
> > >> > > >
> > >> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> > >> > > >
> > >> > > > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> > stas.oskin@gmail.com>
> > >> > > wrote:
> > >> > > >
> > >> > > > > To clarify, it's either let single SecNameNode to write to
> > >> multiple
> > >> > NFS
> > >> > > > > exports, or actually have multiple SecNameNodes.
> > >> > > > >
> > >> > > > > Thanks again.
> > >> > > > >
> > >> > > > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> > stas.oskin@gmail.com
> > >> >
> > >> > > > wrote:
> > >> > > > >
> > >> > > > > > Hi.
> > >> > > > > >
> > >> > > > > > I'm want to keep a checkpoint data on several separate
> > machines
> > >> for
> > >> > > > > backup,
> > >> > > > > > and deliberating between exporting these machines disks via
> > NFS,
> > >> or
> > >> > > > > actually
> > >> > > > > > running Secondary Name Nodes there.
> > >> > > > > >
> > >> > > > > > Can anyone advice what would be better in my case?
> > >> > > > > >
> > >> > > > > > Regards.
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > >> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > >> > www.prohadoopbook.com a community for Hadoop Professionals
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > > www.prohadoopbook.com a community for Hadoop Professionals
> > >
> >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Jobs stop at 0%

Posted by Raymond Jennings III <ra...@yahoo.com>.
I have been recently seeing a problem where jobs stop at map 0% that previously worked fine (with no code changes.)  Restarting hadoop on the cluster solves this problem but there is nothing in the log files to indicate what the problem is.  Has anyone seen something similar?


      

Re: Secondary NameNodes or NFS exports?

Posted by Stas Oskin <st...@gmail.com>.
Hi.

What was your solution to this then?

Regards.

On Sat, Dec 5, 2009 at 7:43 AM, Jason Venner <ja...@gmail.com> wrote:

> I have dug into this more, it turns out the problem is unrelated to nfs or
> solaris.
> The issue is that if there is a meta data change, while the secondary is
> rebuilding the fsimage, the rebuilt image is rejected.
> On our production cluster, there is almost never a moment where there is
> not
> a file being created or altered, and as such the secondary is never make a
> fresh fsimage for the cluster.
>
> I have checked this with several hadoop variants and with vanilla
> distributions with the namenode, secondary and a datanode all running on
> the
> same machine.
>
> On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <jason.hadoop@gmail.com
> >wrote:
>
> > The namenode would never accept the rebuild fsimage from the secondary,
> so
> > the edit logs grew with outbounds.
> >
> >
> > On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <st...@gmail.com>
> wrote:
> >
> >> Hi.
> >>
> >> You mean, you couldn't recover the NameNode from checkpoints because of
> >> timestamps?
> >>
> >> Regards.
> >>
> >> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <jason.hadoop@gmail.com
> >> >wrote:
> >>
> >> > We have been having some trouble with the secondary on a cluster that
> >> has
> >> > one edit log partition on an nfs server, with the namenode rejecting
> the
> >> > merged images due to timestamp missmatches.
> >> >
> >> >
> >> > On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com>
> >> wrote:
> >> >
> >> > > Hi.
> >> > >
> >> > > Thanks for the advice, it seems that the initial approach of having
> >> > single
> >> > > SecNameNode writing to exports is the way to go.
> >> > >
> >> > > By the way, I asked this already, but wanted to clarify:
> >> > >
> >> > > * It's possible to set how often SecNameNode checkpoints the data
> >> (what
> >> > is
> >> > > the setting by the way)?
> >> > >
> >> > > * It's possible to let NameNode write to exports as well together
> with
> >> > > local
> >> > > disk, which ensures the latest possible meta-data in case of disk
> >> crash
> >> > > (compared to pereodic check-pointing), but it's going to slow down
> the
> >> > > operations due to network read/writes.
> >> > >
> >> > > Thanks again.
> >> > >
> >> > > On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> >> > > <pa...@gmail.com>wrote:
> >> > >
> >> > > > From what I understand, it's rather tricky to set up multiple
> >> secondary
> >> > > > namenodes. In either case, running multiple 2ndary NNs doesn't get
> >> you
> >> > > > much.
> >> > > > See this thread:
> >> > > >
> >> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> >> > > >
> >> > > > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <
> stas.oskin@gmail.com>
> >> > > wrote:
> >> > > >
> >> > > > > To clarify, it's either let single SecNameNode to write to
> >> multiple
> >> > NFS
> >> > > > > exports, or actually have multiple SecNameNodes.
> >> > > > >
> >> > > > > Thanks again.
> >> > > > >
> >> > > > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <
> stas.oskin@gmail.com
> >> >
> >> > > > wrote:
> >> > > > >
> >> > > > > > Hi.
> >> > > > > >
> >> > > > > > I'm want to keep a checkpoint data on several separate
> machines
> >> for
> >> > > > > backup,
> >> > > > > > and deliberating between exporting these machines disks via
> NFS,
> >> or
> >> > > > > actually
> >> > > > > > running Secondary Name Nodes there.
> >> > > > > >
> >> > > > > > Can anyone advice what would be better in my case?
> >> > > > > >
> >> > > > > > Regards.
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> >> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> >> > www.prohadoopbook.com a community for Hadoop Professionals
> >> >
> >>
> >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: Secondary NameNodes or NFS exports?

Posted by Jason Venner <ja...@gmail.com>.
I have dug into this more, it turns out the problem is unrelated to nfs or
solaris.
The issue is that if there is a meta data change, while the secondary is
rebuilding the fsimage, the rebuilt image is rejected.
On our production cluster, there is almost never a moment where there is not
a file being created or altered, and as such the secondary is never make a
fresh fsimage for the cluster.

I have checked this with several hadoop variants and with vanilla
distributions with the namenode, secondary and a datanode all running on the
same machine.

On Tue, Oct 27, 2009 at 8:03 PM, Jason Venner <ja...@gmail.com>wrote:

> The namenode would never accept the rebuild fsimage from the secondary, so
> the edit logs grew with outbounds.
>
>
> On Tue, Oct 27, 2009 at 10:51 AM, Stas Oskin <st...@gmail.com> wrote:
>
>> Hi.
>>
>> You mean, you couldn't recover the NameNode from checkpoints because of
>> timestamps?
>>
>> Regards.
>>
>> On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <jason.hadoop@gmail.com
>> >wrote:
>>
>> > We have been having some trouble with the secondary on a cluster that
>> has
>> > one edit log partition on an nfs server, with the namenode rejecting the
>> > merged images due to timestamp missmatches.
>> >
>> >
>> > On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com>
>> wrote:
>> >
>> > > Hi.
>> > >
>> > > Thanks for the advice, it seems that the initial approach of having
>> > single
>> > > SecNameNode writing to exports is the way to go.
>> > >
>> > > By the way, I asked this already, but wanted to clarify:
>> > >
>> > > * It's possible to set how often SecNameNode checkpoints the data
>> (what
>> > is
>> > > the setting by the way)?
>> > >
>> > > * It's possible to let NameNode write to exports as well together with
>> > > local
>> > > disk, which ensures the latest possible meta-data in case of disk
>> crash
>> > > (compared to pereodic check-pointing), but it's going to slow down the
>> > > operations due to network read/writes.
>> > >
>> > > Thanks again.
>> > >
>> > > On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
>> > > <pa...@gmail.com>wrote:
>> > >
>> > > > From what I understand, it's rather tricky to set up multiple
>> secondary
>> > > > namenodes. In either case, running multiple 2ndary NNs doesn't get
>> you
>> > > > much.
>> > > > See this thread:
>> > > >
>> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
>> > > >
>> > > > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <st...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > To clarify, it's either let single SecNameNode to write to
>> multiple
>> > NFS
>> > > > > exports, or actually have multiple SecNameNodes.
>> > > > >
>> > > > > Thanks again.
>> > > > >
>> > > > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <stas.oskin@gmail.com
>> >
>> > > > wrote:
>> > > > >
>> > > > > > Hi.
>> > > > > >
>> > > > > > I'm want to keep a checkpoint data on several separate machines
>> for
>> > > > > backup,
>> > > > > > and deliberating between exporting these machines disks via NFS,
>> or
>> > > > > actually
>> > > > > > running Secondary Name Nodes there.
>> > > > > >
>> > > > > > Can anyone advice what would be better in my case?
>> > > > > >
>> > > > > > Regards.
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
>> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
>> > www.prohadoopbook.com a community for Hadoop Professionals
>> >
>>
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Secondary NameNodes or NFS exports?

Posted by Stas Oskin <st...@gmail.com>.
Hi.

You mean, you couldn't recover the NameNode from checkpoints because of
timestamps?

Regards.

On Tue, Oct 27, 2009 at 4:49 PM, Jason Venner <ja...@gmail.com>wrote:

> We have been having some trouble with the secondary on a cluster that has
> one edit log partition on an nfs server, with the namenode rejecting the
> merged images due to timestamp missmatches.
>
>
> On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com> wrote:
>
> > Hi.
> >
> > Thanks for the advice, it seems that the initial approach of having
> single
> > SecNameNode writing to exports is the way to go.
> >
> > By the way, I asked this already, but wanted to clarify:
> >
> > * It's possible to set how often SecNameNode checkpoints the data (what
> is
> > the setting by the way)?
> >
> > * It's possible to let NameNode write to exports as well together with
> > local
> > disk, which ensures the latest possible meta-data in case of disk crash
> > (compared to pereodic check-pointing), but it's going to slow down the
> > operations due to network read/writes.
> >
> > Thanks again.
> >
> > On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> > <pa...@gmail.com>wrote:
> >
> > > From what I understand, it's rather tricky to set up multiple secondary
> > > namenodes. In either case, running multiple 2ndary NNs doesn't get you
> > > much.
> > > See this thread:
> > > http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> > >
> > > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <st...@gmail.com>
> > wrote:
> > >
> > > > To clarify, it's either let single SecNameNode to write to multiple
> NFS
> > > > exports, or actually have multiple SecNameNodes.
> > > >
> > > > Thanks again.
> > > >
> > > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <st...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi.
> > > > >
> > > > > I'm want to keep a checkpoint data on several separate machines for
> > > > backup,
> > > > > and deliberating between exporting these machines disks via NFS, or
> > > > actually
> > > > > running Secondary Name Nodes there.
> > > > >
> > > > > Can anyone advice what would be better in my case?
> > > > >
> > > > > Regards.
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: Secondary NameNodes or NFS exports?

Posted by Jason Venner <ja...@gmail.com>.
We have been having some trouble with the secondary on a cluster that has
one edit log partition on an nfs server, with the namenode rejecting the
merged images due to timestamp missmatches.


On Mon, Oct 26, 2009 at 10:14 AM, Stas Oskin <st...@gmail.com> wrote:

> Hi.
>
> Thanks for the advice, it seems that the initial approach of having single
> SecNameNode writing to exports is the way to go.
>
> By the way, I asked this already, but wanted to clarify:
>
> * It's possible to set how often SecNameNode checkpoints the data (what is
> the setting by the way)?
>
> * It's possible to let NameNode write to exports as well together with
> local
> disk, which ensures the latest possible meta-data in case of disk crash
> (compared to pereodic check-pointing), but it's going to slow down the
> operations due to network read/writes.
>
> Thanks again.
>
> On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
> <pa...@gmail.com>wrote:
>
> > From what I understand, it's rather tricky to set up multiple secondary
> > namenodes. In either case, running multiple 2ndary NNs doesn't get you
> > much.
> > See this thread:
> > http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
> >
> > On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <st...@gmail.com>
> wrote:
> >
> > > To clarify, it's either let single SecNameNode to write to multiple NFS
> > > exports, or actually have multiple SecNameNodes.
> > >
> > > Thanks again.
> > >
> > > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <st...@gmail.com>
> > wrote:
> > >
> > > > Hi.
> > > >
> > > > I'm want to keep a checkpoint data on several separate machines for
> > > backup,
> > > > and deliberating between exporting these machines disks via NFS, or
> > > actually
> > > > running Secondary Name Nodes there.
> > > >
> > > > Can anyone advice what would be better in my case?
> > > >
> > > > Regards.
> > > >
> > >
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Secondary NameNodes or NFS exports?

Posted by Stas Oskin <st...@gmail.com>.
Hi.

Thanks for the advice, it seems that the initial approach of having single
SecNameNode writing to exports is the way to go.

By the way, I asked this already, but wanted to clarify:

* It's possible to set how often SecNameNode checkpoints the data (what is
the setting by the way)?

* It's possible to let NameNode write to exports as well together with local
disk, which ensures the latest possible meta-data in case of disk crash
(compared to pereodic check-pointing), but it's going to slow down the
operations due to network read/writes.

Thanks again.

On Thu, Oct 22, 2009 at 10:03 PM, Patrick Angeles
<pa...@gmail.com>wrote:

> From what I understand, it's rather tricky to set up multiple secondary
> namenodes. In either case, running multiple 2ndary NNs doesn't get you
> much.
> See this thread:
> http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html
>
> On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <st...@gmail.com> wrote:
>
> > To clarify, it's either let single SecNameNode to write to multiple NFS
> > exports, or actually have multiple SecNameNodes.
> >
> > Thanks again.
> >
> > On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <st...@gmail.com>
> wrote:
> >
> > > Hi.
> > >
> > > I'm want to keep a checkpoint data on several separate machines for
> > backup,
> > > and deliberating between exporting these machines disks via NFS, or
> > actually
> > > running Secondary Name Nodes there.
> > >
> > > Can anyone advice what would be better in my case?
> > >
> > > Regards.
> > >
> >
>

Re: Secondary NameNodes or NFS exports?

Posted by Patrick Angeles <pa...@gmail.com>.
>From what I understand, it's rather tricky to set up multiple secondary
namenodes. In either case, running multiple 2ndary NNs doesn't get you much.
See this thread:
http://www.mail-archive.com/core-user@hadoop.apache.org/msg06280.html

On Wed, Oct 21, 2009 at 10:44 AM, Stas Oskin <st...@gmail.com> wrote:

> To clarify, it's either let single SecNameNode to write to multiple NFS
> exports, or actually have multiple SecNameNodes.
>
> Thanks again.
>
> On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <st...@gmail.com> wrote:
>
> > Hi.
> >
> > I'm want to keep a checkpoint data on several separate machines for
> backup,
> > and deliberating between exporting these machines disks via NFS, or
> actually
> > running Secondary Name Nodes there.
> >
> > Can anyone advice what would be better in my case?
> >
> > Regards.
> >
>

Re: Secondary NameNodes or NFS exports?

Posted by Stas Oskin <st...@gmail.com>.
To clarify, it's either let single SecNameNode to write to multiple NFS
exports, or actually have multiple SecNameNodes.

Thanks again.

On Wed, Oct 21, 2009 at 4:43 PM, Stas Oskin <st...@gmail.com> wrote:

> Hi.
>
> I'm want to keep a checkpoint data on several separate machines for backup,
> and deliberating between exporting these machines disks via NFS, or actually
> running Secondary Name Nodes there.
>
> Can anyone advice what would be better in my case?
>
> Regards.
>