You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by jiang licht <li...@yahoo.com> on 2010/05/18 02:10:09 UTC

dfs.name.dir capacity for namenode backup?

I am considering to use a machine to save a
redundant copy of HDFS metadata through setting dfs.name.dir in hdfs-site.xml like this (as in YDN):

<property>
    <name>dfs.name.dir</name>
    <value>/home/hadoop/dfs/name,/mnt/namenode-backup</value>
    <final>true</final>
</property>

where the two folders are on different machines so that /mnt/namenode-backup keeps a copy of hdfs file system information and its machine can be used to replace the first machine that fails as namenode. 

So, my question is how big this hdfs metatdata will consume? I guess it is proportional to the hdfs capacity. What ratio is that or what size will be for 150TB hdfs?

Thanks,
Michael

Re: dfs.name.dir capacity for namenode backup?

Posted by Todd Lipcon <to...@cloudera.com>.

Yes, we recommend at least one local directory and one NFS directory for
dfs.name.dir in production environments. This allows an up-to-date recovery
of NN metadata if the NN should fail. In future versions the BackupNode
functionality will move us one step closer to not needing NFS for production
deployments.

Note that the NFS directory does not need to be anything fancy - you can
simply use an NFS mount on another normal Linux box.

-Todd

On Tue, May 18, 2010 at 11:19 AM, Andrew Nguyen <an...@ucsfcti.org> wrote:

> Sorry to hijack but after following this thread, I had a related question
> to the secondary location of dfs.name.dir.
>
> Is the approach outlined below the preferred/suggested way to do this?  Is
> this people mean when they say, "stick it on NFS" ?
>
> Thanks!
>
> On May 17, 2010, at 11:14 PM, Todd Lipcon wrote:
>
> > On Mon, May 17, 2010 at 5:10 PM, jiang licht <li...@yahoo.com>
> wrote:
> >
> >> I am considering to use a machine to save a
> >> redundant copy of HDFS metadata through setting dfs.name.dir in
> >> hdfs-site.xml like this (as in YDN):
> >>
> >> <property>
> >>   <name>dfs.name.dir</name>
> >>   <value>/home/hadoop/dfs/name,/mnt/namenode-backup</value>
> >>   <final>true</final>
> >> </property>
> >>
> >> where the two folders are on different machines so that
> >> /mnt/namenode-backup keeps a copy of hdfs file system information and
> its
> >> machine can be used to replace the first machine that fails as namenode.
> >>
> >> So, my question is how big this hdfs metatdata will consume? I guess it
> is
> >> proportional to the hdfs capacity. What ratio is that or what size will
> be
> >> for 150TB hdfs?
> >>
> >
> > On the order of a few GB, max (you really need double the size of your
> > image, so it has tmp space when downloading a checkpoint or performing an
> > upgrade). But on any disk you can buy these days you'll have plenty of
> > space.
> >
> > -Todd
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: dfs.name.dir capacity for namenode backup?

Posted by Andrew Nguyen <an...@ucsfcti.org>.

Sorry to hijack but after following this thread, I had a related question to the secondary location of dfs.name.dir.  

Is the approach outlined below the preferred/suggested way to do this?  Is this people mean when they say, "stick it on NFS" ?

Thanks!

On May 17, 2010, at 11:14 PM, Todd Lipcon wrote:

> On Mon, May 17, 2010 at 5:10 PM, jiang licht <li...@yahoo.com> wrote:
> 
>> I am considering to use a machine to save a
>> redundant copy of HDFS metadata through setting dfs.name.dir in
>> hdfs-site.xml like this (as in YDN):
>> 
>> <property>
>>   <name>dfs.name.dir</name>
>>   <value>/home/hadoop/dfs/name,/mnt/namenode-backup</value>
>>   <final>true</final>
>> </property>
>> 
>> where the two folders are on different machines so that
>> /mnt/namenode-backup keeps a copy of hdfs file system information and its
>> machine can be used to replace the first machine that fails as namenode.
>> 
>> So, my question is how big this hdfs metatdata will consume? I guess it is
>> proportional to the hdfs capacity. What ratio is that or what size will be
>> for 150TB hdfs?
>> 
> 
> On the order of a few GB, max (you really need double the size of your
> image, so it has tmp space when downloading a checkpoint or performing an
> upgrade). But on any disk you can buy these days you'll have plenty of
> space.
> 
> -Todd
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: dfs.name.dir capacity for namenode backup?

Posted by Todd Lipcon <to...@cloudera.com>.

On Mon, May 17, 2010 at 5:10 PM, jiang licht <li...@yahoo.com> wrote:

> I am considering to use a machine to save a
> redundant copy of HDFS metadata through setting dfs.name.dir in
> hdfs-site.xml like this (as in YDN):
>
> <property>
>    <name>dfs.name.dir</name>
>    <value>/home/hadoop/dfs/name,/mnt/namenode-backup</value>
>    <final>true</final>
> </property>
>
> where the two folders are on different machines so that
> /mnt/namenode-backup keeps a copy of hdfs file system information and its
> machine can be used to replace the first machine that fails as namenode.
>
> So, my question is how big this hdfs metatdata will consume? I guess it is
> proportional to the hdfs capacity. What ratio is that or what size will be
> for 150TB hdfs?
>

On the order of a few GB, max (you really need double the size of your
image, so it has tmp space when downloading a checkpoint or performing an
upgrade). But on any disk you can buy these days you'll have plenty of
space.

-Todd


-- 
Todd Lipcon
Software Engineer, Cloudera