You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ankur Sethi <as...@i411.com> on 2007/07/14 20:53:54 UTC
New user question
I am about to attempt setting up a hadoop file system for an application.
Hadoop Filesystem has single point of failure, namenode. Can you explain
steps necessary for bringing the HDFS backup in case of namenode failure?
Before asking this question I went through these pages:
http://wiki.apache.org/lucene-hadoop-data/attachments/HadoopPresentations/att
achments/HDFSDescription.pdf
and http://lucene.apache.org/hadoop/hdfs_design.html
These describe the overall architecture and the fact that one can have
secondary namenodes.
Lets say the machine just died.
>From the documentation: "The Namenode machine is a single point of failure
for an HDFS cluster. If the Namenode machine fails, manual intervention is
necessary. Currently, automatic restart and failover of the Namenode software
to another machine is not supported."
So what is this manual intervention? I am confused on this. All the nodes
have a configuration file with the master namenode set. So one should
bringup a machine with the same name/ip address.
Then what? Can one bring up the new machine and start a namenode server and
have it repopulate on its own? Please explain?
Sorry if this has been asked before. I did research on the mailing list and
the FAQ page and the documentation before asking this.
Thanks,
Ankur
RE: New user question
Posted by Ankur Sethi <as...@i411.com>.
I see that noone has responded so this is an affirmative that data of the
entire cluster is lost when the namenode data is lost.
I suppose we will have a secondary namenode as backup but we can see that
Hadoop has a long way to go. Wow, looks like the guys at google have put it
a lot of hardwork.
Ankur
-----Original Message-----
From: amalagaura@gmail.com [mailto:amalagaura@gmail.com] On Behalf Of Ankur
Sethi
Sent: Saturday, July 14, 2007 6:51 PM
To: hadoop-user@lucene.apache.org
Subject: Re: New user question
In case the namenode data is lost the data of the entire cluster is lost?
On 7/14/07, Raghu Angadi <ra...@yahoo-inc.com> wrote:
>
>
> You can specify multiple directories for Namenode data, in which case
> the image is written to all the directories. You can also an NFS mount,
> raid or similar approach.
>
> Raghu.
>
> Ankur Sethi wrote:
> > Thank you for the information.
> >
> > I want to take a worse case scenario if the namenode fails. So you are
> > suggesting copying the dfs.name.dir directory. We can take regular
> backups
> > of this? Shouldn't HDFS be truly fault tolerant in this regard? If
> you
> > have 500 machines shouldn't it replicate the essential data in case of
> > failure.
> >
> > The google file system maintains replicates server critical information
> as
> > well.
> >
> > Let's say one did not have the dfs.name.dir backed up, what would
> happen?
> >
> > Thanks,
> > Ankur
>
Re: New user question
Posted by Ankur Sethi <an...@gmail.com>.
In case the namenode data is lost the data of the entire cluster is lost?
On 7/14/07, Raghu Angadi <ra...@yahoo-inc.com> wrote:
>
>
> You can specify multiple directories for Namenode data, in which case
> the image is written to all the directories. You can also an NFS mount,
> raid or similar approach.
>
> Raghu.
>
> Ankur Sethi wrote:
> > Thank you for the information.
> >
> > I want to take a worse case scenario if the namenode fails. So you are
> > suggesting copying the dfs.name.dir directory. We can take regular
> backups
> > of this? Shouldn't HDFS be truly fault tolerant in this regard? If
> you
> > have 500 machines shouldn't it replicate the essential data in case of
> > failure.
> >
> > The google file system maintains replicates server critical information
> as
> > well.
> >
> > Let's say one did not have the dfs.name.dir backed up, what would
> happen?
> >
> > Thanks,
> > Ankur
>
Re: New user question
Posted by Raghu Angadi <ra...@yahoo-inc.com>.
You can specify multiple directories for Namenode data, in which case
the image is written to all the directories. You can also an NFS mount,
raid or similar approach.
Raghu.
Ankur Sethi wrote:
> Thank you for the information.
>
> I want to take a worse case scenario if the namenode fails. So you are
> suggesting copying the dfs.name.dir directory. We can take regular backups
> of this? Shouldn't HDFS be truly fault tolerant in this regard? If you
> have 500 machines shouldn't it replicate the essential data in case of
> failure.
>
> The google file system maintains replicates server critical information as
> well.
>
> Let's say one did not have the dfs.name.dir backed up, what would happen?
>
> Thanks,
> Ankur
Re: New user question
Posted by Ankur Sethi <an...@gmail.com>.
Thank you for the information.
I want to take a worse case scenario if the namenode fails. So you are
suggesting copying the dfs.name.dir directory. We can take regular backups
of this? Shouldn't HDFS be truly fault tolerant in this regard? If you
have 500 machines shouldn't it replicate the essential data in case of
failure.
The google file system maintains replicates server critical information as
well.
Let's say one did not have the dfs.name.dir backed up, what would happen?
Thanks,
Ankur
On 7/14/07, Raghu Angadi <ra...@yahoo-inc.com> wrote:
>
> Ankur Sethi wrote:
>
> > Then what? Can one bring up the new machine and start a namenode server
> and
> > have it repopulate on its own? Please explain?
>
> If you bring up the new Namenode with same hostname and IP, then you
> don't need to restart the Datanodes. If the hostname changes, then you
> need to edit the configuration, distribute the configuration to other
> nodes and restart the whole cluster.
>
> Before bringing up the new Namenode, you need to copy ${dfs.name.dir}
> directory from the original Namenode. By default, this is set to
> ${dfs.tmp.dir}/dfs/name. This directory has the filesystem image for the
> cluster.
>
> Raghu.
>
> > Sorry if this has been asked before. I did research on the mailing list
> and
> > the FAQ page and the documentation before asking this.
> >
> > Thanks,
> > Ankur
>
>
Re: New user question
Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Ankur Sethi wrote:
> Then what? Can one bring up the new machine and start a namenode server and
> have it repopulate on its own? Please explain?
If you bring up the new Namenode with same hostname and IP, then you
don't need to restart the Datanodes. If the hostname changes, then you
need to edit the configuration, distribute the configuration to other
nodes and restart the whole cluster.
Before bringing up the new Namenode, you need to copy ${dfs.name.dir}
directory from the original Namenode. By default, this is set to
${dfs.tmp.dir}/dfs/name. This directory has the filesystem image for the
cluster.
Raghu.
> Sorry if this has been asked before. I did research on the mailing list and
> the FAQ page and the documentation before asking this.
>
> Thanks,
> Ankur