You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Tomislav Poljak <tp...@gmail.com> on 2008/11/03 14:10:48 UTC

Re: SecondaryNameNode on separate machine

Hi,
Thank you all for your time and your answers!

Now SecondaryNameNode connects to the NameNode (after I configured
dfs.http.address to the NN's http server -> NN hostname on port 50070)
and creates(transfers) edits and fsimage from NameNode.

Can you explain me a little bit more how NameNode failover should work
now? 

For example, SecondaryNameNode now stores fsimage and edits to (SNN's)
dirX and let's say NameNode goes down (disk becomes unreadable). Now I
create/dedicate a new machine for NameNode (also change DNS to point to
this new NameNode machine as nameNode host) and take the data dirX from
SNN and copy it to new NameNode. How do I configure new NameNode to use
data from dirX (do I configure dfs.name.dir to point to dirX and start
new NameNode)?

Thanks,
        Tomislav



On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote:
> True, dfs.http.address is the NN Web UI address.
> This where the NN http server runs. Besides the Web UI there also
> a servlet running on that server which is used to transfer image
> and edits from NN to the secondary using http get.
> So SNN uses both addresses fs.default.name and dfs.http.address.
> 
> When SNN finishes the checkpoint the primary needs to transfer the
> resulting image back. This is done via the http server running on SNN.
> 
> Answering Tomislav's question:
> The difference between fs.default.name and dfs.http.address is that
> fs.default.name is the name-node's PRC address, where clients and
> data-nodes connect to, while dfs.http.address is the NN's http server
> address where our browsers connect to, but it is also used for
> transferring image and edits files.
> 
> --Konstantin
> 
> Otis Gospodnetic wrote:
> > Konstantin & Co, please correct me if I'm wrong, but looking at hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN *Web UI*.  In other words, this is where we people go look at the NN.
> > 
> > The secondary NN must then be using only the Primary NN URL specified in fs.default.name.  This URL looks like hdfs://name-node-hostname-here/.  Something in Hadoop then knows the exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this URL.
> > 
> > Is this correct?
> > 
> > 
> > Thanks,
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > ----- Original Message ----
> >> From: Tomislav Poljak <tp...@gmail.com>
> >> To: core-user@hadoop.apache.org
> >> Sent: Thursday, October 30, 2008 1:52:18 PM
> >> Subject: Re: SecondaryNameNode on separate machine
> >>
> >> Hi,
> >> can you, please, explain the difference between fs.default.name and
> >> dfs.http.address (like how and when is SecondaryNameNode using
> >> fs.default.name and how/when dfs.http.address). I have set them both to
> >> same (namenode's) hostname:port. Is this correct (or dfs.http.address
> >> needs some other port)? 
> >>
> >> Thanks,
> >>
> >> Tomislav
> >>
> >> On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote:
> >>> SecondaryNameNode uses http protocol to transfer the image and the edits
> >>> from the primary name-node and vise versa.
> >>> So the secondary does not access local files on the primary directly.
> >>> The primary NN should know the secondary's http address.
> >>> And the secondary NN need to know both fs.default.name and dfs.http.address of 
> >> the primary.
> >>> In general we usually create one configuration file hadoop-site.xml
> >>> and copy it to all other machines. So you don't need to set up different
> >>> values for all servers.
> >>>
> >>> Regards,
> >>> --Konstantin
> >>>
> >>> Tomislav Poljak wrote:
> >>>> Hi,
> >>>> I'm not clear on how does SecondaryNameNode communicates with NameNode
> >>>> (if deployed on separate machine). Does SecondaryNameNode uses direct
> >>>> connection (over some port and protocol) or is it enough for
> >>>> SecondaryNameNode to have access to data which NameNode writes locally
> >>>> on disk?
> >>>>
> >>>> Tomislav
> >>>>
> >>>> On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote:
> >>>>> I think a lot of the confusion comes from this thread :
> >>>>> http://www.nabble.com/NameNode-failover-procedure-td11711842.html
> >>>>>
> >>>>> Particularly because the wiki was updated with wrong information, not
> >>>>> maliciously I'm sure. This information is now gone for good.
> >>>>>
> >>>>> Otis, your solution is pretty much like the one given by Dhruba Borthakur
> >>>>> and augmented by Konstantin Shvachko later in the thread but I never did it
> >>>>> myself.
> >>>>>
> >>>>> One thing should be clear though, the NN is and will remain a SPOF (just
> >>>>> like HBase's Master) as long as a distributed manager service (like
> >>>>> Zookeeper) is not plugged into Hadoop to help with failover.
> >>>>>
> >>>>> J-D
> >>>>>
> >>>>> On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic <
> >>>>> otis_gospodnetic@yahoo.com> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> So what is the "recipe" for avoiding NN SPOF using only what comes with
> >>>>>> Hadoop?
> >>>>>>
> >>>>>> From what I can tell, I think one has to do the following two things:
> >>>>>>
> >>>>>> 1) configure primary NN to save namespace and xa logs to multiple dirs, 
> >> one
> >>>>>> of which is actually on a remotely mounted disk, so that the data actually
> >>>>>> lives on a separate disk on a separate box.  This saves namespace and xa
> >>>>>> logs on multiple boxes in case of primary NN hardware failure.
> >>>>>>
> >>>>>> 2) configure secondary NN to periodically merge fsimage+edits and create
> >>>>>> the fsimage checkpoint.  This really is a second NN process running on
> >>>>>> another box.  It sounds like this secondary NN has to somehow have access 
> >> to
> >>>>>> fsimage & edits files from the primary NN server.
> >>>>>>
> >> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodedoes 
> >> not describe the best practise around that - the recommended way to
> >>>>>> give secondary NN access to primary NN's fsimage and edits files.  Should
> >>>>>> one mount a disk from the primary NN box to the secondary NN box to get
> >>>>>> access to those files?  Or is there a simpler way?
> >>>>>> In any case, this checkpoint is just a merge of fsimage+edits files and
> >>>>>> again is there in case the box with the primary NN dies.  That's what's
> >>>>>> described on
> >>>>>>
> >> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodemore 
> >> or less.
> >>>>>> Is this sufficient, or are there other things one has to do to eliminate 
> >> NN
> >>>>>> SPOF?
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Otis
> >>>>>> --
> >>>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ----- Original Message ----
> >>>>>>> From: Jean-Daniel Cryans 
> >>>>>>> To: core-user@hadoop.apache.org
> >>>>>>> Sent: Tuesday, October 28, 2008 8:14:44 PM
> >>>>>>> Subject: Re: SecondaryNameNode on separate machine
> >>>>>>>
> >>>>>>> Tomislav.
> >>>>>>>
> >>>>>>> Contrary to popular belief the secondary namenode does not provide
> >>>>>> failover,
> >>>>>>> it's only used to do what is described here :
> >>>>>>>
> >> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNode
> >>>>>>> So the term "secondary" does not mean "a second one" but is more like "a
> >>>>>>> second part of".
> >>>>>>>
> >>>>>>> J-D
> >>>>>>>
> >>>>>>> On Tue, Oct 28, 2008 at 9:44 AM, Tomislav Poljak wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>> I'm trying to implement NameNode failover (or at least NameNode local
> >>>>>>>> data backup), but it is hard since there is no official documentation.
> >>>>>>>> Pages on this subject are created, but still empty:
> >>>>>>>>
> >>>>>>>> http://wiki.apache.org/hadoop/NameNodeFailover
> >>>>>>>> http://wiki.apache.org/hadoop/SecondaryNameNode
> >>>>>>>>
> >>>>>>>> I have been browsing the web and hadoop mailing list to see how this
> >>>>>>>> should be implemented, but I got even more confused. People are asking
> >>>>>>>> do we even need SecondaryNameNode etc. (since NameNode can write local
> >>>>>>>> data to multiple locations, so one of those locations can be a mounted
> >>>>>>>> disk from other machine). I think I understand the motivation for
> >>>>>>>> SecondaryNameNode (to create a snapshoot of NameNode data every n
> >>>>>>>> seconds/hours), but setting (deploying and running) SecondaryNameNode
> >>>>>> on
> >>>>>>>> different machine than NameNode is not as trivial as I expected. First
> >>>>>> I
> >>>>>>>> found that if I need to run SecondaryNameNode on other machine than
> >>>>>>>> NameNode I should change masters file on NameNode (change localhost to
> >>>>>>>> SecondaryNameNode host) and set some properties in hadoop-site.xml on
> >>>>>>>> SecondaryNameNode (fs.default.name, fs.checkpoint.dir,
> >>>>>>>> fs.checkpoint.period etc.)
> >>>>>>>>
> >>>>>>>> This was enough to start SecondaryNameNode when starting NameNode with
> >>>>>>>> bin/start-dfs.sh , but it didn't create image on SecondaryNameNode.
> >>>>>> Then
> >>>>>>>> I found that I need to set dfs.http.address on NameNode address (so now
> >>>>>>>> I have NameNode address in both fs.default.name and dfs.http.address).
> >>>>>>>>
> >>>>>>>> Now I get following exception:
> >>>>>>>>
> >>>>>>>> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary - Exception in
> >>>>>>>> doCheckpoint:
> >>>>>>>> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary -
> >>>>>>>> java.net.SocketException: Unexpected end of file from server
> >>>>>>>>
> >>>>>>>> My questions are following:
> >>>>>>>> How to resolve this problem (this exception)?
> >>>>>>>> Do I need additional property in SecondaryNameNode's hadoop-site.xml or
> >>>>>>>> NameNode's hadoop-site.xml?
> >>>>>>>>
> >>>>>>>> How should NameNode failover work ideally? Is it like this:
> >>>>>>>>
> >>>>>>>> SecondaryNameNode runs on separate machine than NameNode and stores
> >>>>>>>> NameNode's data (fsimage and fsiedits) locally in fs.checkpoint.dir.
> >>>>>>>> When NameNode machine crashes, we start NameNode on machine where
> >>>>>>>> SecondaryNameNode was running and we set dfs.name.dir to
> >>>>>>>> fs.checkpoint.dir. Also we need to change how DNS resolves NameNode
> >>>>>>>> hostname (change from the primary to the secondary).
> >>>>>>>>
> >>>>>>>> Is this correct ?
> >>>>>>>>
> >>>>>>>> Tomislav
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>
> > 
> >

Re: SecondaryNameNode on separate machine

Posted by yossale <yo...@gmail.com>.

>Now SecondaryNameNode connects to the NameNode (after I configured
>dfs.http.address to the NN's http server -> NN hostname on port 50070)
>and creates(transfers) edits and fsimage from NameNode.

It didn't work for me - I get an error: 
java.io.FileNotFoundException:
http://192.168.30.5:50070/getimage?putimage=1&port=50090&machine=127.0.0.1&token=-16:1173009257:0:1226503705000:1226503705207
        at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1168)
        at
org.apache.hadoop.dfs.TransferFsImage.getFileClient(TransferFsImage.java:150)
        at
org.apache.hadoop.dfs.SecondaryNameNode.putFSImage(SecondaryNameNode.java:271)
        at
org.apache.hadoop.dfs.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:311)
        at
org.apache.hadoop.dfs.SecondaryNameNode.run(SecondaryNameNode.java:216)
        at java.lang.Thread.run(Thread.java:595)

And when I run the http request directly (in the browser) , I receive this : 
GetImage failed. java.io.IOException: Namenode is not expecting an new image
UPLOAD_START
	at
org.apache.hadoop.dfs.FSImage.validateCheckpointUpload(FSImage.java:1193)
	at org.apache.hadoop.dfs.GetImageServlet.doGet(GetImageServlet.java:57)
        ...... 

If it is a mundane thing (i.e "not need to check point now" ) why does it
throw an Error? What is the "UPLOAD_START" at the end of the message? (if it
failed , how come it starts?) - but more importantly - how do I get rid of
it? 

Thanks!


-- 
View this message in context: http://www.nabble.com/SecondaryNameNode-on-separate-machine-tp20207482p20463349.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: SecondaryNameNode on separate machine

Posted by Tomislav Poljak <tp...@gmail.com>.

Konstantin,

it works, thanks a lot!

Tomislav


On Mon, 2008-11-03 at 11:13 -0800, Konstantin Shvachko wrote:
> You can either do what you just described with dfs.name.dir = dirX
> or you can start name-node with -importCheckpoint option.
> This is an automation for copying image files from secondary to primary.
> 
> See here:
> http://hadoop.apache.org/core/docs/current/commands_manual.html#namenode
> http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode
> http://issues.apache.org/jira/browse/HADOOP-2585#action_12584755
> 
> --Konstantin
> 
> Tomislav Poljak wrote:
> > Hi,
> > Thank you all for your time and your answers!
> > 
> > Now SecondaryNameNode connects to the NameNode (after I configured
> > dfs.http.address to the NN's http server -> NN hostname on port 50070)
> > and creates(transfers) edits and fsimage from NameNode.
> > 
> > Can you explain me a little bit more how NameNode failover should work
> > now? 
> > 
> > For example, SecondaryNameNode now stores fsimage and edits to (SNN's)
> > dirX and let's say NameNode goes down (disk becomes unreadable). Now I
> > create/dedicate a new machine for NameNode (also change DNS to point to
> > this new NameNode machine as nameNode host) and take the data dirX from
> > SNN and copy it to new NameNode. How do I configure new NameNode to use
> > data from dirX (do I configure dfs.name.dir to point to dirX and start
> > new NameNode)?
> > 
> > Thanks,
> >         Tomislav
> > 
> > 
> > 
> > On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote:
> >> True, dfs.http.address is the NN Web UI address.
> >> This where the NN http server runs. Besides the Web UI there also
> >> a servlet running on that server which is used to transfer image
> >> and edits from NN to the secondary using http get.
> >> So SNN uses both addresses fs.default.name and dfs.http.address.
> >>
> >> When SNN finishes the checkpoint the primary needs to transfer the
> >> resulting image back. This is done via the http server running on SNN.
> >>
> >> Answering Tomislav's question:
> >> The difference between fs.default.name and dfs.http.address is that
> >> fs.default.name is the name-node's PRC address, where clients and
> >> data-nodes connect to, while dfs.http.address is the NN's http server
> >> address where our browsers connect to, but it is also used for
> >> transferring image and edits files.
> >>
> >> --Konstantin
> >>
> >> Otis Gospodnetic wrote:
> >>> Konstantin & Co, please correct me if I'm wrong, but looking at hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN *Web UI*.  In other words, this is where we people go look at the NN.
> >>>
> >>> The secondary NN must then be using only the Primary NN URL specified in fs.default.name.  This URL looks like hdfs://name-node-hostname-here/.  Something in Hadoop then knows the exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this URL.
> >>>
> >>> Is this correct?
> >>>
> >>>
> >>> Thanks,
> >>> Otis
> >>> --
> >>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>>
> >>>
> >>>
> >>> ----- Original Message ----
> >>>> From: Tomislav Poljak <tp...@gmail.com>
> >>>> To: core-user@hadoop.apache.org
> >>>> Sent: Thursday, October 30, 2008 1:52:18 PM
> >>>> Subject: Re: SecondaryNameNode on separate machine
> >>>>
> >>>> Hi,
> >>>> can you, please, explain the difference between fs.default.name and
> >>>> dfs.http.address (like how and when is SecondaryNameNode using
> >>>> fs.default.name and how/when dfs.http.address). I have set them both to
> >>>> same (namenode's) hostname:port. Is this correct (or dfs.http.address
> >>>> needs some other port)? 
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Tomislav
> >>>>
> >>>> On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote:
> >>>>> SecondaryNameNode uses http protocol to transfer the image and the edits
> >>>>> from the primary name-node and vise versa.
> >>>>> So the secondary does not access local files on the primary directly.
> >>>>> The primary NN should know the secondary's http address.
> >>>>> And the secondary NN need to know both fs.default.name and dfs.http.address of 
> >>>> the primary.
> >>>>> In general we usually create one configuration file hadoop-site.xml
> >>>>> and copy it to all other machines. So you don't need to set up different
> >>>>> values for all servers.
> >>>>>
> >>>>> Regards,
> >>>>> --Konstantin
> >>>>>
> >>>>> Tomislav Poljak wrote:
> >>>>>> Hi,
> >>>>>> I'm not clear on how does SecondaryNameNode communicates with NameNode
> >>>>>> (if deployed on separate machine). Does SecondaryNameNode uses direct
> >>>>>> connection (over some port and protocol) or is it enough for
> >>>>>> SecondaryNameNode to have access to data which NameNode writes locally
> >>>>>> on disk?
> >>>>>>
> >>>>>> Tomislav
> >>>>>>
> >>>>>> On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote:
> >>>>>>> I think a lot of the confusion comes from this thread :
> >>>>>>> http://www.nabble.com/NameNode-failover-procedure-td11711842.html
> >>>>>>>
> >>>>>>> Particularly because the wiki was updated with wrong information, not
> >>>>>>> maliciously I'm sure. This information is now gone for good.
> >>>>>>>
> >>>>>>> Otis, your solution is pretty much like the one given by Dhruba Borthakur
> >>>>>>> and augmented by Konstantin Shvachko later in the thread but I never did it
> >>>>>>> myself.
> >>>>>>>
> >>>>>>> One thing should be clear though, the NN is and will remain a SPOF (just
> >>>>>>> like HBase's Master) as long as a distributed manager service (like
> >>>>>>> Zookeeper) is not plugged into Hadoop to help with failover.
> >>>>>>>
> >>>>>>> J-D
> >>>>>>>
> >>>>>>> On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic <
> >>>>>>> otis_gospodnetic@yahoo.com> wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>> So what is the "recipe" for avoiding NN SPOF using only what comes with
> >>>>>>>> Hadoop?
> >>>>>>>>
> >>>>>>>> From what I can tell, I think one has to do the following two things:
> >>>>>>>>
> >>>>>>>> 1) configure primary NN to save namespace and xa logs to multiple dirs, 
> >>>> one
> >>>>>>>> of which is actually on a remotely mounted disk, so that the data actually
> >>>>>>>> lives on a separate disk on a separate box.  This saves namespace and xa
> >>>>>>>> logs on multiple boxes in case of primary NN hardware failure.
> >>>>>>>>
> >>>>>>>> 2) configure secondary NN to periodically merge fsimage+edits and create
> >>>>>>>> the fsimage checkpoint.  This really is a second NN process running on
> >>>>>>>> another box.  It sounds like this secondary NN has to somehow have access 
> >>>> to
> >>>>>>>> fsimage & edits files from the primary NN server.
> >>>>>>>>
> >>>> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodedoes 
> >>>> not describe the best practise around that - the recommended way to
> >>>>>>>> give secondary NN access to primary NN's fsimage and edits files.  Should
> >>>>>>>> one mount a disk from the primary NN box to the secondary NN box to get
> >>>>>>>> access to those files?  Or is there a simpler way?
> >>>>>>>> In any case, this checkpoint is just a merge of fsimage+edits files and
> >>>>>>>> again is there in case the box with the primary NN dies.  That's what's
> >>>>>>>> described on
> >>>>>>>>
> >>>> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodemore 
> >>>> or less.
> >>>>>>>> Is this sufficient, or are there other things one has to do to eliminate 
> >>>> NN
> >>>>>>>> SPOF?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Otis
> >>>>>>>> --
> >>>>>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ----- Original Message ----
> >>>>>>>>> From: Jean-Daniel Cryans 
> >>>>>>>>> To: core-user@hadoop.apache.org
> >>>>>>>>> Sent: Tuesday, October 28, 2008 8:14:44 PM
> >>>>>>>>> Subject: Re: SecondaryNameNode on separate machine
> >>>>>>>>>
> >>>>>>>>> Tomislav.
> >>>>>>>>>
> >>>>>>>>> Contrary to popular belief the secondary namenode does not provide
> >>>>>>>> failover,
> >>>>>>>>> it's only used to do what is described here :
> >>>>>>>>>
> >>>> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNode
> >>>>>>>>> So the term "secondary" does not mean "a second one" but is more like "a
> >>>>>>>>> second part of".
> >>>>>>>>>
> >>>>>>>>> J-D
> >>>>>>>>>
> >>>>>>>>> On Tue, Oct 28, 2008 at 9:44 AM, Tomislav Poljak wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>> I'm trying to implement NameNode failover (or at least NameNode local
> >>>>>>>>>> data backup), but it is hard since there is no official documentation.
> >>>>>>>>>> Pages on this subject are created, but still empty:
> >>>>>>>>>>
> >>>>>>>>>> http://wiki.apache.org/hadoop/NameNodeFailover
> >>>>>>>>>> http://wiki.apache.org/hadoop/SecondaryNameNode
> >>>>>>>>>>
> >>>>>>>>>> I have been browsing the web and hadoop mailing list to see how this
> >>>>>>>>>> should be implemented, but I got even more confused. People are asking
> >>>>>>>>>> do we even need SecondaryNameNode etc. (since NameNode can write local
> >>>>>>>>>> data to multiple locations, so one of those locations can be a mounted
> >>>>>>>>>> disk from other machine). I think I understand the motivation for
> >>>>>>>>>> SecondaryNameNode (to create a snapshoot of NameNode data every n
> >>>>>>>>>> seconds/hours), but setting (deploying and running) SecondaryNameNode
> >>>>>>>> on
> >>>>>>>>>> different machine than NameNode is not as trivial as I expected. First
> >>>>>>>> I
> >>>>>>>>>> found that if I need to run SecondaryNameNode on other machine than
> >>>>>>>>>> NameNode I should change masters file on NameNode (change localhost to
> >>>>>>>>>> SecondaryNameNode host) and set some properties in hadoop-site.xml on
> >>>>>>>>>> SecondaryNameNode (fs.default.name, fs.checkpoint.dir,
> >>>>>>>>>> fs.checkpoint.period etc.)
> >>>>>>>>>>
> >>>>>>>>>> This was enough to start SecondaryNameNode when starting NameNode with
> >>>>>>>>>> bin/start-dfs.sh , but it didn't create image on SecondaryNameNode.
> >>>>>>>> Then
> >>>>>>>>>> I found that I need to set dfs.http.address on NameNode address (so now
> >>>>>>>>>> I have NameNode address in both fs.default.name and dfs.http.address).
> >>>>>>>>>>
> >>>>>>>>>> Now I get following exception:
> >>>>>>>>>>
> >>>>>>>>>> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary - Exception in
> >>>>>>>>>> doCheckpoint:
> >>>>>>>>>> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary -
> >>>>>>>>>> java.net.SocketException: Unexpected end of file from server
> >>>>>>>>>>
> >>>>>>>>>> My questions are following:
> >>>>>>>>>> How to resolve this problem (this exception)?
> >>>>>>>>>> Do I need additional property in SecondaryNameNode's hadoop-site.xml or
> >>>>>>>>>> NameNode's hadoop-site.xml?
> >>>>>>>>>>
> >>>>>>>>>> How should NameNode failover work ideally? Is it like this:
> >>>>>>>>>>
> >>>>>>>>>> SecondaryNameNode runs on separate machine than NameNode and stores
> >>>>>>>>>> NameNode's data (fsimage and fsiedits) locally in fs.checkpoint.dir.
> >>>>>>>>>> When NameNode machine crashes, we start NameNode on machine where
> >>>>>>>>>> SecondaryNameNode was running and we set dfs.name.dir to
> >>>>>>>>>> fs.checkpoint.dir. Also we need to change how DNS resolves NameNode
> >>>>>>>>>> hostname (change from the primary to the secondary).
> >>>>>>>>>>
> >>>>>>>>>> Is this correct ?
> >>>>>>>>>>
> >>>>>>>>>> Tomislav
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>
> > 
> >

Re: SecondaryNameNode on separate machine

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

You can either do what you just described with dfs.name.dir = dirX
or you can start name-node with -importCheckpoint option.
This is an automation for copying image files from secondary to primary.

See here:
http://hadoop.apache.org/core/docs/current/commands_manual.html#namenode
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode
http://issues.apache.org/jira/browse/HADOOP-2585#action_12584755

--Konstantin

Tomislav Poljak wrote:
> Hi,
> Thank you all for your time and your answers!
> 
> Now SecondaryNameNode connects to the NameNode (after I configured
> dfs.http.address to the NN's http server -> NN hostname on port 50070)
> and creates(transfers) edits and fsimage from NameNode.
> 
> Can you explain me a little bit more how NameNode failover should work
> now? 
> 
> For example, SecondaryNameNode now stores fsimage and edits to (SNN's)
> dirX and let's say NameNode goes down (disk becomes unreadable). Now I
> create/dedicate a new machine for NameNode (also change DNS to point to
> this new NameNode machine as nameNode host) and take the data dirX from
> SNN and copy it to new NameNode. How do I configure new NameNode to use
> data from dirX (do I configure dfs.name.dir to point to dirX and start
> new NameNode)?
> 
> Thanks,
>         Tomislav
> 
> 
> 
> On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote:
>> True, dfs.http.address is the NN Web UI address.
>> This where the NN http server runs. Besides the Web UI there also
>> a servlet running on that server which is used to transfer image
>> and edits from NN to the secondary using http get.
>> So SNN uses both addresses fs.default.name and dfs.http.address.
>>
>> When SNN finishes the checkpoint the primary needs to transfer the
>> resulting image back. This is done via the http server running on SNN.
>>
>> Answering Tomislav's question:
>> The difference between fs.default.name and dfs.http.address is that
>> fs.default.name is the name-node's PRC address, where clients and
>> data-nodes connect to, while dfs.http.address is the NN's http server
>> address where our browsers connect to, but it is also used for
>> transferring image and edits files.
>>
>> --Konstantin
>>
>> Otis Gospodnetic wrote:
>>> Konstantin & Co, please correct me if I'm wrong, but looking at hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN *Web UI*.  In other words, this is where we people go look at the NN.
>>>
>>> The secondary NN must then be using only the Primary NN URL specified in fs.default.name.  This URL looks like hdfs://name-node-hostname-here/.  Something in Hadoop then knows the exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this URL.
>>>
>>> Is this correct?
>>>
>>>
>>> Thanks,
>>> Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Tomislav Poljak <tp...@gmail.com>
>>>> To: core-user@hadoop.apache.org
>>>> Sent: Thursday, October 30, 2008 1:52:18 PM
>>>> Subject: Re: SecondaryNameNode on separate machine
>>>>
>>>> Hi,
>>>> can you, please, explain the difference between fs.default.name and
>>>> dfs.http.address (like how and when is SecondaryNameNode using
>>>> fs.default.name and how/when dfs.http.address). I have set them both to
>>>> same (namenode's) hostname:port. Is this correct (or dfs.http.address
>>>> needs some other port)? 
>>>>
>>>> Thanks,
>>>>
>>>> Tomislav
>>>>
>>>> On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote:
>>>>> SecondaryNameNode uses http protocol to transfer the image and the edits
>>>>> from the primary name-node and vise versa.
>>>>> So the secondary does not access local files on the primary directly.
>>>>> The primary NN should know the secondary's http address.
>>>>> And the secondary NN need to know both fs.default.name and dfs.http.address of 
>>>> the primary.
>>>>> In general we usually create one configuration file hadoop-site.xml
>>>>> and copy it to all other machines. So you don't need to set up different
>>>>> values for all servers.
>>>>>
>>>>> Regards,
>>>>> --Konstantin
>>>>>
>>>>> Tomislav Poljak wrote:
>>>>>> Hi,
>>>>>> I'm not clear on how does SecondaryNameNode communicates with NameNode
>>>>>> (if deployed on separate machine). Does SecondaryNameNode uses direct
>>>>>> connection (over some port and protocol) or is it enough for
>>>>>> SecondaryNameNode to have access to data which NameNode writes locally
>>>>>> on disk?
>>>>>>
>>>>>> Tomislav
>>>>>>
>>>>>> On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote:
>>>>>>> I think a lot of the confusion comes from this thread :
>>>>>>> http://www.nabble.com/NameNode-failover-procedure-td11711842.html
>>>>>>>
>>>>>>> Particularly because the wiki was updated with wrong information, not
>>>>>>> maliciously I'm sure. This information is now gone for good.
>>>>>>>
>>>>>>> Otis, your solution is pretty much like the one given by Dhruba Borthakur
>>>>>>> and augmented by Konstantin Shvachko later in the thread but I never did it
>>>>>>> myself.
>>>>>>>
>>>>>>> One thing should be clear though, the NN is and will remain a SPOF (just
>>>>>>> like HBase's Master) as long as a distributed manager service (like
>>>>>>> Zookeeper) is not plugged into Hadoop to help with failover.
>>>>>>>
>>>>>>> J-D
>>>>>>>
>>>>>>> On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic <
>>>>>>> otis_gospodnetic@yahoo.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> So what is the "recipe" for avoiding NN SPOF using only what comes with
>>>>>>>> Hadoop?
>>>>>>>>
>>>>>>>> From what I can tell, I think one has to do the following two things:
>>>>>>>>
>>>>>>>> 1) configure primary NN to save namespace and xa logs to multiple dirs, 
>>>> one
>>>>>>>> of which is actually on a remotely mounted disk, so that the data actually
>>>>>>>> lives on a separate disk on a separate box.  This saves namespace and xa
>>>>>>>> logs on multiple boxes in case of primary NN hardware failure.
>>>>>>>>
>>>>>>>> 2) configure secondary NN to periodically merge fsimage+edits and create
>>>>>>>> the fsimage checkpoint.  This really is a second NN process running on
>>>>>>>> another box.  It sounds like this secondary NN has to somehow have access 
>>>> to
>>>>>>>> fsimage & edits files from the primary NN server.
>>>>>>>>
>>>> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodedoes 
>>>> not describe the best practise around that - the recommended way to
>>>>>>>> give secondary NN access to primary NN's fsimage and edits files.  Should
>>>>>>>> one mount a disk from the primary NN box to the secondary NN box to get
>>>>>>>> access to those files?  Or is there a simpler way?
>>>>>>>> In any case, this checkpoint is just a merge of fsimage+edits files and
>>>>>>>> again is there in case the box with the primary NN dies.  That's what's
>>>>>>>> described on
>>>>>>>>
>>>> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodemore 
>>>> or less.
>>>>>>>> Is this sufficient, or are there other things one has to do to eliminate 
>>>> NN
>>>>>>>> SPOF?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Otis
>>>>>>>> --
>>>>>>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ----- Original Message ----
>>>>>>>>> From: Jean-Daniel Cryans 
>>>>>>>>> To: core-user@hadoop.apache.org
>>>>>>>>> Sent: Tuesday, October 28, 2008 8:14:44 PM
>>>>>>>>> Subject: Re: SecondaryNameNode on separate machine
>>>>>>>>>
>>>>>>>>> Tomislav.
>>>>>>>>>
>>>>>>>>> Contrary to popular belief the secondary namenode does not provide
>>>>>>>> failover,
>>>>>>>>> it's only used to do what is described here :
>>>>>>>>>
>>>> http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNode
>>>>>>>>> So the term "secondary" does not mean "a second one" but is more like "a
>>>>>>>>> second part of".
>>>>>>>>>
>>>>>>>>> J-D
>>>>>>>>>
>>>>>>>>> On Tue, Oct 28, 2008 at 9:44 AM, Tomislav Poljak wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> I'm trying to implement NameNode failover (or at least NameNode local
>>>>>>>>>> data backup), but it is hard since there is no official documentation.
>>>>>>>>>> Pages on this subject are created, but still empty:
>>>>>>>>>>
>>>>>>>>>> http://wiki.apache.org/hadoop/NameNodeFailover
>>>>>>>>>> http://wiki.apache.org/hadoop/SecondaryNameNode
>>>>>>>>>>
>>>>>>>>>> I have been browsing the web and hadoop mailing list to see how this
>>>>>>>>>> should be implemented, but I got even more confused. People are asking
>>>>>>>>>> do we even need SecondaryNameNode etc. (since NameNode can write local
>>>>>>>>>> data to multiple locations, so one of those locations can be a mounted
>>>>>>>>>> disk from other machine). I think I understand the motivation for
>>>>>>>>>> SecondaryNameNode (to create a snapshoot of NameNode data every n
>>>>>>>>>> seconds/hours), but setting (deploying and running) SecondaryNameNode
>>>>>>>> on
>>>>>>>>>> different machine than NameNode is not as trivial as I expected. First
>>>>>>>> I
>>>>>>>>>> found that if I need to run SecondaryNameNode on other machine than
>>>>>>>>>> NameNode I should change masters file on NameNode (change localhost to
>>>>>>>>>> SecondaryNameNode host) and set some properties in hadoop-site.xml on
>>>>>>>>>> SecondaryNameNode (fs.default.name, fs.checkpoint.dir,
>>>>>>>>>> fs.checkpoint.period etc.)
>>>>>>>>>>
>>>>>>>>>> This was enough to start SecondaryNameNode when starting NameNode with
>>>>>>>>>> bin/start-dfs.sh , but it didn't create image on SecondaryNameNode.
>>>>>>>> Then
>>>>>>>>>> I found that I need to set dfs.http.address on NameNode address (so now
>>>>>>>>>> I have NameNode address in both fs.default.name and dfs.http.address).
>>>>>>>>>>
>>>>>>>>>> Now I get following exception:
>>>>>>>>>>
>>>>>>>>>> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary - Exception in
>>>>>>>>>> doCheckpoint:
>>>>>>>>>> 2008-10-28 09:18:00,098 ERROR NameNode.Secondary -
>>>>>>>>>> java.net.SocketException: Unexpected end of file from server
>>>>>>>>>>
>>>>>>>>>> My questions are following:
>>>>>>>>>> How to resolve this problem (this exception)?
>>>>>>>>>> Do I need additional property in SecondaryNameNode's hadoop-site.xml or
>>>>>>>>>> NameNode's hadoop-site.xml?
>>>>>>>>>>
>>>>>>>>>> How should NameNode failover work ideally? Is it like this:
>>>>>>>>>>
>>>>>>>>>> SecondaryNameNode runs on separate machine than NameNode and stores
>>>>>>>>>> NameNode's data (fsimage and fsiedits) locally in fs.checkpoint.dir.
>>>>>>>>>> When NameNode machine crashes, we start NameNode on machine where
>>>>>>>>>> SecondaryNameNode was running and we set dfs.name.dir to
>>>>>>>>>> fs.checkpoint.dir. Also we need to change how DNS resolves NameNode
>>>>>>>>>> hostname (change from the primary to the secondary).
>>>>>>>>>>
>>>>>>>>>> Is this correct ?
>>>>>>>>>>
>>>>>>>>>> Tomislav
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>
> 
>