You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Rakhi Khatwani <ra...@gmail.com> on 2009/05/14 15:03:26 UTC

Setting up another machine as secondary node

Hi,
     I wanna set up a cluster of 5 nodes in such a way that
node1 - master
node2 - secondary namenode
node3 - slave
node4 - slave
node5 - slave


How do we go about that?
there is no property in hadoop-env where i can set the ip-address for
secondary name node.

if i set node-1 and node-2 in masters, and when we start dfs, in both the
m/cs, the namenode n secondary namenode processes r present. but i think
only node1 is active.
n my namenode fail over operation fails.

ny suggesstions?

Regards,
Rakhi

Re: Setting up another machine as secondary node

Posted by Billy Pearson <sa...@pearsonwholesale.com>.

secondary name node is not a fail-over for the namenode.
http://wiki.apache.org/hadoop/FAQ#7

Billy




"Rakhi Khatwani" <ra...@gmail.com> 
wrote in message 
news:384813770905180134h1a503cc5k123a885aebc4ba28@mail.gmail.com...
> Hi,
>    I successfully set up the secondary name node thing.
> but i am having issues whn i perform the failover.
>
> i have 5 nodes
> node a - master
> node b - slave
> node c - slave
> node d - slave
> node e - secondary name node
>
> Following are my steps:
>
> 1. configuration is as follows:
> for all the nodes:
> conf/master: node e
> conf/slaves: node b
>                   node c
>                   node d
> conf/hadoop-site: default-name: node a,
>                          job-tracker: node a
>
> 2. ./start-dfs
>
> 3. added a couple of files in hadoop fs.
> 4. kill the namenode
>
> 5. changed the following properties for hadoop-site for node b, node c, 
> node
> d, node e
>                         default-name: node e,
>                          job-tracker: node e.
>
>
>
> TRIAL1:
> 6 .copied the name dir from node a to node e
>
> 7 .executed the following command
> ./hadoop namenode -importCheckpoint
>
> i get the following exception:
> 05/18 13:54:59 INFO metrics.RpcMetrics: Initializing RPC Metrics with
> hostName=NameNode, port=44444
> 09/05/18 13:54:59 INFO namenode.NameNode: Namenode up at: germapp/
> 192.168.0.1:44444
> 09/05/18 13:54:59 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=NameNode, sessionId=null
> 09/05/18 13:54:59 INFO metrics.NameNodeMetrics: Initializing
> NameNodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 09/05/18 13:54:59 INFO namenode.FSNamesystem: fsOwner=ithurs,ithurs
> 09/05/18 13:54:59 INFO namenode.FSNamesystem: supergroup=supergroup
> 09/05/18 13:54:59 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 09/05/18 13:54:59 INFO metrics.FSNamesystemMetrics: Initializing
> FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 09/05/18 13:54:59 INFO namenode.FSNamesystem: Registered
> FSNamesystemStatusMBean
> 09/05/18 13:54:59 ERROR namenode.FSNamesystem: FSNamesystem initialization
> failed.
> java.io.IOException: Cannot import image from a checkpoint.  NameNode
> already contains an image in /tmp/hadoop-ithurs/dfs/name
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 09/05/18 13:54:59 INFO ipc.Server: Stopping server on 44444
> 09/05/18 13:54:59 ERROR namenode.NameNode: java.io.IOException: Cannot
> import image from a checkpoint.  NameNode already contains an image in
> /tmp/hadoop-ithurs/dfs/name
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>
>
> TRIAL 2:
> 6. skip copying:
> 7 .executed the following command
> ./hadoop namenode -importCheckpoint
>
> i get the following exception:
> 6. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
> Directory /tmp/hadoop-ithurs/dfs/name is in an inconsistent state: storage
> directory does not exist or is not accessible.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 09/05/18 14:13:41 INFO ipc.Server: Stopping server on 44444
> 09/05/18 14:13:41 ERROR namenode.NameNode:
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
> Directory
> /tmp/hadoop-ithurs/dfs/name is in an inconsistent state: storage directory
> does not exist or is not accessible.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>
> TRIAL 3:
> 6. create a new directory name in /tmp/hadoop-ithurs/dfs/
> 7 .executed the following command
> ./hadoop namenode -importCheckpoint
>
> i get the following exception
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
> Directory
> /tmp/hadoop-ithurs/dfs/namesecondary is in an inconsistent state:
> /tmp/hadoop-ithurs/dfs/namesecondary/image does not exist.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:645)
>        at
> org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:590)
>        at
> org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:61)
>        at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:369)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 09/05/18 14:15:12 INFO ipc.Server: Stopping server on 44444
> 09/05/18 14:15:12 ERROR namenode.NameNode:
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: 
> Directory
> /tmp/hadoop-ithurs/dfs/namesecondary is in an inconsistent state:
> /tmp/hadoop-ithurs/dfs/namesecondary/image does not exist.
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:645)
>        at
> org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:590)
>        at
> org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:61)
>        at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:369)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>
>
> any pointers/suggesstions?
>
> Thanks,
> Raakhi
>
>
>
> On Fri, May 15, 2009 at 6:21 AM, jason hadoop 
> <ja...@gmail.com>wrote:
>
>> the masters file only contains the secondary namenodes.
>> when you start-dfs.sh or start-all, the namenode, which is the master, is
>> started on the local machine, and secondary namenodes are started on each
>> host listed in conf/masters
>>
>> This now confusing pattern is probably the result of some historical
>> requirement that we are unaware of.
>>
>> Here are the relevant lines from bin/start-dfs.sh
>>
>> # start dfs daemons
>> # start namenode after datanodes, to minimize time namenode is up w/o 
>> data
>> # note: datanodes will log connection errors until namenode starts
>> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
>> $nameStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
>> $dataStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
>> secondarynamenode
>>
>>
>> On Thu, May 14, 2009 at 11:36 PM, Ninad Raut 
>> <hbase.user.ninad@gmail.com
>> >wrote:
>>
>> > But if we have two master in the master file we have master and 
>> > secondary
>> > node, *both *processes getting started on the two servers listed. Cant 
>> > we
>> > have master and secondary node started seperately on two machines??
>> >
>> > On Fri, May 15, 2009 at 9:39 AM, jason hadoop 
>> > <jason.hadoop@gmail.com
>> > >wrote:
>> >
>> > > I agree with billy. conf/masters is misleading as the place for
>> secondary
>> > > namenodes.
>> > >
>> > > On Thu, May 14, 2009 at 8:38 PM, Billy Pearson
>> > > <sa...@pearsonwholesale.com>wrote:
>> > >
>> > > > I thank the secondary namenode is set in the masters file in the 
>> > > > conf
>> > > > folder
>> > > > misleading
>> > > >
>> > > > Billy
>> > > >
>> > > >
>> > > >
>> > > > "Rakhi Khatwani" 
>> > > > <ra...@gmail.com> wrote in 
>> > > > message
>> > > > news:384813770905140603g4d552834gcef2db3028a00191@mail.gmail.com...
>> > > >
>> > > >  Hi,
>> > > >>    I wanna set up a cluster of 5 nodes in such a way that
>> > > >> node1 - master
>> > > >> node2 - secondary namenode
>> > > >> node3 - slave
>> > > >> node4 - slave
>> > > >> node5 - slave
>> > > >>
>> > > >>
>> > > >> How do we go about that?
>> > > >> there is no property in hadoop-env where i can set the ip-address
>> for
>> > > >> secondary name node.
>> > > >>
>> > > >> if i set node-1 and node-2 in masters, and when we start dfs, in
>> both
>> > > the
>> > > >> m/cs, the namenode n secondary namenode processes r present. but i
>> > think
>> > > >> only node1 is active.
>> > > >> n my namenode fail over operation fails.
>> > > >>
>> > > >> ny suggesstions?
>> > > >>
>> > > >> Regards,
>> > > >> Rakhi
>> > > >>
>> > > >>
>> > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Alpha Chapters of my book on Hadoop are available
>> > > http://www.apress.com/book/view/9781430219422
>> > > www.prohadoopbook.com a community for Hadoop Professionals
>> > >
>> >
>>
>>
>>
>> --
>> Alpha Chapters of my book on Hadoop are available
>> http://www.apress.com/book/view/9781430219422
>> www.prohadoopbook.com a community for Hadoop Professionals
>>
>

Re: Setting up another machine as secondary node

Posted by Rakhi Khatwani <ra...@gmail.com>.

Hi,
    I successfully set up the secondary name node thing.
but i am having issues whn i perform the failover.

i have 5 nodes
node a - master
node b - slave
node c - slave
node d - slave
node e - secondary name node

Following are my steps:

1. configuration is as follows:
for all the nodes:
conf/master: node e
conf/slaves: node b
                   node c
                   node d
conf/hadoop-site: default-name: node a,
                          job-tracker: node a

2. ./start-dfs

3. added a couple of files in hadoop fs.
4. kill the namenode

5. changed the following properties for hadoop-site for node b, node c, node
d, node e
                         default-name: node e,
                          job-tracker: node e.



TRIAL1:
6 .copied the name dir from node a to node e

7 .executed the following command
./hadoop namenode -importCheckpoint

i get the following exception:
05/18 13:54:59 INFO metrics.RpcMetrics: Initializing RPC Metrics with
hostName=NameNode, port=44444
09/05/18 13:54:59 INFO namenode.NameNode: Namenode up at: germapp/
192.168.0.1:44444
09/05/18 13:54:59 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=NameNode, sessionId=null
09/05/18 13:54:59 INFO metrics.NameNodeMetrics: Initializing
NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
09/05/18 13:54:59 INFO namenode.FSNamesystem: fsOwner=ithurs,ithurs
09/05/18 13:54:59 INFO namenode.FSNamesystem: supergroup=supergroup
09/05/18 13:54:59 INFO namenode.FSNamesystem: isPermissionEnabled=true
09/05/18 13:54:59 INFO metrics.FSNamesystemMetrics: Initializing
FSNamesystemMetrics using context
object:org.apache.hadoop.metrics.spi.NullContext
09/05/18 13:54:59 INFO namenode.FSNamesystem: Registered
FSNamesystemStatusMBean
09/05/18 13:54:59 ERROR namenode.FSNamesystem: FSNamesystem initialization
failed.
java.io.IOException: Cannot import image from a checkpoint.  NameNode
already contains an image in /tmp/hadoop-ithurs/dfs/name
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
09/05/18 13:54:59 INFO ipc.Server: Stopping server on 44444
09/05/18 13:54:59 ERROR namenode.NameNode: java.io.IOException: Cannot
import image from a checkpoint.  NameNode already contains an image in
/tmp/hadoop-ithurs/dfs/name
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)


TRIAL 2:
6. skip copying:
7 .executed the following command
./hadoop namenode -importCheckpoint

i get the following exception:
6. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
Directory /tmp/hadoop-ithurs/dfs/name is in an inconsistent state: storage
directory does not exist or is not accessible.
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
09/05/18 14:13:41 INFO ipc.Server: Stopping server on 44444
09/05/18 14:13:41 ERROR namenode.NameNode:
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory
/tmp/hadoop-ithurs/dfs/name is in an inconsistent state: storage directory
does not exist or is not accessible.
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)

TRIAL 3:
6. create a new directory name in /tmp/hadoop-ithurs/dfs/
7 .executed the following command
./hadoop namenode -importCheckpoint

i get the following exception
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory
/tmp/hadoop-ithurs/dfs/namesecondary is in an inconsistent state:
/tmp/hadoop-ithurs/dfs/namesecondary/image does not exist.
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:645)
        at
org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:590)
        at
org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:61)
        at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:369)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
09/05/18 14:15:12 INFO ipc.Server: Stopping server on 44444
09/05/18 14:15:12 ERROR namenode.NameNode:
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory
/tmp/hadoop-ithurs/dfs/namesecondary is in an inconsistent state:
/tmp/hadoop-ithurs/dfs/namesecondary/image does not exist.
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.isConversionNeeded(FSImage.java:645)
        at
org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:590)
        at
org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:61)
        at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:369)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
        at
org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)


any pointers/suggesstions?

Thanks,
Raakhi



On Fri, May 15, 2009 at 6:21 AM, jason hadoop <ja...@gmail.com>wrote:

> the masters file only contains the secondary namenodes.
> when you start-dfs.sh or start-all, the namenode, which is the master, is
> started on the local machine, and secondary namenodes are started on each
> host listed in conf/masters
>
> This now confusing pattern is probably the result of some historical
> requirement that we are unaware of.
>
> Here are the relevant lines from bin/start-dfs.sh
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>
> On Thu, May 14, 2009 at 11:36 PM, Ninad Raut <hbase.user.ninad@gmail.com
> >wrote:
>
> > But if we have two master in the master file we have master and secondary
> > node, *both *processes getting started on the two servers listed. Cant we
> > have master and secondary node started seperately on two machines??
> >
> > On Fri, May 15, 2009 at 9:39 AM, jason hadoop <jason.hadoop@gmail.com
> > >wrote:
> >
> > > I agree with billy. conf/masters is misleading as the place for
> secondary
> > > namenodes.
> > >
> > > On Thu, May 14, 2009 at 8:38 PM, Billy Pearson
> > > <sa...@pearsonwholesale.com>wrote:
> > >
> > > > I thank the secondary namenode is set in the masters file in the conf
> > > > folder
> > > > misleading
> > > >
> > > > Billy
> > > >
> > > >
> > > >
> > > > "Rakhi Khatwani" <ra...@gmail.com> wrote in message
> > > > news:384813770905140603g4d552834gcef2db3028a00191@mail.gmail.com...
> > > >
> > > >  Hi,
> > > >>    I wanna set up a cluster of 5 nodes in such a way that
> > > >> node1 - master
> > > >> node2 - secondary namenode
> > > >> node3 - slave
> > > >> node4 - slave
> > > >> node5 - slave
> > > >>
> > > >>
> > > >> How do we go about that?
> > > >> there is no property in hadoop-env where i can set the ip-address
> for
> > > >> secondary name node.
> > > >>
> > > >> if i set node-1 and node-2 in masters, and when we start dfs, in
> both
> > > the
> > > >> m/cs, the namenode n secondary namenode processes r present. but i
> > think
> > > >> only node1 is active.
> > > >> n my namenode fail over operation fails.
> > > >>
> > > >> ny suggesstions?
> > > >>
> > > >> Regards,
> > > >> Rakhi
> > > >>
> > > >>
> > > >
> > > >
> > >
> > >
> > > --
> > > Alpha Chapters of my book on Hadoop are available
> > > http://www.apress.com/book/view/9781430219422
> > > www.prohadoopbook.com a community for Hadoop Professionals
> > >
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: Setting up another machine as secondary node

Posted by jason hadoop <ja...@gmail.com>.

the masters file only contains the secondary namenodes.
when you start-dfs.sh or start-all, the namenode, which is the master, is
started on the local machine, and secondary namenodes are started on each
host listed in conf/masters

This now confusing pattern is probably the result of some historical
requirement that we are unaware of.

Here are the relevant lines from bin/start-dfs.sh

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
$nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
$dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
secondarynamenode


On Thu, May 14, 2009 at 11:36 PM, Ninad Raut <hb...@gmail.com>wrote:

> But if we have two master in the master file we have master and secondary
> node, *both *processes getting started on the two servers listed. Cant we
> have master and secondary node started seperately on two machines??
>
> On Fri, May 15, 2009 at 9:39 AM, jason hadoop <jason.hadoop@gmail.com
> >wrote:
>
> > I agree with billy. conf/masters is misleading as the place for secondary
> > namenodes.
> >
> > On Thu, May 14, 2009 at 8:38 PM, Billy Pearson
> > <sa...@pearsonwholesale.com>wrote:
> >
> > > I thank the secondary namenode is set in the masters file in the conf
> > > folder
> > > misleading
> > >
> > > Billy
> > >
> > >
> > >
> > > "Rakhi Khatwani" <ra...@gmail.com> wrote in message
> > > news:384813770905140603g4d552834gcef2db3028a00191@mail.gmail.com...
> > >
> > >  Hi,
> > >>    I wanna set up a cluster of 5 nodes in such a way that
> > >> node1 - master
> > >> node2 - secondary namenode
> > >> node3 - slave
> > >> node4 - slave
> > >> node5 - slave
> > >>
> > >>
> > >> How do we go about that?
> > >> there is no property in hadoop-env where i can set the ip-address for
> > >> secondary name node.
> > >>
> > >> if i set node-1 and node-2 in masters, and when we start dfs, in both
> > the
> > >> m/cs, the namenode n secondary namenode processes r present. but i
> think
> > >> only node1 is active.
> > >> n my namenode fail over operation fails.
> > >>
> > >> ny suggesstions?
> > >>
> > >> Regards,
> > >> Rakhi
> > >>
> > >>
> > >
> > >
> >
> >
> > --
> > Alpha Chapters of my book on Hadoop are available
> > http://www.apress.com/book/view/9781430219422
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: Setting up another machine as secondary node

Posted by Ninad Raut <hb...@gmail.com>.

But if we have two master in the master file we have master and secondary
node, *both *processes getting started on the two servers listed. Cant we
have master and secondary node started seperately on two machines??

On Fri, May 15, 2009 at 9:39 AM, jason hadoop <ja...@gmail.com>wrote:

> I agree with billy. conf/masters is misleading as the place for secondary
> namenodes.
>
> On Thu, May 14, 2009 at 8:38 PM, Billy Pearson
> <sa...@pearsonwholesale.com>wrote:
>
> > I thank the secondary namenode is set in the masters file in the conf
> > folder
> > misleading
> >
> > Billy
> >
> >
> >
> > "Rakhi Khatwani" <ra...@gmail.com> wrote in message
> > news:384813770905140603g4d552834gcef2db3028a00191@mail.gmail.com...
> >
> >  Hi,
> >>    I wanna set up a cluster of 5 nodes in such a way that
> >> node1 - master
> >> node2 - secondary namenode
> >> node3 - slave
> >> node4 - slave
> >> node5 - slave
> >>
> >>
> >> How do we go about that?
> >> there is no property in hadoop-env where i can set the ip-address for
> >> secondary name node.
> >>
> >> if i set node-1 and node-2 in masters, and when we start dfs, in both
> the
> >> m/cs, the namenode n secondary namenode processes r present. but i think
> >> only node1 is active.
> >> n my namenode fail over operation fails.
> >>
> >> ny suggesstions?
> >>
> >> Regards,
> >> Rakhi
> >>
> >>
> >
> >
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: Setting up another machine as secondary node

Posted by jason hadoop <ja...@gmail.com>.

I agree with billy. conf/masters is misleading as the place for secondary
namenodes.

On Thu, May 14, 2009 at 8:38 PM, Billy Pearson
<sa...@pearsonwholesale.com>wrote:

> I thank the secondary namenode is set in the masters file in the conf
> folder
> misleading
>
> Billy
>
>
>
> "Rakhi Khatwani" <ra...@gmail.com> wrote in message
> news:384813770905140603g4d552834gcef2db3028a00191@mail.gmail.com...
>
>  Hi,
>>    I wanna set up a cluster of 5 nodes in such a way that
>> node1 - master
>> node2 - secondary namenode
>> node3 - slave
>> node4 - slave
>> node5 - slave
>>
>>
>> How do we go about that?
>> there is no property in hadoop-env where i can set the ip-address for
>> secondary name node.
>>
>> if i set node-1 and node-2 in masters, and when we start dfs, in both the
>> m/cs, the namenode n secondary namenode processes r present. but i think
>> only node1 is active.
>> n my namenode fail over operation fails.
>>
>> ny suggesstions?
>>
>> Regards,
>> Rakhi
>>
>>
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: Setting up another machine as secondary node

Posted by Billy Pearson <sa...@pearsonwholesale.com>.

I thank the secondary namenode is set in the masters file in the conf folder
misleading

Billy



"Rakhi Khatwani" <ra...@gmail.com> 
wrote in message 
news:384813770905140603g4d552834gcef2db3028a00191@mail.gmail.com...
> Hi,
>     I wanna set up a cluster of 5 nodes in such a way that
> node1 - master
> node2 - secondary namenode
> node3 - slave
> node4 - slave
> node5 - slave
>
>
> How do we go about that?
> there is no property in hadoop-env where i can set the ip-address for
> secondary name node.
>
> if i set node-1 and node-2 in masters, and when we start dfs, in both the
> m/cs, the namenode n secondary namenode processes r present. but i think
> only node1 is active.
> n my namenode fail over operation fails.
>
> ny suggesstions?
>
> Regards,
> Rakhi
>

Re: Setting up another machine as secondary node

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

I don't think you will find any step-by-step instructions.
Somebody has already mentioned in replies below that secondary node is NOT
a fail-over node. You can read about it here:
http://wiki.apache.org/hadoop/FAQ#7
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode

In fact, Secondary NameNode is a checkpointer only: it cannot process
heartbeats from data-nodes or "ls" commands from hdfs clients.
You probably meant to do after NameNode failed on node1 is:
stop Secondary node on node2 and then start the real
NameNode on node2. You will also have to restart data-nodes to redirect
them to the new name-node.

Another way to model a fail-over is to play with the Backup node, which
is only available in trunk (not in 0.19, which you seem to be using), and
is supposed to replace secondary node in 0.21.

Backup node is a real name-node and it can start processing heartbeats
and client commands if you redirect them to the Backup node.
I guess nobody tried it yet. So please share your experience.

Regards,
--Konstantin


Rakhi Khatwani wrote:
> Hi,
>       Thanks for the suggestions. but my scenario is a little different.
> i am doin a POC on namenode failover.
> 
> i have a 5 cluster node setup in which one acts as a master, 3 acts as
> slaves and the last one, the secondary node.
> 
> i start my hadoop dfs, write something into it... and later kill my
> namenode. (tryin to produce a real worls scenario where my namenode fails
> due to some hardware error).
> 
> so my aim is to start the secondary node as the primary m/c.
> so tht the dfs is intact (by copyin the checkpoint info)
> and all the slave pcs becoming the slaves of the secondary namenode now.
> 
> 1. Can this be achieved without shuttin down the cluster?... i have read
> this somewhere... but coudnt achieve it.
> 
> 2. Whats the step by step instruction to achieve it?.. i hv google it, got a
> lot of different opinions n m totally confused now.
> 
> Thanks,
> Raakhi
> 
> 
> 
> 
> On Tue, May 26, 2009 at 11:27 PM, Konstantin Shvachko <sh...@yahoo-inc.com>wrote:
> 
>> Hi Rakhi,
>>
>> This is because your name-node is trying to -importCheckpoint from a
>> directory,
>> which is locked by secondary name-node.
>> The secondary node is also running in your case, right?
>> You should use -importCheckpoint as the last resort, when name-node's
>> directories
>> are damaged.
>> In regular case you start name-node with
>> ./hadoop-daemon.sh start namenode
>>
>> Thanks,
>> --Konstantin
>>
>>
>> Rakhi Khatwani wrote:
>>
>>> Hi,
>>>       I followed the instructions suggested by you all. but i still
>>> come across this exception when i use the following command:
>>> ./hadoop-daemon.sh start namenode -importCheckpoint
>>>
>>> the exception is as follows:
>>> 2009-05-26 14:43:48,004 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting NameNode
>>> STARTUP_MSG:   host = germapp/192.168.0.1
>>> STARTUP_MSG:   args = [-importCheckpoint]
>>> STARTUP_MSG:   version = 0.19.0
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
>>> 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
>>> ************************************************************/
>>> 2009-05-26 14:43:48,147 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>>> Initializing RPC Metrics with hostName=NameNode, port=44444
>>> 2009-05-26 14:43:48,154 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>>> germapp/192.168.0.1:44444
>>> 2009-05-26 14:43:48,160 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>>> 2009-05-26 14:43:48,166 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>>> Initializing NameNodeMeterics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2009-05-26 14:43:48,316 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> fsOwner=ithurs,ithurs
>>> 2009-05-26 14:43:48,317 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> supergroup=supergroup
>>> 2009-05-26 14:43:48,317 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> isPermissionEnabled=true
>>> 2009-05-26 14:43:48,343 INFO
>>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>>> Initializing FSNamesystemMetrics using context
>>> object:org.apache.hadoop.metrics.spi.NullContext
>>> 2009-05-26 14:43:48,347 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>>> FSNamesystemStatusMBean
>>> 2009-05-26 14:43:48,455 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>> /tmp/hadoop-ithurs/dfs/name is not formatted.
>>> 2009-05-26 14:43:48,455 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2009-05-26 14:43:48,457 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage
>>> /tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
>>> 2009-05-26 14:43:48,460 ERROR
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>>> initialization failed.
>>> java.io.IOException: Cannot lock storage
>>> /tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
>>>        at
>>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
>>>        at
>>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>> 2009-05-26 14:43:48,464 INFO org.apache.hadoop.ipc.Server: Stopping
>>> server on 44444
>>> 2009-05-26 14:43:48,466 ERROR
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException:
>>> Cannot lock storage /tmp/hadoop-ithurs/dfs/namesecondary. The
>>> directory is already locked.
>>>        at
>>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
>>>        at
>>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>>        at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>>
>>> 2009-05-26 14:43:48,468 INFO
>>> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down NameNode at germapp/192.168.0.1
>>> ************************************************************/
>>>
>>> any pointers/suggestions?
>>> Thanks,
>>> Raakhi
>>>
>>> On 5/20/09, Aaron Kimball <aa...@cloudera.com> wrote:
>>>
>>>> See this regarding instructions on configuring a 2NN on a separate
>>>> machine
>>>> from the NN:
>>>>
>>>> http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/
>>>>
>>>> - Aaron
>>>>
>>>> On Thu, May 14, 2009 at 10:42 AM, Koji Noguchi
>>>> <kn...@yahoo-inc.com>wrote:
>>>>
>>>>  Before 0.19, fsimage/edits were on the same directory.
>>>>> So whenever secondary finishes checkpointing, it copies back the fsimage
>>>>> while namenode still kept on writing to the edits file.
>>>>>
>>>>> Usually we observed some latency on the namenode side during that time.
>>>>>
>>>>> HADOOP-3948 would probably help after 0.19 or later.
>>>>>
>>>>> Koji
>>>>>
>>>>> -----Original Message-----
>>>>> From: Brian Bockelman [mailto:bbockelm@cse.unl.edu]
>>>>> Sent: Thursday, May 14, 2009 10:32 AM
>>>>> To: core-user@hadoop.apache.org
>>>>> Subject: Re: Setting up another machine as secondary node
>>>>>
>>>>> Hey Koji,
>>>>>
>>>>> It's an expensive operation - for the secondary namenode, not the
>>>>> namenode itself, right?  I don't particularly care if I stress out a
>>>>> dedicated node that doesn't have to respond to queries ;)
>>>>>
>>>>> Locally we checkpoint+backup fairly frequently (not 5 minutes ...
>>>>> maybe less than the default hour) due to sheer paranoia of losing
>>>>> metadata.
>>>>>
>>>>> Brian
>>>>>
>>>>> On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:
>>>>>
>>>>>  The secondary namenode takes a snapshot
>>>>>>> at 5 minute (configurable) intervals,
>>>>>>>
>>>>>>>  This is a bit too aggressive.
>>>>>> Checkpointing is still an expensive operation.
>>>>>> I'd say every hour or even every day.
>>>>>>
>>>>>> Isn't the default 3600 seconds?
>>>>>>
>>>>>> Koji
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: jason hadoop [mailto:jason.hadoop@gmail.com]
>>>>>> Sent: Thursday, May 14, 2009 7:46 AM
>>>>>> To: core-user@hadoop.apache.org
>>>>>> Subject: Re: Setting up another machine as secondary node
>>>>>>
>>>>>> any machine put in the conf/masters file becomes a secondary namenode.
>>>>>>
>>>>>> At some point there was confusion on the safety of more than one
>>>>>> machine,
>>>>>> which I believe was settled, as many are safe.
>>>>>>
>>>>>> The secondary namenode takes a snapshot at 5 minute (configurable)
>>>>>> intervals, rebuilds the fsimage and sends that back to the namenode.
>>>>>> There is some performance advantage of having it on the local machine,
>>>>>> and
>>>>>> some safety advantage of having it on an alternate machine.
>>>>>> Could someone who remembers speak up on the single vrs multiple
>>>>>> secondary
>>>>>> namenodes?
>>>>>>
>>>>>>
>>>>>> On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>  First of all, the secondary namenode is not a what you might think a
>>>>>>> secondary is - it's not failover device.  It does make a copy of the
>>>>>>> filesystem metadata periodically, and it integrates the edits into
>>>>>>> the
>>>>>>> image.  It does *not* provide failover.
>>>>>>>
>>>>>>> Second, you specify its IP address in hadoop-site.xml.  This is where
>>>>>>>
>>>>>> you
>>>>>>
>>>>>>> can override the defaults set in hadoop-default.xml.
>>>>>>>
>>>>>>> dbr
>>>>>>>
>>>>>>> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
>>>>>>>
>>>>>> <rakhi.khatwani@gmail.com
>>>>>>
>>>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>>   I wanna set up a cluster of 5 nodes in such a way that
>>>>>>>> node1 - master
>>>>>>>> node2 - secondary namenode
>>>>>>>> node3 - slave
>>>>>>>> node4 - slave
>>>>>>>> node5 - slave
>>>>>>>>
>>>>>>>>
>>>>>>>> How do we go about that?
>>>>>>>> there is no property in hadoop-env where i can set the ip-address
>>>>>>>>
>>>>>>> for
>>>>>>> secondary name node.
>>>>>>>> if i set node-1 and node-2 in masters, and when we start dfs, in
>>>>>>>>
>>>>>>> both the
>>>>>>> m/cs, the namenode n secondary namenode processes r present. but i
>>>>>>> think
>>>>>>> only node1 is active.
>>>>>>>> n my namenode fail over operation fails.
>>>>>>>>
>>>>>>>> ny suggesstions?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Rakhi
>>>>>>>>
>>>>>>>>
>>>>>> --
>>>>>> Alpha Chapters of my book on Hadoop are available
>>>>>> http://www.apress.com/book/view/9781430219422
>>>>>> www.prohadoopbook.com a community for Hadoop Professionals
>>>>>>
>>>>>
>

Re: Setting up another machine as secondary node

Posted by Rakhi Khatwani <ra...@gmail.com>.

Hi,
      Thanks for the suggestions. but my scenario is a little different.
i am doin a POC on namenode failover.

i have a 5 cluster node setup in which one acts as a master, 3 acts as
slaves and the last one, the secondary node.

i start my hadoop dfs, write something into it... and later kill my
namenode. (tryin to produce a real worls scenario where my namenode fails
due to some hardware error).

so my aim is to start the secondary node as the primary m/c.
so tht the dfs is intact (by copyin the checkpoint info)
and all the slave pcs becoming the slaves of the secondary namenode now.

1. Can this be achieved without shuttin down the cluster?... i have read
this somewhere... but coudnt achieve it.

2. Whats the step by step instruction to achieve it?.. i hv google it, got a
lot of different opinions n m totally confused now.

Thanks,
Raakhi




On Tue, May 26, 2009 at 11:27 PM, Konstantin Shvachko <sh...@yahoo-inc.com>wrote:

> Hi Rakhi,
>
> This is because your name-node is trying to -importCheckpoint from a
> directory,
> which is locked by secondary name-node.
> The secondary node is also running in your case, right?
> You should use -importCheckpoint as the last resort, when name-node's
> directories
> are damaged.
> In regular case you start name-node with
> ./hadoop-daemon.sh start namenode
>
> Thanks,
> --Konstantin
>
>
> Rakhi Khatwani wrote:
>
>> Hi,
>>       I followed the instructions suggested by you all. but i still
>> come across this exception when i use the following command:
>> ./hadoop-daemon.sh start namenode -importCheckpoint
>>
>> the exception is as follows:
>> 2009-05-26 14:43:48,004 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting NameNode
>> STARTUP_MSG:   host = germapp/192.168.0.1
>> STARTUP_MSG:   args = [-importCheckpoint]
>> STARTUP_MSG:   version = 0.19.0
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
>> 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
>> ************************************************************/
>> 2009-05-26 14:43:48,147 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>> Initializing RPC Metrics with hostName=NameNode, port=44444
>> 2009-05-26 14:43:48,154 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
>> germapp/192.168.0.1:44444
>> 2009-05-26 14:43:48,160 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=NameNode, sessionId=null
>> 2009-05-26 14:43:48,166 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
>> Initializing NameNodeMeterics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2009-05-26 14:43:48,316 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> fsOwner=ithurs,ithurs
>> 2009-05-26 14:43:48,317 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> supergroup=supergroup
>> 2009-05-26 14:43:48,317 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> isPermissionEnabled=true
>> 2009-05-26 14:43:48,343 INFO
>> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
>> Initializing FSNamesystemMetrics using context
>> object:org.apache.hadoop.metrics.spi.NullContext
>> 2009-05-26 14:43:48,347 INFO
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
>> FSNamesystemStatusMBean
>> 2009-05-26 14:43:48,455 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>> /tmp/hadoop-ithurs/dfs/name is not formatted.
>> 2009-05-26 14:43:48,455 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>> 2009-05-26 14:43:48,457 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage
>> /tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
>> 2009-05-26 14:43:48,460 ERROR
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>> initialization failed.
>> java.io.IOException: Cannot lock storage
>> /tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
>>        at
>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
>>        at
>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>> 2009-05-26 14:43:48,464 INFO org.apache.hadoop.ipc.Server: Stopping
>> server on 44444
>> 2009-05-26 14:43:48,466 ERROR
>> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException:
>> Cannot lock storage /tmp/hadoop-ithurs/dfs/namesecondary. The
>> directory is already locked.
>>        at
>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
>>        at
>> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>>        at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
>>
>> 2009-05-26 14:43:48,468 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down NameNode at germapp/192.168.0.1
>> ************************************************************/
>>
>> any pointers/suggestions?
>> Thanks,
>> Raakhi
>>
>> On 5/20/09, Aaron Kimball <aa...@cloudera.com> wrote:
>>
>>> See this regarding instructions on configuring a 2NN on a separate
>>> machine
>>> from the NN:
>>>
>>> http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/
>>>
>>> - Aaron
>>>
>>> On Thu, May 14, 2009 at 10:42 AM, Koji Noguchi
>>> <kn...@yahoo-inc.com>wrote:
>>>
>>>  Before 0.19, fsimage/edits were on the same directory.
>>>> So whenever secondary finishes checkpointing, it copies back the fsimage
>>>> while namenode still kept on writing to the edits file.
>>>>
>>>> Usually we observed some latency on the namenode side during that time.
>>>>
>>>> HADOOP-3948 would probably help after 0.19 or later.
>>>>
>>>> Koji
>>>>
>>>> -----Original Message-----
>>>> From: Brian Bockelman [mailto:bbockelm@cse.unl.edu]
>>>> Sent: Thursday, May 14, 2009 10:32 AM
>>>> To: core-user@hadoop.apache.org
>>>> Subject: Re: Setting up another machine as secondary node
>>>>
>>>> Hey Koji,
>>>>
>>>> It's an expensive operation - for the secondary namenode, not the
>>>> namenode itself, right?  I don't particularly care if I stress out a
>>>> dedicated node that doesn't have to respond to queries ;)
>>>>
>>>> Locally we checkpoint+backup fairly frequently (not 5 minutes ...
>>>> maybe less than the default hour) due to sheer paranoia of losing
>>>> metadata.
>>>>
>>>> Brian
>>>>
>>>> On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:
>>>>
>>>>  The secondary namenode takes a snapshot
>>>>>> at 5 minute (configurable) intervals,
>>>>>>
>>>>>>  This is a bit too aggressive.
>>>>> Checkpointing is still an expensive operation.
>>>>> I'd say every hour or even every day.
>>>>>
>>>>> Isn't the default 3600 seconds?
>>>>>
>>>>> Koji
>>>>>
>>>>> -----Original Message-----
>>>>> From: jason hadoop [mailto:jason.hadoop@gmail.com]
>>>>> Sent: Thursday, May 14, 2009 7:46 AM
>>>>> To: core-user@hadoop.apache.org
>>>>> Subject: Re: Setting up another machine as secondary node
>>>>>
>>>>> any machine put in the conf/masters file becomes a secondary namenode.
>>>>>
>>>>> At some point there was confusion on the safety of more than one
>>>>> machine,
>>>>> which I believe was settled, as many are safe.
>>>>>
>>>>> The secondary namenode takes a snapshot at 5 minute (configurable)
>>>>> intervals, rebuilds the fsimage and sends that back to the namenode.
>>>>> There is some performance advantage of having it on the local machine,
>>>>> and
>>>>> some safety advantage of having it on an alternate machine.
>>>>> Could someone who remembers speak up on the single vrs multiple
>>>>> secondary
>>>>> namenodes?
>>>>>
>>>>>
>>>>> On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>  First of all, the secondary namenode is not a what you might think a
>>>>>> secondary is - it's not failover device.  It does make a copy of the
>>>>>> filesystem metadata periodically, and it integrates the edits into
>>>>>> the
>>>>>> image.  It does *not* provide failover.
>>>>>>
>>>>>> Second, you specify its IP address in hadoop-site.xml.  This is where
>>>>>>
>>>>> you
>>>>>
>>>>>> can override the defaults set in hadoop-default.xml.
>>>>>>
>>>>>> dbr
>>>>>>
>>>>>> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
>>>>>>
>>>>> <rakhi.khatwani@gmail.com
>>>>>
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>   I wanna set up a cluster of 5 nodes in such a way that
>>>>>>> node1 - master
>>>>>>> node2 - secondary namenode
>>>>>>> node3 - slave
>>>>>>> node4 - slave
>>>>>>> node5 - slave
>>>>>>>
>>>>>>>
>>>>>>> How do we go about that?
>>>>>>> there is no property in hadoop-env where i can set the ip-address
>>>>>>>
>>>>>> for
>>>>>
>>>>>> secondary name node.
>>>>>>>
>>>>>>> if i set node-1 and node-2 in masters, and when we start dfs, in
>>>>>>>
>>>>>> both the
>>>>>
>>>>>> m/cs, the namenode n secondary namenode processes r present. but i
>>>>>>>
>>>>>> think
>>>>>
>>>>>> only node1 is active.
>>>>>>> n my namenode fail over operation fails.
>>>>>>>
>>>>>>> ny suggesstions?
>>>>>>>
>>>>>>> Regards,
>>>>>>> Rakhi
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> Alpha Chapters of my book on Hadoop are available
>>>>> http://www.apress.com/book/view/9781430219422
>>>>> www.prohadoopbook.com a community for Hadoop Professionals
>>>>>
>>>>
>>>>
>>

Re: Setting up another machine as secondary node

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

Hi Rakhi,

This is because your name-node is trying to -importCheckpoint from a directory,
which is locked by secondary name-node.
The secondary node is also running in your case, right?
You should use -importCheckpoint as the last resort, when name-node's directories
are damaged.
In regular case you start name-node with
./hadoop-daemon.sh start namenode

Thanks,
--Konstantin

Rakhi Khatwani wrote:
> Hi,
>        I followed the instructions suggested by you all. but i still
> come across this exception when i use the following command:
> ./hadoop-daemon.sh start namenode -importCheckpoint
> 
> the exception is as follows:
> 2009-05-26 14:43:48,004 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = germapp/192.168.0.1
> STARTUP_MSG:   args = [-importCheckpoint]
> STARTUP_MSG:   version = 0.19.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
> 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> ************************************************************/
> 2009-05-26 14:43:48,147 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=NameNode, port=44444
> 2009-05-26 14:43:48,154 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
> germapp/192.168.0.1:44444
> 2009-05-26 14:43:48,160 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=NameNode, sessionId=null
> 2009-05-26 14:43:48,166 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
> Initializing NameNodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 2009-05-26 14:43:48,316 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> fsOwner=ithurs,ithurs
> 2009-05-26 14:43:48,317 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> supergroup=supergroup
> 2009-05-26 14:43:48,317 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=true
> 2009-05-26 14:43:48,343 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> Initializing FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext
> 2009-05-26 14:43:48,347 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> FSNamesystemStatusMBean
> 2009-05-26 14:43:48,455 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
> /tmp/hadoop-ithurs/dfs/name is not formatted.
> 2009-05-26 14:43:48,455 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2009-05-26 14:43:48,457 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage
> /tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
> 2009-05-26 14:43:48,460 ERROR
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
> initialization failed.
> java.io.IOException: Cannot lock storage
> /tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
>         at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
>         at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 2009-05-26 14:43:48,464 INFO org.apache.hadoop.ipc.Server: Stopping
> server on 44444
> 2009-05-26 14:43:48,466 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException:
> Cannot lock storage /tmp/hadoop-ithurs/dfs/namesecondary. The
> directory is already locked.
>         at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
>         at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
>         at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
>         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
>         at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
> 
> 2009-05-26 14:43:48,468 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at germapp/192.168.0.1
> ************************************************************/
> 
> any pointers/suggestions?
> Thanks,
> Raakhi
> 
> On 5/20/09, Aaron Kimball <aa...@cloudera.com> wrote:
>> See this regarding instructions on configuring a 2NN on a separate machine
>> from the NN:
>> http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/
>>
>> - Aaron
>>
>> On Thu, May 14, 2009 at 10:42 AM, Koji Noguchi
>> <kn...@yahoo-inc.com>wrote:
>>
>>> Before 0.19, fsimage/edits were on the same directory.
>>> So whenever secondary finishes checkpointing, it copies back the fsimage
>>> while namenode still kept on writing to the edits file.
>>>
>>> Usually we observed some latency on the namenode side during that time.
>>>
>>> HADOOP-3948 would probably help after 0.19 or later.
>>>
>>> Koji
>>>
>>> -----Original Message-----
>>> From: Brian Bockelman [mailto:bbockelm@cse.unl.edu]
>>> Sent: Thursday, May 14, 2009 10:32 AM
>>> To: core-user@hadoop.apache.org
>>> Subject: Re: Setting up another machine as secondary node
>>>
>>> Hey Koji,
>>>
>>> It's an expensive operation - for the secondary namenode, not the
>>> namenode itself, right?  I don't particularly care if I stress out a
>>> dedicated node that doesn't have to respond to queries ;)
>>>
>>> Locally we checkpoint+backup fairly frequently (not 5 minutes ...
>>> maybe less than the default hour) due to sheer paranoia of losing
>>> metadata.
>>>
>>> Brian
>>>
>>> On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:
>>>
>>>>> The secondary namenode takes a snapshot
>>>>> at 5 minute (configurable) intervals,
>>>>>
>>>> This is a bit too aggressive.
>>>> Checkpointing is still an expensive operation.
>>>> I'd say every hour or even every day.
>>>>
>>>> Isn't the default 3600 seconds?
>>>>
>>>> Koji
>>>>
>>>> -----Original Message-----
>>>> From: jason hadoop [mailto:jason.hadoop@gmail.com]
>>>> Sent: Thursday, May 14, 2009 7:46 AM
>>>> To: core-user@hadoop.apache.org
>>>> Subject: Re: Setting up another machine as secondary node
>>>>
>>>> any machine put in the conf/masters file becomes a secondary namenode.
>>>>
>>>> At some point there was confusion on the safety of more than one
>>>> machine,
>>>> which I believe was settled, as many are safe.
>>>>
>>>> The secondary namenode takes a snapshot at 5 minute (configurable)
>>>> intervals, rebuilds the fsimage and sends that back to the namenode.
>>>> There is some performance advantage of having it on the local machine,
>>>> and
>>>> some safety advantage of having it on an alternate machine.
>>>> Could someone who remembers speak up on the single vrs multiple
>>>> secondary
>>>> namenodes?
>>>>
>>>>
>>>> On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
>>>> wrote:
>>>>
>>>>> First of all, the secondary namenode is not a what you might think a
>>>>> secondary is - it's not failover device.  It does make a copy of the
>>>>> filesystem metadata periodically, and it integrates the edits into
>>>>> the
>>>>> image.  It does *not* provide failover.
>>>>>
>>>>> Second, you specify its IP address in hadoop-site.xml.  This is where
>>>> you
>>>>> can override the defaults set in hadoop-default.xml.
>>>>>
>>>>> dbr
>>>>>
>>>>> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
>>>> <rakhi.khatwani@gmail.com
>>>>>> wrote:
>>>>>> Hi,
>>>>>>    I wanna set up a cluster of 5 nodes in such a way that
>>>>>> node1 - master
>>>>>> node2 - secondary namenode
>>>>>> node3 - slave
>>>>>> node4 - slave
>>>>>> node5 - slave
>>>>>>
>>>>>>
>>>>>> How do we go about that?
>>>>>> there is no property in hadoop-env where i can set the ip-address
>>>> for
>>>>>> secondary name node.
>>>>>>
>>>>>> if i set node-1 and node-2 in masters, and when we start dfs, in
>>>> both the
>>>>>> m/cs, the namenode n secondary namenode processes r present. but i
>>>> think
>>>>>> only node1 is active.
>>>>>> n my namenode fail over operation fails.
>>>>>>
>>>>>> ny suggesstions?
>>>>>>
>>>>>> Regards,
>>>>>> Rakhi
>>>>>>
>>>>
>>>>
>>>> --
>>>> Alpha Chapters of my book on Hadoop are available
>>>> http://www.apress.com/book/view/9781430219422
>>>> www.prohadoopbook.com a community for Hadoop Professionals
>>>
>

Re: Setting up another machine as secondary node

Posted by Rakhi Khatwani <ra...@gmail.com>.

Hi,
       I followed the instructions suggested by you all. but i still
come across this exception when i use the following command:
./hadoop-daemon.sh start namenode -importCheckpoint

the exception is as follows:
2009-05-26 14:43:48,004 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = germapp/192.168.0.1
STARTUP_MSG:   args = [-importCheckpoint]
STARTUP_MSG:   version = 0.19.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
************************************************************/
2009-05-26 14:43:48,147 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=NameNode, port=44444
2009-05-26 14:43:48,154 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
germapp/192.168.0.1:44444
2009-05-26 14:43:48,160 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2009-05-26 14:43:48,166 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
Initializing NameNodeMeterics using context
object:org.apache.hadoop.metrics.spi.NullContext
2009-05-26 14:43:48,316 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
fsOwner=ithurs,ithurs
2009-05-26 14:43:48,317 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
supergroup=supergroup
2009-05-26 14:43:48,317 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
isPermissionEnabled=true
2009-05-26 14:43:48,343 INFO
org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
Initializing FSNamesystemMetrics using context
object:org.apache.hadoop.metrics.spi.NullContext
2009-05-26 14:43:48,347 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
FSNamesystemStatusMBean
2009-05-26 14:43:48,455 INFO
org.apache.hadoop.hdfs.server.common.Storage: Storage directory
/tmp/hadoop-ithurs/dfs/name is not formatted.
2009-05-26 14:43:48,455 INFO
org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
2009-05-26 14:43:48,457 INFO
org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage
/tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
2009-05-26 14:43:48,460 ERROR
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
java.io.IOException: Cannot lock storage
/tmp/hadoop-ithurs/dfs/namesecondary. The directory is already locked.
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
2009-05-26 14:43:48,464 INFO org.apache.hadoop.ipc.Server: Stopping
server on 44444
2009-05-26 14:43:48,466 ERROR
org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException:
Cannot lock storage /tmp/hadoop-ithurs/dfs/namesecondary. The
directory is already locked.
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:510)
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:273)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.doImportCheckpoint(FSImage.java:504)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:344)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:208)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:194)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)

2009-05-26 14:43:48,468 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at germapp/192.168.0.1
************************************************************/

any pointers/suggestions?
Thanks,
Raakhi

On 5/20/09, Aaron Kimball <aa...@cloudera.com> wrote:
> See this regarding instructions on configuring a 2NN on a separate machine
> from the NN:
> http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/
>
> - Aaron
>
> On Thu, May 14, 2009 at 10:42 AM, Koji Noguchi
> <kn...@yahoo-inc.com>wrote:
>
>> Before 0.19, fsimage/edits were on the same directory.
>> So whenever secondary finishes checkpointing, it copies back the fsimage
>> while namenode still kept on writing to the edits file.
>>
>> Usually we observed some latency on the namenode side during that time.
>>
>> HADOOP-3948 would probably help after 0.19 or later.
>>
>> Koji
>>
>> -----Original Message-----
>> From: Brian Bockelman [mailto:bbockelm@cse.unl.edu]
>> Sent: Thursday, May 14, 2009 10:32 AM
>> To: core-user@hadoop.apache.org
>> Subject: Re: Setting up another machine as secondary node
>>
>> Hey Koji,
>>
>> It's an expensive operation - for the secondary namenode, not the
>> namenode itself, right?  I don't particularly care if I stress out a
>> dedicated node that doesn't have to respond to queries ;)
>>
>> Locally we checkpoint+backup fairly frequently (not 5 minutes ...
>> maybe less than the default hour) due to sheer paranoia of losing
>> metadata.
>>
>> Brian
>>
>> On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:
>>
>> >> The secondary namenode takes a snapshot
>> >> at 5 minute (configurable) intervals,
>> >>
>> > This is a bit too aggressive.
>> > Checkpointing is still an expensive operation.
>> > I'd say every hour or even every day.
>> >
>> > Isn't the default 3600 seconds?
>> >
>> > Koji
>> >
>> > -----Original Message-----
>> > From: jason hadoop [mailto:jason.hadoop@gmail.com]
>> > Sent: Thursday, May 14, 2009 7:46 AM
>> > To: core-user@hadoop.apache.org
>> > Subject: Re: Setting up another machine as secondary node
>> >
>> > any machine put in the conf/masters file becomes a secondary namenode.
>> >
>> > At some point there was confusion on the safety of more than one
>> > machine,
>> > which I believe was settled, as many are safe.
>> >
>> > The secondary namenode takes a snapshot at 5 minute (configurable)
>> > intervals, rebuilds the fsimage and sends that back to the namenode.
>> > There is some performance advantage of having it on the local machine,
>> > and
>> > some safety advantage of having it on an alternate machine.
>> > Could someone who remembers speak up on the single vrs multiple
>> > secondary
>> > namenodes?
>> >
>> >
>> > On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
>> > wrote:
>> >
>> >> First of all, the secondary namenode is not a what you might think a
>> >> secondary is - it's not failover device.  It does make a copy of the
>> >> filesystem metadata periodically, and it integrates the edits into
>> >> the
>> >> image.  It does *not* provide failover.
>> >>
>> >> Second, you specify its IP address in hadoop-site.xml.  This is where
>> > you
>> >> can override the defaults set in hadoop-default.xml.
>> >>
>> >> dbr
>> >>
>> >> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
>> > <rakhi.khatwani@gmail.com
>> >>> wrote:
>> >>
>> >>> Hi,
>> >>>    I wanna set up a cluster of 5 nodes in such a way that
>> >>> node1 - master
>> >>> node2 - secondary namenode
>> >>> node3 - slave
>> >>> node4 - slave
>> >>> node5 - slave
>> >>>
>> >>>
>> >>> How do we go about that?
>> >>> there is no property in hadoop-env where i can set the ip-address
>> > for
>> >>> secondary name node.
>> >>>
>> >>> if i set node-1 and node-2 in masters, and when we start dfs, in
>> > both the
>> >>> m/cs, the namenode n secondary namenode processes r present. but i
>> > think
>> >>> only node1 is active.
>> >>> n my namenode fail over operation fails.
>> >>>
>> >>> ny suggesstions?
>> >>>
>> >>> Regards,
>> >>> Rakhi
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Alpha Chapters of my book on Hadoop are available
>> > http://www.apress.com/book/view/9781430219422
>> > www.prohadoopbook.com a community for Hadoop Professionals
>>
>>
>

Re: Setting up another machine as secondary node

Posted by Aaron Kimball <aa...@cloudera.com>.

See this regarding instructions on configuring a 2NN on a separate machine
from the NN:
http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/

- Aaron

On Thu, May 14, 2009 at 10:42 AM, Koji Noguchi <kn...@yahoo-inc.com>wrote:

> Before 0.19, fsimage/edits were on the same directory.
> So whenever secondary finishes checkpointing, it copies back the fsimage
> while namenode still kept on writing to the edits file.
>
> Usually we observed some latency on the namenode side during that time.
>
> HADOOP-3948 would probably help after 0.19 or later.
>
> Koji
>
> -----Original Message-----
> From: Brian Bockelman [mailto:bbockelm@cse.unl.edu]
> Sent: Thursday, May 14, 2009 10:32 AM
> To: core-user@hadoop.apache.org
> Subject: Re: Setting up another machine as secondary node
>
> Hey Koji,
>
> It's an expensive operation - for the secondary namenode, not the
> namenode itself, right?  I don't particularly care if I stress out a
> dedicated node that doesn't have to respond to queries ;)
>
> Locally we checkpoint+backup fairly frequently (not 5 minutes ...
> maybe less than the default hour) due to sheer paranoia of losing
> metadata.
>
> Brian
>
> On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:
>
> >> The secondary namenode takes a snapshot
> >> at 5 minute (configurable) intervals,
> >>
> > This is a bit too aggressive.
> > Checkpointing is still an expensive operation.
> > I'd say every hour or even every day.
> >
> > Isn't the default 3600 seconds?
> >
> > Koji
> >
> > -----Original Message-----
> > From: jason hadoop [mailto:jason.hadoop@gmail.com]
> > Sent: Thursday, May 14, 2009 7:46 AM
> > To: core-user@hadoop.apache.org
> > Subject: Re: Setting up another machine as secondary node
> >
> > any machine put in the conf/masters file becomes a secondary namenode.
> >
> > At some point there was confusion on the safety of more than one
> > machine,
> > which I believe was settled, as many are safe.
> >
> > The secondary namenode takes a snapshot at 5 minute (configurable)
> > intervals, rebuilds the fsimage and sends that back to the namenode.
> > There is some performance advantage of having it on the local machine,
> > and
> > some safety advantage of having it on an alternate machine.
> > Could someone who remembers speak up on the single vrs multiple
> > secondary
> > namenodes?
> >
> >
> > On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
> > wrote:
> >
> >> First of all, the secondary namenode is not a what you might think a
> >> secondary is - it's not failover device.  It does make a copy of the
> >> filesystem metadata periodically, and it integrates the edits into
> >> the
> >> image.  It does *not* provide failover.
> >>
> >> Second, you specify its IP address in hadoop-site.xml.  This is where
> > you
> >> can override the defaults set in hadoop-default.xml.
> >>
> >> dbr
> >>
> >> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
> > <rakhi.khatwani@gmail.com
> >>> wrote:
> >>
> >>> Hi,
> >>>    I wanna set up a cluster of 5 nodes in such a way that
> >>> node1 - master
> >>> node2 - secondary namenode
> >>> node3 - slave
> >>> node4 - slave
> >>> node5 - slave
> >>>
> >>>
> >>> How do we go about that?
> >>> there is no property in hadoop-env where i can set the ip-address
> > for
> >>> secondary name node.
> >>>
> >>> if i set node-1 and node-2 in masters, and when we start dfs, in
> > both the
> >>> m/cs, the namenode n secondary namenode processes r present. but i
> > think
> >>> only node1 is active.
> >>> n my namenode fail over operation fails.
> >>>
> >>> ny suggesstions?
> >>>
> >>> Regards,
> >>> Rakhi
> >>>
> >>
> >
> >
> >
> > --
> > Alpha Chapters of my book on Hadoop are available
> > http://www.apress.com/book/view/9781430219422
> > www.prohadoopbook.com a community for Hadoop Professionals
>
>

RE: Setting up another machine as secondary node

Posted by Koji Noguchi <kn...@yahoo-inc.com>.

Before 0.19, fsimage/edits were on the same directory.
So whenever secondary finishes checkpointing, it copies back the fsimage
while namenode still kept on writing to the edits file.

Usually we observed some latency on the namenode side during that time.

HADOOP-3948 would probably help after 0.19 or later.

Koji

-----Original Message-----
From: Brian Bockelman [mailto:bbockelm@cse.unl.edu] 
Sent: Thursday, May 14, 2009 10:32 AM
To: core-user@hadoop.apache.org
Subject: Re: Setting up another machine as secondary node

Hey Koji,

It's an expensive operation - for the secondary namenode, not the  
namenode itself, right?  I don't particularly care if I stress out a  
dedicated node that doesn't have to respond to queries ;)

Locally we checkpoint+backup fairly frequently (not 5 minutes ...  
maybe less than the default hour) due to sheer paranoia of losing  
metadata.

Brian

On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:

>> The secondary namenode takes a snapshot
>> at 5 minute (configurable) intervals,
>>
> This is a bit too aggressive.
> Checkpointing is still an expensive operation.
> I'd say every hour or even every day.
>
> Isn't the default 3600 seconds?
>
> Koji
>
> -----Original Message-----
> From: jason hadoop [mailto:jason.hadoop@gmail.com]
> Sent: Thursday, May 14, 2009 7:46 AM
> To: core-user@hadoop.apache.org
> Subject: Re: Setting up another machine as secondary node
>
> any machine put in the conf/masters file becomes a secondary namenode.
>
> At some point there was confusion on the safety of more than one
> machine,
> which I believe was settled, as many are safe.
>
> The secondary namenode takes a snapshot at 5 minute (configurable)
> intervals, rebuilds the fsimage and sends that back to the namenode.
> There is some performance advantage of having it on the local machine,
> and
> some safety advantage of having it on an alternate machine.
> Could someone who remembers speak up on the single vrs multiple
> secondary
> namenodes?
>
>
> On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
> wrote:
>
>> First of all, the secondary namenode is not a what you might think a
>> secondary is - it's not failover device.  It does make a copy of the
>> filesystem metadata periodically, and it integrates the edits into  
>> the
>> image.  It does *not* provide failover.
>>
>> Second, you specify its IP address in hadoop-site.xml.  This is where
> you
>> can override the defaults set in hadoop-default.xml.
>>
>> dbr
>>
>> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
> <rakhi.khatwani@gmail.com
>>> wrote:
>>
>>> Hi,
>>>    I wanna set up a cluster of 5 nodes in such a way that
>>> node1 - master
>>> node2 - secondary namenode
>>> node3 - slave
>>> node4 - slave
>>> node5 - slave
>>>
>>>
>>> How do we go about that?
>>> there is no property in hadoop-env where i can set the ip-address
> for
>>> secondary name node.
>>>
>>> if i set node-1 and node-2 in masters, and when we start dfs, in
> both the
>>> m/cs, the namenode n secondary namenode processes r present. but i
> think
>>> only node1 is active.
>>> n my namenode fail over operation fails.
>>>
>>> ny suggesstions?
>>>
>>> Regards,
>>> Rakhi
>>>
>>
>
>
>
> -- 
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals

Re: Setting up another machine as secondary node

Posted by Brian Bockelman <bb...@cse.unl.edu>.

Hey Koji,

It's an expensive operation - for the secondary namenode, not the  
namenode itself, right?  I don't particularly care if I stress out a  
dedicated node that doesn't have to respond to queries ;)

Locally we checkpoint+backup fairly frequently (not 5 minutes ...  
maybe less than the default hour) due to sheer paranoia of losing  
metadata.

Brian

On May 14, 2009, at 12:25 PM, Koji Noguchi wrote:

>> The secondary namenode takes a snapshot
>> at 5 minute (configurable) intervals,
>>
> This is a bit too aggressive.
> Checkpointing is still an expensive operation.
> I'd say every hour or even every day.
>
> Isn't the default 3600 seconds?
>
> Koji
>
> -----Original Message-----
> From: jason hadoop [mailto:jason.hadoop@gmail.com]
> Sent: Thursday, May 14, 2009 7:46 AM
> To: core-user@hadoop.apache.org
> Subject: Re: Setting up another machine as secondary node
>
> any machine put in the conf/masters file becomes a secondary namenode.
>
> At some point there was confusion on the safety of more than one
> machine,
> which I believe was settled, as many are safe.
>
> The secondary namenode takes a snapshot at 5 minute (configurable)
> intervals, rebuilds the fsimage and sends that back to the namenode.
> There is some performance advantage of having it on the local machine,
> and
> some safety advantage of having it on an alternate machine.
> Could someone who remembers speak up on the single vrs multiple
> secondary
> namenodes?
>
>
> On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
> wrote:
>
>> First of all, the secondary namenode is not a what you might think a
>> secondary is - it's not failover device.  It does make a copy of the
>> filesystem metadata periodically, and it integrates the edits into  
>> the
>> image.  It does *not* provide failover.
>>
>> Second, you specify its IP address in hadoop-site.xml.  This is where
> you
>> can override the defaults set in hadoop-default.xml.
>>
>> dbr
>>
>> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
> <rakhi.khatwani@gmail.com
>>> wrote:
>>
>>> Hi,
>>>    I wanna set up a cluster of 5 nodes in such a way that
>>> node1 - master
>>> node2 - secondary namenode
>>> node3 - slave
>>> node4 - slave
>>> node5 - slave
>>>
>>>
>>> How do we go about that?
>>> there is no property in hadoop-env where i can set the ip-address
> for
>>> secondary name node.
>>>
>>> if i set node-1 and node-2 in masters, and when we start dfs, in
> both the
>>> m/cs, the namenode n secondary namenode processes r present. but i
> think
>>> only node1 is active.
>>> n my namenode fail over operation fails.
>>>
>>> ny suggesstions?
>>>
>>> Regards,
>>> Rakhi
>>>
>>
>
>
>
> -- 
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals

RE: Setting up another machine as secondary node

Posted by Koji Noguchi <kn...@yahoo-inc.com>.

> The secondary namenode takes a snapshot 
> at 5 minute (configurable) intervals,
>
This is a bit too aggressive.
Checkpointing is still an expensive operation.
I'd say every hour or even every day.

Isn't the default 3600 seconds?

Koji

-----Original Message-----
From: jason hadoop [mailto:jason.hadoop@gmail.com] 
Sent: Thursday, May 14, 2009 7:46 AM
To: core-user@hadoop.apache.org
Subject: Re: Setting up another machine as secondary node

any machine put in the conf/masters file becomes a secondary namenode.

At some point there was confusion on the safety of more than one
machine,
which I believe was settled, as many are safe.

The secondary namenode takes a snapshot at 5 minute (configurable)
intervals, rebuilds the fsimage and sends that back to the namenode.
There is some performance advantage of having it on the local machine,
and
some safety advantage of having it on an alternate machine.
Could someone who remembers speak up on the single vrs multiple
secondary
namenodes?


On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com>
wrote:

> First of all, the secondary namenode is not a what you might think a
> secondary is - it's not failover device.  It does make a copy of the
> filesystem metadata periodically, and it integrates the edits into the
> image.  It does *not* provide failover.
>
> Second, you specify its IP address in hadoop-site.xml.  This is where
you
> can override the defaults set in hadoop-default.xml.
>
> dbr
>
> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani
<rakhi.khatwani@gmail.com
> >wrote:
>
> > Hi,
> >     I wanna set up a cluster of 5 nodes in such a way that
> > node1 - master
> > node2 - secondary namenode
> > node3 - slave
> > node4 - slave
> > node5 - slave
> >
> >
> > How do we go about that?
> > there is no property in hadoop-env where i can set the ip-address
for
> > secondary name node.
> >
> > if i set node-1 and node-2 in masters, and when we start dfs, in
both the
> > m/cs, the namenode n secondary namenode processes r present. but i
think
> > only node1 is active.
> > n my namenode fail over operation fails.
> >
> > ny suggesstions?
> >
> > Regards,
> > Rakhi
> >
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: Setting up another machine as secondary node

Posted by jason hadoop <ja...@gmail.com>.

any machine put in the conf/masters file becomes a secondary namenode.

At some point there was confusion on the safety of more than one machine,
which I believe was settled, as many are safe.

The secondary namenode takes a snapshot at 5 minute (configurable)
intervals, rebuilds the fsimage and sends that back to the namenode.
There is some performance advantage of having it on the local machine, and
some safety advantage of having it on an alternate machine.
Could someone who remembers speak up on the single vrs multiple secondary
namenodes?

On Thu, May 14, 2009 at 6:07 AM, David Ritch <da...@gmail.com> wrote:

> First of all, the secondary namenode is not a what you might think a
> secondary is - it's not failover device.  It does make a copy of the
> filesystem metadata periodically, and it integrates the edits into the
> image.  It does *not* provide failover.
>
> Second, you specify its IP address in hadoop-site.xml.  This is where you
> can override the defaults set in hadoop-default.xml.
>
> dbr
>
> On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani <rakhi.khatwani@gmail.com
> >wrote:
>
> > Hi,
> >     I wanna set up a cluster of 5 nodes in such a way that
> > node1 - master
> > node2 - secondary namenode
> > node3 - slave
> > node4 - slave
> > node5 - slave
> >
> >
> > How do we go about that?
> > there is no property in hadoop-env where i can set the ip-address for
> > secondary name node.
> >
> > if i set node-1 and node-2 in masters, and when we start dfs, in both the
> > m/cs, the namenode n secondary namenode processes r present. but i think
> > only node1 is active.
> > n my namenode fail over operation fails.
> >
> > ny suggesstions?
> >
> > Regards,
> > Rakhi
> >
>

-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: Setting up another machine as secondary node

Posted by David Ritch <da...@gmail.com>.

First of all, the secondary namenode is not a what you might think a
secondary is - it's not failover device.  It does make a copy of the
filesystem metadata periodically, and it integrates the edits into the
image.  It does *not* provide failover.

Second, you specify its IP address in hadoop-site.xml.  This is where you
can override the defaults set in hadoop-default.xml.

dbr

On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani <ra...@gmail.com>wrote:

> Hi,
>     I wanna set up a cluster of 5 nodes in such a way that
> node1 - master
> node2 - secondary namenode
> node3 - slave
> node4 - slave
> node5 - slave
>
>
> How do we go about that?
> there is no property in hadoop-env where i can set the ip-address for
> secondary name node.
>
> if i set node-1 and node-2 in masters, and when we start dfs, in both the
> m/cs, the namenode n secondary namenode processes r present. but i think
> only node1 is active.
> n my namenode fail over operation fails.
>
> ny suggesstions?
>
> Regards,
> Rakhi
>