You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Jean-Marc Spaggiari <je...@spaggiari.org> on 2012/12/01 03:11:33 UTC

Re: CheckPoint Node

Hi,

Is there a way to ask Hadoop to display its parameters?

I have updated the property as followed:
  <property>
    <name>dfs.name.dir</name>
    <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
  </property>

But even if I stop/start hadoop, there is nothing written on the usb
drive. So I'm wondering if there is a command line like bin/hadoop
--showparameters

Thanks,

JM

2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
> Perfect. Thanks again for your time!
>
> I will first add another drive on the Namenode because this will take
> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
> and most probably will use the zookeeper solution.
>
> This will take more time, so will be done over the week-end.
>
> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
> concerned about the NameNode data. Just want to secure that a bit
> more.
>
> JM
>
> 2012/11/22, Harsh J <ha...@cloudera.com>:
>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>
>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>> that is your only concern. You only need to use the right download (or
>> if you compile, use the -Dhadoop.profile=23 maven option).
>>
>> You will need to restart the NameNode to make changes to the
>> dfs.name.dir property and set it into effect. A reasonably fast disk
>> is needed for quicker edit log writes (few bytes worth in each round)
>> but a large, or SSD-style disk is not a requisite. An external disk
>> would work fine too (instead of an NFS), as long as it is reliable.
>>
>> You do not need to copy data manually - just ensure that your NameNode
>> process user owns the directory and it will auto-populate the empty
>> directory on startup.
>>
>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>> metrics as well) will indicate this (see bottom of NN UI page for an
>> example of what am talking about) but the NN will continue to run with
>> the lone remaining disk, but its not a good idea to let it run for too
>> long without fixing/replacing the disk, for you will be losing out on
>> redundancy.
>>
>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>> <je...@spaggiari.org> wrote:
>>> Hi Harsh,
>>>
>>> Again, thanks a lot for all those details.
>>>
>>> I read the previous link and I totally understand the HA NameNode. I
>>> already have a zookeeper quorum (3 servers) that I will be able to
>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>
>>> Can I "simply" add one directory to dfs.name.dir and restart
>>> my namenode? Is it going to feed all the required information in this
>>> directory? Or do I need to copy the data of the existing one in the
>>> new one before I restart it? Also, does it need a fast transfert rate?
>>> Or will an exteral hard drive (quick to be moved to another server if
>>> required) be enought?
>>>
>>>
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Please follow the tips provided at
>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>
>>>> In short, if you use a non-HA NameNode setup:
>>>>
>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>> data should be redundantly stored for safety.
>>>> - You should, in production, configure your NameNode's image and edits
>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>> be a dedicated one with adequate free space for gradual growth, and
>>>> should configure multiple disks (with one off-machine NFS point highly
>>>> recommended for easy recovery) for adequate redundancy.
>>>>
>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>> journal log mount or quorum setup would automatically act as
>>>> safeguards for the FS metadata.
>>>>
>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>
>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>> server hard-drive die? Is there any critical data stored locally? Or I
>>>>> simply need to build a new namenode, start it and restart all my
>>>>> namenodes to find my data back?
>>>>>
>>>>> I can deal with my application not beeing available, but loosing data
>>>>> can be a bigger issue.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> JM
>>>>>
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Hey Jean,
>>>>>>
>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>> documented
>>>>>> at
>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>
>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Replying to myself ;)
>>>>>>>
>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>> 0.23.4
>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>> 1.0
>>>>>>> are now deprecated.
>>>>>>>
>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>
>>>>>>> JM
>>>>>>>
>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA of
>>>>>>>> my
>>>>>>>> current cluster.
>>>>>>>>
>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>
>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>> that
>>>>>>>> a
>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>
>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>> if
>>>>>>>> I
>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in the
>>>>>>>> documentation, it's not starting.
>>>>>>>>
>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>> the
>>>>>>>> BackupNode.
>>>>>>>>
>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this version
>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>> and
>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>> and
>>>>>>>> checkpointnode?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> JM
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>
>>
>>
>> --
>> Harsh J
>>
>

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi ac,

Usually better to start your own thread for such purposes. However, my
response inline.

On Sat, Dec 1, 2012 at 9:41 AM, ac@hsk.hk <ac...@hsk.hk> wrote:
> Hi JM,
>
> If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

HDFS upgrade should be simple enough to do, by invoking the namenode's
first startup with a -upgrade command.

For MR, since Apache Hadoop 2.0.x carries MR2 (which runs atop YARN
[1]), you would have to recompile any custom jobs you have, against an
Apache Hadoop 2.x dependency, and use these new builds.

For HBase, following the HBase's hadoop-2 build instructions should
give you a usable tarball to deploy with.

In case you are open to using packages, you can also look at Apache
Bigtop [2], as they have RPM/DEB scripts for Hadoop 2.x and other
ecosystem components that you can leverage.

> I have a testing cluster ready for the migration test.
>
> Thanks
> ac

[1] - http://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/
[2] - http://bigtop.apache.org

--
Harsh J

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi ac,

Usually better to start your own thread for such purposes. However, my
response inline.

On Sat, Dec 1, 2012 at 9:41 AM, ac@hsk.hk <ac...@hsk.hk> wrote:
> Hi JM,
>
> If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

HDFS upgrade should be simple enough to do, by invoking the namenode's
first startup with a -upgrade command.

For MR, since Apache Hadoop 2.0.x carries MR2 (which runs atop YARN
[1]), you would have to recompile any custom jobs you have, against an
Apache Hadoop 2.x dependency, and use these new builds.

For HBase, following the HBase's hadoop-2 build instructions should
give you a usable tarball to deploy with.

In case you are open to using packages, you can also look at Apache
Bigtop [2], as they have RPM/DEB scripts for Hadoop 2.x and other
ecosystem components that you can leverage.

> I have a testing cluster ready for the migration test.
>
> Thanks
> ac

[1] - http://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/
[2] - http://bigtop.apache.org

--
Harsh J

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi ac,

Usually better to start your own thread for such purposes. However, my
response inline.

On Sat, Dec 1, 2012 at 9:41 AM, ac@hsk.hk <ac...@hsk.hk> wrote:
> Hi JM,
>
> If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

HDFS upgrade should be simple enough to do, by invoking the namenode's
first startup with a -upgrade command.

For MR, since Apache Hadoop 2.0.x carries MR2 (which runs atop YARN
[1]), you would have to recompile any custom jobs you have, against an
Apache Hadoop 2.x dependency, and use these new builds.

For HBase, following the HBase's hadoop-2 build instructions should
give you a usable tarball to deploy with.

In case you are open to using packages, you can also look at Apache
Bigtop [2], as they have RPM/DEB scripts for Hadoop 2.x and other
ecosystem components that you can leverage.

> I have a testing cluster ready for the migration test.
>
> Thanks
> ac

[1] - http://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/
[2] - http://bigtop.apache.org

--
Harsh J

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi ac,

Usually better to start your own thread for such purposes. However, my
response inline.

On Sat, Dec 1, 2012 at 9:41 AM, ac@hsk.hk <ac...@hsk.hk> wrote:
> Hi JM,
>
> If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

HDFS upgrade should be simple enough to do, by invoking the namenode's
first startup with a -upgrade command.

For MR, since Apache Hadoop 2.0.x carries MR2 (which runs atop YARN
[1]), you would have to recompile any custom jobs you have, against an
Apache Hadoop 2.x dependency, and use these new builds.

For HBase, following the HBase's hadoop-2 build instructions should
give you a usable tarball to deploy with.

In case you are open to using packages, you can also look at Apache
Bigtop [2], as they have RPM/DEB scripts for Hadoop 2.x and other
ecosystem components that you can leverage.

> I have a testing cluster ready for the migration test.
>
> Thanks
> ac

[1] - http://www.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/
[2] - http://bigtop.apache.org

--
Harsh J

Re: CheckPoint Node

Posted by "ac@hsk.hk" <ac...@hsk.hk>.

Hi JM,

If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

I have a testing cluster ready for the migration test.
 
Thanks
ac



On 1 Dec 2012, at 10:25 AM, Jean-Marc Spaggiari wrote:

> Sorry about that. My fault.
> 
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
> 
> I moved it and it's now working fine.
> 
> Thanks.
> 
> JM
> 
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>> 
>> Is there a way to ask Hadoop to display its parameters?
>> 
>> I have updated the property as followed:
>>  <property>
>>    <name>dfs.name.dir</name>
>>    <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>  </property>
>> 
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>> 
>> Thanks,
>> 
>> JM
>> 
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>> 
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>> 
>>> This will take more time, so will be done over the week-end.
>>> 
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>> 
>>> JM
>>> 
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>> 
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>> 
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>> 
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>> 
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>> 
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>> 
>>>>> Again, thanks a lot for all those details.
>>>>> 
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>> 
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>> 
>>>>> 
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>> 
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>> 
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>> 
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>> 
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>> 
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>> 
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>> 
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> JM
>>>>>>> 
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>> 
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>> 
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>> 
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>> 
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>> 
>>>>>>>>> JM
>>>>>>>>> 
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>> 
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>> 
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>> 
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>> 
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>> 
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> JM
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Harsh J
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Harsh J
>>>> 
>>> 
>>

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi,

On your question of how you may see configured values, I went over
some ways for Andy previously here:
http://search-hadoop.com/m/cmcAQ1FlzFp

On Sat, Dec 1, 2012 at 7:55 AM, Jean-Marc Spaggiari
<je...@spaggiari.org> wrote:
> Sorry about that. My fault.
>
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
>
> I moved it and it's now working fine.
>
> Thanks.
>
> JM
>
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>>
>> Is there a way to ask Hadoop to display its parameters?
>>
>> I have updated the property as followed:
>>   <property>
>>     <name>dfs.name.dir</name>
>>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>   </property>
>>
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>>
>> Thanks,
>>
>> JM
>>
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>>
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>>
>>> This will take more time, so will be done over the week-end.
>>>
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>>
>>> JM
>>>
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>>
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>>
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>>
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>>
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>>
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>> Again, thanks a lot for all those details.
>>>>>
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>>
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>>
>>>>>
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>>
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>>
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>>
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>>
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>>
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>>
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>>
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> JM
>>>>>>>
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>>
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>>
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>>
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>>
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>>
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>>
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>>
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>>
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>>
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> JM
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>



-- 
Harsh J

Re: CheckPoint Node

Posted by "ac@hsk.hk" <ac...@hsk.hk>.

Hi JM,

If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

I have a testing cluster ready for the migration test.
 
Thanks
ac



On 1 Dec 2012, at 10:25 AM, Jean-Marc Spaggiari wrote:

> Sorry about that. My fault.
> 
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
> 
> I moved it and it's now working fine.
> 
> Thanks.
> 
> JM
> 
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>> 
>> Is there a way to ask Hadoop to display its parameters?
>> 
>> I have updated the property as followed:
>>  <property>
>>    <name>dfs.name.dir</name>
>>    <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>  </property>
>> 
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>> 
>> Thanks,
>> 
>> JM
>> 
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>> 
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>> 
>>> This will take more time, so will be done over the week-end.
>>> 
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>> 
>>> JM
>>> 
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>> 
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>> 
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>> 
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>> 
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>> 
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>> 
>>>>> Again, thanks a lot for all those details.
>>>>> 
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>> 
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>> 
>>>>> 
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>> 
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>> 
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>> 
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>> 
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>> 
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>> 
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>> 
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> JM
>>>>>>> 
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>> 
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>> 
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>> 
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>> 
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>> 
>>>>>>>>> JM
>>>>>>>>> 
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>> 
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>> 
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>> 
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>> 
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>> 
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> JM
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Harsh J
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Harsh J
>>>> 
>>> 
>>

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi,

On your question of how you may see configured values, I went over
some ways for Andy previously here:
http://search-hadoop.com/m/cmcAQ1FlzFp

On Sat, Dec 1, 2012 at 7:55 AM, Jean-Marc Spaggiari
<je...@spaggiari.org> wrote:
> Sorry about that. My fault.
>
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
>
> I moved it and it's now working fine.
>
> Thanks.
>
> JM
>
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>>
>> Is there a way to ask Hadoop to display its parameters?
>>
>> I have updated the property as followed:
>>   <property>
>>     <name>dfs.name.dir</name>
>>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>   </property>
>>
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>>
>> Thanks,
>>
>> JM
>>
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>>
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>>
>>> This will take more time, so will be done over the week-end.
>>>
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>>
>>> JM
>>>
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>>
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>>
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>>
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>>
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>>
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>> Again, thanks a lot for all those details.
>>>>>
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>>
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>>
>>>>>
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>>
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>>
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>>
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>>
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>>
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>>
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>>
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> JM
>>>>>>>
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>>
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>>
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>>
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>>
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>>
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>>
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>>
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>>
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>>
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> JM
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>



-- 
Harsh J

Re: CheckPoint Node

Posted by "ac@hsk.hk" <ac...@hsk.hk>.

Hi JM,

If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

I have a testing cluster ready for the migration test.
 
Thanks
ac



On 1 Dec 2012, at 10:25 AM, Jean-Marc Spaggiari wrote:

> Sorry about that. My fault.
> 
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
> 
> I moved it and it's now working fine.
> 
> Thanks.
> 
> JM
> 
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>> 
>> Is there a way to ask Hadoop to display its parameters?
>> 
>> I have updated the property as followed:
>>  <property>
>>    <name>dfs.name.dir</name>
>>    <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>  </property>
>> 
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>> 
>> Thanks,
>> 
>> JM
>> 
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>> 
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>> 
>>> This will take more time, so will be done over the week-end.
>>> 
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>> 
>>> JM
>>> 
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>> 
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>> 
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>> 
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>> 
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>> 
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>> 
>>>>> Again, thanks a lot for all those details.
>>>>> 
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>> 
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>> 
>>>>> 
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>> 
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>> 
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>> 
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>> 
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>> 
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>> 
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>> 
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> JM
>>>>>>> 
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>> 
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>> 
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>> 
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>> 
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>> 
>>>>>>>>> JM
>>>>>>>>> 
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>> 
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>> 
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>> 
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>> 
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>> 
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> JM
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Harsh J
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Harsh J
>>>> 
>>> 
>>

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi,

On your question of how you may see configured values, I went over
some ways for Andy previously here:
http://search-hadoop.com/m/cmcAQ1FlzFp

On Sat, Dec 1, 2012 at 7:55 AM, Jean-Marc Spaggiari
<je...@spaggiari.org> wrote:
> Sorry about that. My fault.
>
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
>
> I moved it and it's now working fine.
>
> Thanks.
>
> JM
>
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>>
>> Is there a way to ask Hadoop to display its parameters?
>>
>> I have updated the property as followed:
>>   <property>
>>     <name>dfs.name.dir</name>
>>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>   </property>
>>
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>>
>> Thanks,
>>
>> JM
>>
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>>
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>>
>>> This will take more time, so will be done over the week-end.
>>>
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>>
>>> JM
>>>
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>>
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>>
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>>
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>>
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>>
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>> Again, thanks a lot for all those details.
>>>>>
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>>
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>>
>>>>>
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>>
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>>
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>>
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>>
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>>
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>>
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>>
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> JM
>>>>>>>
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>>
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>>
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>>
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>>
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>>
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>>
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>>
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>>
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>>
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> JM
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>



-- 
Harsh J

Re: CheckPoint Node

Posted by Harsh J <ha...@cloudera.com>.

Hi,

On your question of how you may see configured values, I went over
some ways for Andy previously here:
http://search-hadoop.com/m/cmcAQ1FlzFp

On Sat, Dec 1, 2012 at 7:55 AM, Jean-Marc Spaggiari
<je...@spaggiari.org> wrote:
> Sorry about that. My fault.
>
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
>
> I moved it and it's now working fine.
>
> Thanks.
>
> JM
>
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>>
>> Is there a way to ask Hadoop to display its parameters?
>>
>> I have updated the property as followed:
>>   <property>
>>     <name>dfs.name.dir</name>
>>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>   </property>
>>
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>>
>> Thanks,
>>
>> JM
>>
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>>
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>>
>>> This will take more time, so will be done over the week-end.
>>>
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>>
>>> JM
>>>
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>>
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>>
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>>
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>>
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>>
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>>
>>>>> Again, thanks a lot for all those details.
>>>>>
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>>
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>>
>>>>>
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>>
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>>
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>>
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>>
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>>
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>>
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>>
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> JM
>>>>>>>
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>>
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>>
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>>
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>>
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>>
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>>
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>>
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>>
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>>
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> JM
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Harsh J
>>>>
>>>
>>



-- 
Harsh J

Re: CheckPoint Node

Posted by "ac@hsk.hk" <ac...@hsk.hk>.

Hi JM,

If you migrate 1.0.3 to 2.0.x, could you mind to share your migration steps? it is because I also have a 1.0.4 cluster (Ubuntu 12.04, Hadoop 1.0.4, Hbase 0.94.2 and ZooKeeper 3.4.4 ) and want to migrate it to 2.0.x in order to avoid the hardware failure of the NameNode.

I have a testing cluster ready for the migration test.
 
Thanks
ac



On 1 Dec 2012, at 10:25 AM, Jean-Marc Spaggiari wrote:

> Sorry about that. My fault.
> 
> I have put this on the core-site.xml file but should be on the hdfs-site.xml...
> 
> I moved it and it's now working fine.
> 
> Thanks.
> 
> JM
> 
> 2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Hi,
>> 
>> Is there a way to ask Hadoop to display its parameters?
>> 
>> I have updated the property as followed:
>>  <property>
>>    <name>dfs.name.dir</name>
>>    <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>>  </property>
>> 
>> But even if I stop/start hadoop, there is nothing written on the usb
>> drive. So I'm wondering if there is a command line like bin/hadoop
>> --showparameters
>> 
>> Thanks,
>> 
>> JM
>> 
>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>> Perfect. Thanks again for your time!
>>> 
>>> I will first add another drive on the Namenode because this will take
>>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>>> and most probably will use the zookeeper solution.
>>> 
>>> This will take more time, so will be done over the week-end.
>>> 
>>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>>> concerned about the NameNode data. Just want to secure that a bit
>>> more.
>>> 
>>> JM
>>> 
>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>> 
>>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>>> that is your only concern. You only need to use the right download (or
>>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>> 
>>>> You will need to restart the NameNode to make changes to the
>>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>>> is needed for quicker edit log writes (few bytes worth in each round)
>>>> but a large, or SSD-style disk is not a requisite. An external disk
>>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>> 
>>>> You do not need to copy data manually - just ensure that your NameNode
>>>> process user owns the directory and it will auto-populate the empty
>>>> directory on startup.
>>>> 
>>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>>> example of what am talking about) but the NN will continue to run with
>>>> the lone remaining disk, but its not a good idea to let it run for too
>>>> long without fixing/replacing the disk, for you will be losing out on
>>>> redundancy.
>>>> 
>>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>>> <je...@spaggiari.org> wrote:
>>>>> Hi Harsh,
>>>>> 
>>>>> Again, thanks a lot for all those details.
>>>>> 
>>>>> I read the previous link and I totally understand the HA NameNode. I
>>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>> 
>>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>>> my namenode? Is it going to feed all the required information in this
>>>>> directory? Or do I need to copy the data of the existing one in the
>>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>>> required) be enought?
>>>>> 
>>>>> 
>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>> Please follow the tips provided at
>>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>> 
>>>>>> In short, if you use a non-HA NameNode setup:
>>>>>> 
>>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>>> data should be redundantly stored for safety.
>>>>>> - You should, in production, configure your NameNode's image and edits
>>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>> 
>>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>>> journal log mount or quorum setup would automatically act as
>>>>>> safeguards for the FS metadata.
>>>>>> 
>>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>>> <je...@spaggiari.org> wrote:
>>>>>>> Hi Harsh,
>>>>>>> 
>>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>> 
>>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>>> I
>>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>>> namenodes to find my data back?
>>>>>>> 
>>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>>> can be a bigger issue.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> JM
>>>>>>> 
>>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>>> Hey Jean,
>>>>>>>> 
>>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>>> documented
>>>>>>>> at
>>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>> 
>>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>>> Replying to myself ;)
>>>>>>>>> 
>>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>>> 0.23.4
>>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>>> 1.0
>>>>>>>>> are now deprecated.
>>>>>>>>> 
>>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>> 
>>>>>>>>> JM
>>>>>>>>> 
>>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>>> of
>>>>>>>>>> my
>>>>>>>>>> current cluster.
>>>>>>>>>> 
>>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>> 
>>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>>> that
>>>>>>>>>> a
>>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>> 
>>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>>> if
>>>>>>>>>> I
>>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>>> the
>>>>>>>>>> documentation, it's not starting.
>>>>>>>>>> 
>>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>>> the
>>>>>>>>>> BackupNode.
>>>>>>>>>> 
>>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>>> version
>>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>>> and
>>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>>> and
>>>>>>>>>> checkpointnode?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> JM
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Harsh J
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Harsh J
>>>> 
>>> 
>>

Re: CheckPoint Node

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Sorry about that. My fault.

I have put this on the core-site.xml file but should be on the hdfs-site.xml...

I moved it and it's now working fine.

Thanks.

JM

2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
> Hi,
>
> Is there a way to ask Hadoop to display its parameters?
>
> I have updated the property as followed:
>   <property>
>     <name>dfs.name.dir</name>
>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>   </property>
>
> But even if I stop/start hadoop, there is nothing written on the usb
> drive. So I'm wondering if there is a command line like bin/hadoop
> --showparameters
>
> Thanks,
>
> JM
>
> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Perfect. Thanks again for your time!
>>
>> I will first add another drive on the Namenode because this will take
>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>> and most probably will use the zookeeper solution.
>>
>> This will take more time, so will be done over the week-end.
>>
>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>> concerned about the NameNode data. Just want to secure that a bit
>> more.
>>
>> JM
>>
>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>
>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>> that is your only concern. You only need to use the right download (or
>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>
>>> You will need to restart the NameNode to make changes to the
>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>> is needed for quicker edit log writes (few bytes worth in each round)
>>> but a large, or SSD-style disk is not a requisite. An external disk
>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>
>>> You do not need to copy data manually - just ensure that your NameNode
>>> process user owns the directory and it will auto-populate the empty
>>> directory on startup.
>>>
>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>> example of what am talking about) but the NN will continue to run with
>>> the lone remaining disk, but its not a good idea to let it run for too
>>> long without fixing/replacing the disk, for you will be losing out on
>>> redundancy.
>>>
>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>> <je...@spaggiari.org> wrote:
>>>> Hi Harsh,
>>>>
>>>> Again, thanks a lot for all those details.
>>>>
>>>> I read the previous link and I totally understand the HA NameNode. I
>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>
>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>> my namenode? Is it going to feed all the required information in this
>>>> directory? Or do I need to copy the data of the existing one in the
>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>> required) be enought?
>>>>
>>>>
>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>> Please follow the tips provided at
>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>
>>>>> In short, if you use a non-HA NameNode setup:
>>>>>
>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>> data should be redundantly stored for safety.
>>>>> - You should, in production, configure your NameNode's image and edits
>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>
>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>> journal log mount or quorum setup would automatically act as
>>>>> safeguards for the FS metadata.
>>>>>
>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>> <je...@spaggiari.org> wrote:
>>>>>> Hi Harsh,
>>>>>>
>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>
>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>> I
>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>> namenodes to find my data back?
>>>>>>
>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>> can be a bigger issue.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> JM
>>>>>>
>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>> Hey Jean,
>>>>>>>
>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>> documented
>>>>>>> at
>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>
>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>> Replying to myself ;)
>>>>>>>>
>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>> 0.23.4
>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>> 1.0
>>>>>>>> are now deprecated.
>>>>>>>>
>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>
>>>>>>>> JM
>>>>>>>>
>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>> of
>>>>>>>>> my
>>>>>>>>> current cluster.
>>>>>>>>>
>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>
>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>> that
>>>>>>>>> a
>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>
>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>> if
>>>>>>>>> I
>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>> the
>>>>>>>>> documentation, it's not starting.
>>>>>>>>>
>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>> the
>>>>>>>>> BackupNode.
>>>>>>>>>
>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>> version
>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>> and
>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>> and
>>>>>>>>> checkpointnode?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harsh J
>>>>>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>

Re: CheckPoint Node

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Sorry about that. My fault.

I have put this on the core-site.xml file but should be on the hdfs-site.xml...

I moved it and it's now working fine.

Thanks.

JM

2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
> Hi,
>
> Is there a way to ask Hadoop to display its parameters?
>
> I have updated the property as followed:
>   <property>
>     <name>dfs.name.dir</name>
>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>   </property>
>
> But even if I stop/start hadoop, there is nothing written on the usb
> drive. So I'm wondering if there is a command line like bin/hadoop
> --showparameters
>
> Thanks,
>
> JM
>
> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Perfect. Thanks again for your time!
>>
>> I will first add another drive on the Namenode because this will take
>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>> and most probably will use the zookeeper solution.
>>
>> This will take more time, so will be done over the week-end.
>>
>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>> concerned about the NameNode data. Just want to secure that a bit
>> more.
>>
>> JM
>>
>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>
>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>> that is your only concern. You only need to use the right download (or
>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>
>>> You will need to restart the NameNode to make changes to the
>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>> is needed for quicker edit log writes (few bytes worth in each round)
>>> but a large, or SSD-style disk is not a requisite. An external disk
>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>
>>> You do not need to copy data manually - just ensure that your NameNode
>>> process user owns the directory and it will auto-populate the empty
>>> directory on startup.
>>>
>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>> example of what am talking about) but the NN will continue to run with
>>> the lone remaining disk, but its not a good idea to let it run for too
>>> long without fixing/replacing the disk, for you will be losing out on
>>> redundancy.
>>>
>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>> <je...@spaggiari.org> wrote:
>>>> Hi Harsh,
>>>>
>>>> Again, thanks a lot for all those details.
>>>>
>>>> I read the previous link and I totally understand the HA NameNode. I
>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>
>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>> my namenode? Is it going to feed all the required information in this
>>>> directory? Or do I need to copy the data of the existing one in the
>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>> required) be enought?
>>>>
>>>>
>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>> Please follow the tips provided at
>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>
>>>>> In short, if you use a non-HA NameNode setup:
>>>>>
>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>> data should be redundantly stored for safety.
>>>>> - You should, in production, configure your NameNode's image and edits
>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>
>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>> journal log mount or quorum setup would automatically act as
>>>>> safeguards for the FS metadata.
>>>>>
>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>> <je...@spaggiari.org> wrote:
>>>>>> Hi Harsh,
>>>>>>
>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>
>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>> I
>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>> namenodes to find my data back?
>>>>>>
>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>> can be a bigger issue.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> JM
>>>>>>
>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>> Hey Jean,
>>>>>>>
>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>> documented
>>>>>>> at
>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>
>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>> Replying to myself ;)
>>>>>>>>
>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>> 0.23.4
>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>> 1.0
>>>>>>>> are now deprecated.
>>>>>>>>
>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>
>>>>>>>> JM
>>>>>>>>
>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>> of
>>>>>>>>> my
>>>>>>>>> current cluster.
>>>>>>>>>
>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>
>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>> that
>>>>>>>>> a
>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>
>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>> if
>>>>>>>>> I
>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>> the
>>>>>>>>> documentation, it's not starting.
>>>>>>>>>
>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>> the
>>>>>>>>> BackupNode.
>>>>>>>>>
>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>> version
>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>> and
>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>> and
>>>>>>>>> checkpointnode?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harsh J
>>>>>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>

Re: CheckPoint Node

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Sorry about that. My fault.

I have put this on the core-site.xml file but should be on the hdfs-site.xml...

I moved it and it's now working fine.

Thanks.

JM

2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
> Hi,
>
> Is there a way to ask Hadoop to display its parameters?
>
> I have updated the property as followed:
>   <property>
>     <name>dfs.name.dir</name>
>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>   </property>
>
> But even if I stop/start hadoop, there is nothing written on the usb
> drive. So I'm wondering if there is a command line like bin/hadoop
> --showparameters
>
> Thanks,
>
> JM
>
> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Perfect. Thanks again for your time!
>>
>> I will first add another drive on the Namenode because this will take
>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>> and most probably will use the zookeeper solution.
>>
>> This will take more time, so will be done over the week-end.
>>
>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>> concerned about the NameNode data. Just want to secure that a bit
>> more.
>>
>> JM
>>
>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>
>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>> that is your only concern. You only need to use the right download (or
>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>
>>> You will need to restart the NameNode to make changes to the
>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>> is needed for quicker edit log writes (few bytes worth in each round)
>>> but a large, or SSD-style disk is not a requisite. An external disk
>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>
>>> You do not need to copy data manually - just ensure that your NameNode
>>> process user owns the directory and it will auto-populate the empty
>>> directory on startup.
>>>
>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>> example of what am talking about) but the NN will continue to run with
>>> the lone remaining disk, but its not a good idea to let it run for too
>>> long without fixing/replacing the disk, for you will be losing out on
>>> redundancy.
>>>
>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>> <je...@spaggiari.org> wrote:
>>>> Hi Harsh,
>>>>
>>>> Again, thanks a lot for all those details.
>>>>
>>>> I read the previous link and I totally understand the HA NameNode. I
>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>
>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>> my namenode? Is it going to feed all the required information in this
>>>> directory? Or do I need to copy the data of the existing one in the
>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>> required) be enought?
>>>>
>>>>
>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>> Please follow the tips provided at
>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>
>>>>> In short, if you use a non-HA NameNode setup:
>>>>>
>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>> data should be redundantly stored for safety.
>>>>> - You should, in production, configure your NameNode's image and edits
>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>
>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>> journal log mount or quorum setup would automatically act as
>>>>> safeguards for the FS metadata.
>>>>>
>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>> <je...@spaggiari.org> wrote:
>>>>>> Hi Harsh,
>>>>>>
>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>
>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>> I
>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>> namenodes to find my data back?
>>>>>>
>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>> can be a bigger issue.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> JM
>>>>>>
>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>> Hey Jean,
>>>>>>>
>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>> documented
>>>>>>> at
>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>
>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>> Replying to myself ;)
>>>>>>>>
>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>> 0.23.4
>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>> 1.0
>>>>>>>> are now deprecated.
>>>>>>>>
>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>
>>>>>>>> JM
>>>>>>>>
>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>> of
>>>>>>>>> my
>>>>>>>>> current cluster.
>>>>>>>>>
>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>
>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>> that
>>>>>>>>> a
>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>
>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>> if
>>>>>>>>> I
>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>> the
>>>>>>>>> documentation, it's not starting.
>>>>>>>>>
>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>> the
>>>>>>>>> BackupNode.
>>>>>>>>>
>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>> version
>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>> and
>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>> and
>>>>>>>>> checkpointnode?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harsh J
>>>>>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>

Re: CheckPoint Node

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Sorry about that. My fault.

I have put this on the core-site.xml file but should be on the hdfs-site.xml...

I moved it and it's now working fine.

Thanks.

JM

2012/11/30, Jean-Marc Spaggiari <je...@spaggiari.org>:
> Hi,
>
> Is there a way to ask Hadoop to display its parameters?
>
> I have updated the property as followed:
>   <property>
>     <name>dfs.name.dir</name>
>     <value>${hadoop.tmp.dir}/dfs/name,/media/usb0/</value>
>   </property>
>
> But even if I stop/start hadoop, there is nothing written on the usb
> drive. So I'm wondering if there is a command line like bin/hadoop
> --showparameters
>
> Thanks,
>
> JM
>
> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>> Perfect. Thanks again for your time!
>>
>> I will first add another drive on the Namenode because this will take
>> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x
>> and most probably will use the zookeeper solution.
>>
>> This will take more time, so will be done over the week-end.
>>
>> I lost 2 hard drives this week (2 datanodes), so I'm not a bit
>> concerned about the NameNode data. Just want to secure that a bit
>> more.
>>
>> JM
>>
>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>> Jean-Marc (Sorry if I've been spelling your name wrong),
>>>
>>> 0.94 does support Hadoop-2 already, and works pretty well with it, if
>>> that is your only concern. You only need to use the right download (or
>>> if you compile, use the -Dhadoop.profile=23 maven option).
>>>
>>> You will need to restart the NameNode to make changes to the
>>> dfs.name.dir property and set it into effect. A reasonably fast disk
>>> is needed for quicker edit log writes (few bytes worth in each round)
>>> but a large, or SSD-style disk is not a requisite. An external disk
>>> would work fine too (instead of an NFS), as long as it is reliable.
>>>
>>> You do not need to copy data manually - just ensure that your NameNode
>>> process user owns the directory and it will auto-populate the empty
>>> directory on startup.
>>>
>>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and
>>> metrics as well) will indicate this (see bottom of NN UI page for an
>>> example of what am talking about) but the NN will continue to run with
>>> the lone remaining disk, but its not a good idea to let it run for too
>>> long without fixing/replacing the disk, for you will be losing out on
>>> redundancy.
>>>
>>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari
>>> <je...@spaggiari.org> wrote:
>>>> Hi Harsh,
>>>>
>>>> Again, thanks a lot for all those details.
>>>>
>>>> I read the previous link and I totally understand the HA NameNode. I
>>>> already have a zookeeper quorum (3 servers) that I will be able to
>>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible
>>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA
>>>> NameNode until I can migrate to a stable 0.96 HBase version.
>>>>
>>>> Can I "simply" add one directory to dfs.name.dir and restart
>>>> my namenode? Is it going to feed all the required information in this
>>>> directory? Or do I need to copy the data of the existing one in the
>>>> new one before I restart it? Also, does it need a fast transfert rate?
>>>> Or will an exteral hard drive (quick to be moved to another server if
>>>> required) be enought?
>>>>
>>>>
>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>> Please follow the tips provided at
>>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
>>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F
>>>>>
>>>>> In short, if you use a non-HA NameNode setup:
>>>>>
>>>>> - Yes the NN is a very vital persistence point in running HDFS and its
>>>>> data should be redundantly stored for safety.
>>>>> - You should, in production, configure your NameNode's image and edits
>>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
>>>>> be a dedicated one with adequate free space for gradual growth, and
>>>>> should configure multiple disks (with one off-machine NFS point highly
>>>>> recommended for easy recovery) for adequate redundancy.
>>>>>
>>>>> If you instead use a HA NameNode setup (I'd highly recommend doing
>>>>> this since it is now available), the presence of > 1 NameNodes and the
>>>>> journal log mount or quorum setup would automatically act as
>>>>> safeguards for the FS metadata.
>>>>>
>>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
>>>>> <je...@spaggiari.org> wrote:
>>>>>> Hi Harsh,
>>>>>>
>>>>>> Thanks for pointing me to this link. I will take a close look at it.
>>>>>>
>>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode
>>>>>> server hard-drive die? Is there any critical data stored locally? Or
>>>>>> I
>>>>>> simply need to build a new namenode, start it and restart all my
>>>>>> namenodes to find my data back?
>>>>>>
>>>>>> I can deal with my application not beeing available, but loosing data
>>>>>> can be a bigger issue.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> JM
>>>>>>
>>>>>> 2012/11/22, Harsh J <ha...@cloudera.com>:
>>>>>>> Hey Jean,
>>>>>>>
>>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>>>>>>> The current 2.x releases carry HA-NN abilities, and this is
>>>>>>> documented
>>>>>>> at
>>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>>>>>>
>>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>>>>>>> <je...@spaggiari.org> wrote:
>>>>>>>> Replying to myself ;)
>>>>>>>>
>>>>>>>> By digging a bit more I figured that 1.0 version is older than
>>>>>>>> 0.23.4
>>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on
>>>>>>>> 1.0
>>>>>>>> are now deprecated.
>>>>>>>>
>>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>>>>>>
>>>>>>>> JM
>>>>>>>>
>>>>>>>> 2012/11/22, Jean-Marc Spaggiari <je...@spaggiari.org>:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA
>>>>>>>>> of
>>>>>>>>> my
>>>>>>>>> current cluster.
>>>>>>>>>
>>>>>>>>> Today I have 8 datanodes and one namenode.
>>>>>>>>>
>>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see
>>>>>>>>> that
>>>>>>>>> a
>>>>>>>>> Checkpoint node might be a good idea.
>>>>>>>>>
>>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>>>>>>> online doc. There is a link toe describe the command usage "For
>>>>>>>>> command usage, see namenode." but this link is not working. Also,
>>>>>>>>> if
>>>>>>>>> I
>>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in
>>>>>>>>> the
>>>>>>>>> documentation, it's not starting.
>>>>>>>>>
>>>>>>>>> So I'n wondering, is there anywhere where I can find up to date
>>>>>>>>> documentation about the checkpoint node? I will most probably try
>>>>>>>>> the
>>>>>>>>> BackupNode.
>>>>>>>>>
>>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this
>>>>>>>>> version
>>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck
>>>>>>>>> and
>>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode
>>>>>>>>> and
>>>>>>>>> checkpointnode?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> JM
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Harsh J
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harsh J
>>>>>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>