You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Thanh Do <th...@cs.wisc.edu> on 2013/04/18 18:05:20 UTC

why multiple checkpoint nodes?

Hi all,

The document says "Multiple checkpoint nodes may be specified in the
cluster configuration file".

Can some one clarify me that why we really need to run multiple checkpoint
nodes anyway? Is it possible that while checkpoint node A is doing
checkpoint, and check point node B kicks in and does another checkpoint?

Thanks,
Thanh

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
For more information : https://issues.apache.org/jira/browse/HADOOP-7297

It has been corrected but the stable documentation is still the 1.0.4
(previous to correction).

See
* http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.1/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.2/hdfs_user_guide.html

Regards

Bertrand


On Thu, Apr 18, 2013 at 9:45 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
For more information : https://issues.apache.org/jira/browse/HADOOP-7297

It has been corrected but the stable documentation is still the 1.0.4
(previous to correction).

See
* http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.1/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.2/hdfs_user_guide.html

Regards

Bertrand


On Thu, Apr 18, 2013 at 9:45 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
Thanks guys for updating!

Yeah, I read the thread that Checkpoint/BackupNode may be get deprecated.
SNN is a way to go then.

I just wonder if we use multiple CheckpointNodes, we might run into the
situation where while a checkpoint is on-going, but the first
CheckpointNode is slow, then the second checkpointNode kicks in, just
wonder what would happen.


On Thu, Apr 18, 2013 at 3:07 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Thanh,
>
>        Just to keep you updated, checkpoint node might get depricated. So,
> it's always better to use secondary namenode. More on this could be found
> here :
> https://issues.apache.org/jira/browse/HDFS-2397
> https://issues.apache.org/jira/browse/HDFS-4114
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> It would be important to point the document (which I believe is
>> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
>> version of Hadoop you are interested in. At one time, the documentation was
>> misleading. The 1.x version didn't have checkpoint/backup nodes only the
>> secondary namenode. I don't believe it has changed but I might be wrong (or
>> the documentation still hasn't been fixed). The 2.x version will have
>> namenode HA which will be the final solution.
>>
>> Regards
>>
>> Bertrand
>>
>>
>> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> so reliability (to prevent metadata loss) is the main motivation for
>>> multiple checkpoint nodes?
>>>
>>> Does anybody use multiple checkpoint nodes in real life?
>>>
>>> Thanks
>>>
>>>
>>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>>> dwivedishashwat@gmail.com> wrote:
>>>
>>>> more checkpoint nodes means more backup of the metadata :)
>>>>
>>>> *Thanks & Regards    *
>>>>
>>>> ∞
>>>> Shashwat Shriparv
>>>>
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>>> cluster configuration file".
>>>>>
>>>>> Can some one clarify me that why we really need to run multiple
>>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>>> doing checkpoint, and check point node B kicks in and does another
>>>>> checkpoint?
>>>>>
>>>>> Thanks,
>>>>> Thanh
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
Thanks guys for updating!

Yeah, I read the thread that Checkpoint/BackupNode may be get deprecated.
SNN is a way to go then.

I just wonder if we use multiple CheckpointNodes, we might run into the
situation where while a checkpoint is on-going, but the first
CheckpointNode is slow, then the second checkpointNode kicks in, just
wonder what would happen.


On Thu, Apr 18, 2013 at 3:07 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Thanh,
>
>        Just to keep you updated, checkpoint node might get depricated. So,
> it's always better to use secondary namenode. More on this could be found
> here :
> https://issues.apache.org/jira/browse/HDFS-2397
> https://issues.apache.org/jira/browse/HDFS-4114
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> It would be important to point the document (which I believe is
>> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
>> version of Hadoop you are interested in. At one time, the documentation was
>> misleading. The 1.x version didn't have checkpoint/backup nodes only the
>> secondary namenode. I don't believe it has changed but I might be wrong (or
>> the documentation still hasn't been fixed). The 2.x version will have
>> namenode HA which will be the final solution.
>>
>> Regards
>>
>> Bertrand
>>
>>
>> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> so reliability (to prevent metadata loss) is the main motivation for
>>> multiple checkpoint nodes?
>>>
>>> Does anybody use multiple checkpoint nodes in real life?
>>>
>>> Thanks
>>>
>>>
>>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>>> dwivedishashwat@gmail.com> wrote:
>>>
>>>> more checkpoint nodes means more backup of the metadata :)
>>>>
>>>> *Thanks & Regards    *
>>>>
>>>> ∞
>>>> Shashwat Shriparv
>>>>
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>>> cluster configuration file".
>>>>>
>>>>> Can some one clarify me that why we really need to run multiple
>>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>>> doing checkpoint, and check point node B kicks in and does another
>>>>> checkpoint?
>>>>>
>>>>> Thanks,
>>>>> Thanh
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
Thanks guys for updating!

Yeah, I read the thread that Checkpoint/BackupNode may be get deprecated.
SNN is a way to go then.

I just wonder if we use multiple CheckpointNodes, we might run into the
situation where while a checkpoint is on-going, but the first
CheckpointNode is slow, then the second checkpointNode kicks in, just
wonder what would happen.


On Thu, Apr 18, 2013 at 3:07 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Thanh,
>
>        Just to keep you updated, checkpoint node might get depricated. So,
> it's always better to use secondary namenode. More on this could be found
> here :
> https://issues.apache.org/jira/browse/HDFS-2397
> https://issues.apache.org/jira/browse/HDFS-4114
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> It would be important to point the document (which I believe is
>> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
>> version of Hadoop you are interested in. At one time, the documentation was
>> misleading. The 1.x version didn't have checkpoint/backup nodes only the
>> secondary namenode. I don't believe it has changed but I might be wrong (or
>> the documentation still hasn't been fixed). The 2.x version will have
>> namenode HA which will be the final solution.
>>
>> Regards
>>
>> Bertrand
>>
>>
>> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> so reliability (to prevent metadata loss) is the main motivation for
>>> multiple checkpoint nodes?
>>>
>>> Does anybody use multiple checkpoint nodes in real life?
>>>
>>> Thanks
>>>
>>>
>>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>>> dwivedishashwat@gmail.com> wrote:
>>>
>>>> more checkpoint nodes means more backup of the metadata :)
>>>>
>>>> *Thanks & Regards    *
>>>>
>>>> ∞
>>>> Shashwat Shriparv
>>>>
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>>> cluster configuration file".
>>>>>
>>>>> Can some one clarify me that why we really need to run multiple
>>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>>> doing checkpoint, and check point node B kicks in and does another
>>>>> checkpoint?
>>>>>
>>>>> Thanks,
>>>>> Thanh
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
Thanks guys for updating!

Yeah, I read the thread that Checkpoint/BackupNode may be get deprecated.
SNN is a way to go then.

I just wonder if we use multiple CheckpointNodes, we might run into the
situation where while a checkpoint is on-going, but the first
CheckpointNode is slow, then the second checkpointNode kicks in, just
wonder what would happen.


On Thu, Apr 18, 2013 at 3:07 PM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello Thanh,
>
>        Just to keep you updated, checkpoint node might get depricated. So,
> it's always better to use secondary namenode. More on this could be found
> here :
> https://issues.apache.org/jira/browse/HDFS-2397
> https://issues.apache.org/jira/browse/HDFS-4114
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:
>
>> It would be important to point the document (which I believe is
>> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
>> version of Hadoop you are interested in. At one time, the documentation was
>> misleading. The 1.x version didn't have checkpoint/backup nodes only the
>> secondary namenode. I don't believe it has changed but I might be wrong (or
>> the documentation still hasn't been fixed). The 2.x version will have
>> namenode HA which will be the final solution.
>>
>> Regards
>>
>> Bertrand
>>
>>
>> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> so reliability (to prevent metadata loss) is the main motivation for
>>> multiple checkpoint nodes?
>>>
>>> Does anybody use multiple checkpoint nodes in real life?
>>>
>>> Thanks
>>>
>>>
>>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>>> dwivedishashwat@gmail.com> wrote:
>>>
>>>> more checkpoint nodes means more backup of the metadata :)
>>>>
>>>> *Thanks & Regards    *
>>>>
>>>> ∞
>>>> Shashwat Shriparv
>>>>
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>>> cluster configuration file".
>>>>>
>>>>> Can some one clarify me that why we really need to run multiple
>>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>>> doing checkpoint, and check point node B kicks in and does another
>>>>> checkpoint?
>>>>>
>>>>> Thanks,
>>>>> Thanh
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Bertrand Dechoux
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Thanh,

       Just to keep you updated, checkpoint node might get depricated. So,
it's always better to use secondary namenode. More on this could be found
here :
https://issues.apache.org/jira/browse/HDFS-2397
https://issues.apache.org/jira/browse/HDFS-4114

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: why multiple checkpoint nodes?

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Thanh,

       Just to keep you updated, checkpoint node might get depricated. So,
it's always better to use secondary namenode. More on this could be found
here :
https://issues.apache.org/jira/browse/HDFS-2397
https://issues.apache.org/jira/browse/HDFS-4114

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
For more information : https://issues.apache.org/jira/browse/HADOOP-7297

It has been corrected but the stable documentation is still the 1.0.4
(previous to correction).

See
* http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.1/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.2/hdfs_user_guide.html

Regards

Bertrand


On Thu, Apr 18, 2013 at 9:45 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Thanh,

       Just to keep you updated, checkpoint node might get depricated. So,
it's always better to use secondary namenode. More on this could be found
here :
https://issues.apache.org/jira/browse/HDFS-2397
https://issues.apache.org/jira/browse/HDFS-4114

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
For more information : https://issues.apache.org/jira/browse/HADOOP-7297

It has been corrected but the stable documentation is still the 1.0.4
(previous to correction).

See
* http://hadoop.apache.org/docs/r1.0.4/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.1/hdfs_user_guide.html
* http://hadoop.apache.org/docs/r1.1.2/hdfs_user_guide.html

Regards

Bertrand


On Thu, Apr 18, 2013 at 9:45 PM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>



-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Thanh,

       Just to keep you updated, checkpoint node might get depricated. So,
it's always better to use secondary namenode. More on this could be found
here :
https://issues.apache.org/jira/browse/HDFS-2397
https://issues.apache.org/jira/browse/HDFS-4114

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Apr 19, 2013 at 1:15 AM, Bertrand Dechoux <de...@gmail.com>wrote:

> It would be important to point the document (which I believe is
> http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the
> version of Hadoop you are interested in. At one time, the documentation was
> misleading. The 1.x version didn't have checkpoint/backup nodes only the
> secondary namenode. I don't believe it has changed but I might be wrong (or
> the documentation still hasn't been fixed). The 2.x version will have
> namenode HA which will be the final solution.
>
> Regards
>
> Bertrand
>
>
> On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> so reliability (to prevent metadata loss) is the main motivation for
>> multiple checkpoint nodes?
>>
>> Does anybody use multiple checkpoint nodes in real life?
>>
>> Thanks
>>
>>
>> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
>> dwivedishashwat@gmail.com> wrote:
>>
>>> more checkpoint nodes means more backup of the metadata :)
>>>
>>> *Thanks & Regards    *
>>>
>>> ∞
>>> Shashwat Shriparv
>>>
>>>
>>>
>>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> The document says "Multiple checkpoint nodes may be specified in the
>>>> cluster configuration file".
>>>>
>>>> Can some one clarify me that why we really need to run multiple
>>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>>> doing checkpoint, and check point node B kicks in and does another
>>>> checkpoint?
>>>>
>>>> Thanks,
>>>> Thanh
>>>>
>>>
>>>
>>
>
>
> --
> Bertrand Dechoux
>

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
It would be important to point the document (which I believe is
http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the version
of Hadoop you are interested in. At one time, the documentation was
misleading. The 1.x version didn't have checkpoint/backup nodes only the
secondary namenode. I don't believe it has changed but I might be wrong (or
the documentation still hasn't been fixed). The 2.x version will have
namenode HA which will be the final solution.

Regards

Bertrand


On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> so reliability (to prevent metadata loss) is the main motivation for
> multiple checkpoint nodes?
>
> Does anybody use multiple checkpoint nodes in real life?
>
> Thanks
>
>
> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
> dwivedishashwat@gmail.com> wrote:
>
>> more checkpoint nodes means more backup of the metadata :)
>>
>> *Thanks & Regards    *
>>
>> ∞
>> Shashwat Shriparv
>>
>>
>>
>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> Hi all,
>>>
>>> The document says "Multiple checkpoint nodes may be specified in the
>>> cluster configuration file".
>>>
>>> Can some one clarify me that why we really need to run multiple
>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>> doing checkpoint, and check point node B kicks in and does another
>>> checkpoint?
>>>
>>> Thanks,
>>> Thanh
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
It would be important to point the document (which I believe is
http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the version
of Hadoop you are interested in. At one time, the documentation was
misleading. The 1.x version didn't have checkpoint/backup nodes only the
secondary namenode. I don't believe it has changed but I might be wrong (or
the documentation still hasn't been fixed). The 2.x version will have
namenode HA which will be the final solution.

Regards

Bertrand


On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> so reliability (to prevent metadata loss) is the main motivation for
> multiple checkpoint nodes?
>
> Does anybody use multiple checkpoint nodes in real life?
>
> Thanks
>
>
> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
> dwivedishashwat@gmail.com> wrote:
>
>> more checkpoint nodes means more backup of the metadata :)
>>
>> *Thanks & Regards    *
>>
>> ∞
>> Shashwat Shriparv
>>
>>
>>
>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> Hi all,
>>>
>>> The document says "Multiple checkpoint nodes may be specified in the
>>> cluster configuration file".
>>>
>>> Can some one clarify me that why we really need to run multiple
>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>> doing checkpoint, and check point node B kicks in and does another
>>> checkpoint?
>>>
>>> Thanks,
>>> Thanh
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
It would be important to point the document (which I believe is
http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the version
of Hadoop you are interested in. At one time, the documentation was
misleading. The 1.x version didn't have checkpoint/backup nodes only the
secondary namenode. I don't believe it has changed but I might be wrong (or
the documentation still hasn't been fixed). The 2.x version will have
namenode HA which will be the final solution.

Regards

Bertrand


On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> so reliability (to prevent metadata loss) is the main motivation for
> multiple checkpoint nodes?
>
> Does anybody use multiple checkpoint nodes in real life?
>
> Thanks
>
>
> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
> dwivedishashwat@gmail.com> wrote:
>
>> more checkpoint nodes means more backup of the metadata :)
>>
>> *Thanks & Regards    *
>>
>> ∞
>> Shashwat Shriparv
>>
>>
>>
>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> Hi all,
>>>
>>> The document says "Multiple checkpoint nodes may be specified in the
>>> cluster configuration file".
>>>
>>> Can some one clarify me that why we really need to run multiple
>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>> doing checkpoint, and check point node B kicks in and does another
>>> checkpoint?
>>>
>>> Thanks,
>>> Thanh
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Bertrand Dechoux <de...@gmail.com>.
It would be important to point the document (which I believe is
http://hadoop.apache.org/docs/stable/hdfs_user_guide.html) and the version
of Hadoop you are interested in. At one time, the documentation was
misleading. The 1.x version didn't have checkpoint/backup nodes only the
secondary namenode. I don't believe it has changed but I might be wrong (or
the documentation still hasn't been fixed). The 2.x version will have
namenode HA which will be the final solution.

Regards

Bertrand


On Thu, Apr 18, 2013 at 7:20 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> so reliability (to prevent metadata loss) is the main motivation for
> multiple checkpoint nodes?
>
> Does anybody use multiple checkpoint nodes in real life?
>
> Thanks
>
>
> On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
> dwivedishashwat@gmail.com> wrote:
>
>> more checkpoint nodes means more backup of the metadata :)
>>
>> *Thanks & Regards    *
>>
>> ∞
>> Shashwat Shriparv
>>
>>
>>
>> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>>
>>> Hi all,
>>>
>>> The document says "Multiple checkpoint nodes may be specified in the
>>> cluster configuration file".
>>>
>>> Can some one clarify me that why we really need to run multiple
>>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>>> doing checkpoint, and check point node B kicks in and does another
>>> checkpoint?
>>>
>>> Thanks,
>>> Thanh
>>>
>>
>>
>


-- 
Bertrand Dechoux

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
so reliability (to prevent metadata loss) is the main motivation for
multiple checkpoint nodes?

Does anybody use multiple checkpoint nodes in real life?

Thanks


On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> more checkpoint nodes means more backup of the metadata :)
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> Hi all,
>>
>> The document says "Multiple checkpoint nodes may be specified in the
>> cluster configuration file".
>>
>> Can some one clarify me that why we really need to run multiple
>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>> doing checkpoint, and check point node B kicks in and does another
>> checkpoint?
>>
>> Thanks,
>> Thanh
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
so reliability (to prevent metadata loss) is the main motivation for
multiple checkpoint nodes?

Does anybody use multiple checkpoint nodes in real life?

Thanks


On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> more checkpoint nodes means more backup of the metadata :)
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> Hi all,
>>
>> The document says "Multiple checkpoint nodes may be specified in the
>> cluster configuration file".
>>
>> Can some one clarify me that why we really need to run multiple
>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>> doing checkpoint, and check point node B kicks in and does another
>> checkpoint?
>>
>> Thanks,
>> Thanh
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
so reliability (to prevent metadata loss) is the main motivation for
multiple checkpoint nodes?

Does anybody use multiple checkpoint nodes in real life?

Thanks


On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> more checkpoint nodes means more backup of the metadata :)
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> Hi all,
>>
>> The document says "Multiple checkpoint nodes may be specified in the
>> cluster configuration file".
>>
>> Can some one clarify me that why we really need to run multiple
>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>> doing checkpoint, and check point node B kicks in and does another
>> checkpoint?
>>
>> Thanks,
>> Thanh
>>
>
>

Re: why multiple checkpoint nodes?

Posted by Thanh Do <th...@cs.wisc.edu>.
so reliability (to prevent metadata loss) is the main motivation for
multiple checkpoint nodes?

Does anybody use multiple checkpoint nodes in real life?

Thanks


On Thu, Apr 18, 2013 at 12:07 PM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> more checkpoint nodes means more backup of the metadata :)
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
>> Hi all,
>>
>> The document says "Multiple checkpoint nodes may be specified in the
>> cluster configuration file".
>>
>> Can some one clarify me that why we really need to run multiple
>> checkpoint nodes anyway? Is it possible that while checkpoint node A is
>> doing checkpoint, and check point node B kicks in and does another
>> checkpoint?
>>
>> Thanks,
>> Thanh
>>
>
>

Re: why multiple checkpoint nodes?

Posted by shashwat shriparv <dw...@gmail.com>.
more checkpoint nodes means more backup of the metadata :)

*Thanks & Regards    *

∞
Shashwat Shriparv



On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi all,
>
> The document says "Multiple checkpoint nodes may be specified in the
> cluster configuration file".
>
> Can some one clarify me that why we really need to run multiple checkpoint
> nodes anyway? Is it possible that while checkpoint node A is doing
> checkpoint, and check point node B kicks in and does another checkpoint?
>
> Thanks,
> Thanh
>

Re: why multiple checkpoint nodes?

Posted by shashwat shriparv <dw...@gmail.com>.
more checkpoint nodes means more backup of the metadata :)

*Thanks & Regards    *

∞
Shashwat Shriparv



On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi all,
>
> The document says "Multiple checkpoint nodes may be specified in the
> cluster configuration file".
>
> Can some one clarify me that why we really need to run multiple checkpoint
> nodes anyway? Is it possible that while checkpoint node A is doing
> checkpoint, and check point node B kicks in and does another checkpoint?
>
> Thanks,
> Thanh
>

Re: why multiple checkpoint nodes?

Posted by shashwat shriparv <dw...@gmail.com>.
more checkpoint nodes means more backup of the metadata :)

*Thanks & Regards    *

∞
Shashwat Shriparv



On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi all,
>
> The document says "Multiple checkpoint nodes may be specified in the
> cluster configuration file".
>
> Can some one clarify me that why we really need to run multiple checkpoint
> nodes anyway? Is it possible that while checkpoint node A is doing
> checkpoint, and check point node B kicks in and does another checkpoint?
>
> Thanks,
> Thanh
>

Re: why multiple checkpoint nodes?

Posted by shashwat shriparv <dw...@gmail.com>.
more checkpoint nodes means more backup of the metadata :)

*Thanks & Regards    *

∞
Shashwat Shriparv



On Thu, Apr 18, 2013 at 9:35 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi all,
>
> The document says "Multiple checkpoint nodes may be specified in the
> cluster configuration file".
>
> Can some one clarify me that why we really need to run multiple checkpoint
> nodes anyway? Is it possible that while checkpoint node A is doing
> checkpoint, and check point node B kicks in and does another checkpoint?
>
> Thanks,
> Thanh
>