You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Bharath Kumar <bh...@gmail.com> on 2014/03/26 08:17:06 UTC

Decommissioning a node takes forever

Hi All,

I am a novice hadoop user . I tried removing a node from my cluster of 2
nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
command . But decommissioning takes a very long time I left it over the
weekend and still it was not complete.

Your inputs will help


-- 
Warm Regards,
                         *Bharath*

Re: Decommissioning a node takes forever

Posted by Azuryy Yu <az...@gmail.com>.
Hi,
which version HDFS you used?


On Wed, Mar 26, 2014 at 3:17 PM, Bharath Kumar <bh...@gmail.com> wrote:

>
> Hi All,
>
> I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
> Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>

Re: Decommissioning a node takes forever

Posted by Bharath Kumar <bh...@gmail.com>.
Hi All,

Thanks for the excellent suggestion , once we find out the blocks which
have replication factor more than actual datanodes and we run setrep =
datanodes .
Should the decomissiong proceed smoothly . From your past experiences how
much time does a decommissioning take to complete . During decommissioning
can i execute my MR jobs and will it affect performance of cluster ?


On Thu, Mar 27, 2014 at 7:58 AM, Mingjiang Shi <ms...@gopivotal.com> wrote:

> Hi Brahma,
> It might be some files have replication factor more than the actual number
> of datanodes, so namenode will not be able to decommission the machine
> because it cannot get the replica count settled.
>
> Run the following command to check the replication factor of the files on
> the hdfs, see if any file has replication factor more than the actual
> number of datanodes.
>
> sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=
>
>
>
> On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
> brahmareddy.battula@huawei.com> wrote:
>
>>  can you please elaborate more..?
>>
>>
>>
>> Like how many nodes are there in cluster and what's the replication
>> factor for files..?
>>
>>
>>
>> Normally decommission will be success once all the replica's from the
>> excluded node is replicated another node in  cluster(another node should be
>> availble,,)..
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>>
>>
>> Brahma Reddy Battula
>>
>>
>>   ------------------------------
>> *From:* Bharath Kumar [bharathdcs@gmail.com]
>> *Sent:* Wednesday, March 26, 2014 12:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Decommissioning a node takes forever
>>
>>
>>  Hi All,
>>
>>  I am a novice hadoop user . I tried removing a node from my cluster of 2
>> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
>> command . But decommissioning takes a very long time I left it over the
>> weekend and still it was not complete.
>>
>>  Your inputs will help
>>
>>
>> --
>> Warm Regards,
>>                          *Bharath*
>>
>>
>>
>
>
>
> --
> Cheers
> -MJ
>



-- 
Warm Regards,
                         *Bharath Kumar *

Re: Decommissioning a node takes forever

Posted by Bharath Kumar <bh...@gmail.com>.
Hi All,

Thanks for the excellent suggestion , once we find out the blocks which
have replication factor more than actual datanodes and we run setrep =
datanodes .
Should the decomissiong proceed smoothly . From your past experiences how
much time does a decommissioning take to complete . During decommissioning
can i execute my MR jobs and will it affect performance of cluster ?


On Thu, Mar 27, 2014 at 7:58 AM, Mingjiang Shi <ms...@gopivotal.com> wrote:

> Hi Brahma,
> It might be some files have replication factor more than the actual number
> of datanodes, so namenode will not be able to decommission the machine
> because it cannot get the replica count settled.
>
> Run the following command to check the replication factor of the files on
> the hdfs, see if any file has replication factor more than the actual
> number of datanodes.
>
> sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=
>
>
>
> On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
> brahmareddy.battula@huawei.com> wrote:
>
>>  can you please elaborate more..?
>>
>>
>>
>> Like how many nodes are there in cluster and what's the replication
>> factor for files..?
>>
>>
>>
>> Normally decommission will be success once all the replica's from the
>> excluded node is replicated another node in  cluster(another node should be
>> availble,,)..
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>>
>>
>> Brahma Reddy Battula
>>
>>
>>   ------------------------------
>> *From:* Bharath Kumar [bharathdcs@gmail.com]
>> *Sent:* Wednesday, March 26, 2014 12:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Decommissioning a node takes forever
>>
>>
>>  Hi All,
>>
>>  I am a novice hadoop user . I tried removing a node from my cluster of 2
>> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
>> command . But decommissioning takes a very long time I left it over the
>> weekend and still it was not complete.
>>
>>  Your inputs will help
>>
>>
>> --
>> Warm Regards,
>>                          *Bharath*
>>
>>
>>
>
>
>
> --
> Cheers
> -MJ
>



-- 
Warm Regards,
                         *Bharath Kumar *

Re: Decommissioning a node takes forever

Posted by Bharath Kumar <bh...@gmail.com>.
Hi All,

Thanks for the excellent suggestion , once we find out the blocks which
have replication factor more than actual datanodes and we run setrep =
datanodes .
Should the decomissiong proceed smoothly . From your past experiences how
much time does a decommissioning take to complete . During decommissioning
can i execute my MR jobs and will it affect performance of cluster ?


On Thu, Mar 27, 2014 at 7:58 AM, Mingjiang Shi <ms...@gopivotal.com> wrote:

> Hi Brahma,
> It might be some files have replication factor more than the actual number
> of datanodes, so namenode will not be able to decommission the machine
> because it cannot get the replica count settled.
>
> Run the following command to check the replication factor of the files on
> the hdfs, see if any file has replication factor more than the actual
> number of datanodes.
>
> sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=
>
>
>
> On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
> brahmareddy.battula@huawei.com> wrote:
>
>>  can you please elaborate more..?
>>
>>
>>
>> Like how many nodes are there in cluster and what's the replication
>> factor for files..?
>>
>>
>>
>> Normally decommission will be success once all the replica's from the
>> excluded node is replicated another node in  cluster(another node should be
>> availble,,)..
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>>
>>
>> Brahma Reddy Battula
>>
>>
>>   ------------------------------
>> *From:* Bharath Kumar [bharathdcs@gmail.com]
>> *Sent:* Wednesday, March 26, 2014 12:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Decommissioning a node takes forever
>>
>>
>>  Hi All,
>>
>>  I am a novice hadoop user . I tried removing a node from my cluster of 2
>> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
>> command . But decommissioning takes a very long time I left it over the
>> weekend and still it was not complete.
>>
>>  Your inputs will help
>>
>>
>> --
>> Warm Regards,
>>                          *Bharath*
>>
>>
>>
>
>
>
> --
> Cheers
> -MJ
>



-- 
Warm Regards,
                         *Bharath Kumar *

Re: Decommissioning a node takes forever

Posted by Bharath Kumar <bh...@gmail.com>.
Hi All,

Thanks for the excellent suggestion , once we find out the blocks which
have replication factor more than actual datanodes and we run setrep =
datanodes .
Should the decomissiong proceed smoothly . From your past experiences how
much time does a decommissioning take to complete . During decommissioning
can i execute my MR jobs and will it affect performance of cluster ?


On Thu, Mar 27, 2014 at 7:58 AM, Mingjiang Shi <ms...@gopivotal.com> wrote:

> Hi Brahma,
> It might be some files have replication factor more than the actual number
> of datanodes, so namenode will not be able to decommission the machine
> because it cannot get the replica count settled.
>
> Run the following command to check the replication factor of the files on
> the hdfs, see if any file has replication factor more than the actual
> number of datanodes.
>
> sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=
>
>
>
> On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
> brahmareddy.battula@huawei.com> wrote:
>
>>  can you please elaborate more..?
>>
>>
>>
>> Like how many nodes are there in cluster and what's the replication
>> factor for files..?
>>
>>
>>
>> Normally decommission will be success once all the replica's from the
>> excluded node is replicated another node in  cluster(another node should be
>> availble,,)..
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks & Regards
>>
>>
>>
>> Brahma Reddy Battula
>>
>>
>>   ------------------------------
>> *From:* Bharath Kumar [bharathdcs@gmail.com]
>> *Sent:* Wednesday, March 26, 2014 12:47 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Decommissioning a node takes forever
>>
>>
>>  Hi All,
>>
>>  I am a novice hadoop user . I tried removing a node from my cluster of 2
>> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
>> command . But decommissioning takes a very long time I left it over the
>> weekend and still it was not complete.
>>
>>  Your inputs will help
>>
>>
>> --
>> Warm Regards,
>>                          *Bharath*
>>
>>
>>
>
>
>
> --
> Cheers
> -MJ
>



-- 
Warm Regards,
                         *Bharath Kumar *

Re: Decommissioning a node takes forever

Posted by Mingjiang Shi <ms...@gopivotal.com>.
Hi Brahma,
It might be some files have replication factor more than the actual number
of datanodes, so namenode will not be able to decommission the machine
because it cannot get the replica count settled.

Run the following command to check the replication factor of the files on
the hdfs, see if any file has replication factor more than the actual
number of datanodes.

sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=



On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  can you please elaborate more..?
>
>
>
> Like how many nodes are there in cluster and what's the replication factor
> for files..?
>
>
>
> Normally decommission will be success once all the replica's from the
> excluded node is replicated another node in  cluster(another node should be
> availble,,)..
>
>
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> Brahma Reddy Battula
>
>
>   ------------------------------
> *From:* Bharath Kumar [bharathdcs@gmail.com]
> *Sent:* Wednesday, March 26, 2014 12:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Decommissioning a node takes forever
>
>
>  Hi All,
>
>  I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
>  Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>



-- 
Cheers
-MJ

Re: Decommissioning a node takes forever

Posted by Mingjiang Shi <ms...@gopivotal.com>.
Hi Brahma,
It might be some files have replication factor more than the actual number
of datanodes, so namenode will not be able to decommission the machine
because it cannot get the replica count settled.

Run the following command to check the replication factor of the files on
the hdfs, see if any file has replication factor more than the actual
number of datanodes.

sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=



On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  can you please elaborate more..?
>
>
>
> Like how many nodes are there in cluster and what's the replication factor
> for files..?
>
>
>
> Normally decommission will be success once all the replica's from the
> excluded node is replicated another node in  cluster(another node should be
> availble,,)..
>
>
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> Brahma Reddy Battula
>
>
>   ------------------------------
> *From:* Bharath Kumar [bharathdcs@gmail.com]
> *Sent:* Wednesday, March 26, 2014 12:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Decommissioning a node takes forever
>
>
>  Hi All,
>
>  I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
>  Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>



-- 
Cheers
-MJ

Re: Decommissioning a node takes forever

Posted by Mingjiang Shi <ms...@gopivotal.com>.
Hi Brahma,
It might be some files have replication factor more than the actual number
of datanodes, so namenode will not be able to decommission the machine
because it cannot get the replica count settled.

Run the following command to check the replication factor of the files on
the hdfs, see if any file has replication factor more than the actual
number of datanodes.

sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=



On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  can you please elaborate more..?
>
>
>
> Like how many nodes are there in cluster and what's the replication factor
> for files..?
>
>
>
> Normally decommission will be success once all the replica's from the
> excluded node is replicated another node in  cluster(another node should be
> availble,,)..
>
>
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> Brahma Reddy Battula
>
>
>   ------------------------------
> *From:* Bharath Kumar [bharathdcs@gmail.com]
> *Sent:* Wednesday, March 26, 2014 12:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Decommissioning a node takes forever
>
>
>  Hi All,
>
>  I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
>  Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>



-- 
Cheers
-MJ

Re: Decommissioning a node takes forever

Posted by Mingjiang Shi <ms...@gopivotal.com>.
Hi Brahma,
It might be some files have replication factor more than the actual number
of datanodes, so namenode will not be able to decommission the machine
because it cannot get the replica count settled.

Run the following command to check the replication factor of the files on
the hdfs, see if any file has replication factor more than the actual
number of datanodes.

sudo -u hdfs hdfs fsck / -files -blocks | grep -B 1 repl=



On Wed, Mar 26, 2014 at 3:49 PM, Brahma Reddy Battula <
brahmareddy.battula@huawei.com> wrote:

>  can you please elaborate more..?
>
>
>
> Like how many nodes are there in cluster and what's the replication factor
> for files..?
>
>
>
> Normally decommission will be success once all the replica's from the
> excluded node is replicated another node in  cluster(another node should be
> availble,,)..
>
>
>
>
>
>
>
>
>
> Thanks & Regards
>
>
>
> Brahma Reddy Battula
>
>
>   ------------------------------
> *From:* Bharath Kumar [bharathdcs@gmail.com]
> *Sent:* Wednesday, March 26, 2014 12:47 PM
> *To:* user@hadoop.apache.org
> *Subject:* Decommissioning a node takes forever
>
>
>  Hi All,
>
>  I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
>  Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>



-- 
Cheers
-MJ

RE: Decommissioning a node takes forever

Posted by Brahma Reddy Battula <br...@huawei.com>.
can you please elaborate more..?



Like how many nodes are there in cluster and what's the replication factor for files..?



Normally decommission will be success once all the replica's from the excluded node is replicated another node in  cluster(another node should be availble,,)..









Thanks & Regards



Brahma Reddy Battula



________________________________
From: Bharath Kumar [bharathdcs@gmail.com]
Sent: Wednesday, March 26, 2014 12:47 PM
To: user@hadoop.apache.org
Subject: Decommissioning a node takes forever


Hi All,

I am a novice hadoop user . I tried removing a node from my cluster of 2 nodes by adding the ip in excludes file and running dfsadmin -refreshNodes command . But decommissioning takes a very long time I left it over the weekend and still it was not complete.

Your inputs will help


--
Warm Regards,
                         Bharath



RE: Decommissioning a node takes forever

Posted by Brahma Reddy Battula <br...@huawei.com>.
can you please elaborate more..?



Like how many nodes are there in cluster and what's the replication factor for files..?



Normally decommission will be success once all the replica's from the excluded node is replicated another node in  cluster(another node should be availble,,)..









Thanks & Regards



Brahma Reddy Battula



________________________________
From: Bharath Kumar [bharathdcs@gmail.com]
Sent: Wednesday, March 26, 2014 12:47 PM
To: user@hadoop.apache.org
Subject: Decommissioning a node takes forever


Hi All,

I am a novice hadoop user . I tried removing a node from my cluster of 2 nodes by adding the ip in excludes file and running dfsadmin -refreshNodes command . But decommissioning takes a very long time I left it over the weekend and still it was not complete.

Your inputs will help


--
Warm Regards,
                         Bharath



Re: Decommissioning a node takes forever

Posted by Azuryy Yu <az...@gmail.com>.
Hi,
which version HDFS you used?


On Wed, Mar 26, 2014 at 3:17 PM, Bharath Kumar <bh...@gmail.com> wrote:

>
> Hi All,
>
> I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
> Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>

Re: Decommissioning a node takes forever

Posted by Azuryy Yu <az...@gmail.com>.
Hi,
which version HDFS you used?


On Wed, Mar 26, 2014 at 3:17 PM, Bharath Kumar <bh...@gmail.com> wrote:

>
> Hi All,
>
> I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
> Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>

Re: Decommissioning a node takes forever

Posted by Azuryy Yu <az...@gmail.com>.
Hi,
which version HDFS you used?


On Wed, Mar 26, 2014 at 3:17 PM, Bharath Kumar <bh...@gmail.com> wrote:

>
> Hi All,
>
> I am a novice hadoop user . I tried removing a node from my cluster of 2
> nodes by adding the ip in excludes file and running dfsadmin -refreshNodes
> command . But decommissioning takes a very long time I left it over the
> weekend and still it was not complete.
>
> Your inputs will help
>
>
> --
> Warm Regards,
>                          *Bharath*
>
>
>

RE: Decommissioning a node takes forever

Posted by Brahma Reddy Battula <br...@huawei.com>.
can you please elaborate more..?



Like how many nodes are there in cluster and what's the replication factor for files..?



Normally decommission will be success once all the replica's from the excluded node is replicated another node in  cluster(another node should be availble,,)..









Thanks & Regards



Brahma Reddy Battula



________________________________
From: Bharath Kumar [bharathdcs@gmail.com]
Sent: Wednesday, March 26, 2014 12:47 PM
To: user@hadoop.apache.org
Subject: Decommissioning a node takes forever


Hi All,

I am a novice hadoop user . I tried removing a node from my cluster of 2 nodes by adding the ip in excludes file and running dfsadmin -refreshNodes command . But decommissioning takes a very long time I left it over the weekend and still it was not complete.

Your inputs will help


--
Warm Regards,
                         Bharath



RE: Decommissioning a node takes forever

Posted by Brahma Reddy Battula <br...@huawei.com>.
can you please elaborate more..?



Like how many nodes are there in cluster and what's the replication factor for files..?



Normally decommission will be success once all the replica's from the excluded node is replicated another node in  cluster(another node should be availble,,)..









Thanks & Regards



Brahma Reddy Battula



________________________________
From: Bharath Kumar [bharathdcs@gmail.com]
Sent: Wednesday, March 26, 2014 12:47 PM
To: user@hadoop.apache.org
Subject: Decommissioning a node takes forever


Hi All,

I am a novice hadoop user . I tried removing a node from my cluster of 2 nodes by adding the ip in excludes file and running dfsadmin -refreshNodes command . But decommissioning takes a very long time I left it over the weekend and still it was not complete.

Your inputs will help


--
Warm Regards,
                         Bharath