You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by ch huang <ju...@gmail.com> on 2014/05/06 02:39:35 UTC
issue about cluster balance
hi,maillist:
i have a 5-node hadoop cluster,and yesterday i add 5 new
box into my cluster,after that i start balance task,but it move only 7%
data to new node in 20 hour , and i already set
dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the
balance task take long time ?
Re: issue about cluster balance
Posted by ch huang <ju...@gmail.com>.
i record the disk status befor balance and after balance,from one of source
node and one of destination node
before
source node
/dev/sdd 1.8T 1009G 733G 58% /data/1
/dev/sde 1.8T 1005G 737G 58% /data/2
/dev/sda 1.8T 980G 762G 57% /data/3
/dev/sdb 1.8T 980G 762G 57% /data/4
/dev/sdc 1.8T 972G 769G 56% /data/5
/dev/sdf 1.8T 980G 762G 57% /data/
destination node
/dev/sdb 1.8T 2.0G 1.7T 1% /data/1
/dev/sdc 1.8T 2.1G 1.7T 1% /data/2
/dev/sdd 1.8T 2.0G 1.7T 1% /data/3
/dev/sde 1.8T 2.2G 1.7T 1% /data/4
/dev/sdf 1.8T 2.2G 1.7T 1% /data/5
after
/dev/sdd 1.8T 754G 988G 44% /data/1
/dev/sde 1.8T 736G 1006G 43% /data/2
/dev/sda 1.8T 730G 1011G 42% /data/3
/dev/sdb 1.8T 721G 1020G 42% /data/4
/dev/sdc 1.8T 721G 1021G 42% /data/5
/dev/sdf 1.8T 723G 1019G 42% /data/6
/dev/sdb 1.8T 388G 1.4T 23% /data/1
/dev/sdc 1.8T 381G 1.4T 22% /data/2
/dev/sdd 1.8T 378G 1.4T 22% /data/3
/dev/sde 1.8T 375G 1.4T 22% /data/4
/dev/sdf 1.8T 374G 1.4T 22% /data/5
my wonder is why the source node is not equal destination node ,like 30%
each ?,and the balance took 62.991929444444445 hours
On Tue, May 6, 2014 at 12:38 PM, Rakesh R <ra...@huawei.com> wrote:
> Could you give more details like,
>
> - Could you convert 7% to the total amount of moved data in MBs.
>
> - Also, could you tell me 7% data movement per DN ?
>
> - What values showing for the ‘over-utilized’, ‘above-average’,
> ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based
> on these values.
>
> - Please tell me the cluster topology - SAME_NODE_GROUP,
> SAME_RACK. Basically this will matters when choosing the sourceNode vs
> balancerNode pairs as well as the proxy source.
>
> Did you see all the DNs are getting utilized for the block movement.
>
> - Any exceptions occurred when block movement
>
> - How many iterations played in these hours
>
>
>
> -Rakesh
>
>
>
> *From:* ch huang [mailto:justlooks@gmail.com]
> *Sent:* 06 May 2014 06:10
> *To:* user@hadoop.apache.org
> *Subject:* issue about cluster balance
>
>
>
> hi,maillist:
>
> i have a 5-node hadoop cluster,and yesterday i add 5 new
> box into my cluster,after that i start balance task,but it move only 7%
> data to new node in 20 hour , and i already set
> dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the
> balance task take long time ?
>
Re: issue about cluster balance
Posted by ch huang <ju...@gmail.com>.
i record the disk status befor balance and after balance,from one of source
node and one of destination node
before
source node
/dev/sdd 1.8T 1009G 733G 58% /data/1
/dev/sde 1.8T 1005G 737G 58% /data/2
/dev/sda 1.8T 980G 762G 57% /data/3
/dev/sdb 1.8T 980G 762G 57% /data/4
/dev/sdc 1.8T 972G 769G 56% /data/5
/dev/sdf 1.8T 980G 762G 57% /data/
destination node
/dev/sdb 1.8T 2.0G 1.7T 1% /data/1
/dev/sdc 1.8T 2.1G 1.7T 1% /data/2
/dev/sdd 1.8T 2.0G 1.7T 1% /data/3
/dev/sde 1.8T 2.2G 1.7T 1% /data/4
/dev/sdf 1.8T 2.2G 1.7T 1% /data/5
after
/dev/sdd 1.8T 754G 988G 44% /data/1
/dev/sde 1.8T 736G 1006G 43% /data/2
/dev/sda 1.8T 730G 1011G 42% /data/3
/dev/sdb 1.8T 721G 1020G 42% /data/4
/dev/sdc 1.8T 721G 1021G 42% /data/5
/dev/sdf 1.8T 723G 1019G 42% /data/6
/dev/sdb 1.8T 388G 1.4T 23% /data/1
/dev/sdc 1.8T 381G 1.4T 22% /data/2
/dev/sdd 1.8T 378G 1.4T 22% /data/3
/dev/sde 1.8T 375G 1.4T 22% /data/4
/dev/sdf 1.8T 374G 1.4T 22% /data/5
my wonder is why the source node is not equal destination node ,like 30%
each ?,and the balance took 62.991929444444445 hours
On Tue, May 6, 2014 at 12:38 PM, Rakesh R <ra...@huawei.com> wrote:
> Could you give more details like,
>
> - Could you convert 7% to the total amount of moved data in MBs.
>
> - Also, could you tell me 7% data movement per DN ?
>
> - What values showing for the ‘over-utilized’, ‘above-average’,
> ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based
> on these values.
>
> - Please tell me the cluster topology - SAME_NODE_GROUP,
> SAME_RACK. Basically this will matters when choosing the sourceNode vs
> balancerNode pairs as well as the proxy source.
>
> Did you see all the DNs are getting utilized for the block movement.
>
> - Any exceptions occurred when block movement
>
> - How many iterations played in these hours
>
>
>
> -Rakesh
>
>
>
> *From:* ch huang [mailto:justlooks@gmail.com]
> *Sent:* 06 May 2014 06:10
> *To:* user@hadoop.apache.org
> *Subject:* issue about cluster balance
>
>
>
> hi,maillist:
>
> i have a 5-node hadoop cluster,and yesterday i add 5 new
> box into my cluster,after that i start balance task,but it move only 7%
> data to new node in 20 hour , and i already set
> dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the
> balance task take long time ?
>
Re: issue about cluster balance
Posted by ch huang <ju...@gmail.com>.
i record the disk status befor balance and after balance,from one of source
node and one of destination node
before
source node
/dev/sdd 1.8T 1009G 733G 58% /data/1
/dev/sde 1.8T 1005G 737G 58% /data/2
/dev/sda 1.8T 980G 762G 57% /data/3
/dev/sdb 1.8T 980G 762G 57% /data/4
/dev/sdc 1.8T 972G 769G 56% /data/5
/dev/sdf 1.8T 980G 762G 57% /data/
destination node
/dev/sdb 1.8T 2.0G 1.7T 1% /data/1
/dev/sdc 1.8T 2.1G 1.7T 1% /data/2
/dev/sdd 1.8T 2.0G 1.7T 1% /data/3
/dev/sde 1.8T 2.2G 1.7T 1% /data/4
/dev/sdf 1.8T 2.2G 1.7T 1% /data/5
after
/dev/sdd 1.8T 754G 988G 44% /data/1
/dev/sde 1.8T 736G 1006G 43% /data/2
/dev/sda 1.8T 730G 1011G 42% /data/3
/dev/sdb 1.8T 721G 1020G 42% /data/4
/dev/sdc 1.8T 721G 1021G 42% /data/5
/dev/sdf 1.8T 723G 1019G 42% /data/6
/dev/sdb 1.8T 388G 1.4T 23% /data/1
/dev/sdc 1.8T 381G 1.4T 22% /data/2
/dev/sdd 1.8T 378G 1.4T 22% /data/3
/dev/sde 1.8T 375G 1.4T 22% /data/4
/dev/sdf 1.8T 374G 1.4T 22% /data/5
my wonder is why the source node is not equal destination node ,like 30%
each ?,and the balance took 62.991929444444445 hours
On Tue, May 6, 2014 at 12:38 PM, Rakesh R <ra...@huawei.com> wrote:
> Could you give more details like,
>
> - Could you convert 7% to the total amount of moved data in MBs.
>
> - Also, could you tell me 7% data movement per DN ?
>
> - What values showing for the ‘over-utilized’, ‘above-average’,
> ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based
> on these values.
>
> - Please tell me the cluster topology - SAME_NODE_GROUP,
> SAME_RACK. Basically this will matters when choosing the sourceNode vs
> balancerNode pairs as well as the proxy source.
>
> Did you see all the DNs are getting utilized for the block movement.
>
> - Any exceptions occurred when block movement
>
> - How many iterations played in these hours
>
>
>
> -Rakesh
>
>
>
> *From:* ch huang [mailto:justlooks@gmail.com]
> *Sent:* 06 May 2014 06:10
> *To:* user@hadoop.apache.org
> *Subject:* issue about cluster balance
>
>
>
> hi,maillist:
>
> i have a 5-node hadoop cluster,and yesterday i add 5 new
> box into my cluster,after that i start balance task,but it move only 7%
> data to new node in 20 hour , and i already set
> dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the
> balance task take long time ?
>
Re: issue about cluster balance
Posted by ch huang <ju...@gmail.com>.
i record the disk status befor balance and after balance,from one of source
node and one of destination node
before
source node
/dev/sdd 1.8T 1009G 733G 58% /data/1
/dev/sde 1.8T 1005G 737G 58% /data/2
/dev/sda 1.8T 980G 762G 57% /data/3
/dev/sdb 1.8T 980G 762G 57% /data/4
/dev/sdc 1.8T 972G 769G 56% /data/5
/dev/sdf 1.8T 980G 762G 57% /data/
destination node
/dev/sdb 1.8T 2.0G 1.7T 1% /data/1
/dev/sdc 1.8T 2.1G 1.7T 1% /data/2
/dev/sdd 1.8T 2.0G 1.7T 1% /data/3
/dev/sde 1.8T 2.2G 1.7T 1% /data/4
/dev/sdf 1.8T 2.2G 1.7T 1% /data/5
after
/dev/sdd 1.8T 754G 988G 44% /data/1
/dev/sde 1.8T 736G 1006G 43% /data/2
/dev/sda 1.8T 730G 1011G 42% /data/3
/dev/sdb 1.8T 721G 1020G 42% /data/4
/dev/sdc 1.8T 721G 1021G 42% /data/5
/dev/sdf 1.8T 723G 1019G 42% /data/6
/dev/sdb 1.8T 388G 1.4T 23% /data/1
/dev/sdc 1.8T 381G 1.4T 22% /data/2
/dev/sdd 1.8T 378G 1.4T 22% /data/3
/dev/sde 1.8T 375G 1.4T 22% /data/4
/dev/sdf 1.8T 374G 1.4T 22% /data/5
my wonder is why the source node is not equal destination node ,like 30%
each ?,and the balance took 62.991929444444445 hours
On Tue, May 6, 2014 at 12:38 PM, Rakesh R <ra...@huawei.com> wrote:
> Could you give more details like,
>
> - Could you convert 7% to the total amount of moved data in MBs.
>
> - Also, could you tell me 7% data movement per DN ?
>
> - What values showing for the ‘over-utilized’, ‘above-average’,
> ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based
> on these values.
>
> - Please tell me the cluster topology - SAME_NODE_GROUP,
> SAME_RACK. Basically this will matters when choosing the sourceNode vs
> balancerNode pairs as well as the proxy source.
>
> Did you see all the DNs are getting utilized for the block movement.
>
> - Any exceptions occurred when block movement
>
> - How many iterations played in these hours
>
>
>
> -Rakesh
>
>
>
> *From:* ch huang [mailto:justlooks@gmail.com]
> *Sent:* 06 May 2014 06:10
> *To:* user@hadoop.apache.org
> *Subject:* issue about cluster balance
>
>
>
> hi,maillist:
>
> i have a 5-node hadoop cluster,and yesterday i add 5 new
> box into my cluster,after that i start balance task,but it move only 7%
> data to new node in 20 hour , and i already set
> dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the
> balance task take long time ?
>
RE: issue about cluster balance
Posted by Rakesh R <ra...@huawei.com>.
Could you give more details like,
- Could you convert 7% to the total amount of moved data in MBs.
- Also, could you tell me 7% data movement per DN ?
- What values showing for the ‘over-utilized’, ‘above-average’, ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based on these values.
- Please tell me the cluster topology - SAME_NODE_GROUP, SAME_RACK. Basically this will matters when choosing the sourceNode vs balancerNode pairs as well as the proxy source.
Did you see all the DNs are getting utilized for the block movement.
- Any exceptions occurred when block movement
- How many iterations played in these hours
-Rakesh
From: ch huang [mailto:justlooks@gmail.com]
Sent: 06 May 2014 06:10
To: user@hadoop.apache.org
Subject: issue about cluster balance
hi,maillist:
i have a 5-node hadoop cluster,and yesterday i add 5 new box into my cluster,after that i start balance task,but it move only 7% data to new node in 20 hour , and i already set dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the balance task take long time ?
RE: issue about cluster balance
Posted by Rakesh R <ra...@huawei.com>.
Could you give more details like,
- Could you convert 7% to the total amount of moved data in MBs.
- Also, could you tell me 7% data movement per DN ?
- What values showing for the ‘over-utilized’, ‘above-average’, ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based on these values.
- Please tell me the cluster topology - SAME_NODE_GROUP, SAME_RACK. Basically this will matters when choosing the sourceNode vs balancerNode pairs as well as the proxy source.
Did you see all the DNs are getting utilized for the block movement.
- Any exceptions occurred when block movement
- How many iterations played in these hours
-Rakesh
From: ch huang [mailto:justlooks@gmail.com]
Sent: 06 May 2014 06:10
To: user@hadoop.apache.org
Subject: issue about cluster balance
hi,maillist:
i have a 5-node hadoop cluster,and yesterday i add 5 new box into my cluster,after that i start balance task,but it move only 7% data to new node in 20 hour , and i already set dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the balance task take long time ?
RE: issue about cluster balance
Posted by Rakesh R <ra...@huawei.com>.
Could you give more details like,
- Could you convert 7% to the total amount of moved data in MBs.
- Also, could you tell me 7% data movement per DN ?
- What values showing for the ‘over-utilized’, ‘above-average’, ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based on these values.
- Please tell me the cluster topology - SAME_NODE_GROUP, SAME_RACK. Basically this will matters when choosing the sourceNode vs balancerNode pairs as well as the proxy source.
Did you see all the DNs are getting utilized for the block movement.
- Any exceptions occurred when block movement
- How many iterations played in these hours
-Rakesh
From: ch huang [mailto:justlooks@gmail.com]
Sent: 06 May 2014 06:10
To: user@hadoop.apache.org
Subject: issue about cluster balance
hi,maillist:
i have a 5-node hadoop cluster,and yesterday i add 5 new box into my cluster,after that i start balance task,but it move only 7% data to new node in 20 hour , and i already set dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the balance task take long time ?
RE: issue about cluster balance
Posted by Rakesh R <ra...@huawei.com>.
Could you give more details like,
- Could you convert 7% to the total amount of moved data in MBs.
- Also, could you tell me 7% data movement per DN ?
- What values showing for the ‘over-utilized’, ‘above-average’, ‘below-average’, ‘below-average’ nodes. Balancer will do the pairing based on these values.
- Please tell me the cluster topology - SAME_NODE_GROUP, SAME_RACK. Basically this will matters when choosing the sourceNode vs balancerNode pairs as well as the proxy source.
Did you see all the DNs are getting utilized for the block movement.
- Any exceptions occurred when block movement
- How many iterations played in these hours
-Rakesh
From: ch huang [mailto:justlooks@gmail.com]
Sent: 06 May 2014 06:10
To: user@hadoop.apache.org
Subject: issue about cluster balance
hi,maillist:
i have a 5-node hadoop cluster,and yesterday i add 5 new box into my cluster,after that i start balance task,but it move only 7% data to new node in 20 hour , and i already set dfs.datanode.balance.bandwidthPerSec 10M ,and the threshold is 10%,why the balance task take long time ?