You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by 何琦 <he...@mobicloud.com.cn> on 2013/05/02 09:33:35 UTC

the balance of datanodes

Hi,
I put 6.7G datas to the HDFS. It showed that all the datas were put to one datanode of the cluster.
I thought the datas would be put to all the datanodes.
My hadoop version is 1.0.4, and my dfs.replication value is 1.

Why this happen?Is it necessary for me to do some configuration about the balance?

The result is:
[BEFORE]:
Name: 135.224.99.69:50010                   
Decommission Status : Normal                   
Configured Capacity: 505882238976 (471.14 GB)           
DFS Used: 28672 (28 KB)                       
Non DFS Used: 28186583040 (26.25 GB)               
DFS Remaining: 477695627264(444.89 GB)               
DFS Used%: 0%                          
DFS Remaining%: 94.43%                      
Last contact: Thu May 02 14:57:47 CST 2013          

Name: 135.224.99.66:50010
Decommission Status : Normal
Configured Capacity: 154815664128 (144.18 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 15259037696 (14.21 GB)
DFS Remaining: 139556585472(129.97 GB)
DFS Used%: 0%
DFS Remaining%: 90.14%
Last contact: Thu May 02 14:57:48 CST 2013

[AFTER]:
Name: 135.224.99.69:50010                   
Decommission Status : Normal                  
Configured Capacity: 505882238976 (471.14 GB)         
DFS Used: 6883359 (6.56 MB)                   
Non DFS Used: 28249618401 (26.31 GB)              
DFS Remaining: 477625737216(444.82 GB)               
DFS Used%: 0%                           
DFS Remaining%: 94.41%                       
Last contact: Thu May 02 15:05:32 CST 2013          

Name: 135.224.99.66:50010
Decommission Status : Normal
Configured Capacity: 154815664128 (144.18 GB)
DFS Used: 7153051983 (6.66 GB)
Non DFS Used: 15267964593 (14.22 GB)
DFS Remaining: 132394647552(123.3 GB)
DFS Used%: 4.62%
DFS Remaining%: 85.52%
Last contact: Thu May 02 15:05:33 CST 2013

Re: the balance of datanodes

Posted by Harsh J <ha...@cloudera.com>.
With replication factor 1, what you see is expected, if you also did
your writes from a node that runs a DN (135.224.99.69 in your case -
you ran the data load here).

This is cause of the HDFS write optimization where if it finds a local
DN to write to, it will write there. That fact, coupled with your
rep-factor being just one, would obviously lead to just the local DN
filling up.

If you want to balance it, try lowering the default balancer threshold
to some value (such as 2% maybe) that will detect this small used
percentage.

On Thu, May 2, 2013 at 1:03 PM, 何琦 <he...@mobicloud.com.cn> wrote:
> Hi,
> I put 6.7G datas to the HDFS. It showed that all the datas were put to one
> datanode of the cluster.
> I thought the datas would be put to all the datanodes.
> My hadoop version is 1.0.4, and my dfs.replication value is 1.
>
> Why this happen?Is it necessary for me to do some configuration about the
> balance?
>
> The result is:
> [BEFORE]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 28186583040 (26.25 GB)
> DFS Remaining: 47 7695627264(444.89 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.43%
> Last contact: Thu May 02 14:57:47 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 40960 (40 KB)
> Non DFS Used: 15259037696 (14.21 GB)
> DFS Remaining: 139556585472(129.97 GB)
> DFS Used%: 0%
> DFS Remaining%: 90.14%
> Last contact: Thu May 02 14:57:48 CST 2013
>
> [AFTER]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 6883359 (6.56 MB)
> Non DFS Used: 28249618401 (26.31 GB)
> DFS Remaining: 477625737216(444.82 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.41%
> Last contact: Thu May 02 15:05:32 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 7153051983 (6.66 GB)
> Non DFS Used: 15267964593 (14.22 GB)
> DFS Remaining: 132394647552(123.3 GB)
> DFS Used%: 4.62%
> DFS Remaining%: 85.52%
> Last contact: Thu May 02 15:05:33 CST 2013



-- 
Harsh J

Re: the balance of datanodes

Posted by Harsh J <ha...@cloudera.com>.
With replication factor 1, what you see is expected, if you also did
your writes from a node that runs a DN (135.224.99.69 in your case -
you ran the data load here).

This is cause of the HDFS write optimization where if it finds a local
DN to write to, it will write there. That fact, coupled with your
rep-factor being just one, would obviously lead to just the local DN
filling up.

If you want to balance it, try lowering the default balancer threshold
to some value (such as 2% maybe) that will detect this small used
percentage.

On Thu, May 2, 2013 at 1:03 PM, 何琦 <he...@mobicloud.com.cn> wrote:
> Hi,
> I put 6.7G datas to the HDFS. It showed that all the datas were put to one
> datanode of the cluster.
> I thought the datas would be put to all the datanodes.
> My hadoop version is 1.0.4, and my dfs.replication value is 1.
>
> Why this happen?Is it necessary for me to do some configuration about the
> balance?
>
> The result is:
> [BEFORE]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 28186583040 (26.25 GB)
> DFS Remaining: 47 7695627264(444.89 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.43%
> Last contact: Thu May 02 14:57:47 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 40960 (40 KB)
> Non DFS Used: 15259037696 (14.21 GB)
> DFS Remaining: 139556585472(129.97 GB)
> DFS Used%: 0%
> DFS Remaining%: 90.14%
> Last contact: Thu May 02 14:57:48 CST 2013
>
> [AFTER]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 6883359 (6.56 MB)
> Non DFS Used: 28249618401 (26.31 GB)
> DFS Remaining: 477625737216(444.82 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.41%
> Last contact: Thu May 02 15:05:32 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 7153051983 (6.66 GB)
> Non DFS Used: 15267964593 (14.22 GB)
> DFS Remaining: 132394647552(123.3 GB)
> DFS Used%: 4.62%
> DFS Remaining%: 85.52%
> Last contact: Thu May 02 15:05:33 CST 2013



-- 
Harsh J

Re: the balance of datanodes

Posted by Harsh J <ha...@cloudera.com>.
With replication factor 1, what you see is expected, if you also did
your writes from a node that runs a DN (135.224.99.69 in your case -
you ran the data load here).

This is cause of the HDFS write optimization where if it finds a local
DN to write to, it will write there. That fact, coupled with your
rep-factor being just one, would obviously lead to just the local DN
filling up.

If you want to balance it, try lowering the default balancer threshold
to some value (such as 2% maybe) that will detect this small used
percentage.

On Thu, May 2, 2013 at 1:03 PM, 何琦 <he...@mobicloud.com.cn> wrote:
> Hi,
> I put 6.7G datas to the HDFS. It showed that all the datas were put to one
> datanode of the cluster.
> I thought the datas would be put to all the datanodes.
> My hadoop version is 1.0.4, and my dfs.replication value is 1.
>
> Why this happen?Is it necessary for me to do some configuration about the
> balance?
>
> The result is:
> [BEFORE]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 28186583040 (26.25 GB)
> DFS Remaining: 47 7695627264(444.89 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.43%
> Last contact: Thu May 02 14:57:47 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 40960 (40 KB)
> Non DFS Used: 15259037696 (14.21 GB)
> DFS Remaining: 139556585472(129.97 GB)
> DFS Used%: 0%
> DFS Remaining%: 90.14%
> Last contact: Thu May 02 14:57:48 CST 2013
>
> [AFTER]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 6883359 (6.56 MB)
> Non DFS Used: 28249618401 (26.31 GB)
> DFS Remaining: 477625737216(444.82 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.41%
> Last contact: Thu May 02 15:05:32 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 7153051983 (6.66 GB)
> Non DFS Used: 15267964593 (14.22 GB)
> DFS Remaining: 132394647552(123.3 GB)
> DFS Used%: 4.62%
> DFS Remaining%: 85.52%
> Last contact: Thu May 02 15:05:33 CST 2013



-- 
Harsh J

Re: the balance of datanodes

Posted by Harsh J <ha...@cloudera.com>.
With replication factor 1, what you see is expected, if you also did
your writes from a node that runs a DN (135.224.99.69 in your case -
you ran the data load here).

This is cause of the HDFS write optimization where if it finds a local
DN to write to, it will write there. That fact, coupled with your
rep-factor being just one, would obviously lead to just the local DN
filling up.

If you want to balance it, try lowering the default balancer threshold
to some value (such as 2% maybe) that will detect this small used
percentage.

On Thu, May 2, 2013 at 1:03 PM, 何琦 <he...@mobicloud.com.cn> wrote:
> Hi,
> I put 6.7G datas to the HDFS. It showed that all the datas were put to one
> datanode of the cluster.
> I thought the datas would be put to all the datanodes.
> My hadoop version is 1.0.4, and my dfs.replication value is 1.
>
> Why this happen?Is it necessary for me to do some configuration about the
> balance?
>
> The result is:
> [BEFORE]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 28672 (28 KB)
> Non DFS Used: 28186583040 (26.25 GB)
> DFS Remaining: 47 7695627264(444.89 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.43%
> Last contact: Thu May 02 14:57:47 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 40960 (40 KB)
> Non DFS Used: 15259037696 (14.21 GB)
> DFS Remaining: 139556585472(129.97 GB)
> DFS Used%: 0%
> DFS Remaining%: 90.14%
> Last contact: Thu May 02 14:57:48 CST 2013
>
> [AFTER]:
> Name: 135.224.99.69:50010
> Decommission Status : Normal
> Configured Capacity: 505882238976 (471.14 GB)
> DFS Used: 6883359 (6.56 MB)
> Non DFS Used: 28249618401 (26.31 GB)
> DFS Remaining: 477625737216(444.82 GB)
> DFS Used%: 0%
> DFS Remaining%: 94.41%
> Last contact: Thu May 02 15:05:32 CST 2013
>
> Name: 135.224.99.66:50010
> Decommission Status : Normal
> Configured Capacity: 154815664128 (144.18 GB)
> DFS Used: 7153051983 (6.66 GB)
> Non DFS Used: 15267964593 (14.22 GB)
> DFS Remaining: 132394647552(123.3 GB)
> DFS Used%: 4.62%
> DFS Remaining%: 85.52%
> Last contact: Thu May 02 15:05:33 CST 2013



-- 
Harsh J