You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Usman Waheed <us...@opera.com> on 2009/04/27 10:36:29 UTC
Balancing datanodes - Running hadoop 0.18.3
Hi,
I had sent out an email yesterday asking about how to balance the
cluster after setting the replication level to 2. I have 4 datanodes and
one namenode in my setup.
Using the -R switch with -setrep did the trick but one of my nodes
became under utilized. I then ran hadoop balancer and it did help but
upto a certain extent.
Datanode 4 noted below is now up to almost 5% but when i try to balance
the datanode again using the "hadoop balance" command it says that the
cluster is already balanced which isnt.
I wonder if there is an alternate way(s) or maybe overtime Datanode-4
will pick up more blocks?
Any clues?
Thanks,
Usman
Name: 1
State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 222235858599(206.97 GB)
Used raw bytes: 48140136448 (44.83 GB)
% used: 16.39%
Last contact: Mon Apr 27 08:34:46 UTC 2009
Name: 2
State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 231235100994(215.35 GB)
Used raw bytes: 40704245760 (37.91 GB)
% used: 13.86%
Last contact: Mon Apr 27 08:34:45 UTC 2009
Name: 3
State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 211936026161(197.38 GB)
Used raw bytes: 59591700480 (55.5 GB)
% used: 20.28%
Last contact: Mon Apr 27 08:34:45 UTC 2009
*Name: 4
*State : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 258876991693(241.1 GB)
Used raw bytes: 12142653440 (11.31 GB)
% used: 4.13%
Last contact: Mon Apr 27 08:34:46 UTC 2009
Re: Balancing datanodes - Running hadoop 0.18.3
Posted by Usman Waheed <us...@opera.com>.
Hi Tamir,
Thanks for the info, makes sense now :).
Cheers,
Usman
> Hi,
>
> The balancer works with the average utilization of all the nodes in the
> cluster - in your case it's about 13%. Only nodes that are +/- 10% off the
> average will be rebalanced. Node 4 isn't under-utilized because 13-10=3
> which is less than 4%. You can use a different threshold than the default
> 10% (hadoop balancer -threshold 5). Read more here:
> http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer
>
> Tamir
>
>
> On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed <us...@opera.com> wrote:
>
>
>> Hi,
>> I had sent out an email yesterday asking about how to balance the cluster
>> after setting the replication level to 2. I have 4 datanodes and one
>> namenode in my setup.
>> Using the -R switch with -setrep did the trick but one of my nodes became
>> under utilized. I then ran hadoop balancer and it did help but upto a
>> certain extent.
>>
>> Datanode 4 noted below is now up to almost 5% but when i try to balance the
>> datanode again using the "hadoop balance" command it says that the cluster
>> is already balanced which isnt.
>> I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will
>> pick up more blocks?
>>
>> Any clues?
>>
>> Thanks,
>> Usman
>>
>> Name: 1
>> State : In Service
>> Total raw bytes: 293778976768 (273.6 GB)
>> Remaining raw bytes: 222235858599(206.97 GB)
>> Used raw bytes: 48140136448 (44.83 GB)
>> % used: 16.39%
>> Last contact: Mon Apr 27 08:34:46 UTC 2009
>>
>>
>> Name: 2
>> State : In Service
>> Total raw bytes: 293778976768 (273.6 GB)
>> Remaining raw bytes: 231235100994(215.35 GB)
>> Used raw bytes: 40704245760 (37.91 GB)
>> % used: 13.86%
>> Last contact: Mon Apr 27 08:34:45 UTC 2009
>>
>>
>> Name: 3
>> State : In Service
>> Total raw bytes: 293778976768 (273.6 GB)
>> Remaining raw bytes: 211936026161(197.38 GB)
>> Used raw bytes: 59591700480 (55.5 GB)
>> % used: 20.28%
>> Last contact: Mon Apr 27 08:34:45 UTC 2009
>>
>>
>> *Name: 4
>> *State : In Service
>> Total raw bytes: 293778976768 (273.6 GB)
>> Remaining raw bytes: 258876991693(241.1 GB)
>> Used raw bytes: 12142653440 (11.31 GB)
>> % used: 4.13%
>> Last contact: Mon Apr 27 08:34:46 UTC 2009
>>
>>
>>
>
>
Re: Balancing datanodes - Running hadoop 0.18.3
Posted by Tamir Kamara <ta...@gmail.com>.
Hi,
The balancer works with the average utilization of all the nodes in the
cluster - in your case it's about 13%. Only nodes that are +/- 10% off the
average will be rebalanced. Node 4 isn't under-utilized because 13-10=3
which is less than 4%. You can use a different threshold than the default
10% (hadoop balancer -threshold 5). Read more here:
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer
Tamir
On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed <us...@opera.com> wrote:
> Hi,
> I had sent out an email yesterday asking about how to balance the cluster
> after setting the replication level to 2. I have 4 datanodes and one
> namenode in my setup.
> Using the -R switch with -setrep did the trick but one of my nodes became
> under utilized. I then ran hadoop balancer and it did help but upto a
> certain extent.
>
> Datanode 4 noted below is now up to almost 5% but when i try to balance the
> datanode again using the "hadoop balance" command it says that the cluster
> is already balanced which isnt.
> I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will
> pick up more blocks?
>
> Any clues?
>
> Thanks,
> Usman
>
> Name: 1
> State : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 222235858599(206.97 GB)
> Used raw bytes: 48140136448 (44.83 GB)
> % used: 16.39%
> Last contact: Mon Apr 27 08:34:46 UTC 2009
>
>
> Name: 2
> State : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 231235100994(215.35 GB)
> Used raw bytes: 40704245760 (37.91 GB)
> % used: 13.86%
> Last contact: Mon Apr 27 08:34:45 UTC 2009
>
>
> Name: 3
> State : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 211936026161(197.38 GB)
> Used raw bytes: 59591700480 (55.5 GB)
> % used: 20.28%
> Last contact: Mon Apr 27 08:34:45 UTC 2009
>
>
> *Name: 4
> *State : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 258876991693(241.1 GB)
> Used raw bytes: 12142653440 (11.31 GB)
> % used: 4.13%
> Last contact: Mon Apr 27 08:34:46 UTC 2009
>
>