You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Patai Sangbutsarakum <si...@gmail.com> on 2011/10/24 21:09:59 UTC

balance blocks between small and bigger disks in the same datanode.

Hi All,

I was looking into FAQ, but well still have questions.
Datanodes in my production are running low in the space of one of dfs.data.dir


/dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
/dev/sdb1             --> 484G   324G   161G  67% /hadoop2
/dev/sdc1                   484G   318G   167G  66% /hadoop3

/hadoop1 has smaller space since the very beginning because its drive
is being shared with operating system.
I found one FAQ in wiki page
"3.12. On an individual data node, how do you balance the blocks on the disk?

Hadoop currently does not have a method by which to do this
automatically. To do this manually:

1    Take down the HDFS
2   Use the UNIX mv command to move the individual blocks and meta
pairs from one directory to another on each host
3    Restart the HDFS "


Question of step 1, take down the hdfs.
does that mean the whole cluster OR just datanode process of a
datanode/tasktracker host?

Question of step 2,

2.1 "moving blk and meta pair."

are blk and meta pairs referring to

cd /hadoop1/data/current
$ ls -al *8816473533602921489*
-rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
-rw-rw-r-- 1 apps apps      63 Aug 27 21:03
blk_-8816473533602921489_78445781.meta

???

2.2 "from one directory to another on each host"

does it needs to be like blk(and meta) from "current" has to be landed
to "current" directory of another dfs.data.dir
mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/

or it can be different directory name in destination side.


2.3 how about subdirXX?

under /hadoop1/data/current/
....
....
55G	subdir36
49G	subdir37
.....
.....

it is so tempting to move subdir36, subdir37 because they are huge.
should it look like

mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/

well... under /hadoop2/data/current/subdir36/
also have bunch of blk(and meta) and bunch of subdirectories as well
which mean if i do move, it might be some collide ?


Thanks in advances.
-P

Re: balance blocks between small and bigger disks in the same datanode.

Posted by Patai Sangbutsarakum <si...@gmail.com>.
Sorry for late big "Thank you", Harsh..

>You shouldn't be running into write errors with one full
> disk mount, as it will automatically be unselected for writes.
>

This gives me a big peace of mind.

Regards,
P


On Tue, Oct 25, 2011 at 10:42 AM, Harsh J <ha...@cloudera.com> wrote:
> Hi,
>
> The block writing mechanism does pay heed to remaining free space while
> choosing the disk. You shouldn't be running into write errors with one full
> disk mount, as it will automatically be unselected for writes.
>
> The free space measurement also takes the reservation property into account,
> which you have mentioned.
>
> And yes, I guess it could be better to decommission and recommission+balance
> if you can afford the time.
>
> The writes are round robin in nature, in terms of disk selection, btw.
>
> On Tuesday, October 25, 2011, Patai Sangbutsarakum <si...@gmail.com>
> wrote:
>> Good morning Harsh,
>> Thanks for late night reply ;-)
>>
>>>> Quick q: were some disks added later, as part of this datanode?
>> there is no new disks added.. i just planned to load off data blk from
>> that small partition to other bigger partitions,
>> but seem to me that bring down 130 nodes just for moving blk is sth
>> need to seriously considered, and later on
>> if i ran rebalance, /hadoop1 will be filled back again.
>>
>> Is there anyway to tell hadoop to stop using _a partition_ once free
>> space of a partition hit certain limit ?
>>
>> as far as I researched, it point to "dfs.datanode.du.reserved" which
>> in this case if i put dfs.datanode.du.reserved = (33G in byte)
>>
>> DFS still continue using /hadoop2, /hadoop3... but not fill more blk
>> on /hadoop1?
>>
>> Please suggest,
>> -Patai
>>
>>
>>
>> On Tue, Oct 25, 2011 at 1:49 AM, Harsh J <ha...@cloudera.com> wrote:
>>> Patai,
>>>
>>> 1. HDFS as the whole service.
>>> 2.1. Yes.
>>> 2.2. Yes, the directory parent must be current.
>>> 2.3. Yes you can move the whole subdirectory.
>>>
>>> Quick q: were some disks added later, as part of this datanode?
>>>
>>> On Tuesday, October 25, 2011, Patai Sangbutsarakum <
> silvianhadoop@gmail.com>
>>> wrote:
>>>> Hi All,
>>>>
>>>> I was looking into FAQ, but well still have questions.
>>>> Datanodes in my production are running low in the space of one of
>>> dfs.data.dir
>>>>
>>>>
>>>> /dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
>>>> /dev/sdb1             --> 484G   324G   161G  67% /hadoop2
>>>> /dev/sdc1                   484G   318G   167G  66% /hadoop3
>>>>
>>>> /hadoop1 has smaller space since the very beginning because its drive
>>>> is being shared with operating system.
>>>> I found one FAQ in wiki page
>>>> "3.12. On an individual data node, how do you balance the blocks on the
>>> disk?
>>>>
>>>> Hadoop currently does not have a method by which to do this
>>>> automatically. To do this manually:
>>>>
>>>> 1    Take down the HDFS
>>>> 2   Use the UNIX mv command to move the individual blocks and meta
>>>> pairs from one directory to another on each host
>>>> 3    Restart the HDFS "
>>>>
>>>>
>>>> Question of step 1, take down the hdfs.
>>>> does that mean the whole cluster OR just datanode process of a
>>>> datanode/tasktracker host?
>>>>
>>>> Question of step 2,
>>>>
>>>> 2.1 "moving blk and meta pair."
>>>>
>>>> are blk and meta pairs referring to
>>>>
>>>> cd /hadoop1/data/current
>>>> $ ls -al *8816473533602921489*
>>>> -rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
>>>> -rw-rw-r-- 1 apps apps      63 Aug 27 21:03
>>>> blk_-8816473533602921489_78445781.meta
>>>>
>>>> ???
>>>>
>>>> 2.2 "from one directory to another on each host"
>>>>
>>>> does it needs to be like blk(and meta) from "current" has to be landed
>>>> to "current" directory of another dfs.data.dir
>>>> mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/
>>>>
>>>> or it can be different directory name in destination side.
>>>>
>>>>
>>>> 2.3 how about subdirXX?
>>>>
>>>> under /hadoop1/data/current/
>>>> ....
>>>> ....
>>>> 55G     subdir36
>>>> 49G     subdir37
>>>> .....
>>>> .....
>>>>
>>>> it is so tempting to move subdir36, subdir37 because they are huge.
>>>> should it look like
>>>>
>>>> mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/
>>>>
>>>> well... under /hadoop2/data/current/subdir36/
>>>> also have bunch of blk(and meta) and bunch of subdirectories as well
>>>> which mean if i do move, it might be some collide ?
>>>>
>>>>
>>>> Thanks in advances.
>>>> -P
>>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>
> --
> Harsh J
>

Re: balance blocks between small and bigger disks in the same datanode.

Posted by Harsh J <ha...@cloudera.com>.
Hi,

The block writing mechanism does pay heed to remaining free space while
choosing the disk. You shouldn't be running into write errors with one full
disk mount, as it will automatically be unselected for writes.

The free space measurement also takes the reservation property into account,
which you have mentioned.

And yes, I guess it could be better to decommission and recommission+balance
if you can afford the time.

The writes are round robin in nature, in terms of disk selection, btw.

On Tuesday, October 25, 2011, Patai Sangbutsarakum <si...@gmail.com>
wrote:
> Good morning Harsh,
> Thanks for late night reply ;-)
>
>>> Quick q: were some disks added later, as part of this datanode?
> there is no new disks added.. i just planned to load off data blk from
> that small partition to other bigger partitions,
> but seem to me that bring down 130 nodes just for moving blk is sth
> need to seriously considered, and later on
> if i ran rebalance, /hadoop1 will be filled back again.
>
> Is there anyway to tell hadoop to stop using _a partition_ once free
> space of a partition hit certain limit ?
>
> as far as I researched, it point to "dfs.datanode.du.reserved" which
> in this case if i put dfs.datanode.du.reserved = (33G in byte)
>
> DFS still continue using /hadoop2, /hadoop3... but not fill more blk
> on /hadoop1?
>
> Please suggest,
> -Patai
>
>
>
> On Tue, Oct 25, 2011 at 1:49 AM, Harsh J <ha...@cloudera.com> wrote:
>> Patai,
>>
>> 1. HDFS as the whole service.
>> 2.1. Yes.
>> 2.2. Yes, the directory parent must be current.
>> 2.3. Yes you can move the whole subdirectory.
>>
>> Quick q: were some disks added later, as part of this datanode?
>>
>> On Tuesday, October 25, 2011, Patai Sangbutsarakum <
silvianhadoop@gmail.com>
>> wrote:
>>> Hi All,
>>>
>>> I was looking into FAQ, but well still have questions.
>>> Datanodes in my production are running low in the space of one of
>> dfs.data.dir
>>>
>>>
>>> /dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
>>> /dev/sdb1             --> 484G   324G   161G  67% /hadoop2
>>> /dev/sdc1                   484G   318G   167G  66% /hadoop3
>>>
>>> /hadoop1 has smaller space since the very beginning because its drive
>>> is being shared with operating system.
>>> I found one FAQ in wiki page
>>> "3.12. On an individual data node, how do you balance the blocks on the
>> disk?
>>>
>>> Hadoop currently does not have a method by which to do this
>>> automatically. To do this manually:
>>>
>>> 1    Take down the HDFS
>>> 2   Use the UNIX mv command to move the individual blocks and meta
>>> pairs from one directory to another on each host
>>> 3    Restart the HDFS "
>>>
>>>
>>> Question of step 1, take down the hdfs.
>>> does that mean the whole cluster OR just datanode process of a
>>> datanode/tasktracker host?
>>>
>>> Question of step 2,
>>>
>>> 2.1 "moving blk and meta pair."
>>>
>>> are blk and meta pairs referring to
>>>
>>> cd /hadoop1/data/current
>>> $ ls -al *8816473533602921489*
>>> -rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
>>> -rw-rw-r-- 1 apps apps      63 Aug 27 21:03
>>> blk_-8816473533602921489_78445781.meta
>>>
>>> ???
>>>
>>> 2.2 "from one directory to another on each host"
>>>
>>> does it needs to be like blk(and meta) from "current" has to be landed
>>> to "current" directory of another dfs.data.dir
>>> mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/
>>>
>>> or it can be different directory name in destination side.
>>>
>>>
>>> 2.3 how about subdirXX?
>>>
>>> under /hadoop1/data/current/
>>> ....
>>> ....
>>> 55G     subdir36
>>> 49G     subdir37
>>> .....
>>> .....
>>>
>>> it is so tempting to move subdir36, subdir37 because they are huge.
>>> should it look like
>>>
>>> mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/
>>>
>>> well... under /hadoop2/data/current/subdir36/
>>> also have bunch of blk(and meta) and bunch of subdirectories as well
>>> which mean if i do move, it might be some collide ?
>>>
>>>
>>> Thanks in advances.
>>> -P
>>>
>>
>> --
>> Harsh J
>>
>

-- 
Harsh J

Re: balance blocks between small and bigger disks in the same datanode.

Posted by Patai Sangbutsarakum <si...@gmail.com>.
Good morning Harsh,
Thanks for late night reply ;-)

>> Quick q: were some disks added later, as part of this datanode?
there is no new disks added.. i just planned to load off data blk from
that small partition to other bigger partitions,
but seem to me that bring down 130 nodes just for moving blk is sth
need to seriously considered, and later on
if i ran rebalance, /hadoop1 will be filled back again.

Is there anyway to tell hadoop to stop using _a partition_ once free
space of a partition hit certain limit ?

as far as I researched, it point to "dfs.datanode.du.reserved" which
in this case if i put dfs.datanode.du.reserved = (33G in byte)

DFS still continue using /hadoop2, /hadoop3... but not fill more blk
on /hadoop1?

Please suggest,
-Patai



On Tue, Oct 25, 2011 at 1:49 AM, Harsh J <ha...@cloudera.com> wrote:
> Patai,
>
> 1. HDFS as the whole service.
> 2.1. Yes.
> 2.2. Yes, the directory parent must be current.
> 2.3. Yes you can move the whole subdirectory.
>
> Quick q: were some disks added later, as part of this datanode?
>
> On Tuesday, October 25, 2011, Patai Sangbutsarakum <si...@gmail.com>
> wrote:
>> Hi All,
>>
>> I was looking into FAQ, but well still have questions.
>> Datanodes in my production are running low in the space of one of
> dfs.data.dir
>>
>>
>> /dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
>> /dev/sdb1             --> 484G   324G   161G  67% /hadoop2
>> /dev/sdc1                   484G   318G   167G  66% /hadoop3
>>
>> /hadoop1 has smaller space since the very beginning because its drive
>> is being shared with operating system.
>> I found one FAQ in wiki page
>> "3.12. On an individual data node, how do you balance the blocks on the
> disk?
>>
>> Hadoop currently does not have a method by which to do this
>> automatically. To do this manually:
>>
>> 1    Take down the HDFS
>> 2   Use the UNIX mv command to move the individual blocks and meta
>> pairs from one directory to another on each host
>> 3    Restart the HDFS "
>>
>>
>> Question of step 1, take down the hdfs.
>> does that mean the whole cluster OR just datanode process of a
>> datanode/tasktracker host?
>>
>> Question of step 2,
>>
>> 2.1 "moving blk and meta pair."
>>
>> are blk and meta pairs referring to
>>
>> cd /hadoop1/data/current
>> $ ls -al *8816473533602921489*
>> -rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
>> -rw-rw-r-- 1 apps apps      63 Aug 27 21:03
>> blk_-8816473533602921489_78445781.meta
>>
>> ???
>>
>> 2.2 "from one directory to another on each host"
>>
>> does it needs to be like blk(and meta) from "current" has to be landed
>> to "current" directory of another dfs.data.dir
>> mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/
>>
>> or it can be different directory name in destination side.
>>
>>
>> 2.3 how about subdirXX?
>>
>> under /hadoop1/data/current/
>> ....
>> ....
>> 55G     subdir36
>> 49G     subdir37
>> .....
>> .....
>>
>> it is so tempting to move subdir36, subdir37 because they are huge.
>> should it look like
>>
>> mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/
>>
>> well... under /hadoop2/data/current/subdir36/
>> also have bunch of blk(and meta) and bunch of subdirectories as well
>> which mean if i do move, it might be some collide ?
>>
>>
>> Thanks in advances.
>> -P
>>
>
> --
> Harsh J
>

Re: balance blocks between small and bigger disks in the same datanode.

Posted by Harsh J <ha...@cloudera.com>.
Patai,

1. HDFS as the whole service.
2.1. Yes.
2.2. Yes, the directory parent must be current.
2.3. Yes you can move the whole subdirectory.

Quick q: were some disks added later, as part of this datanode?

On Tuesday, October 25, 2011, Patai Sangbutsarakum <si...@gmail.com>
wrote:
> Hi All,
>
> I was looking into FAQ, but well still have questions.
> Datanodes in my production are running low in the space of one of
dfs.data.dir
>
>
> /dev/sda5             --> 355G   322G    33G  91% /hadoop1  <----
> /dev/sdb1             --> 484G   324G   161G  67% /hadoop2
> /dev/sdc1                   484G   318G   167G  66% /hadoop3
>
> /hadoop1 has smaller space since the very beginning because its drive
> is being shared with operating system.
> I found one FAQ in wiki page
> "3.12. On an individual data node, how do you balance the blocks on the
disk?
>
> Hadoop currently does not have a method by which to do this
> automatically. To do this manually:
>
> 1    Take down the HDFS
> 2   Use the UNIX mv command to move the individual blocks and meta
> pairs from one directory to another on each host
> 3    Restart the HDFS "
>
>
> Question of step 1, take down the hdfs.
> does that mean the whole cluster OR just datanode process of a
> datanode/tasktracker host?
>
> Question of step 2,
>
> 2.1 "moving blk and meta pair."
>
> are blk and meta pairs referring to
>
> cd /hadoop1/data/current
> $ ls -al *8816473533602921489*
> -rw-rw-r-- 1 apps apps 1734467 Aug 27 21:03 blk_-8816473533602921489
> -rw-rw-r-- 1 apps apps      63 Aug 27 21:03
> blk_-8816473533602921489_78445781.meta
>
> ???
>
> 2.2 "from one directory to another on each host"
>
> does it needs to be like blk(and meta) from "current" has to be landed
> to "current" directory of another dfs.data.dir
> mv /hadoop1/data/current/*8816473533602921489* /hadoop2/data/current/
>
> or it can be different directory name in destination side.
>
>
> 2.3 how about subdirXX?
>
> under /hadoop1/data/current/
> ....
> ....
> 55G     subdir36
> 49G     subdir37
> .....
> .....
>
> it is so tempting to move subdir36, subdir37 because they are huge.
> should it look like
>
> mv /hadoop1/data/current/subdir36/*  /hadoop2/data/current/subdir36/
>
> well... under /hadoop2/data/current/subdir36/
> also have bunch of blk(and meta) and bunch of subdirectories as well
> which mean if i do move, it might be some collide ?
>
>
> Thanks in advances.
> -P
>

-- 
Harsh J