You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mike Andrews <mr...@xoba.com> on 2009/04/23 21:54:24 UTC

sub-optimal multiple disk usage in 0.18.3?

i have a bunch of datanodes with several disks each, and i noticed
that sometimes dfs blocks don't get evenly distributed among them. for
instance, one of my machines has 5 disks with 500 gb each, and 1 disk
with 2 TB (6 total disks). the 5 smaller disks are each 98% full,
whereas the larger one is only 12% full. it seems as though dfs should
do better by putting more of the blocks on the larger disk first. and
mapreduce jobs are failing on this machine with error
"java.io.IOException: No space left on device".

any thoughts or suggestions? thanks in advance.

-- 
permanent contact information at http://mikerandrews.com

Re: sub-optimal multiple disk usage in 0.18.3?

Posted by jason hadoop <ja...@gmail.com>.
In theory the block allocation strategy is round robin amount the set of
storage locations that meet the minimum free space requirements.

On Thu, Apr 23, 2009 at 12:55 PM, Bhupesh Bansal <bb...@linkedin.com>wrote:

> What configuration are you using for the disks ??
>
> Best configuration is just doing a JBOD.
>
> http://www.nabble.com/RAID-vs.-JBOD-td21404366.html
>
> Best
> Bhupesh
>
>
>
> On 4/23/09 12:54 PM, "Mike Andrews" <mr...@xoba.com> wrote:
>
> > i have a bunch of datanodes with several disks each, and i noticed
> > that sometimes dfs blocks don't get evenly distributed among them. for
> > instance, one of my machines has 5 disks with 500 gb each, and 1 disk
> > with 2 TB (6 total disks). the 5 smaller disks are each 98% full,
> > whereas the larger one is only 12% full. it seems as though dfs should
> > do better by putting more of the blocks on the larger disk first. and
> > mapreduce jobs are failing on this machine with error
> > "java.io.IOException: No space left on device".
> >
> > any thoughts or suggestions? thanks in advance.
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422

Re: sub-optimal multiple disk usage in 0.18.3?

Posted by Bhupesh Bansal <bb...@linkedin.com>.
What configuration are you using for the disks ??

Best configuration is just doing a JBOD.

http://www.nabble.com/RAID-vs.-JBOD-td21404366.html

Best
Bhupesh



On 4/23/09 12:54 PM, "Mike Andrews" <mr...@xoba.com> wrote:

> i have a bunch of datanodes with several disks each, and i noticed
> that sometimes dfs blocks don't get evenly distributed among them. for
> instance, one of my machines has 5 disks with 500 gb each, and 1 disk
> with 2 TB (6 total disks). the 5 smaller disks are each 98% full,
> whereas the larger one is only 12% full. it seems as though dfs should
> do better by putting more of the blocks on the larger disk first. and
> mapreduce jobs are failing on this machine with error
> "java.io.IOException: No space left on device".
> 
> any thoughts or suggestions? thanks in advance.