You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2011/12/02 06:35:57 UTC

Utilizing multiple hard disks for hadoop HDFS ?

Hi everyone,

So I have this blade server with 4x500 GB hard disks.
I want to use all these hard disks for hadoop HDFS.
How can I achieve this target ?

If I install hadoop on 1 hard disk and use other hard disk as normal
partitions eg.  -

/dev/sda1, -- HDD 1 -- Primary partition -- Linux + Hadoop installed on it
/dev/sda2, -- HDD 2 -- Mounted partition -- /mnt/dev/sda2
/dev/sda3, -- HDD3  -- Mounted partition -- /mnt/dev/sda3
/dev/sda4, -- HDD4  -- Mounted partition -- /mnt/dev/sda4

And if I create a hadoop.tmp.dir on each partition say --
"/tmp/hadoop-datastore/hadoop-hadoop"

and on core-site.xml, if I configure like --
<property>
    <name>hadoop.tmp.dir</name>

<value>/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda2/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda3/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda4/tmp/hadoop-datastore/hadoop-hadoop</value>
    <description>A base for other temporary directories.</description>
</property>

Will it work ??

Can I set the above property for dfs.data.dir also ?

Thanks,
Praveenesh

Re: Utilizing multiple hard disks for hadoop HDFS ?

Posted by Harsh J <ha...@cloudera.com>.
You need to apply comma-separated lists only to dfs.data.dir (HDFS) and mapred.local.dir (MR) directly. Make sure the subdirectories are different for each, else you may accidentally wipe away your data when you restart MR services.

The hadoop.tmp.dir property does not accept multiple paths and you should avoid using it in production - its more of a utility property that acts as a default base path for other properties.

On 02-Dec-2011, at 11:05 AM, praveenesh kumar wrote:

> Hi everyone,
> 
> So I have this blade server with 4x500 GB hard disks.
> I want to use all these hard disks for hadoop HDFS.
> How can I achieve this target ?
> 
> If I install hadoop on 1 hard disk and use other hard disk as normal
> partitions eg.  -
> 
> /dev/sda1, -- HDD 1 -- Primary partition -- Linux + Hadoop installed on it
> /dev/sda2, -- HDD 2 -- Mounted partition -- /mnt/dev/sda2
> /dev/sda3, -- HDD3  -- Mounted partition -- /mnt/dev/sda3
> /dev/sda4, -- HDD4  -- Mounted partition -- /mnt/dev/sda4
> 
> And if I create a hadoop.tmp.dir on each partition say --
> "/tmp/hadoop-datastore/hadoop-hadoop"
> 
> and on core-site.xml, if I configure like --
> <property>
>    <name>hadoop.tmp.dir</name>
> 
> <value>/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda2/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda3/tmp/hadoop-datastore/hadoop-hadoop,/mnt/dev/sda4/tmp/hadoop-datastore/hadoop-hadoop</value>
>    <description>A base for other temporary directories.</description>
> </property>
> 
> Will it work ??
> 
> Can I set the above property for dfs.data.dir also ?
> 
> Thanks,
> Praveenesh