You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Rafael Turk <ra...@gmail.com> on 2008/07/30 03:37:05 UTC

Hadoop 4 disks per server

Hi All,

 I´m setting up a cluster with 4 disks per server. Is there any way to make
Hadoop aware of this setup and take benefits from that?

 *** I´m not planning to set RAID in each node (only on the namenode server)
since HA is granted by the HDFS.

Thanks.
--Rafael

Re: Hadoop 4 disks per server

Posted by Rafael Turk <ra...@gmail.com>.

Thank you all!  it worked like a charm

On Wed, Jul 30, 2008 at 3:05 PM, Konstantin Shvachko <sh...@yahoo-inc.com>wrote:

> On hdfs see
> http://wiki.apache.org/hadoop/FAQ#15
> In addition to the James's suggestion you can also specify dfs.name.dir
> for the name-node to store extra copies of the namespace.
>
>
>
> James Moore wrote:
>
>> On Tue, Jul 29, 2008 at 6:37 PM, Rafael Turk <ra...@gmail.com>
>> wrote:
>>
>>> Hi All,
>>>
>>>  I´m setting up a cluster with 4 disks per server. Is there any way to
>>> make
>>> Hadoop aware of this setup and take benefits from that?
>>>
>>
>> I believe all you need to do is give four directories (one on each
>> drive) as  the value for dfs.data.dir and mapred.local.dir.  Something
>> like:
>>
>> <property>
>>  <name>dfs.data.dir</name>
>>
>>  <value>/drive1/myDfsDir,/drive2/myDfsDir,/drive3/myDfsDir,/drive4/myDfsDir</value>
>>  <description>Determines where on the local filesystem an DFS data node
>>  should store its blocks.  If this is a comma-delimited
>>  list of directories, then data will be stored in all named
>>  directories, typically on different devices.
>>  Directories that do not exist are ignored.
>>  </description>
>> </property>
>>
>>

Re: Hadoop 4 disks per server

Posted by Konstantin Shvachko <sh...@yahoo-inc.com>.

On hdfs see
http://wiki.apache.org/hadoop/FAQ#15
In addition to the James's suggestion you can also specify dfs.name.dir
for the name-node to store extra copies of the namespace.


James Moore wrote:
> On Tue, Jul 29, 2008 at 6:37 PM, Rafael Turk <ra...@gmail.com> wrote:
>> Hi All,
>>
>>  I´m setting up a cluster with 4 disks per server. Is there any way to make
>> Hadoop aware of this setup and take benefits from that?
> 
> I believe all you need to do is give four directories (one on each
> drive) as  the value for dfs.data.dir and mapred.local.dir.  Something
> like:
> 
> <property>
>   <name>dfs.data.dir</name>
>   <value>/drive1/myDfsDir,/drive2/myDfsDir,/drive3/myDfsDir,/drive4/myDfsDir</value>
>   <description>Determines where on the local filesystem an DFS data node
>   should store its blocks.  If this is a comma-delimited
>   list of directories, then data will be stored in all named
>   directories, typically on different devices.
>   Directories that do not exist are ignored.
>   </description>
> </property>
>

Re: Hadoop 4 disks per server

Posted by James Moore <ja...@gmail.com>.

On Tue, Jul 29, 2008 at 6:37 PM, Rafael Turk <ra...@gmail.com> wrote:
> Hi All,
>
>  I´m setting up a cluster with 4 disks per server. Is there any way to make
> Hadoop aware of this setup and take benefits from that?

I believe all you need to do is give four directories (one on each
drive) as  the value for dfs.data.dir and mapred.local.dir.  Something
like:

<property>
  <name>dfs.data.dir</name>
  <value>/drive1/myDfsDir,/drive2/myDfsDir,/drive3/myDfsDir,/drive4/myDfsDir</value>
  <description>Determines where on the local filesystem an DFS data node
  should store its blocks.  If this is a comma-delimited
  list of directories, then data will be stored in all named
  directories, typically on different devices.
  Directories that do not exist are ignored.
  </description>
</property>

-- 
James Moore | james@restphone.com
Ruby and Ruby on Rails consulting
blog.restphone.com

Re: Hadoop 4 disks per server

Posted by Allen Wittenauer <aw...@yahoo-inc.com>.

On 7/29/08 6:37 PM, "Rafael Turk" <ra...@gmail.com> wrote:
>  I´m setting up a cluster with 4 disks per server. Is there any way to make
> Hadoop aware of this setup and take benefits from that?

    This is how we run our nodes.  You just need to list the four file
systems in the configuration files and the datanode and map/red processes
will know what to do.