You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Tao Xiao <xi...@gmail.com> on 2013/12/16 08:02:20 UTC

How to set "hadoop.tmp.dir" if I have multiple disks per node?

I have ten disks per node,and I don't know what value I should set to
"hadoop.tmp.dir". Some said this property refers to a location in local
disk while some other said it refers to a directory in HDFS. I'm confused,
who can explain it ?

I want to spread I/O since I have ten disks per node, so should I set a
comma-separated list of directories (which are on different disks) to
"hadoop.tmp.dir" ?

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
yes, hadoop.tmp.dir is both local and hdfs .


2013/12/17 Raviteja Chirala <rt...@gmail.com>

> If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount
> dir, create same in hdfs.
> ―
> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPad
>
>
> On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>wrote:
>
>> Thanks very much, I suppose I know what I should do with
>>
>>
>> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>>> spreading the disk I/O
>>>
>>>
>>>
>>> This is the default base directory ( its single directory not multiple)
>>> used in case you didn’t configure your own directories for processes such
>>> as NameNode, DataNode and NodeManager.
>>>
>>>
>>>
>>> Exact configurations where you need to configure comma separated values
>>> are as follows.
>>>
>>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>>
>>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>>
>>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>>> *yarn-site.xml*
>>>
>>>
>>>
>>> Please note all above configurations are for Hadoop 2.x
>>>
>>>
>>>
>>> Configure different subdirectories if you are using same disk for
>>> multiple processes.
>>>
>>>                 Ex: /hadoop/data1/dfs/data
>>>
>>>                         And
>>>
>>>                      /hadoop/data1/yarn/nm-local-dir
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Vinayakumar B
>>>
>>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>>> *Sent:* 16 December 2013 14:42
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>>> node?
>>>
>>>
>>>
>>> Thanks.
>>>
>>> In order to spread I/O among multiple disks, should I assign a
>>> comma-separated list of directories which are located on different disks to
>>> "hadoop.tmp.dir"?
>>>
>>> for example,
>>>
>>>  <property>
>>>
>>>       <name>hadoop.tmp.dir</name>
>>>
>>>
>>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>>
>>>  </property>
>>>
>>>
>>>
>>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>>
>>> hadoop.tmp.dir is a directory created on local file system
>>> For example if you have set hadoop.tmp.dir property to
>>> /home/training/hadoop
>>>
>>> This directory will be created when you format the namenode by running
>>> the command
>>> hadoop namenode -format
>>>
>>> When you open this folder
>>>
>>>
>>> you will see two subfolders dfs and mapred.
>>>
>>> the /home/training/hadoop/mapred folder will be on HDFS also
>>>
>>> Hope this clears
>>> Regards,
>>> Som Shekhar Sharma
>>> +91-8197243810
>>>
>>>
>>>
>>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > Make sure to also set mapred.local.dir to the same set of output
>>> > directories, this is were the intermediate key-value pairs are stored!
>>> >
>>> > Regards, Dieter
>>> >
>>> >
>>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>>> >>
>>> >> I have ten disks per node,and I don't know what value I should set to
>>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>>> local disk
>>> >> while some other said it refers to a directory in HDFS. I'm confused,
>>> who
>>> >> can explain it ?
>>> >>
>>> >> I want to spread I/O since I have ten disks per node, so should I set
>>> a
>>> >> comma-separated list of directories (which are on different disks) to
>>> >> "hadoop.tmp.dir" ?
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
yes, hadoop.tmp.dir is both local and hdfs .


2013/12/17 Raviteja Chirala <rt...@gmail.com>

> If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount
> dir, create same in hdfs.
> ―
> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPad
>
>
> On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>wrote:
>
>> Thanks very much, I suppose I know what I should do with
>>
>>
>> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>>> spreading the disk I/O
>>>
>>>
>>>
>>> This is the default base directory ( its single directory not multiple)
>>> used in case you didn’t configure your own directories for processes such
>>> as NameNode, DataNode and NodeManager.
>>>
>>>
>>>
>>> Exact configurations where you need to configure comma separated values
>>> are as follows.
>>>
>>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>>
>>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>>
>>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>>> *yarn-site.xml*
>>>
>>>
>>>
>>> Please note all above configurations are for Hadoop 2.x
>>>
>>>
>>>
>>> Configure different subdirectories if you are using same disk for
>>> multiple processes.
>>>
>>>                 Ex: /hadoop/data1/dfs/data
>>>
>>>                         And
>>>
>>>                      /hadoop/data1/yarn/nm-local-dir
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Vinayakumar B
>>>
>>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>>> *Sent:* 16 December 2013 14:42
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>>> node?
>>>
>>>
>>>
>>> Thanks.
>>>
>>> In order to spread I/O among multiple disks, should I assign a
>>> comma-separated list of directories which are located on different disks to
>>> "hadoop.tmp.dir"?
>>>
>>> for example,
>>>
>>>  <property>
>>>
>>>       <name>hadoop.tmp.dir</name>
>>>
>>>
>>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>>
>>>  </property>
>>>
>>>
>>>
>>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>>
>>> hadoop.tmp.dir is a directory created on local file system
>>> For example if you have set hadoop.tmp.dir property to
>>> /home/training/hadoop
>>>
>>> This directory will be created when you format the namenode by running
>>> the command
>>> hadoop namenode -format
>>>
>>> When you open this folder
>>>
>>>
>>> you will see two subfolders dfs and mapred.
>>>
>>> the /home/training/hadoop/mapred folder will be on HDFS also
>>>
>>> Hope this clears
>>> Regards,
>>> Som Shekhar Sharma
>>> +91-8197243810
>>>
>>>
>>>
>>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > Make sure to also set mapred.local.dir to the same set of output
>>> > directories, this is were the intermediate key-value pairs are stored!
>>> >
>>> > Regards, Dieter
>>> >
>>> >
>>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>>> >>
>>> >> I have ten disks per node,and I don't know what value I should set to
>>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>>> local disk
>>> >> while some other said it refers to a directory in HDFS. I'm confused,
>>> who
>>> >> can explain it ?
>>> >>
>>> >> I want to spread I/O since I have ten disks per node, so should I set
>>> a
>>> >> comma-separated list of directories (which are on different disks) to
>>> >> "hadoop.tmp.dir" ?
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
yes, hadoop.tmp.dir is both local and hdfs .


2013/12/17 Raviteja Chirala <rt...@gmail.com>

> If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount
> dir, create same in hdfs.
> ―
> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPad
>
>
> On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>wrote:
>
>> Thanks very much, I suppose I know what I should do with
>>
>>
>> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>>> spreading the disk I/O
>>>
>>>
>>>
>>> This is the default base directory ( its single directory not multiple)
>>> used in case you didn’t configure your own directories for processes such
>>> as NameNode, DataNode and NodeManager.
>>>
>>>
>>>
>>> Exact configurations where you need to configure comma separated values
>>> are as follows.
>>>
>>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>>
>>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>>
>>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>>> *yarn-site.xml*
>>>
>>>
>>>
>>> Please note all above configurations are for Hadoop 2.x
>>>
>>>
>>>
>>> Configure different subdirectories if you are using same disk for
>>> multiple processes.
>>>
>>>                 Ex: /hadoop/data1/dfs/data
>>>
>>>                         And
>>>
>>>                      /hadoop/data1/yarn/nm-local-dir
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Vinayakumar B
>>>
>>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>>> *Sent:* 16 December 2013 14:42
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>>> node?
>>>
>>>
>>>
>>> Thanks.
>>>
>>> In order to spread I/O among multiple disks, should I assign a
>>> comma-separated list of directories which are located on different disks to
>>> "hadoop.tmp.dir"?
>>>
>>> for example,
>>>
>>>  <property>
>>>
>>>       <name>hadoop.tmp.dir</name>
>>>
>>>
>>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>>
>>>  </property>
>>>
>>>
>>>
>>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>>
>>> hadoop.tmp.dir is a directory created on local file system
>>> For example if you have set hadoop.tmp.dir property to
>>> /home/training/hadoop
>>>
>>> This directory will be created when you format the namenode by running
>>> the command
>>> hadoop namenode -format
>>>
>>> When you open this folder
>>>
>>>
>>> you will see two subfolders dfs and mapred.
>>>
>>> the /home/training/hadoop/mapred folder will be on HDFS also
>>>
>>> Hope this clears
>>> Regards,
>>> Som Shekhar Sharma
>>> +91-8197243810
>>>
>>>
>>>
>>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > Make sure to also set mapred.local.dir to the same set of output
>>> > directories, this is were the intermediate key-value pairs are stored!
>>> >
>>> > Regards, Dieter
>>> >
>>> >
>>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>>> >>
>>> >> I have ten disks per node,and I don't know what value I should set to
>>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>>> local disk
>>> >> while some other said it refers to a directory in HDFS. I'm confused,
>>> who
>>> >> can explain it ?
>>> >>
>>> >> I want to spread I/O since I have ten disks per node, so should I set
>>> a
>>> >> comma-separated list of directories (which are on different disks) to
>>> >> "hadoop.tmp.dir" ?
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
yes, hadoop.tmp.dir is both local and hdfs .


2013/12/17 Raviteja Chirala <rt...@gmail.com>

> If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount
> dir, create same in hdfs.
> ―
> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPad
>
>
> On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>wrote:
>
>> Thanks very much, I suppose I know what I should do with
>>
>>
>> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>>> spreading the disk I/O
>>>
>>>
>>>
>>> This is the default base directory ( its single directory not multiple)
>>> used in case you didn’t configure your own directories for processes such
>>> as NameNode, DataNode and NodeManager.
>>>
>>>
>>>
>>> Exact configurations where you need to configure comma separated values
>>> are as follows.
>>>
>>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>>
>>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>>
>>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>>> *yarn-site.xml*
>>>
>>>
>>>
>>> Please note all above configurations are for Hadoop 2.x
>>>
>>>
>>>
>>> Configure different subdirectories if you are using same disk for
>>> multiple processes.
>>>
>>>                 Ex: /hadoop/data1/dfs/data
>>>
>>>                         And
>>>
>>>                      /hadoop/data1/yarn/nm-local-dir
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Vinayakumar B
>>>
>>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>>> *Sent:* 16 December 2013 14:42
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>>> node?
>>>
>>>
>>>
>>> Thanks.
>>>
>>> In order to spread I/O among multiple disks, should I assign a
>>> comma-separated list of directories which are located on different disks to
>>> "hadoop.tmp.dir"?
>>>
>>> for example,
>>>
>>>  <property>
>>>
>>>       <name>hadoop.tmp.dir</name>
>>>
>>>
>>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>>
>>>  </property>
>>>
>>>
>>>
>>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>>
>>> hadoop.tmp.dir is a directory created on local file system
>>> For example if you have set hadoop.tmp.dir property to
>>> /home/training/hadoop
>>>
>>> This directory will be created when you format the namenode by running
>>> the command
>>> hadoop namenode -format
>>>
>>> When you open this folder
>>>
>>>
>>> you will see two subfolders dfs and mapred.
>>>
>>> the /home/training/hadoop/mapred folder will be on HDFS also
>>>
>>> Hope this clears
>>> Regards,
>>> Som Shekhar Sharma
>>> +91-8197243810
>>>
>>>
>>>
>>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > Make sure to also set mapred.local.dir to the same set of output
>>> > directories, this is were the intermediate key-value pairs are stored!
>>> >
>>> > Regards, Dieter
>>> >
>>> >
>>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>>> >>
>>> >> I have ten disks per node,and I don't know what value I should set to
>>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>>> local disk
>>> >> while some other said it refers to a directory in HDFS. I'm confused,
>>> who
>>> >> can explain it ?
>>> >>
>>> >> I want to spread I/O since I have ten disks per node, so should I set
>>> a
>>> >> comma-separated list of directories (which are on different disks) to
>>> >> "hadoop.tmp.dir" ?
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Raviteja Chirala <rt...@gmail.com>.
If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount dir, create same in hdfs. 

—
Sent from Mailbox for iPad

On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>
wrote:

> Thanks very much, I suppose I know what I should do with
> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>  Hi,
>>
>>
>>
>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>> spreading the disk I/O
>>
>>
>>
>> This is the default base directory ( its single directory not multiple)
>> used in case you didn’t configure your own directories for processes such
>> as NameNode, DataNode and NodeManager.
>>
>>
>>
>> Exact configurations where you need to configure comma separated values
>> are as follows.
>>
>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>
>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>
>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>> *yarn-site.xml*
>>
>>
>>
>> Please note all above configurations are for Hadoop 2.x
>>
>>
>>
>> Configure different subdirectories if you are using same disk for multiple
>> processes.
>>
>>                 Ex: /hadoop/data1/dfs/data
>>
>>                         And
>>
>>                      /hadoop/data1/yarn/nm-local-dir
>>
>>
>>
>>
>>
>> Cheers,
>>
>> Vinayakumar B
>>
>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>> *Sent:* 16 December 2013 14:42
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>> node?
>>
>>
>>
>> Thanks.
>>
>> In order to spread I/O among multiple disks, should I assign a
>> comma-separated list of directories which are located on different disks to
>> "hadoop.tmp.dir"?
>>
>> for example,
>>
>>  <property>
>>
>>       <name>hadoop.tmp.dir</name>
>>
>>
>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>
>>  </property>
>>
>>
>>
>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in local
>> disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>>
>>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Raviteja Chirala <rt...@gmail.com>.
If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount dir, create same in hdfs. 

—
Sent from Mailbox for iPad

On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>
wrote:

> Thanks very much, I suppose I know what I should do with
> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>  Hi,
>>
>>
>>
>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>> spreading the disk I/O
>>
>>
>>
>> This is the default base directory ( its single directory not multiple)
>> used in case you didn’t configure your own directories for processes such
>> as NameNode, DataNode and NodeManager.
>>
>>
>>
>> Exact configurations where you need to configure comma separated values
>> are as follows.
>>
>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>
>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>
>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>> *yarn-site.xml*
>>
>>
>>
>> Please note all above configurations are for Hadoop 2.x
>>
>>
>>
>> Configure different subdirectories if you are using same disk for multiple
>> processes.
>>
>>                 Ex: /hadoop/data1/dfs/data
>>
>>                         And
>>
>>                      /hadoop/data1/yarn/nm-local-dir
>>
>>
>>
>>
>>
>> Cheers,
>>
>> Vinayakumar B
>>
>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>> *Sent:* 16 December 2013 14:42
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>> node?
>>
>>
>>
>> Thanks.
>>
>> In order to spread I/O among multiple disks, should I assign a
>> comma-separated list of directories which are located on different disks to
>> "hadoop.tmp.dir"?
>>
>> for example,
>>
>>  <property>
>>
>>       <name>hadoop.tmp.dir</name>
>>
>>
>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>
>>  </property>
>>
>>
>>
>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in local
>> disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>>
>>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Raviteja Chirala <rt...@gmail.com>.
If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount dir, create same in hdfs. 

—
Sent from Mailbox for iPad

On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>
wrote:

> Thanks very much, I suppose I know what I should do with
> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>  Hi,
>>
>>
>>
>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>> spreading the disk I/O
>>
>>
>>
>> This is the default base directory ( its single directory not multiple)
>> used in case you didn’t configure your own directories for processes such
>> as NameNode, DataNode and NodeManager.
>>
>>
>>
>> Exact configurations where you need to configure comma separated values
>> are as follows.
>>
>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>
>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>
>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>> *yarn-site.xml*
>>
>>
>>
>> Please note all above configurations are for Hadoop 2.x
>>
>>
>>
>> Configure different subdirectories if you are using same disk for multiple
>> processes.
>>
>>                 Ex: /hadoop/data1/dfs/data
>>
>>                         And
>>
>>                      /hadoop/data1/yarn/nm-local-dir
>>
>>
>>
>>
>>
>> Cheers,
>>
>> Vinayakumar B
>>
>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>> *Sent:* 16 December 2013 14:42
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>> node?
>>
>>
>>
>> Thanks.
>>
>> In order to spread I/O among multiple disks, should I assign a
>> comma-separated list of directories which are located on different disks to
>> "hadoop.tmp.dir"?
>>
>> for example,
>>
>>  <property>
>>
>>       <name>hadoop.tmp.dir</name>
>>
>>
>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>
>>  </property>
>>
>>
>>
>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in local
>> disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>>
>>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Raviteja Chirala <rt...@gmail.com>.
If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount dir, create same in hdfs. 

—
Sent from Mailbox for iPad

On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xi...@gmail.com>
wrote:

> Thanks very much, I suppose I know what I should do with
> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:
>>  Hi,
>>
>>
>>
>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>> spreading the disk I/O
>>
>>
>>
>> This is the default base directory ( its single directory not multiple)
>> used in case you didn’t configure your own directories for processes such
>> as NameNode, DataNode and NodeManager.
>>
>>
>>
>> Exact configurations where you need to configure comma separated values
>> are as follows.
>>
>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>
>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>
>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>> *yarn-site.xml*
>>
>>
>>
>> Please note all above configurations are for Hadoop 2.x
>>
>>
>>
>> Configure different subdirectories if you are using same disk for multiple
>> processes.
>>
>>                 Ex: /hadoop/data1/dfs/data
>>
>>                         And
>>
>>                      /hadoop/data1/yarn/nm-local-dir
>>
>>
>>
>>
>>
>> Cheers,
>>
>> Vinayakumar B
>>
>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>> *Sent:* 16 December 2013 14:42
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>> node?
>>
>>
>>
>> Thanks.
>>
>> In order to spread I/O among multiple disks, should I assign a
>> comma-separated list of directories which are located on different disks to
>> "hadoop.tmp.dir"?
>>
>> for example,
>>
>>  <property>
>>
>>       <name>hadoop.tmp.dir</name>
>>
>>
>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>>
>>  </property>
>>
>>
>>
>> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in local
>> disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>>
>>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks very much, I suppose I know what I should do with


On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:

>  Hi,
>
>
>
> *hadoop.tmp.dir* is not the exact configuration you are looking for
> spreading the disk I/O
>
>
>
> This is the default base directory ( its single directory not multiple)
> used in case you didn’t configure your own directories for processes such
> as NameNode, DataNode and NodeManager.
>
>
>
> Exact configurations where you need to configure comma separated values
> are as follows.
>
>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>
> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>
> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
> *yarn-site.xml*
>
>
>
> Please note all above configurations are for Hadoop 2.x
>
>
>
> Configure different subdirectories if you are using same disk for multiple
> processes.
>
>                 Ex: /hadoop/data1/dfs/data
>
>                         And
>
>                      /hadoop/data1/yarn/nm-local-dir
>
>
>
>
>
> Cheers,
>
> Vinayakumar B
>
> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> *Sent:* 16 December 2013 14:42
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
> node?
>
>
>
> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>
>       <name>hadoop.tmp.dir</name>
>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>
>  </property>
>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks very much, I suppose I know what I should do with


On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:

>  Hi,
>
>
>
> *hadoop.tmp.dir* is not the exact configuration you are looking for
> spreading the disk I/O
>
>
>
> This is the default base directory ( its single directory not multiple)
> used in case you didn’t configure your own directories for processes such
> as NameNode, DataNode and NodeManager.
>
>
>
> Exact configurations where you need to configure comma separated values
> are as follows.
>
>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>
> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>
> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
> *yarn-site.xml*
>
>
>
> Please note all above configurations are for Hadoop 2.x
>
>
>
> Configure different subdirectories if you are using same disk for multiple
> processes.
>
>                 Ex: /hadoop/data1/dfs/data
>
>                         And
>
>                      /hadoop/data1/yarn/nm-local-dir
>
>
>
>
>
> Cheers,
>
> Vinayakumar B
>
> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> *Sent:* 16 December 2013 14:42
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
> node?
>
>
>
> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>
>       <name>hadoop.tmp.dir</name>
>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>
>  </property>
>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks very much, I suppose I know what I should do with


On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:

>  Hi,
>
>
>
> *hadoop.tmp.dir* is not the exact configuration you are looking for
> spreading the disk I/O
>
>
>
> This is the default base directory ( its single directory not multiple)
> used in case you didn’t configure your own directories for processes such
> as NameNode, DataNode and NodeManager.
>
>
>
> Exact configurations where you need to configure comma separated values
> are as follows.
>
>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>
> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>
> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
> *yarn-site.xml*
>
>
>
> Please note all above configurations are for Hadoop 2.x
>
>
>
> Configure different subdirectories if you are using same disk for multiple
> processes.
>
>                 Ex: /hadoop/data1/dfs/data
>
>                         And
>
>                      /hadoop/data1/yarn/nm-local-dir
>
>
>
>
>
> Cheers,
>
> Vinayakumar B
>
> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> *Sent:* 16 December 2013 14:42
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
> node?
>
>
>
> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>
>       <name>hadoop.tmp.dir</name>
>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>
>  </property>
>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks very much, I suppose I know what I should do with


On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vi...@huawei.com>wrote:

>  Hi,
>
>
>
> *hadoop.tmp.dir* is not the exact configuration you are looking for
> spreading the disk I/O
>
>
>
> This is the default base directory ( its single directory not multiple)
> used in case you didn’t configure your own directories for processes such
> as NameNode, DataNode and NodeManager.
>
>
>
> Exact configurations where you need to configure comma separated values
> are as follows.
>
>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>
> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>
> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
> *yarn-site.xml*
>
>
>
> Please note all above configurations are for Hadoop 2.x
>
>
>
> Configure different subdirectories if you are using same disk for multiple
> processes.
>
>                 Ex: /hadoop/data1/dfs/data
>
>                         And
>
>                      /hadoop/data1/yarn/nm-local-dir
>
>
>
>
>
> Cheers,
>
> Vinayakumar B
>
> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
> *Sent:* 16 December 2013 14:42
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
> node?
>
>
>
> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>
>       <name>hadoop.tmp.dir</name>
>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>
>  </property>
>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>
>
>

RE: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

hadoop.tmp.dir is not the exact configuration you are looking for spreading the disk I/O

This is the default base directory ( its single directory not multiple) used in case you didn’t configure your own directories for processes such as NameNode, DataNode and NodeManager.

Exact configurations where you need to configure comma separated values are as follows.

1.       dfs.namenode.name.dir for  namenode in hdfs-site.xml

2.       dfs.datanode.data.dir for datanode in hdfs-site.xml

3.       yarn.nodemanager.local-dirs for NodeManager in yarn-site.xml

Please note all above configurations are for Hadoop 2.x

Configure different subdirectories if you are using same disk for multiple processes.
                Ex: /hadoop/data1/dfs/data
                        And
                     /hadoop/data1/yarn/nm-local-dir


Cheers,
Vinayakumar B
From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
Sent: 16 December 2013 14:42
To: user@hadoop.apache.org
Subject: Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Thanks.

In order to spread I/O among multiple disks, should I assign a comma-separated list of directories which are located on different disks to "hadoop.tmp.dir"?
for example,
 <property>
      <name>hadoop.tmp.dir</name>
      <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>

2013/12/16 Shekhar Sharma <sh...@gmail.com>>
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>


RE: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

hadoop.tmp.dir is not the exact configuration you are looking for spreading the disk I/O

This is the default base directory ( its single directory not multiple) used in case you didn’t configure your own directories for processes such as NameNode, DataNode and NodeManager.

Exact configurations where you need to configure comma separated values are as follows.

1.       dfs.namenode.name.dir for  namenode in hdfs-site.xml

2.       dfs.datanode.data.dir for datanode in hdfs-site.xml

3.       yarn.nodemanager.local-dirs for NodeManager in yarn-site.xml

Please note all above configurations are for Hadoop 2.x

Configure different subdirectories if you are using same disk for multiple processes.
                Ex: /hadoop/data1/dfs/data
                        And
                     /hadoop/data1/yarn/nm-local-dir


Cheers,
Vinayakumar B
From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
Sent: 16 December 2013 14:42
To: user@hadoop.apache.org
Subject: Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Thanks.

In order to spread I/O among multiple disks, should I assign a comma-separated list of directories which are located on different disks to "hadoop.tmp.dir"?
for example,
 <property>
      <name>hadoop.tmp.dir</name>
      <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>

2013/12/16 Shekhar Sharma <sh...@gmail.com>>
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>


Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Tao,

No, you need to set mapred.local.dir  in the mapred-site.xml with comma
separated list of path to spread I/O .


On Mon, Dec 16, 2013 at 5:11 PM, Tao Xiao <xi...@gmail.com> wrote:

> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>       <name>hadoop.tmp.dir</name>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>  </property>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>> local disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Tao,

No, you need to set mapred.local.dir  in the mapred-site.xml with comma
separated list of path to spread I/O .


On Mon, Dec 16, 2013 at 5:11 PM, Tao Xiao <xi...@gmail.com> wrote:

> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>       <name>hadoop.tmp.dir</name>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>  </property>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>> local disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Tao,

No, you need to set mapred.local.dir  in the mapred-site.xml with comma
separated list of path to spread I/O .


On Mon, Dec 16, 2013 at 5:11 PM, Tao Xiao <xi...@gmail.com> wrote:

> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>       <name>hadoop.tmp.dir</name>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>  </property>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>> local disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>
>

RE: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

hadoop.tmp.dir is not the exact configuration you are looking for spreading the disk I/O

This is the default base directory ( its single directory not multiple) used in case you didn’t configure your own directories for processes such as NameNode, DataNode and NodeManager.

Exact configurations where you need to configure comma separated values are as follows.

1.       dfs.namenode.name.dir for  namenode in hdfs-site.xml

2.       dfs.datanode.data.dir for datanode in hdfs-site.xml

3.       yarn.nodemanager.local-dirs for NodeManager in yarn-site.xml

Please note all above configurations are for Hadoop 2.x

Configure different subdirectories if you are using same disk for multiple processes.
                Ex: /hadoop/data1/dfs/data
                        And
                     /hadoop/data1/yarn/nm-local-dir


Cheers,
Vinayakumar B
From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
Sent: 16 December 2013 14:42
To: user@hadoop.apache.org
Subject: Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Thanks.

In order to spread I/O among multiple disks, should I assign a comma-separated list of directories which are located on different disks to "hadoop.tmp.dir"?
for example,
 <property>
      <name>hadoop.tmp.dir</name>
      <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>

2013/12/16 Shekhar Sharma <sh...@gmail.com>>
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>


Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Azuryy Yu <az...@gmail.com>.
Hi Tao,

No, you need to set mapred.local.dir  in the mapred-site.xml with comma
separated list of path to spread I/O .


On Mon, Dec 16, 2013 at 5:11 PM, Tao Xiao <xi...@gmail.com> wrote:

> Thanks.
>
> In order to spread I/O among multiple disks, should I assign a
> comma-separated list of directories which are located on different disks to
> "hadoop.tmp.dir"?
>
> for example,
>
>  <property>
>       <name>hadoop.tmp.dir</name>
>
> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
>  </property>
>
>
> 2013/12/16 Shekhar Sharma <sh...@gmail.com>
>
>> hadoop.tmp.dir is a directory created on local file system
>> For example if you have set hadoop.tmp.dir property to
>> /home/training/hadoop
>>
>> This directory will be created when you format the namenode by running
>> the command
>> hadoop namenode -format
>>
>> When you open this folder
>>
>>
>> you will see two subfolders dfs and mapred.
>>
>> the /home/training/hadoop/mapred folder will be on HDFS also
>>
>> Hope this clears
>> Regards,
>> Som Shekhar Sharma
>> +91-8197243810
>>
>>
>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Make sure to also set mapred.local.dir to the same set of output
>> > directories, this is were the intermediate key-value pairs are stored!
>> >
>> > Regards, Dieter
>> >
>> >
>> > 2013/12/16 Tao Xiao <xi...@gmail.com>
>> >>
>> >> I have ten disks per node,and I don't know what value I should set to
>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>> local disk
>> >> while some other said it refers to a directory in HDFS. I'm confused,
>> who
>> >> can explain it ?
>> >>
>> >> I want to spread I/O since I have ten disks per node, so should I set a
>> >> comma-separated list of directories (which are on different disks) to
>> >> "hadoop.tmp.dir" ?
>> >
>> >
>>
>
>

RE: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Vinayakumar B <vi...@huawei.com>.
Hi,

hadoop.tmp.dir is not the exact configuration you are looking for spreading the disk I/O

This is the default base directory ( its single directory not multiple) used in case you didn’t configure your own directories for processes such as NameNode, DataNode and NodeManager.

Exact configurations where you need to configure comma separated values are as follows.

1.       dfs.namenode.name.dir for  namenode in hdfs-site.xml

2.       dfs.datanode.data.dir for datanode in hdfs-site.xml

3.       yarn.nodemanager.local-dirs for NodeManager in yarn-site.xml

Please note all above configurations are for Hadoop 2.x

Configure different subdirectories if you are using same disk for multiple processes.
                Ex: /hadoop/data1/dfs/data
                        And
                     /hadoop/data1/yarn/nm-local-dir


Cheers,
Vinayakumar B
From: Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
Sent: 16 December 2013 14:42
To: user@hadoop.apache.org
Subject: Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Thanks.

In order to spread I/O among multiple disks, should I assign a comma-separated list of directories which are located on different disks to "hadoop.tmp.dir"?
for example,
 <property>
      <name>hadoop.tmp.dir</name>
      <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>

2013/12/16 Shekhar Sharma <sh...@gmail.com>>
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>


Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks.

In order to spread I/O among multiple disks, should I assign a
comma-separated list of directories which are located on different disks to
"hadoop.tmp.dir"?

for example,

 <property>
      <name>hadoop.tmp.dir</name>

<value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>


2013/12/16 Shekhar Sharma <sh...@gmail.com>

> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks.

In order to spread I/O among multiple disks, should I assign a
comma-separated list of directories which are located on different disks to
"hadoop.tmp.dir"?

for example,

 <property>
      <name>hadoop.tmp.dir</name>

<value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>


2013/12/16 Shekhar Sharma <sh...@gmail.com>

> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks.

In order to spread I/O among multiple disks, should I assign a
comma-separated list of directories which are located on different disks to
"hadoop.tmp.dir"?

for example,

 <property>
      <name>hadoop.tmp.dir</name>

<value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>


2013/12/16 Shekhar Sharma <sh...@gmail.com>

> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Tao Xiao <xi...@gmail.com>.
Thanks.

In order to spread I/O among multiple disks, should I assign a
comma-separated list of directories which are located on different disks to
"hadoop.tmp.dir"?

for example,

 <property>
      <name>hadoop.tmp.dir</name>

<value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value>
 </property>


2013/12/16 Shekhar Sharma <sh...@gmail.com>

> hadoop.tmp.dir is a directory created on local file system
> For example if you have set hadoop.tmp.dir property to
> /home/training/hadoop
>
> This directory will be created when you format the namenode by running
> the command
> hadoop namenode -format
>
> When you open this folder
>
>
> you will see two subfolders dfs and mapred.
>
> the /home/training/hadoop/mapred folder will be on HDFS also
>
> Hope this clears
> Regards,
> Som Shekhar Sharma
> +91-8197243810
>
>
> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com>
> wrote:
> > Hi,
> >
> > Make sure to also set mapred.local.dir to the same set of output
> > directories, this is were the intermediate key-value pairs are stored!
> >
> > Regards, Dieter
> >
> >
> > 2013/12/16 Tao Xiao <xi...@gmail.com>
> >>
> >> I have ten disks per node,and I don't know what value I should set to
> >> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk
> >> while some other said it refers to a directory in HDFS. I'm confused,
> who
> >> can explain it ?
> >>
> >> I want to spread I/O since I have ten disks per node, so should I set a
> >> comma-separated list of directories (which are on different disks) to
> >> "hadoop.tmp.dir" ?
> >
> >
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Shekhar Sharma <sh...@gmail.com>.
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Shekhar Sharma <sh...@gmail.com>.
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Shekhar Sharma <sh...@gmail.com>.
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Shekhar Sharma <sh...@gmail.com>.
hadoop.tmp.dir is a directory created on local file system
For example if you have set hadoop.tmp.dir property to /home/training/hadoop

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810


On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <dr...@gmail.com> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xi...@gmail.com>
>>
>> I have ten disks per node,and I don't know what value I should set to
>> "hadoop.tmp.dir". Some said this property refers to a location in local disk
>> while some other said it refers to a directory in HDFS. I'm confused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I set a
>> comma-separated list of directories (which are on different disks) to
>> "hadoop.tmp.dir" ?
>
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Dieter De Witte <dr...@gmail.com>.
Hi,

Make sure to also set mapred.local.dir to the same set of output
directories, this is were the intermediate key-value pairs are stored!

Regards, Dieter


2013/12/16 Tao Xiao <xi...@gmail.com>

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by shashwat shriparv <dw...@gmail.com>.
You can set the hadoop tmp dir to a directory or a disk you can mount the
disk and put path of that to the configuration file.

link /mnt

and you should set right permission for the mounted disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Mon, Dec 16, 2013 at 12:32 PM, Tao Xiao <xi...@gmail.com> wrote:

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by shashwat shriparv <dw...@gmail.com>.
You can set the hadoop tmp dir to a directory or a disk you can mount the
disk and put path of that to the configuration file.

link /mnt

and you should set right permission for the mounted disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Mon, Dec 16, 2013 at 12:32 PM, Tao Xiao <xi...@gmail.com> wrote:

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by shashwat shriparv <dw...@gmail.com>.
You can set the hadoop tmp dir to a directory or a disk you can mount the
disk and put path of that to the configuration file.

link /mnt

and you should set right permission for the mounted disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Mon, Dec 16, 2013 at 12:32 PM, Tao Xiao <xi...@gmail.com> wrote:

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by shashwat shriparv <dw...@gmail.com>.
You can set the hadoop tmp dir to a directory or a disk you can mount the
disk and put path of that to the configuration file.

link /mnt

and you should set right permission for the mounted disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Mon, Dec 16, 2013 at 12:32 PM, Tao Xiao <xi...@gmail.com> wrote:

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Dieter De Witte <dr...@gmail.com>.
Hi,

Make sure to also set mapred.local.dir to the same set of output
directories, this is were the intermediate key-value pairs are stored!

Regards, Dieter


2013/12/16 Tao Xiao <xi...@gmail.com>

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Dieter De Witte <dr...@gmail.com>.
Hi,

Make sure to also set mapred.local.dir to the same set of output
directories, this is were the intermediate key-value pairs are stored!

Regards, Dieter


2013/12/16 Tao Xiao <xi...@gmail.com>

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>

Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?

Posted by Dieter De Witte <dr...@gmail.com>.
Hi,

Make sure to also set mapred.local.dir to the same set of output
directories, this is were the intermediate key-value pairs are stored!

Regards, Dieter


2013/12/16 Tao Xiao <xi...@gmail.com>

> I have ten disks per node,and I don't know what value I should set to
> "hadoop.tmp.dir". Some said this property refers to a location in local
> disk while some other said it refers to a directory in HDFS. I'm confused,
> who can explain it ?
>
> I want to spread I/O since I have ten disks per node, so should I set a
> comma-separated list of directories (which are on different disks) to
> "hadoop.tmp.dir" ?
>