You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "S. Nunes" <sn...@gmail.com> on 2008/01/08 20:08:04 UTC

Limit the space used by hadoop on a slave node

Hi,

I'm trying to install hadoop on a set of computers that are not
exclusively dedicated to run hadoop.
Our goal is to use these computers in the hadoop cluster when they are
inactive (during night).

I would like to know if it is possible to limit the space used by
hadoop at a slave node.
Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
the available disk space.

Thanks in advance for any help on this issue,

--
Sérgio Nunes

Re: Limit the space used by hadoop on a slave node

Posted by Ted Dunning <td...@veoh.com>.
And I have both but have had disk full problems.  I can't be sure right now
whether this occurred under 14.4 or 15.1, but I think it was 15.1.

In any case, new file creation from a non-datanode host is definitely not
well balanced and will lead to disk full conditions if you have dramatically
different sized partitions available on the different datanodes.  Also, if
you have a small and a large partition available on a single node, the small
partition will fill up and cause corruption.  I had to go to single
partitions on all nodes to avoid this.

<property>
  <name>dfs.datanode.du.reserved</name>
  <!--  10 GB -->
  <value> 10000000000 </value>
  <description>Reserved space in bytes. Always leave this much space free
for non dfs use  </description>
</property>

<property>
  <name>dfs.datanode.du.pct</name>
  <value>0.9f</value>
  <description>When calculating remaining space, only use this percentage of
the real available space
  </description>
</property>



On 1/8/08 1:30 PM, "Koji Noguchi" <kn...@yahoo-inc.com> wrote:

> We use, 
> 
> dfs.datanode.du.pct for 0.14 and dfs.datanode.du.reserved for 0.15.
> 
> Change was made in the Jira Hairong mentioned.
> https://issues.apache.org/jira/browse/HADOOP-1463
> 
> Koji
> 
>> -----Original Message-----
>> From: Ted Dunning [mailto:tdunning@veoh.com]
>> Sent: Tuesday, January 08, 2008 1:13 PM
>> To: hadoop-user@lucene.apache.org
>> Subject: Re: Limit the space used by hadoop on a slave node
>> 
>> 
>> I think I have seen related bad behavior on 15.1.
>> 
>> On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
>> 
>>> Has anybody tried 15.0? Please check
>>> https://issues.apache.org/jira/browse/HADOOP-1463.
>>> 
>>> Hairong
>>> -----Original Message-----
>>> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
>>> Sent: Tuesday, January 08, 2008 11:33 AM
>>> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
>>> Subject: RE: Limit the space used by hadoop on a slave node
>>> 
>>> at least up until 14.4, these options are broken. see
>>> https://issues.apache.org/jira/browse/HADOOP-2549
>>> 
>>> (there's a trivial patch - but i am still testing).
>>> 
>>> 
> 


RE: Limit the space used by hadoop on a slave node

Posted by Koji Noguchi <kn...@yahoo-inc.com>.
We use, 

dfs.datanode.du.pct for 0.14 and dfs.datanode.du.reserved for 0.15.

Change was made in the Jira Hairong mentioned.
https://issues.apache.org/jira/browse/HADOOP-1463

Koji

> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@veoh.com]
> Sent: Tuesday, January 08, 2008 1:13 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Limit the space used by hadoop on a slave node
> 
> 
> I think I have seen related bad behavior on 15.1.
> 
> On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
> 
> > Has anybody tried 15.0? Please check
> > https://issues.apache.org/jira/browse/HADOOP-1463.
> >
> > Hairong
> > -----Original Message-----
> > From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> > Sent: Tuesday, January 08, 2008 11:33 AM
> > To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> > Subject: RE: Limit the space used by hadoop on a slave node
> >
> > at least up until 14.4, these options are broken. see
> > https://issues.apache.org/jira/browse/HADOOP-2549
> >
> > (there's a trivial patch - but i am still testing).
> >
> >


Re: Limit the space used by hadoop on a slave node

Posted by Ted Dunning <td...@veoh.com>.
My problem was caused purely by copying files to HDFS using [hadoop dfs
-put].  No map-reduce activity was going on at the time (and all of the jobs
I had around that time were counting jobs that had very powerful reduction
in data volumes due to combiner functions.


On 1/8/08 1:32 PM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:

> Most of the time dfs and map/reduce share disks. Keep in mind that du
> options can not control how much space that map/reduce tasks take.
> Sometimes we get the out of disk space problem because data intensive
> map/reduce tasks take a lot of disk space.
> 
> Hairong
> 
> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@veoh.com]
> Sent: Tuesday, January 08, 2008 1:13 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Limit the space used by hadoop on a slave node
> 
> 
> I think I have seen related bad behavior on 15.1.
> 
> On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
> 
>> Has anybody tried 15.0? Please check
>> https://issues.apache.org/jira/browse/HADOOP-1463.
>> 
>> Hairong
>> -----Original Message-----
>> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
>> Sent: Tuesday, January 08, 2008 11:33 AM
>> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
>> Subject: RE: Limit the space used by hadoop on a slave node
>> 
>> at least up until 14.4, these options are broken. see
>> https://issues.apache.org/jira/browse/HADOOP-2549
>> 
>> (there's a trivial patch - but i am still testing).
>> 
>> 
> 


RE: Limit the space used by hadoop on a slave node

Posted by Hairong Kuang <ha...@yahoo-inc.com>.
Most of the time dfs and map/reduce share disks. Keep in mind that du
options can not control how much space that map/reduce tasks take.
Sometimes we get the out of disk space problem because data intensive
map/reduce tasks take a lot of disk space.

Hairong

-----Original Message-----
From: Ted Dunning [mailto:tdunning@veoh.com] 
Sent: Tuesday, January 08, 2008 1:13 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node


I think I have seen related bad behavior on 15.1.

On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:

> Has anybody tried 15.0? Please check
> https://issues.apache.org/jira/browse/HADOOP-1463.
> 
> Hairong
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> Sent: Tuesday, January 08, 2008 11:33 AM
> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> Subject: RE: Limit the space used by hadoop on a slave node
> 
> at least up until 14.4, these options are broken. see
> https://issues.apache.org/jira/browse/HADOOP-2549
> 
> (there's a trivial patch - but i am still testing).
> 
> 


Re: Limit the space used by hadoop on a slave node

Posted by Ted Dunning <td...@veoh.com>.
I think I have seen related bad behavior on 15.1.

On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:

> Has anybody tried 15.0? Please check
> https://issues.apache.org/jira/browse/HADOOP-1463.
> 
> Hairong
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> Sent: Tuesday, January 08, 2008 11:33 AM
> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> Subject: RE: Limit the space used by hadoop on a slave node
> 
> at least up until 14.4, these options are broken. see
> https://issues.apache.org/jira/browse/HADOOP-2549
> 
> (there's a trivial patch - but i am still testing).
> 
> 


RE: Limit the space used by hadoop on a slave node

Posted by Hairong Kuang <ha...@yahoo-inc.com>.
Has anybody tried 15.0? Please check https://issues.apache.org/jira/browse/HADOOP-1463.

Hairong
-----Original Message-----
From: Joydeep Sen Sarma [mailto:jssarma@facebook.com] 
Sent: Tuesday, January 08, 2008 11:33 AM
To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
Subject: RE: Limit the space used by hadoop on a slave node

at least up until 14.4, these options are broken. see https://issues.apache.org/jira/browse/HADOOP-2549

(there's a trivial patch - but i am still testing).


-----Original Message-----
From: Khalil Honsali [mailto:k.honsali@gmail.com]
Sent: Tue 1/8/2008 11:21 AM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node
 
I haven't tried yet, but I've seen this:
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.
  </description>
</property>
or
<property>
  <name>dfs.datanode.du.pct</name>
  <value>0.98f</value>
  <description>When calculating remaining space, only use this percentage of the real available space
  </description>
</property>


In:
conf/hadoop-site.xml


On 09/01/2008, S. Nunes <sn...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to install hadoop on a set of computers that are not 
> exclusively dedicated to run hadoop.
> Our goal is to use these computers in the hadoop cluster when they are 
> inactive (during night).
>
> I would like to know if it is possible to limit the space used by 
> hadoop at a slave node.
> Something like "hadoop.tmp.dir.max". I do not want hadoop to use all 
> the available disk space.
>
> Thanks in advance for any help on this issue,
>
> --
> Sérgio Nunes
>



-


RE: Limit the space used by hadoop on a slave node

Posted by Joydeep Sen Sarma <js...@facebook.com>.
at least up until 14.4, these options are broken. see https://issues.apache.org/jira/browse/HADOOP-2549

(there's a trivial patch - but i am still testing).


-----Original Message-----
From: Khalil Honsali [mailto:k.honsali@gmail.com]
Sent: Tue 1/8/2008 11:21 AM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node
 
I haven't tried yet, but I've seen this:
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
  </description>
</property>
or
<property>
  <name>dfs.datanode.du.pct</name>
  <value>0.98f</value>
  <description>When calculating remaining space, only use this percentage of
the real available space
  </description>
</property>


In:
conf/hadoop-site.xml


On 09/01/2008, S. Nunes <sn...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to install hadoop on a set of computers that are not
> exclusively dedicated to run hadoop.
> Our goal is to use these computers in the hadoop cluster when they are
> inactive (during night).
>
> I would like to know if it is possible to limit the space used by
> hadoop at a slave node.
> Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
> the available disk space.
>
> Thanks in advance for any help on this issue,
>
> --
> Sérgio Nunes
>



-


Re: Limit the space used by hadoop on a slave node

Posted by Khalil Honsali <k....@gmail.com>.
I haven't tried yet, but I've seen this:
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
  </description>
</property>
or
<property>
  <name>dfs.datanode.du.pct</name>
  <value>0.98f</value>
  <description>When calculating remaining space, only use this percentage of
the real available space
  </description>
</property>


In:
conf/hadoop-site.xml


On 09/01/2008, S. Nunes <sn...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to install hadoop on a set of computers that are not
> exclusively dedicated to run hadoop.
> Our goal is to use these computers in the hadoop cluster when they are
> inactive (during night).
>
> I would like to know if it is possible to limit the space used by
> hadoop at a slave node.
> Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
> the available disk space.
>
> Thanks in advance for any help on this issue,
>
> --
> Sérgio Nunes
>



-