You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "S. Nunes" <sn...@gmail.com> on 2008/01/08 20:08:04 UTC
Limit the space used by hadoop on a slave node
Hi,
I'm trying to install hadoop on a set of computers that are not
exclusively dedicated to run hadoop.
Our goal is to use these computers in the hadoop cluster when they are
inactive (during night).
I would like to know if it is possible to limit the space used by
hadoop at a slave node.
Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
the available disk space.
Thanks in advance for any help on this issue,
--
Sérgio Nunes
Re: Limit the space used by hadoop on a slave node
Posted by Ted Dunning <td...@veoh.com>.
And I have both but have had disk full problems. I can't be sure right now
whether this occurred under 14.4 or 15.1, but I think it was 15.1.
In any case, new file creation from a non-datanode host is definitely not
well balanced and will lead to disk full conditions if you have dramatically
different sized partitions available on the different datanodes. Also, if
you have a small and a large partition available on a single node, the small
partition will fill up and cause corruption. I had to go to single
partitions on all nodes to avoid this.
<property>
<name>dfs.datanode.du.reserved</name>
<!-- 10 GB -->
<value> 10000000000 </value>
<description>Reserved space in bytes. Always leave this much space free
for non dfs use </description>
</property>
<property>
<name>dfs.datanode.du.pct</name>
<value>0.9f</value>
<description>When calculating remaining space, only use this percentage of
the real available space
</description>
</property>
On 1/8/08 1:30 PM, "Koji Noguchi" <kn...@yahoo-inc.com> wrote:
> We use,
>
> dfs.datanode.du.pct for 0.14 and dfs.datanode.du.reserved for 0.15.
>
> Change was made in the Jira Hairong mentioned.
> https://issues.apache.org/jira/browse/HADOOP-1463
>
> Koji
>
>> -----Original Message-----
>> From: Ted Dunning [mailto:tdunning@veoh.com]
>> Sent: Tuesday, January 08, 2008 1:13 PM
>> To: hadoop-user@lucene.apache.org
>> Subject: Re: Limit the space used by hadoop on a slave node
>>
>>
>> I think I have seen related bad behavior on 15.1.
>>
>> On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
>>
>>> Has anybody tried 15.0? Please check
>>> https://issues.apache.org/jira/browse/HADOOP-1463.
>>>
>>> Hairong
>>> -----Original Message-----
>>> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
>>> Sent: Tuesday, January 08, 2008 11:33 AM
>>> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
>>> Subject: RE: Limit the space used by hadoop on a slave node
>>>
>>> at least up until 14.4, these options are broken. see
>>> https://issues.apache.org/jira/browse/HADOOP-2549
>>>
>>> (there's a trivial patch - but i am still testing).
>>>
>>>
>
RE: Limit the space used by hadoop on a slave node
Posted by Koji Noguchi <kn...@yahoo-inc.com>.
We use,
dfs.datanode.du.pct for 0.14 and dfs.datanode.du.reserved for 0.15.
Change was made in the Jira Hairong mentioned.
https://issues.apache.org/jira/browse/HADOOP-1463
Koji
> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@veoh.com]
> Sent: Tuesday, January 08, 2008 1:13 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Limit the space used by hadoop on a slave node
>
>
> I think I have seen related bad behavior on 15.1.
>
> On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
>
> > Has anybody tried 15.0? Please check
> > https://issues.apache.org/jira/browse/HADOOP-1463.
> >
> > Hairong
> > -----Original Message-----
> > From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> > Sent: Tuesday, January 08, 2008 11:33 AM
> > To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> > Subject: RE: Limit the space used by hadoop on a slave node
> >
> > at least up until 14.4, these options are broken. see
> > https://issues.apache.org/jira/browse/HADOOP-2549
> >
> > (there's a trivial patch - but i am still testing).
> >
> >
Re: Limit the space used by hadoop on a slave node
Posted by Ted Dunning <td...@veoh.com>.
My problem was caused purely by copying files to HDFS using [hadoop dfs
-put]. No map-reduce activity was going on at the time (and all of the jobs
I had around that time were counting jobs that had very powerful reduction
in data volumes due to combiner functions.
On 1/8/08 1:32 PM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
> Most of the time dfs and map/reduce share disks. Keep in mind that du
> options can not control how much space that map/reduce tasks take.
> Sometimes we get the out of disk space problem because data intensive
> map/reduce tasks take a lot of disk space.
>
> Hairong
>
> -----Original Message-----
> From: Ted Dunning [mailto:tdunning@veoh.com]
> Sent: Tuesday, January 08, 2008 1:13 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Limit the space used by hadoop on a slave node
>
>
> I think I have seen related bad behavior on 15.1.
>
> On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
>
>> Has anybody tried 15.0? Please check
>> https://issues.apache.org/jira/browse/HADOOP-1463.
>>
>> Hairong
>> -----Original Message-----
>> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
>> Sent: Tuesday, January 08, 2008 11:33 AM
>> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
>> Subject: RE: Limit the space used by hadoop on a slave node
>>
>> at least up until 14.4, these options are broken. see
>> https://issues.apache.org/jira/browse/HADOOP-2549
>>
>> (there's a trivial patch - but i am still testing).
>>
>>
>
RE: Limit the space used by hadoop on a slave node
Posted by Hairong Kuang <ha...@yahoo-inc.com>.
Most of the time dfs and map/reduce share disks. Keep in mind that du
options can not control how much space that map/reduce tasks take.
Sometimes we get the out of disk space problem because data intensive
map/reduce tasks take a lot of disk space.
Hairong
-----Original Message-----
From: Ted Dunning [mailto:tdunning@veoh.com]
Sent: Tuesday, January 08, 2008 1:13 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node
I think I have seen related bad behavior on 15.1.
On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
> Has anybody tried 15.0? Please check
> https://issues.apache.org/jira/browse/HADOOP-1463.
>
> Hairong
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> Sent: Tuesday, January 08, 2008 11:33 AM
> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> Subject: RE: Limit the space used by hadoop on a slave node
>
> at least up until 14.4, these options are broken. see
> https://issues.apache.org/jira/browse/HADOOP-2549
>
> (there's a trivial patch - but i am still testing).
>
>
Re: Limit the space used by hadoop on a slave node
Posted by Ted Dunning <td...@veoh.com>.
I think I have seen related bad behavior on 15.1.
On 1/8/08 11:49 AM, "Hairong Kuang" <ha...@yahoo-inc.com> wrote:
> Has anybody tried 15.0? Please check
> https://issues.apache.org/jira/browse/HADOOP-1463.
>
> Hairong
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
> Sent: Tuesday, January 08, 2008 11:33 AM
> To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
> Subject: RE: Limit the space used by hadoop on a slave node
>
> at least up until 14.4, these options are broken. see
> https://issues.apache.org/jira/browse/HADOOP-2549
>
> (there's a trivial patch - but i am still testing).
>
>
RE: Limit the space used by hadoop on a slave node
Posted by Hairong Kuang <ha...@yahoo-inc.com>.
Has anybody tried 15.0? Please check https://issues.apache.org/jira/browse/HADOOP-1463.
Hairong
-----Original Message-----
From: Joydeep Sen Sarma [mailto:jssarma@facebook.com]
Sent: Tuesday, January 08, 2008 11:33 AM
To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org
Subject: RE: Limit the space used by hadoop on a slave node
at least up until 14.4, these options are broken. see https://issues.apache.org/jira/browse/HADOOP-2549
(there's a trivial patch - but i am still testing).
-----Original Message-----
From: Khalil Honsali [mailto:k.honsali@gmail.com]
Sent: Tue 1/8/2008 11:21 AM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node
I haven't tried yet, but I've seen this:
<property>
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much space free for non dfs use.
</description>
</property>
or
<property>
<name>dfs.datanode.du.pct</name>
<value>0.98f</value>
<description>When calculating remaining space, only use this percentage of the real available space
</description>
</property>
In:
conf/hadoop-site.xml
On 09/01/2008, S. Nunes <sn...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to install hadoop on a set of computers that are not
> exclusively dedicated to run hadoop.
> Our goal is to use these computers in the hadoop cluster when they are
> inactive (during night).
>
> I would like to know if it is possible to limit the space used by
> hadoop at a slave node.
> Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
> the available disk space.
>
> Thanks in advance for any help on this issue,
>
> --
> Sérgio Nunes
>
-
RE: Limit the space used by hadoop on a slave node
Posted by Joydeep Sen Sarma <js...@facebook.com>.
at least up until 14.4, these options are broken. see https://issues.apache.org/jira/browse/HADOOP-2549
(there's a trivial patch - but i am still testing).
-----Original Message-----
From: Khalil Honsali [mailto:k.honsali@gmail.com]
Sent: Tue 1/8/2008 11:21 AM
To: hadoop-user@lucene.apache.org
Subject: Re: Limit the space used by hadoop on a slave node
I haven't tried yet, but I've seen this:
<property>
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
</property>
or
<property>
<name>dfs.datanode.du.pct</name>
<value>0.98f</value>
<description>When calculating remaining space, only use this percentage of
the real available space
</description>
</property>
In:
conf/hadoop-site.xml
On 09/01/2008, S. Nunes <sn...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to install hadoop on a set of computers that are not
> exclusively dedicated to run hadoop.
> Our goal is to use these computers in the hadoop cluster when they are
> inactive (during night).
>
> I would like to know if it is possible to limit the space used by
> hadoop at a slave node.
> Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
> the available disk space.
>
> Thanks in advance for any help on this issue,
>
> --
> Sérgio Nunes
>
-
Re: Limit the space used by hadoop on a slave node
Posted by Khalil Honsali <k....@gmail.com>.
I haven't tried yet, but I've seen this:
<property>
<name>dfs.datanode.du.reserved</name>
<value>0</value>
<description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
</description>
</property>
or
<property>
<name>dfs.datanode.du.pct</name>
<value>0.98f</value>
<description>When calculating remaining space, only use this percentage of
the real available space
</description>
</property>
In:
conf/hadoop-site.xml
On 09/01/2008, S. Nunes <sn...@gmail.com> wrote:
>
> Hi,
>
> I'm trying to install hadoop on a set of computers that are not
> exclusively dedicated to run hadoop.
> Our goal is to use these computers in the hadoop cluster when they are
> inactive (during night).
>
> I would like to know if it is possible to limit the space used by
> hadoop at a slave node.
> Something like "hadoop.tmp.dir.max". I do not want hadoop to use all
> the available disk space.
>
> Thanks in advance for any help on this issue,
>
> --
> Sérgio Nunes
>
-