You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Philip <mr...@gmail.com> on 2008/12/17 19:44:51 UTC

OOME only with large datasets

I've been trying to trouble shoot an OOME we've been having.

When we run the job over a dataset that about 700GB (~9000 files) or larger
we will get an OOME on the map jobs.  However if we run the job over smaller
set of the data then everything works out fine.  So my question is: What
changes in Hadoop as the size of the input set increases?

We are on hadoop 0.18.0.

Here's is a stack trace produced by the job tracker.
java.lang.OutOfMemoryError: Java heap space at
java.util.Arrays.copyOf(Arrays.java:2882) at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at
java.lang.StringBuffer.append(StringBuffer.java:224) at
com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(DeferredDocumentImpl.java:1167)
at
com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeValueString(DeferredDocumentImpl.java:1120)
at
com.sun.org.apache.xerces.internal.dom.DeferredTextImpl.synchronizeData(DeferredTextImpl.java:93)
at
com.sun.org.apache.xerces.internal.dom.CharacterDataImpl.getData(CharacterDataImpl.java:160)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:928)
at
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:851)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:819) at
org.apache.hadoop.conf.Configuration.get(Configuration.java:278) at
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:446) at
org.apache.hadoop.mapred.JobConf.getKeepFailedTaskFiles(JobConf.java:308) at
org.apache.hadoop.mapred.TaskTracker$TaskInProgress.setJobConf(TaskTracker.java:1506)
at
org.apache.hadoop.mapred.TaskTracker.launchTaskForJob(TaskTracker.java:727)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:721) at
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1306) at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:946) at
org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1343) at
org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2354)


Thanks,
Philip.

Re: OOME only with large datasets

Posted by Karl Anderson <kr...@monkey.org>.

On 17-Dec-08, at 10:44 AM, Philip wrote:

> I've been trying to trouble shoot an OOME we've been having.
>
> When we run the job over a dataset that about 700GB (~9000 files) or  
> larger
> we will get an OOME on the map jobs.  However if we run the job over  
> smaller
> set of the data then everything works out fine.  So my question is:  
> What
> changes in Hadoop as the size of the input set increases?
>
> We are on hadoop 0.18.0.

I don't have a real answer, but the first thing you should do is try  
using 0.18.1 or 0.17.x - I had some 0.18.0 memory/filehandle problems  
go away when I switched.  0.18.0 was not a stable release, in any case.

Karl Anderson
kra@monkey.org
http://monkey.org/~kra

Re: OOME only with large datasets

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

On Dec 17, 2008, at 10:44 AM, Philip wrote:

> I've been trying to trouble shoot an OOME we've been having.
>
> When we run the job over a dataset that about 700GB (~9000 files) or  
> larger
> we will get an OOME on the map jobs.  However if we run the job over  
> smaller
> set of the data then everything works out fine.  So my question is:  
> What
> changes in Hadoop as the size of the input set increases?
>
> We are on hadoop 0.18.0.
>

I suspect the reason is that larger data-sets result in more maps and  
we seem to have a memory leak at the TaskTracker which depends on the  
number of maps being run on a given TaskTracker.
I've opened https://issues.apache.org/jira/browse/HADOOP-4906 to track  
this.

As a workaround you could try increasing the heapsize for the  
TaskTracker via HADOOP_TASKTRACKER_OPTS in conf/hadoop-env.sh.

Arun