You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Vijay Murthi <mu...@yahoo-inc.com> on 2006/05/25 19:16:11 UTC

Out of memory after Map tasks

I am trying to understand what happens during the time duration when Map task got finished and reduce task starts executing. I have 2 machines with 4 process + 4 Gigs on each with NFS (not dfs) to process 50 Gigs of data. Map taks finish completion successfully. After that I see the following on the tasktracker log.

"Exception in thread "Server handler 1 on 50040" java.lang.OutOfMemoryError: Java heap space"


Lister below is the configuration parameter. Am I setting JAVA memory heap very low compared to io.sort.mb or file buffer size? I thought Tasktracker just pushes the job to the child node, does it because of something like moving data ? If so is there a buffer size I can set a limit? Also, I noticed on mapred local each under the directotries for reduce files start growing even after tasktracker has "out of memory error".

Any feedback would be appreciated.

Thanks,
VJ



-------------------------------------------------------------------
  <name>io.sort.factor</name>
  <value>10</value>

  <name>io.sort.mb</name>
  <value>500</value>

  <name>io.skip.checksum.errors</name>
  <value>false</value>

  <name>io.file.buffer.size</name>
  <value>4096000</value>


  <name>mapred.reduce.tasks</name>
  <value>6</value>

  <name>mapred.task.timeout</name>
  <value>100000000000</value>

  <name>mapred.tasktracker.tasks.maximum</name>
  <value>3</value>

  <name>mapred.child.java.opts</name>
  <value>-Xmx1024m</value>

  <name>mapred.combine.buffer.size</name>
  <value>100000</value>

  <name>mapred.speculative.execution</name>
  <value>true</value>

  <name>ipc.client.timeout</name>
  <value>60000</value>

------------------------------------------------------------
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=1024
------------------------------------------------------------

Re: Out of memory after Map tasks

Posted by Doug Cutting <cu...@apache.org>.

Vijay Murthi wrote:
> I am trying to understand what happens during the time duration when Map task got finished and reduce task starts executing. I have 2 machines with 4 process + 4 Gigs on each with NFS (not dfs) to process 50 Gigs of data. Map taks finish completion successfully. After that I see the following on the tasktracker log.
> 
> "Exception in thread "Server handler 1 on 50040" java.lang.OutOfMemoryError: Java heap space"

Are you running the current trunk?  My guess is that you are.  If so, 
then this error is "normal", things should keep running.

Are you running a 64-bit kernel?  If not, can it really take advantage 
of all 4GB?  In my experience, 32-bit JVM's can't effectively use more 
than around 1.5GB, and a 32-bit kernel can't effectively use all 4GB, 
but I may be wrong on that last count.

> Lister below is the configuration parameter. Am I setting JAVA memory heap very low compared to io.sort.mb or file buffer size? I thought Tasktracker just pushes the job to the child node, does it because of something like moving data ? If so is there a buffer size I can set a limit? Also, I noticed on mapred local each under the directotries for reduce files start growing even after tasktracker has "out of memory error".

Sorting does indeed happen in the child process.

4MB buffers for file streams seems large to me.

You might increase the io.sort.factor.  With 500MB for sorting and a 
sort factor of 100, each sort stream would get a 5MB buffer, plenty to 
ensure that transfer time dominates seek, since the break-even point is 
around 100kB.  So you could even use a sort factor of 500.  That would 
make sorts a lot faster.

Also why are you setting the task timeout so high?  Do you have mappers 
or reducers that take a long time per entry and are not calling 
Reporter.setStatus() regularly?  That can cause tasks to time out.

Doug

> -------------------------------------------------------------------
>   <name>io.sort.factor</name>
>   <value>10</value>
> 
>   <name>io.sort.mb</name>
>   <value>500</value>
> 
>   <name>io.skip.checksum.errors</name>
>   <value>false</value>
> 
>   <name>io.file.buffer.size</name>
>   <value>4096000</value>
> 
> 
>   <name>mapred.reduce.tasks</name>
>   <value>6</value>
> 
>   <name>mapred.task.timeout</name>
>   <value>100000000000</value>
> 
>   <name>mapred.tasktracker.tasks.maximum</name>
>   <value>3</value>
> 
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx1024m</value>
> 
>   <name>mapred.combine.buffer.size</name>
>   <value>100000</value>
> 
>   <name>mapred.speculative.execution</name>
>   <value>true</value>
> 
>   <name>ipc.client.timeout</name>
>   <value>60000</value>
> 
> ------------------------------------------------------------
> # The maximum amount of heap to use, in MB. Default is 1000.
> export HADOOP_HEAPSIZE=1024
> ------------------------------------------------------------