You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by bhupesh bansal <bb...@gmail.com> on 2008/04/11 19:40:30 UTC
Mapper OutOfMemoryError Revisited !!

Hi Guys, 

I need to restart discussion around 
http://www.nabble.com/Mapper-Out-of-Memory-td14200563.html

I saw the same OOM error in my map-reduce job in the map phase.

1. I tried changing mapred.child.java.opts (bumped to 600M)
2. io.sort.mb was kept at 100MB.
I see the same errors still.

I checked with debug the size of "keyValBuffer" in collect(), that is always
less than io.sort.mb and is spilled to disk properly. 

I tried changing the map.task number to a very high number so that the input
is split into smaller chunks.  It helps for a while as the map job went a
bit far (56% from 5%) but still see the problem. 

I tried bumping mapred.child.java.opts to 1000M , still got the same error. 

I also tried using the -verbose:gc -Xloggc:/tmp/@taskid@.gc value in opts to
get the gc.log but didnt got any log??

I tried using 'jmap -histo pid' to see the heap information, it didnt gave
me any meaningful or obvious problem point. 


What are the other possible memory hog during mapper phase ?? Is the input
file chunk kept fully in memory ?? 


task_200804110926_0004_m_000239_0: java.lang.OutOfMemoryError: Java heap
spacetask_200804110926_0004_m_000239_0:      at
java.util.Arrays.copyOf(Arrays.java:2786)task_200804110926_0004_m_000239_0:     
at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)task_200804110926_0004_m_000239_0:     
at
java.io.DataOutputStream.write(DataOutputStream.java:90)task_200804110926_0004_m_000239_0:     
at
java.io.DataOutputStream.writeUTF(DataOutputStream.java:384)task_200804110926_0004_m_000239_0:     
at
java.io.DataOutputStream.writeUTF(DataOutputStream.java:306)task_200804110926_0004_m_000239_0:     
at
com.linkedin.Hadoop.DataObjects.SearchTrackingJoinValue.write(SearchTrackingJoinValue.java:117)task_200804110926_0004_m_000239_0:     
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:350)task_200804110926_0004_m_000239_0:     
at
com.linkedin.Hadoop.Mapper.SearchClickJoinMapper.readSearchJoinResultsObject(SearchClickJoinMapper.java:131)task_200804110926_0004_m_000239_0:     
at
com.linkedin.Hadoop.Mapper.SearchClickJoinMapper.map(SearchClickJoinMapper.java:54)task_200804110926_0004_m_000239_0:     
at
com.linkedin.Hadoop.Mapper.SearchClickJoinMapper.map(SearchClickJoinMapper.java:31)task_200804110926_0004_m_000239_0:     
at
org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)task_200804110926_0004_m_000239_0:     
at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)task_200804110926_0004_m_000239_0:     
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1804)


-- 
View this message in context: http://www.nabble.com/Mapper-OutOfMemoryError-Revisited-%21%21-tp16628173p16628173.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.