You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mark Kerzner <ma...@shmsoft.com> on 2012/05/23 21:16:38 UTC

Memory exception in the mapper

Hi, all,

I got the exception below in the mapper. I already have my global Hadoop
heap at 5 GB, but is there a specific other setting? Or maybe I should
troubleshoot for memory?

But the same application works in the IDE.

Thank you!

Mark

*stderr logs*

Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
Java heap space
	at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
	at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
	at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
	at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
	at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
Exception in thread "communication thread" java.lang.OutOfMemoryError:
Java heap space

Exception: java.lang.OutOfMemoryError thrown from the
UncaughtExceptionHandler in thread "communication thread"

Re: Memory exception in the mapper

Posted by Mark Kerzner <ma...@shmsoft.com>.
Thanks, Joey,

we are in beta, and I kinda need these for debugging. But as soon as we go
to production, your word is well taken. (I hope we will replace the current
primitive logging with good one (log4j is I think preferred with Hadoop),
and then we can change the log level.

Mark

On Wed, May 23, 2012 at 10:39 PM, Joey Krabacher <jk...@gmail.com>wrote:

> No problem, glad I could help.
>
> In our test environment I have lots of output and logging turned on, but as
> soon as it is on production all output and logging is reduced to the bare
> minimum.
> Basically, in production we only log caught exceptions.
>
> I would take it out unless you absolutely need it. IMHO.
> If your jobs are not mission critical and do not need to run as smooth as
> possible then it's not as important to remove those.
>
> /* Joey */
>
> On Wed, May 23, 2012 at 10:21 PM, Mark Kerzner <mark.kerzner@shmsoft.com
> >wrote:
>
> > Joey,
> >
> > that did the trick!
> >
> > Actually, I am writing to the log with System.out.println() statements,
> and
> > I write about 12,000 lines, would that be a problem? I don't really need
> > this output, so if you think it's inadvisable, I will remove that.
> >
> > Also, I hope that if I have not 6,000 maps but 12,000 or even 30,000, it
> > will still work.
> >
> > Well, I will see pretty soon, I guess, with more data.
> >
> > Again, thank you.
> >
> > Sincerely,
> > Mark
> >
> > On Wed, May 23, 2012 at 9:43 PM, Joey Krabacher <jkrabacher@gmail.com
> > >wrote:
> >
> > > Mark,
> > >
> > > Have you tried tweaking the mapred.child.java.opts property in your
> > > mapred-site.xml?
> > >
> > > <property>
> > >    <name>mapred.child.java.opts</name>
> > >    <value>-Xmx2048m</value>
> > >  </property>
> > >
> > > This might help.
> > > It looks like the fatal error came right after the log truncater fired
> > off.
> > > Are you outputting anything to the logs manually, or have you looked at
> > the
> > > user logs to see if there is anything taking up lots of room?
> > >
> > > / * Joey */
> > >
> > >
> > > On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner <
> mark.kerzner@shmsoft.com
> > > >wrote:
> > >
> > > > Joey,
> > > >
> > > > my errors closely resembles this
> > > > one<
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3CAANLkTikR3DF4ce-tGIPhv9_-EVFOeD_5-T684NF4y6dd@mail.gmail.com%3E
> > > > >in
> > > > the archives. I can now be much more specific with the errors
> message,
> > > > and it is quoted below. I tried -Xmx3096. But I got the same error.
> > > >
> > > > Thank you,
> > > > Mark
> > > >
> > > >
> > > > syslog logs
> > > > 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
> > > > Unable to load native-hadoop library for your platform... using
> > > > builtin-java classes where applicable
> > > > 2012-05-23 20:04:52,519 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics:
> > > > Initializing JVM Metrics with processName=MAP, sessionId=
> > > > 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree:
> setsid
> > > > exited with exit code 0
> > > > 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
> > > > ResourceCalculatorPlugin :
> > > > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
> > > > 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask:
> > > io.sort.mb =
> > > > 100
> > > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data
> > > buffer
> > > > = 79691776/99614720
> > > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
> > > > buffer = 262144/327680
> > > > 2012-05-23 20:04:53,010 WARN
> > > > org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native
> library
> > > not
> > > > loaded
> > > > 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 0; bufend = 79542629; bufvoid = 99614720
> > > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > 0;
> > > > kvend = 228; length = 327680
> > > > 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 0
> > > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 79542629; bufend = 53863940; bufvoid = 99614720
> > > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 228; kvend = 431; length = 327680
> > > > 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 1
> > > > 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 53863940; bufend = 31696780; bufvoid = 99614720
> > > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 431; kvend = 861; length = 327680
> > > > 2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 2
> > > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 31696780; bufend = 10267329; bufvoid = 99614720
> > > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 861; kvend = 1462; length = 327680
> > > > 2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 3
> > > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 10267329; bufend = 85241086; bufvoid = 99614720
> > > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 1462; kvend = 1642; length = 327680
> > > > 2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 4
> > > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 85241086; bufend = 51305930; bufvoid = 99614720
> > > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 1642; kvend = 1946; length = 327680
> > > > 2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 5
> > > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 51305930; bufend = 31353466; bufvoid = 99614720
> > > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 1946; kvend = 2263; length = 327680
> > > > 2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 6
> > > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 31353466; bufend = 10945750; bufvoid = 99614720
> > > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 2263; kvend = 2755; length = 327680
> > > > 2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 7
> > > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 10945750; bufend = 81838103; bufvoid = 99614720
> > > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 2755; kvend = 2967; length = 327680
> > > > 2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 8
> > > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 81838103; bufend = 61751422; bufvoid = 99614720
> > > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 2967; kvend = 3202; length = 327680
> > > > 2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 9
> > > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 61751422; bufend = 38440188; bufvoid = 99614720
> > > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 3202; kvend = 3628; length = 327680
> > > > 2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 10
> > > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 38440188; bufend = 18376423; bufvoid = 99614720
> > > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 3628; kvend = 4252; length = 327680
> > > > 2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 11
> > > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 18376423; bufend = 97890160; bufvoid = 99614720
> > > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 4252; kvend = 4488; length = 327680
> > > > 2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 12
> > > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 97890160; bufend = 77234938; bufvoid = 99614720
> > > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 4488; kvend = 4771; length = 327680
> > > > 2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 13
> > > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask:
> Spilling
> > > map
> > > > output: buffer full= true
> > > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask:
> > bufstart =
> > > > 77234938; bufend = 52962105; bufvoid = 99614720
> > > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask:
> kvstart
> > =
> > > > 4771; kvend = 5203; length = 327680
> > > > 2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 14
> > > > 2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask:
> Starting
> > > > flush of map output
> > > > 2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask:
> Finished
> > > > spill 15
> > > > 2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging
> > 16
> > > > sorted segments
> > > > 2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger:
> Merging 7
> > > > intermediate segments out of a total of 16
> > > > 2012-05-23 20:22:58,083 INFO
> > org.apache.hadoop.mapred.TaskLogsTruncater:
> > > > Initializing logs' truncater with mapRetainSize=-1 and
> > > reduceRetainSize=-1
> > > > 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error
> > > running
> > > > child : java.lang.OutOfMemoryError: Java heap space
> > > >    at
> > org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
> > > >    at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)
> > > >    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
> > > >    at
> org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
> > > >    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
> > > >    at
> org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499)
> > > >    at
> org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> > > >    at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
> > > >    at
> > > >
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
> > > >    at
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
> > > >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
> > > >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> > > >    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> > > >    at java.security.AccessController.doPrivileged(Native Method)
> > > >    at javax.security.auth.Subject.doAs(Subject.java:396)
> > > >    at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> > > >    at org.apache.hadoop.mapred.Child.main(Child.java:264)
> > > >
> > > > On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <
> jkrabacher@gmail.com
> > > > >wrote:
> > > >
> > > > > My experience with this sort of problem tells me one of two things
> > and
> > > > > maybe both:
> > > > >
> > > > > 1. there are some optimizations to the code that can be made
> > (variable
> > > > > re-creation inside of loops, etc.)
> > > > > 2. something has gone horribly wrong with the logic in the mapper.
> > > > >
> > > > > To troubleshoot I would output some log entries at specific points
> in
> > > the
> > > > > mapper (be careful not to log every execution of the mapper because
> > > this
> > > > > could cause major issues with the disk filling up and that sort of
> > > > thing.)
> > > > >
> > > > > Hope that helps.
> > > > >
> > > > > /* Joey */
> > > > >
> > > > > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <
> > > mark.kerzner@shmsoft.com
> > > > > >wrote:
> > > > >
> > > > > > Hi, all,
> > > > > >
> > > > > > I got the exception below in the mapper. I already have my global
> > > > Hadoop
> > > > > > heap at 5 GB, but is there a specific other setting? Or maybe I
> > > should
> > > > > > troubleshoot for memory?
> > > > > >
> > > > > > But the same application works in the IDE.
> > > > > >
> > > > > > Thank you!
> > > > > >
> > > > > > Mark
> > > > > >
> > > > > > *stderr logs*
> > > > > >
> > > > > > Exception in thread "Thread for syncLogs"
> > java.lang.OutOfMemoryError:
> > > > > > Java heap space
> > > > > >        at
> > > > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> > > > > >        at
> > > > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> > > > > >        at
> > > > > >
> org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> > > > > >        at
> > org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> > > > > >        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > > > > > Exception in thread "communication thread"
> > > java.lang.OutOfMemoryError:
> > > > > > Java heap space
> > > > > >
> > > > > > Exception: java.lang.OutOfMemoryError thrown from the
> > > > > > UncaughtExceptionHandler in thread "communication thread"
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Memory exception in the mapper

Posted by Joey Krabacher <jk...@gmail.com>.
No problem, glad I could help.

In our test environment I have lots of output and logging turned on, but as
soon as it is on production all output and logging is reduced to the bare
minimum.
Basically, in production we only log caught exceptions.

I would take it out unless you absolutely need it. IMHO.
If your jobs are not mission critical and do not need to run as smooth as
possible then it's not as important to remove those.

/* Joey */

On Wed, May 23, 2012 at 10:21 PM, Mark Kerzner <ma...@shmsoft.com>wrote:

> Joey,
>
> that did the trick!
>
> Actually, I am writing to the log with System.out.println() statements, and
> I write about 12,000 lines, would that be a problem? I don't really need
> this output, so if you think it's inadvisable, I will remove that.
>
> Also, I hope that if I have not 6,000 maps but 12,000 or even 30,000, it
> will still work.
>
> Well, I will see pretty soon, I guess, with more data.
>
> Again, thank you.
>
> Sincerely,
> Mark
>
> On Wed, May 23, 2012 at 9:43 PM, Joey Krabacher <jkrabacher@gmail.com
> >wrote:
>
> > Mark,
> >
> > Have you tried tweaking the mapred.child.java.opts property in your
> > mapred-site.xml?
> >
> > <property>
> >    <name>mapred.child.java.opts</name>
> >    <value>-Xmx2048m</value>
> >  </property>
> >
> > This might help.
> > It looks like the fatal error came right after the log truncater fired
> off.
> > Are you outputting anything to the logs manually, or have you looked at
> the
> > user logs to see if there is anything taking up lots of room?
> >
> > / * Joey */
> >
> >
> > On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner <mark.kerzner@shmsoft.com
> > >wrote:
> >
> > > Joey,
> > >
> > > my errors closely resembles this
> > > one<
> > >
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3CAANLkTikR3DF4ce-tGIPhv9_-EVFOeD_5-T684NF4y6dd@mail.gmail.com%3E
> > > >in
> > > the archives. I can now be much more specific with the errors message,
> > > and it is quoted below. I tried -Xmx3096. But I got the same error.
> > >
> > > Thank you,
> > > Mark
> > >
> > >
> > > syslog logs
> > > 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
> > > Unable to load native-hadoop library for your platform... using
> > > builtin-java classes where applicable
> > > 2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > > Initializing JVM Metrics with processName=MAP, sessionId=
> > > 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid
> > > exited with exit code 0
> > > 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
> > > ResourceCalculatorPlugin :
> > > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
> > > 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask:
> > io.sort.mb =
> > > 100
> > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data
> > buffer
> > > = 79691776/99614720
> > > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
> > > buffer = 262144/327680
> > > 2012-05-23 20:04:53,010 WARN
> > > org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
> > not
> > > loaded
> > > 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 0; bufend = 79542629; bufvoid = 99614720
> > > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > 0;
> > > kvend = 228; length = 327680
> > > 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 0
> > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 79542629; bufend = 53863940; bufvoid = 99614720
> > > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 228; kvend = 431; length = 327680
> > > 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 1
> > > 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 53863940; bufend = 31696780; bufvoid = 99614720
> > > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 431; kvend = 861; length = 327680
> > > 2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 2
> > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 31696780; bufend = 10267329; bufvoid = 99614720
> > > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 861; kvend = 1462; length = 327680
> > > 2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 3
> > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 10267329; bufend = 85241086; bufvoid = 99614720
> > > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 1462; kvend = 1642; length = 327680
> > > 2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 4
> > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 85241086; bufend = 51305930; bufvoid = 99614720
> > > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 1642; kvend = 1946; length = 327680
> > > 2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 5
> > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 51305930; bufend = 31353466; bufvoid = 99614720
> > > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 1946; kvend = 2263; length = 327680
> > > 2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 6
> > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 31353466; bufend = 10945750; bufvoid = 99614720
> > > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 2263; kvend = 2755; length = 327680
> > > 2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 7
> > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 10945750; bufend = 81838103; bufvoid = 99614720
> > > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 2755; kvend = 2967; length = 327680
> > > 2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 8
> > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 81838103; bufend = 61751422; bufvoid = 99614720
> > > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 2967; kvend = 3202; length = 327680
> > > 2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 9
> > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 61751422; bufend = 38440188; bufvoid = 99614720
> > > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 3202; kvend = 3628; length = 327680
> > > 2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 10
> > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 38440188; bufend = 18376423; bufvoid = 99614720
> > > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 3628; kvend = 4252; length = 327680
> > > 2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 11
> > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 18376423; bufend = 97890160; bufvoid = 99614720
> > > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 4252; kvend = 4488; length = 327680
> > > 2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 12
> > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 97890160; bufend = 77234938; bufvoid = 99614720
> > > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 4488; kvend = 4771; length = 327680
> > > 2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 13
> > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: Spilling
> > map
> > > output: buffer full= true
> > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask:
> bufstart =
> > > 77234938; bufend = 52962105; bufvoid = 99614720
> > > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: kvstart
> =
> > > 4771; kvend = 5203; length = 327680
> > > 2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 14
> > > 2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask: Starting
> > > flush of map output
> > > 2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask: Finished
> > > spill 15
> > > 2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging
> 16
> > > sorted segments
> > > 2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger: Merging 7
> > > intermediate segments out of a total of 16
> > > 2012-05-23 20:22:58,083 INFO
> org.apache.hadoop.mapred.TaskLogsTruncater:
> > > Initializing logs' truncater with mapRetainSize=-1 and
> > reduceRetainSize=-1
> > > 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error
> > running
> > > child : java.lang.OutOfMemoryError: Java heap space
> > >    at
> org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
> > >    at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)
> > >    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
> > >    at
> > >
> > >
> >
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
> > >    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
> > >    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
> > >    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499)
> > >    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> > >    at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
> > >    at
> > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
> > >    at
> > >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
> > >    at
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
> > >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
> > >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> > >    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> > >    at java.security.AccessController.doPrivileged(Native Method)
> > >    at javax.security.auth.Subject.doAs(Subject.java:396)
> > >    at
> > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> > >    at org.apache.hadoop.mapred.Child.main(Child.java:264)
> > >
> > > On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <jkrabacher@gmail.com
> > > >wrote:
> > >
> > > > My experience with this sort of problem tells me one of two things
> and
> > > > maybe both:
> > > >
> > > > 1. there are some optimizations to the code that can be made
> (variable
> > > > re-creation inside of loops, etc.)
> > > > 2. something has gone horribly wrong with the logic in the mapper.
> > > >
> > > > To troubleshoot I would output some log entries at specific points in
> > the
> > > > mapper (be careful not to log every execution of the mapper because
> > this
> > > > could cause major issues with the disk filling up and that sort of
> > > thing.)
> > > >
> > > > Hope that helps.
> > > >
> > > > /* Joey */
> > > >
> > > > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <
> > mark.kerzner@shmsoft.com
> > > > >wrote:
> > > >
> > > > > Hi, all,
> > > > >
> > > > > I got the exception below in the mapper. I already have my global
> > > Hadoop
> > > > > heap at 5 GB, but is there a specific other setting? Or maybe I
> > should
> > > > > troubleshoot for memory?
> > > > >
> > > > > But the same application works in the IDE.
> > > > >
> > > > > Thank you!
> > > > >
> > > > > Mark
> > > > >
> > > > > *stderr logs*
> > > > >
> > > > > Exception in thread "Thread for syncLogs"
> java.lang.OutOfMemoryError:
> > > > > Java heap space
> > > > >        at
> > > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> > > > >        at
> > > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> > > > >        at
> > > > > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> > > > >        at
> org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> > > > >        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > > > > Exception in thread "communication thread"
> > java.lang.OutOfMemoryError:
> > > > > Java heap space
> > > > >
> > > > > Exception: java.lang.OutOfMemoryError thrown from the
> > > > > UncaughtExceptionHandler in thread "communication thread"
> > > > >
> > > >
> > >
> >
>

Re: Memory exception in the mapper

Posted by Mark Kerzner <ma...@shmsoft.com>.
Joey,

that did the trick!

Actually, I am writing to the log with System.out.println() statements, and
I write about 12,000 lines, would that be a problem? I don't really need
this output, so if you think it's inadvisable, I will remove that.

Also, I hope that if I have not 6,000 maps but 12,000 or even 30,000, it
will still work.

Well, I will see pretty soon, I guess, with more data.

Again, thank you.

Sincerely,
Mark

On Wed, May 23, 2012 at 9:43 PM, Joey Krabacher <jk...@gmail.com>wrote:

> Mark,
>
> Have you tried tweaking the mapred.child.java.opts property in your
> mapred-site.xml?
>
> <property>
>    <name>mapred.child.java.opts</name>
>    <value>-Xmx2048m</value>
>  </property>
>
> This might help.
> It looks like the fatal error came right after the log truncater fired off.
> Are you outputting anything to the logs manually, or have you looked at the
> user logs to see if there is anything taking up lots of room?
>
> / * Joey */
>
>
> On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner <mark.kerzner@shmsoft.com
> >wrote:
>
> > Joey,
> >
> > my errors closely resembles this
> > one<
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3CAANLkTikR3DF4ce-tGIPhv9_-EVFOeD_5-T684NF4y6dd@mail.gmail.com%3E
> > >in
> > the archives. I can now be much more specific with the errors message,
> > and it is quoted below. I tried -Xmx3096. But I got the same error.
> >
> > Thank you,
> > Mark
> >
> >
> > syslog logs
> > 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
> > Unable to load native-hadoop library for your platform... using
> > builtin-java classes where applicable
> > 2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> > Initializing JVM Metrics with processName=MAP, sessionId=
> > 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid
> > exited with exit code 0
> > 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
> > ResourceCalculatorPlugin :
> > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
> > 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask:
> io.sort.mb =
> > 100
> > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data
> buffer
> > = 79691776/99614720
> > 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
> > buffer = 262144/327680
> > 2012-05-23 20:04:53,010 WARN
> > org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
> not
> > loaded
> > 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 0; bufend = 79542629; bufvoid = 99614720
> > 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 0;
> > kvend = 228; length = 327680
> > 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 0
> > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 79542629; bufend = 53863940; bufvoid = 99614720
> > 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 228; kvend = 431; length = 327680
> > 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 1
> > 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 53863940; bufend = 31696780; bufvoid = 99614720
> > 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 431; kvend = 861; length = 327680
> > 2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 2
> > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 31696780; bufend = 10267329; bufvoid = 99614720
> > 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 861; kvend = 1462; length = 327680
> > 2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 3
> > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 10267329; bufend = 85241086; bufvoid = 99614720
> > 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 1462; kvend = 1642; length = 327680
> > 2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 4
> > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 85241086; bufend = 51305930; bufvoid = 99614720
> > 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 1642; kvend = 1946; length = 327680
> > 2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 5
> > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 51305930; bufend = 31353466; bufvoid = 99614720
> > 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 1946; kvend = 2263; length = 327680
> > 2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 6
> > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 31353466; bufend = 10945750; bufvoid = 99614720
> > 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 2263; kvend = 2755; length = 327680
> > 2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 7
> > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 10945750; bufend = 81838103; bufvoid = 99614720
> > 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 2755; kvend = 2967; length = 327680
> > 2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 8
> > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 81838103; bufend = 61751422; bufvoid = 99614720
> > 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 2967; kvend = 3202; length = 327680
> > 2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 9
> > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 61751422; bufend = 38440188; bufvoid = 99614720
> > 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 3202; kvend = 3628; length = 327680
> > 2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 10
> > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 38440188; bufend = 18376423; bufvoid = 99614720
> > 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 3628; kvend = 4252; length = 327680
> > 2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 11
> > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 18376423; bufend = 97890160; bufvoid = 99614720
> > 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 4252; kvend = 4488; length = 327680
> > 2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 12
> > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 97890160; bufend = 77234938; bufvoid = 99614720
> > 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 4488; kvend = 4771; length = 327680
> > 2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 13
> > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: Spilling
> map
> > output: buffer full= true
> > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> > 77234938; bufend = 52962105; bufvoid = 99614720
> > 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> > 4771; kvend = 5203; length = 327680
> > 2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 14
> > 2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask: Starting
> > flush of map output
> > 2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask: Finished
> > spill 15
> > 2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging 16
> > sorted segments
> > 2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger: Merging 7
> > intermediate segments out of a total of 16
> > 2012-05-23 20:22:58,083 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> > Initializing logs' truncater with mapRetainSize=-1 and
> reduceRetainSize=-1
> > 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error
> running
> > child : java.lang.OutOfMemoryError: Java heap space
> >    at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
> >    at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)
> >    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
> >    at
> >
> >
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
> >    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
> >    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
> >    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499)
> >    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> >    at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
> >    at
> >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
> >    at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
> >    at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
> >    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> >    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at javax.security.auth.Subject.doAs(Subject.java:396)
> >    at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> >    at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >
> > On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <jkrabacher@gmail.com
> > >wrote:
> >
> > > My experience with this sort of problem tells me one of two things and
> > > maybe both:
> > >
> > > 1. there are some optimizations to the code that can be made (variable
> > > re-creation inside of loops, etc.)
> > > 2. something has gone horribly wrong with the logic in the mapper.
> > >
> > > To troubleshoot I would output some log entries at specific points in
> the
> > > mapper (be careful not to log every execution of the mapper because
> this
> > > could cause major issues with the disk filling up and that sort of
> > thing.)
> > >
> > > Hope that helps.
> > >
> > > /* Joey */
> > >
> > > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <
> mark.kerzner@shmsoft.com
> > > >wrote:
> > >
> > > > Hi, all,
> > > >
> > > > I got the exception below in the mapper. I already have my global
> > Hadoop
> > > > heap at 5 GB, but is there a specific other setting? Or maybe I
> should
> > > > troubleshoot for memory?
> > > >
> > > > But the same application works in the IDE.
> > > >
> > > > Thank you!
> > > >
> > > > Mark
> > > >
> > > > *stderr logs*
> > > >
> > > > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> > > > Java heap space
> > > >        at
> > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> > > >        at
> > > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> > > >        at
> > > > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> > > >        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> > > >        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > > > Exception in thread "communication thread"
> java.lang.OutOfMemoryError:
> > > > Java heap space
> > > >
> > > > Exception: java.lang.OutOfMemoryError thrown from the
> > > > UncaughtExceptionHandler in thread "communication thread"
> > > >
> > >
> >
>

Re: Memory exception in the mapper

Posted by Joey Krabacher <jk...@gmail.com>.
Mark,

Have you tried tweaking the mapred.child.java.opts property in your
mapred-site.xml?

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx2048m</value>
  </property>

This might help.
It looks like the fatal error came right after the log truncater fired off.
Are you outputting anything to the logs manually, or have you looked at the
user logs to see if there is anything taking up lots of room?

/ * Joey */


On Wed, May 23, 2012 at 9:35 PM, Mark Kerzner <ma...@shmsoft.com>wrote:

> Joey,
>
> my errors closely resembles this
> one<
> http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3CAANLkTikR3DF4ce-tGIPhv9_-EVFOeD_5-T684NF4y6dd@mail.gmail.com%3E
> >in
> the archives. I can now be much more specific with the errors message,
> and it is quoted below. I tried -Xmx3096. But I got the same error.
>
> Thank you,
> Mark
>
>
> syslog logs
> 2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid
> exited with exit code 0
> 2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
> 2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb =
> 100
> 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data buffer
> = 79691776/99614720
> 2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
> buffer = 262144/327680
> 2012-05-23 20:04:53,010 WARN
> org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library not
> loaded
> 2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 0; bufend = 79542629; bufvoid = 99614720
> 2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart = 0;
> kvend = 228; length = 327680
> 2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 0
> 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 79542629; bufend = 53863940; bufvoid = 99614720
> 2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 228; kvend = 431; length = 327680
> 2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 1
> 2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 53863940; bufend = 31696780; bufvoid = 99614720
> 2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 431; kvend = 861; length = 327680
> 2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 2
> 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 31696780; bufend = 10267329; bufvoid = 99614720
> 2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 861; kvend = 1462; length = 327680
> 2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 3
> 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 10267329; bufend = 85241086; bufvoid = 99614720
> 2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 1462; kvend = 1642; length = 327680
> 2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 4
> 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 85241086; bufend = 51305930; bufvoid = 99614720
> 2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 1642; kvend = 1946; length = 327680
> 2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 5
> 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 51305930; bufend = 31353466; bufvoid = 99614720
> 2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 1946; kvend = 2263; length = 327680
> 2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 6
> 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 31353466; bufend = 10945750; bufvoid = 99614720
> 2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 2263; kvend = 2755; length = 327680
> 2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 7
> 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 10945750; bufend = 81838103; bufvoid = 99614720
> 2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 2755; kvend = 2967; length = 327680
> 2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 8
> 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 81838103; bufend = 61751422; bufvoid = 99614720
> 2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 2967; kvend = 3202; length = 327680
> 2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 9
> 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 61751422; bufend = 38440188; bufvoid = 99614720
> 2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 3202; kvend = 3628; length = 327680
> 2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 10
> 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 38440188; bufend = 18376423; bufvoid = 99614720
> 2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 3628; kvend = 4252; length = 327680
> 2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 11
> 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 18376423; bufend = 97890160; bufvoid = 99614720
> 2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 4252; kvend = 4488; length = 327680
> 2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 12
> 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 97890160; bufend = 77234938; bufvoid = 99614720
> 2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 4488; kvend = 4771; length = 327680
> 2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 13
> 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: buffer full= true
> 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 77234938; bufend = 52962105; bufvoid = 99614720
> 2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 4771; kvend = 5203; length = 327680
> 2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 14
> 2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask: Starting
> flush of map output
> 2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask: Finished
> spill 15
> 2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging 16
> sorted segments
> 2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger: Merging 7
> intermediate segments out of a total of 16
> 2012-05-23 20:22:58,083 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error running
> child : java.lang.OutOfMemoryError: Java heap space
>    at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
>    at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)
>    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
>    at
>
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
>    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
>    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499)
>    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
>    at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
>    at
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
>    at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
>    at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
>    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at javax.security.auth.Subject.doAs(Subject.java:396)
>    at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
>    at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
> On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <jkrabacher@gmail.com
> >wrote:
>
> > My experience with this sort of problem tells me one of two things and
> > maybe both:
> >
> > 1. there are some optimizations to the code that can be made (variable
> > re-creation inside of loops, etc.)
> > 2. something has gone horribly wrong with the logic in the mapper.
> >
> > To troubleshoot I would output some log entries at specific points in the
> > mapper (be careful not to log every execution of the mapper because this
> > could cause major issues with the disk filling up and that sort of
> thing.)
> >
> > Hope that helps.
> >
> > /* Joey */
> >
> > On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <mark.kerzner@shmsoft.com
> > >wrote:
> >
> > > Hi, all,
> > >
> > > I got the exception below in the mapper. I already have my global
> Hadoop
> > > heap at 5 GB, but is there a specific other setting? Or maybe I should
> > > troubleshoot for memory?
> > >
> > > But the same application works in the IDE.
> > >
> > > Thank you!
> > >
> > > Mark
> > >
> > > *stderr logs*
> > >
> > > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> > > Java heap space
> > >        at
> > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> > >        at
> > java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> > >        at
> > > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> > >        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> > >        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > > Exception in thread "communication thread" java.lang.OutOfMemoryError:
> > > Java heap space
> > >
> > > Exception: java.lang.OutOfMemoryError thrown from the
> > > UncaughtExceptionHandler in thread "communication thread"
> > >
> >
>

Re: Memory exception in the mapper

Posted by Mark Kerzner <ma...@shmsoft.com>.
Joey,

my errors closely resembles this
one<http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201006.mbox/%3CAANLkTikR3DF4ce-tGIPhv9_-EVFOeD_5-T684NF4y6dd@mail.gmail.com%3E>in
the archives. I can now be much more specific with the errors message,
and it is quoted below. I tried -Xmx3096. But I got the same error.

Thank you,
Mark


syslog logs
2012-05-23 20:04:52,349 WARN org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
2012-05-23 20:04:52,519 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2012-05-23 20:04:52,695 INFO org.apache.hadoop.util.ProcessTree: setsid
exited with exit code 0
2012-05-23 20:04:52,699 INFO org.apache.hadoop.mapred.Task:  Using
ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@d56b37
2012-05-23 20:04:52,813 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb =
100
2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: data buffer
= 79691776/99614720
2012-05-23 20:04:52,998 INFO org.apache.hadoop.mapred.MapTask: record
buffer = 262144/327680
2012-05-23 20:04:53,010 WARN
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library not
loaded
2012-05-23 20:12:29,120 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: bufstart =
0; bufend = 79542629; bufvoid = 99614720
2012-05-23 20:12:29,134 INFO org.apache.hadoop.mapred.MapTask: kvstart = 0;
kvend = 228; length = 327680
2012-05-23 20:12:31,248 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 0
2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: bufstart =
79542629; bufend = 53863940; bufvoid = 99614720
2012-05-23 20:13:01,862 INFO org.apache.hadoop.mapred.MapTask: kvstart =
228; kvend = 431; length = 327680
2012-05-23 20:13:03,294 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 1
2012-05-23 20:13:48,121 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: bufstart =
53863940; bufend = 31696780; bufvoid = 99614720
2012-05-23 20:13:48,122 INFO org.apache.hadoop.mapred.MapTask: kvstart =
431; kvend = 861; length = 327680
2012-05-23 20:13:49,818 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 2
2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: bufstart =
31696780; bufend = 10267329; bufvoid = 99614720
2012-05-23 20:15:25,618 INFO org.apache.hadoop.mapred.MapTask: kvstart =
861; kvend = 1462; length = 327680
2012-05-23 20:15:27,068 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 3
2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: bufstart =
10267329; bufend = 85241086; bufvoid = 99614720
2012-05-23 20:15:53,519 INFO org.apache.hadoop.mapred.MapTask: kvstart =
1462; kvend = 1642; length = 327680
2012-05-23 20:15:54,760 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 4
2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: bufstart =
85241086; bufend = 51305930; bufvoid = 99614720
2012-05-23 20:16:26,284 INFO org.apache.hadoop.mapred.MapTask: kvstart =
1642; kvend = 1946; length = 327680
2012-05-23 20:16:27,566 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 5
2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: bufstart =
51305930; bufend = 31353466; bufvoid = 99614720
2012-05-23 20:16:57,046 INFO org.apache.hadoop.mapred.MapTask: kvstart =
1946; kvend = 2263; length = 327680
2012-05-23 20:16:58,076 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 6
2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: bufstart =
31353466; bufend = 10945750; bufvoid = 99614720
2012-05-23 20:17:52,820 INFO org.apache.hadoop.mapred.MapTask: kvstart =
2263; kvend = 2755; length = 327680
2012-05-23 20:17:53,939 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 7
2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: bufstart =
10945750; bufend = 81838103; bufvoid = 99614720
2012-05-23 20:18:19,528 INFO org.apache.hadoop.mapred.MapTask: kvstart =
2755; kvend = 2967; length = 327680
2012-05-23 20:18:21,145 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 8
2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: bufstart =
81838103; bufend = 61751422; bufvoid = 99614720
2012-05-23 20:18:52,353 INFO org.apache.hadoop.mapred.MapTask: kvstart =
2967; kvend = 3202; length = 327680
2012-05-23 20:18:53,485 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 9
2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: bufstart =
61751422; bufend = 38440188; bufvoid = 99614720
2012-05-23 20:19:42,993 INFO org.apache.hadoop.mapred.MapTask: kvstart =
3202; kvend = 3628; length = 327680
2012-05-23 20:19:44,346 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 10
2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: bufstart =
38440188; bufend = 18376423; bufvoid = 99614720
2012-05-23 20:20:47,957 INFO org.apache.hadoop.mapred.MapTask: kvstart =
3628; kvend = 4252; length = 327680
2012-05-23 20:20:48,767 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 11
2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: bufstart =
18376423; bufend = 97890160; bufvoid = 99614720
2012-05-23 20:21:19,207 INFO org.apache.hadoop.mapred.MapTask: kvstart =
4252; kvend = 4488; length = 327680
2012-05-23 20:21:20,176 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 12
2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: bufstart =
97890160; bufend = 77234938; bufvoid = 99614720
2012-05-23 20:21:47,395 INFO org.apache.hadoop.mapred.MapTask: kvstart =
4488; kvend = 4771; length = 327680
2012-05-23 20:21:48,383 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 13
2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true
2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: bufstart =
77234938; bufend = 52962105; bufvoid = 99614720
2012-05-23 20:22:21,086 INFO org.apache.hadoop.mapred.MapTask: kvstart =
4771; kvend = 5203; length = 327680
2012-05-23 20:22:21,641 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 14
2012-05-23 20:22:46,559 INFO org.apache.hadoop.mapred.MapTask: Starting
flush of map output
2012-05-23 20:22:46,615 INFO org.apache.hadoop.mapred.MapTask: Finished
spill 15
2012-05-23 20:22:47,606 INFO org.apache.hadoop.mapred.Merger: Merging 16
sorted segments
2012-05-23 20:22:47,696 INFO org.apache.hadoop.mapred.Merger: Merging 7
intermediate segments out of a total of 16
2012-05-23 20:22:58,083 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error running
child : java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
    at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:417)
    at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)
    at
org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330)
    at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
    at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156)
    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:499)
    at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
    at org.apache.hadoop.mapred.Merger.merge(Merger.java:77)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1548)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1180)
    at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
    at org.apache.hadoop.mapred.Child.main(Child.java:264)

On Wed, May 23, 2012 at 2:47 PM, Joey Krabacher <jk...@gmail.com>wrote:

> My experience with this sort of problem tells me one of two things and
> maybe both:
>
> 1. there are some optimizations to the code that can be made (variable
> re-creation inside of loops, etc.)
> 2. something has gone horribly wrong with the logic in the mapper.
>
> To troubleshoot I would output some log entries at specific points in the
> mapper (be careful not to log every execution of the mapper because this
> could cause major issues with the disk filling up and that sort of thing.)
>
> Hope that helps.
>
> /* Joey */
>
> On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <mark.kerzner@shmsoft.com
> >wrote:
>
> > Hi, all,
> >
> > I got the exception below in the mapper. I already have my global Hadoop
> > heap at 5 GB, but is there a specific other setting? Or maybe I should
> > troubleshoot for memory?
> >
> > But the same application works in the IDE.
> >
> > Thank you!
> >
> > Mark
> >
> > *stderr logs*
> >
> > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> > Java heap space
> >        at
> java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> >        at
> java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> >        at
> > org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> >        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> >        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > Exception in thread "communication thread" java.lang.OutOfMemoryError:
> > Java heap space
> >
> > Exception: java.lang.OutOfMemoryError thrown from the
> > UncaughtExceptionHandler in thread "communication thread"
> >
>

Re: Memory exception in the mapper

Posted by Joey Krabacher <jk...@gmail.com>.
My experience with this sort of problem tells me one of two things and
maybe both:

1. there are some optimizations to the code that can be made (variable
re-creation inside of loops, etc.)
2. something has gone horribly wrong with the logic in the mapper.

To troubleshoot I would output some log entries at specific points in the
mapper (be careful not to log every execution of the mapper because this
could cause major issues with the disk filling up and that sort of thing.)

Hope that helps.

/* Joey */

On Wed, May 23, 2012 at 2:16 PM, Mark Kerzner <ma...@shmsoft.com>wrote:

> Hi, all,
>
> I got the exception below in the mapper. I already have my global Hadoop
> heap at 5 GB, but is there a specific other setting? Or maybe I should
> troubleshoot for memory?
>
> But the same application works in the IDE.
>
> Thank you!
>
> Mark
>
> *stderr logs*
>
> Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> Java heap space
>        at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
>        at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
>        at
> org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
>        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
>        at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> Exception in thread "communication thread" java.lang.OutOfMemoryError:
> Java heap space
>
> Exception: java.lang.OutOfMemoryError thrown from the
> UncaughtExceptionHandler in thread "communication thread"
>

Re: Memory exception in the mapper

Posted by Mark Kerzner <ma...@shmsoft.com>.
Arun,

Actually CDH3 is Hadoop 0.20, but with .21 backported, so I am using 0.21
API whenever I can.

Mark

On Wed, May 23, 2012 at 9:40 PM, Mark Kerzner <ma...@shmsoft.com>wrote:

> Arun,
>
> I am running the latest CDH3, which I re-installed yesterday, so I believe
> it is Hadoop 0.21.
>
> I have about 6000 maps emitted, and 16 spills, and then I see Mapper
> cleanup() being called, after which I get this error
>
> 2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error
> running child : java.lang.OutOfMemoryError: Java heap space
>     at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)
>
> Thank you,
> Mark
>
>
> On Wed, May 23, 2012 at 9:29 PM, Arun C Murthy <ac...@hortonworks.com>wrote:
>
>> What version of hadoop are you running?
>>
>> On May 23, 2012, at 12:16 PM, Mark Kerzner wrote:
>>
>> > Hi, all,
>> >
>> > I got the exception below in the mapper. I already have my global Hadoop
>> > heap at 5 GB, but is there a specific other setting? Or maybe I should
>> > troubleshoot for memory?
>> >
>> > But the same application works in the IDE.
>> >
>> > Thank you!
>> >
>> > Mark
>> >
>> > *stderr logs*
>> >
>> > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
>> > Java heap space
>> >       at
>> java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
>> >       at
>> java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
>> >       at
>> org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
>> >       at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
>> >       at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
>> > Exception in thread "communication thread" java.lang.OutOfMemoryError:
>> > Java heap space
>> >
>> > Exception: java.lang.OutOfMemoryError thrown from the
>> > UncaughtExceptionHandler in thread "communication thread"
>>
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>

Re: Memory exception in the mapper

Posted by Mark Kerzner <ma...@shmsoft.com>.
Arun,

I am running the latest CDH3, which I re-installed yesterday, so I believe
it is Hadoop 0.21.

I have about 6000 maps emitted, and 16 spills, and then I see Mapper
cleanup() being called, after which I get this error

2012-05-23 20:22:58,108 FATAL org.apache.hadoop.mapred.Child: Error running
child : java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:355)

Thank you,
Mark

On Wed, May 23, 2012 at 9:29 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> What version of hadoop are you running?
>
> On May 23, 2012, at 12:16 PM, Mark Kerzner wrote:
>
> > Hi, all,
> >
> > I got the exception below in the mapper. I already have my global Hadoop
> > heap at 5 GB, but is there a specific other setting? Or maybe I should
> > troubleshoot for memory?
> >
> > But the same application works in the IDE.
> >
> > Thank you!
> >
> > Mark
> >
> > *stderr logs*
> >
> > Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> > Java heap space
> >       at
> java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> >       at
> java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> >       at
> org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> >       at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> >       at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> > Exception in thread "communication thread" java.lang.OutOfMemoryError:
> > Java heap space
> >
> > Exception: java.lang.OutOfMemoryError thrown from the
> > UncaughtExceptionHandler in thread "communication thread"
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Memory exception in the mapper

Posted by Arun C Murthy <ac...@hortonworks.com>.
What version of hadoop are you running?

On May 23, 2012, at 12:16 PM, Mark Kerzner wrote:

> Hi, all,
> 
> I got the exception below in the mapper. I already have my global Hadoop
> heap at 5 GB, but is there a specific other setting? Or maybe I should
> troubleshoot for memory?
> 
> But the same application works in the IDE.
> 
> Thank you!
> 
> Mark
> 
> *stderr logs*
> 
> Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> Java heap space
> 	at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:76)
> 	at java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:59)
> 	at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:292)
> 	at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:365)
> 	at org.apache.hadoop.mapred.Child$3.run(Child.java:157)
> Exception in thread "communication thread" java.lang.OutOfMemoryError:
> Java heap space
> 
> Exception: java.lang.OutOfMemoryError thrown from the
> UncaughtExceptionHandler in thread "communication thread"

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/