You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Shuja Rehman <sh...@gmail.com> on 2010/07/09 19:14:55 UTC
java.lang.OutOfMemoryError: Java heap space
Hi All
I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.
Caught: java.lang.OutOfMemoryError: Java heap space
at Nodemapper5.parseXML(Nodemapper5.groovy:25)
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
tell the job manually by following line.
-D mapred.child.java.opts=-Xmx2000M \
but it still gives the error. The same job runs fine if i run on shell using
1024M heap size like
cat file.xml | /root/Nodemapper5.groovy
Any clue?????????
Thanks in advance.
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by anshul goel <an...@gmail.com>.
unsubscribe
On Fri, Jul 9, 2010 at 10:59 PM, anshul goel <an...@gmail.com> wrote:
> unsubscrive
>
>
> On Fri, Jul 9, 2010 at 10:44 PM, Shuja Rehman <sh...@gmail.com>wrote:
>
>> Hi All
>>
>> I am facing a hard problem. I am running a map reduce job using streaming
>> but it fails and it gives the following error.
>>
>> Caught: java.lang.OutOfMemoryError: Java heap space
>> at Nodemapper5.parseXML(Nodemapper5.groovy:25)
>> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
>> failed with code 1
>> at
>> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>> at
>> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>> at
>> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>> at
>> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>
>>
>>
>> I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
>> tell the job manually by following line.
>>
>> -D mapred.child.java.opts=-Xmx2000M \
>>
>> but it still gives the error. The same job runs fine if i run on shell
>> using
>> 1024M heap size like
>>
>> cat file.xml | /root/Nodemapper5.groovy
>>
>>
>> Any clue?????????
>>
>> Thanks in advance.
>>
>>
>> --
>> Regards
>> Shuja-ur-Rehman Baig
>> _________________________________
>> MS CS - School of Science and Engineering
>> Lahore University of Management Sciences (LUMS)
>> Sector U, DHA, Lahore, 54792, Pakistan
>> Cell: +92 3214207445
>>
>
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by anshul goel <an...@gmail.com>.
unsubscrive
On Fri, Jul 9, 2010 at 10:44 PM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi All
>
> I am facing a hard problem. I am running a map reduce job using streaming
> but it fails and it gives the following error.
>
> Caught: java.lang.OutOfMemoryError: Java heap space
> at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
> at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
>
> I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
> tell the job manually by following line.
>
> -D mapred.child.java.opts=-Xmx2000M \
>
> but it still gives the error. The same job runs fine if i run on shell
> using
> 1024M heap size like
>
> cat file.xml | /root/Nodemapper5.groovy
>
>
> Any clue?????????
>
> Thanks in advance.
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Honestly, no idea. I can just suggest running "*hadoop jar
/usr/lib/hadoop-0.20/contrib/*
*streaming/hadoop-streaming-0.**20.2+320.jar -jt local -fs local ...*" on
both nodes and debug.
On Mon, Jul 12, 2010 at 4:53 PM, Shuja Rehman <sh...@gmail.com> wrote:
> Alex, any guess why it fails on server while it has more free memory than
> slave.
>
> On Tue, Jul 13, 2010 at 3:06 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > *Master Node output:*
> >
> > total used free shared buffers cached
> > Mem: 2097328 515576 1581752 0 56060 254760
> > -/+ buffers/cache: 204756 1892572
> > Swap: 522104 0 522104
> >
> > *Slave Node output:*
> > total used free shared buffers cached
> > Mem: 1048752 860684 188068 0 148388 570948
> > -/+ buffers/cache: 141348 907404
> > Swap: 522104 40 522064
> >
> > it seems that on server there is more memory free.
> >
> > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> >> Maybe you do not have enough available memory on master? What is the
> >> output
> >> of "*free*" on both nodes? -- Alex K
> >>
> >> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> >> wrote:
> >>
> >> > Hi
> >> > I have added following line to my master node mapred-site.xml file
> >> >
> >> > <property>
> >> > <name>mapred.child.ulimit</name>
> >> > <value>3145728</value>
> >> > </property>
> >> >
> >> > and run the job again, and wow..., the jobs get completed in 4th
> >> attempt. I
> >> > checked the at 50030. Hadoop runs job 3 times on master server and it
> >> fails
> >> > but when it run on 2nd node, it succeeded and produce the desired
> >> result.
> >> > Why it failed on master?
> >> > Thanks
> >> > Shuja
> >> >
> >> >
> >> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> >> wrote:
> >> >
> >> > > Hmm. It means your options are not propagated to the nodes. Can
> you
> >> put
> >> > *
> >> > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> >> tasktrackers?
> >> > > I
> >> > > was under impression that the below should be enough though. Glad
> you
> >> > got
> >> > > it working in local mode. -- Alex K
> >> > >
> >> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <
> shujamughal@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Alex, I am using putty to connect to servers. and this is
> almost
> >> my
> >> > > > maximum screen output which i sent. putty is not allowed me to
> >> increase
> >> > > the
> >> > > > size of terminal. is there any other way that i get the complete
> >> output
> >> > > of
> >> > > > ps-aef?
> >> > > >
> >> > > > Now i run the following command and thnx God, it did not fails and
> >> > > produce
> >> > > > the desired output.
> >> > > >
> >> > > > hadoop jar
> >> > > >
> >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > \
> >> > > > -D mapred.child.java.opts=-Xmx1024m \
> >> > > > -D mapred.child.ulimit=3145728 \
> >> > > > -jt local \
> >> > > > -inputformat StreamInputFormat \
> >> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> >> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> >> http://www.w3.org/TR/REC-xml%5C> <
> >> > http://www.w3.org/TR/REC-xml%5C> <
> >> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> >> > > > \
> >> > > > -input
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> >> > > > \
> >> > > > -jobconf mapred.map.tasks=1 \
> >> > > > -jobconf mapred.reduce.tasks=0 \
> >> > > > -output RNC32 \
> >> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> >> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> >> > > > -file /home/ftpuser1/Nodemapper5.groovy
> >> > > >
> >> > > >
> >> > > > but when i omit the -jt local, it produces the same error.
> >> > > > Thanks Alex for helping
> >> > > > Regards
> >> > > > Shuja
> >> > > >
> >> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <alexvk@cloudera.com
> >
> >> > > wrote:
> >> > > >
> >> > > > > Hi Shuja,
> >> > > > >
> >> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..."
> >> on
> >> > the
> >> > > > > command line, the last is valid. Unfortunately you have
> truncated
> >> > > > command
> >> > > > > lines. Can you show us the full command line, particularly for
> >> the
> >> > > > process
> >> > > > > 26162? This seems to be causing problems.
> >> > > > >
> >> > > > > If you are running your cluster on 2 nodes, it may be that the
> >> task
> >> > was
> >> > > > > scheduled on the second node. Did you run "ps -aef" on the
> second
> >> > node
> >> > > > as
> >> > > > > well? You can see the task assignment in the JT web-UI (
> >> > > > > http://jt-name:50030, drill down to tasks).
> >> > > > >
> >> > > > > I suggest you first debug your program in the local mode first,
> >> > however
> >> > > > > (use
> >> > > > > "*-jt local*" option). Did you try the "*-D
> >> > > > mapred.child.ulimit=3145728*"
> >> > > > > option? I do not see it on the command line.
> >> > > > >
> >> > > > > Alex K
> >> > > > >
> >> > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> >> > shujamughal@gmail.com
> >> > > > > >wrote:
> >> > > > >
> >> > > > > > Hi Alex
> >> > > > > >
> >> > > > > > I have tried with using quotes and also with -jt local but
> same
> >> > heap
> >> > > > > > error.
> >> > > > > > and here is the output of ps -aef
> >> > > > > >
> >> > > > > > UID PID PPID C STIME TTY TIME CMD
> >> > > > > > root 1 0 0 04:37 ? 00:00:00 init [3]
> >> > > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> >> > > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> >> > > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> >> > > > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> >> > > > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> >> > > > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> >> > > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> >> > > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> >> > > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> >> > > > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> >> > > > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> >> > > > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> >> > > > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> >> > > > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> >> > > > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> >> > > > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> >> > > > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> >> > > > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> >> > > > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> >> > > > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> >> > > > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> >> > > > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> >> > > > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> >> > > > > > root 635 7 0 04:37 ? 00:00:00
> >> [kmpath_handlerd]
> >> > > > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> >> > > > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> >> > > > > > root 1032 1 0 04:38 ? 00:00:00 auditd
> >> > > > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> >> > > > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> >> > > > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> >> > > > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> >> > > > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> >> > > > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon
> >> > --system
> >> > > > > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> >> > > > > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> >> > > > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> >> > > > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> >> > > > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
> >> > > --server
> >> > > > > > root 1295 1 0 04:38 ? 00:00:00 automount
> >> > > > > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> >> > > > > > root 1326 1 0 04:38 ? 00:00:00 xinetd
> >> -stayalive
> >> > > > > -pidfile
> >> > > > > > /var/run/xinetd.pid
> >> > > > > > root 1337 1 0 04:38 ? 00:00:00
> /usr/sbin/vsftpd
> >> > > > > > /etc/vsftpd/vsftpd.conf
> >> > > > > > root 1354 1 0 04:38 ? 00:00:00 sendmail:
> >> accepting
> >> > > > > > connections
> >> > > > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail:
> Queue
> >> > > > runner@01
> >> > > > > > :00:00
> >> > > > > > for /var/spool/clientmqueue
> >> > > > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
> >> > > /dev/input/mice
> >> > > > -t
> >> > > > > > exps2
> >> > > > > > root 1410 1 0 04:38 ? 00:00:00 crond
> >> > > > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv
> >> > -daemon
> >> > > > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> >> > > > > > 68 1508 1 0 04:38 ? 00:00:00 hald
> >> > > > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> >> > > > > > root 1533 1 0 04:38 ? 00:00:00
> /usr/sbin/smartd
> >> -q
> >> > > > never
> >> > > > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty
> >> xvc0
> >> > > 9600
> >> > > > > > vt100-nav
> >> > > > > > root 1537 1 0 04:38 ? 00:00:00
> /usr/bin/python
> >> -tt
> >> > > > > > /usr/sbin/yum-updatesd
> >> > > > > > root 1539 1 0 04:38 ? 00:00:00
> >> > > /usr/libexec/gam_server
> >> > > > > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts
> >> /0
> >> > > > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> >> > > > > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts
> >> /1
> >> > > > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> >> > > > > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts
> >> /2
> >> > > > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> >> > > > > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts
> >> /3
> >> > > > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> >> > > > > > hadoop 24808 1 0 12:01 ? 00:00:02
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop 24893 1 0 12:01 ? 00:00:01
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop 24988 1 0 12:01 ? 00:00:01
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop 25085 1 0 12:01 ? 00:00:00
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop 25175 1 0 12:01 ? 00:00:01
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> >> > > > > > -Dhadoop.log.file=hadoo
> >> > > > > > root 25925 21994 1 12:06 pts/2 00:00:00
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> >> > > > > > -Dhadoop.log.file=hadoop.log -
> >> > > > > > hadoop 26120 25175 14 12:06 ? 00:00:01
> >> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> >> > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> >> > > > > > -Dscript.name=/usr/local/groovy/b
> >> > > > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> >> > > > > >
> >> > > > > >
> >> > > > > > *The command which i am executing is *
> >> > > > > >
> >> > > > > >
> >> > > > > > hadoop jar
> >> > > > > >
> >> > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > > > \
> >> > > > > > -D mapred.child.java.opts=-Xmx1024m \
> >> > > > > > -inputformat StreamInputFormat \
> >> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> >> > > > > > http://www.w3.org/TR/REC-xml\<http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C><
> >> http://www.w3.org/TR/REC-xml%5C> <
> >> > http://www.w3.org/TR/REC-xml%5C> <
> >> > > http://www.w3.org/TR/REC-xml%5C> <
> >> > > > http://www.w3.org/TR/REC-xml%5C> <
> >> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> >> > > > > > \
> >> > > > > > -input
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> >> > > > > > \
> >> > > > > > -jobconf mapred.map.tasks=1 \
> >> > > > > > -jobconf mapred.reduce.tasks=0 \
> >> > > > > > -output RNC25 \
> >> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> >> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> >> > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> >> > > > > > -jt local
> >> > > > > >
> >> > > > > > I have noticed that the all hadoop processes showing 2001
> memory
> >> > size
> >> > > > > which
> >> > > > > > i have set in hadoop-env.sh. and one the command, i give 2000
> in
> >> > > mapper
> >> > > > > and
> >> > > > > > 1024 in child.java.opts but i think these values(1024,2001)
> are
> >> not
> >> > > in
> >> > > > > use.
> >> > > > > > secondly the following lines
> >> > > > > >
> >> > > > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> >> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> >> > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> >> > > > > > -Dscript.name=/usr/local/groovy/b*
> >> > > > > >
> >> > > > > > did not appear for first time when job runs. they appear when
> >> job
> >> > > > failed
> >> > > > > > for
> >> > > > > > first time and then again try to start mapping. I have one
> more
> >> > > > question
> >> > > > > > which is as all hadoop processes (namenode, datanode,
> >> > tasktracker...)
> >> > > > > > showing 2001 heapsize in process. will it means all the
> >> processes
> >> > > > using
> >> > > > > > 2001m of memory??
> >> > > > > >
> >> > > > > > Regards
> >> > > > > > Shuja
> >> > > > > >
> >> > > > > >
> >> > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> >> alexvk@cloudera.com>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi Shuja,
> >> > > > > > >
> >> > > > > > > I think you need to enclose the invocation string in quotes.
> >> > Try:
> >> > > > > > >
> >> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> >> > > > > > >
> >> > > > > > > Also, it would be nice to see how exactly the groovy is
> >> invoked.
> >> > > Is
> >> > > > > > groovy
> >> > > > > > > started and them gives you OOM or is OOM error during the
> >> start?
> >> > > Can
> >> > > > > you
> >> > > > > > > see the new process with "ps -aef"?
> >> > > > > > >
> >> > > > > > > Can you run groovy in local mode? Try "-jt local" option.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > >
> >> > > > > > > Alex K
> >> > > > > > >
> >> > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> >> > > shujamughal@gmail.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi Patrick,
> >> > > > > > > > Thanks for explanation. I have supply the heapsize in
> mapper
> >> in
> >> > > the
> >> > > > > > > > following way
> >> > > > > > > >
> >> > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> >> > > > > > > >
> >> > > > > > > > but still same error. Any other idea?
> >> > > > > > > > Thanks
> >> > > > > > > >
> >> > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> >> > > > > patrick@cloudera.com
> >> > > > > > > > >wrote:
> >> > > > > > > >
> >> > > > > > > > > Shuja,
> >> > > > > > > > >
> >> > > > > > > > > Those settings (mapred.child.jvm.opts and
> >> > mapred.child.ulimit)
> >> > > > are
> >> > > > > > only
> >> > > > > > > > > used
> >> > > > > > > > > for child JVMs that get forked by the TaskTracker. You
> are
> >> > > using
> >> > > > > > Hadoop
> >> > > > > > > > > streaming, which means the TaskTracker is forking a JVM
> >> for
> >> > > > > > streaming,
> >> > > > > > > > > which
> >> > > > > > > > > is then forking a shell process that runs your groovy
> code
> >> > (in
> >> > > > > > another
> >> > > > > > > > > JVM).
> >> > > > > > > > >
> >> > > > > > > > > I'm not much of a groovy expert, but if there's a way
> you
> >> can
> >> > > > wrap
> >> > > > > > your
> >> > > > > > > > > code
> >> > > > > > > > > around the MapReduce API that would work best.
> Otherwise,
> >> you
> >> > > can
> >> > > > > > just
> >> > > > > > > > pass
> >> > > > > > > > > the heapsize in '-mapper' argument.
> >> > > > > > > > >
> >> > > > > > > > > Regards,
> >> > > > > > > > >
> >> > > > > > > > > - Patrick
> >> > > > > > > > >
> >> > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> >> > > > > shujamughal@gmail.com
> >> > > > > > >
> >> > > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hi Alex,
> >> > > > > > > > > >
> >> > > > > > > > > > I have update the java to latest available version on
> >> all
> >> > > > > machines
> >> > > > > > in
> >> > > > > > > > the
> >> > > > > > > > > > cluster and now i run the job by adding this line
> >> > > > > > > > > >
> >> > > > > > > > > > -D mapred.child.ulimit=3145728 \
> >> > > > > > > > > >
> >> > > > > > > > > > but still same error. Here is the output of this job.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> >> > > > > > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> >> > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> >> > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> >> > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> >> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> >> > > > > > > > > /usr/lib/hadoop-0.20/con
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> >> > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> >> > > > > > > > > > org.apache.hadoop.util.RunJar
> >> > > > > > > > > >
> >> > > > > > >
> >> > > >
> >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > > > > > > > -D
> >> > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> >> > > > mapred.child.ulimit=3145728
> >> > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> >> > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> >> http://www.w
> >> > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> >> > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> >> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> >> > > > > > mapred.map.tasks=1
> >> > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> >> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> >> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> >> > > > > > > > > /home/ftpuser1/Nodemapp
> >> > > > > > > > > > er5.groovy
> >> > > > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> >> > > > > > > Nodemapper5.groovy
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Any clue?
> >> > > > > > > > > > Thanks
> >> > > > > > > > > >
> >> > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> >> > > > > alexvk@cloudera.com>
> >> > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Hi Shuja,
> >> > > > > > > > > > >
> >> > > > > > > > > > > First, thank you for using CDH3. Can you also check
> >> what
> >> > > m*
> >> > > > > > > > > > > apred.child.ulimit* you are using? Try adding "*
> >> > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command
> line.
> >> > > > > > > > > > >
> >> > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> >> update
> >> > 8
> >> > > at
> >> > > > a
> >> > > > > > > > > minimum,
> >> > > > > > > > > > > which you can download from the Java SE
> >> > > > > > > > > > > Homepage<
> >> http://java.sun.com/javase/downloads/index.jsp>
> >> > > > > > > > > > > .
> >> > > > > > > > > > >
> >> > > > > > > > > > > Let me know how it goes.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Alex K
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> >> > > > > > > > shujamughal@gmail.com
> >> > > > > > > > > > > >wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Hi Alex
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
> >> and
> >> > > using
> >> > > > > > > > Cloudera
> >> > > > > > > > > > > > distribution of hadoop. and here is the output of
> >> this
> >> > > > > command.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> >> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> >> > > > /hadoop-0.20/logs
> >> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> >> > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> >> > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> >> > > op.root.logger=INFO,console
> >> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> >> > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> >> > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> >> > > > > > > > > > > >
> >> > > > > > > > >
> >> > > > > >
> >> > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> >> > > > > > > StreamInputFormat
> >> > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> >> <mdc
> >> > > > > > > xmlns:HTML="
> >> > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> >> > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> >> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> >> > > > > > > > mapred.map.tasks=1
> >> > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
> >> RNC11
> >> > > > -mapper
> >> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> >> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /
> >> > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> >> > > > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00
> grep
> >> > > > > > > > > Nodemapper5.groovy
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ------------------------------------------------------------------------------------------------------------------------------
> >> > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Best Regards
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> >> > > > > > > alexvk@cloudera.com
> >> > > > > > > > >
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Hi Shuja,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > It looks like the OOM is happening in your code.
> >> Are
> >> > > you
> >> > > > > > > running
> >> > > > > > > > > > > > MapReduce
> >> > > > > > > > > > > > > in a cluster? If so, can you send the exact
> >> command
> >> > > line
> >> > > > > > your
> >> > > > > > > > code
> >> > > > > > > > > > is
> >> > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> >> grep
> >> > > > > > > > > > > Nodemapper5.groovy'
> >> > > > > > > > > > > > > command on one of the nodes which is running the
> >> > task?
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Alex K
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> >> > > > > > > > > > shujamughal@gmail.com
> >> > > > > > > > > > > > > >wrote:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Hi All
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > I am facing a hard problem. I am running a map
> >> > reduce
> >> > > > job
> >> > > > > > > using
> >> > > > > > > > > > > > streaming
> >> > > > > > > > > > > > > > but it fails and it gives the following error.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> >> space
> >> > > > > > > > > > > > > > at
> >> > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > java.lang.RuntimeException:
> >> > > > > PipeMapRed.waitOutputThreads():
> >> > > > > > > > > > > subprocess
> >> > > > > > > > > > > > > > failed with code 1
> >> > > > > > > > > > > > > > at
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> >> > > > > > > > > > > > > > at
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > at
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> >> > > > > > > > > > > > > > at
> >> > > > > > > > >
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> >> > > > > > > > > > > > > > at
> >> > > > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > >
> >> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> >> > > > > > > > > > > > > > at
> >> > > > > > > > > > > >
> >> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > at
> >> > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> >> > > > > > > > > > > > > > at
> >> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > I have increased the heap size in
> hadoop-env.sh
> >> and
> >> > > > make
> >> > > > > it
> >> > > > > > > > > 2000M.
> >> > > > > > > > > > > Also
> >> > > > > > > > > > > > I
> >> > > > > > > > > > > > > > tell the job manually by following line.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > but it still gives the error. The same job
> runs
> >> > fine
> >> > > if
> >> > > > i
> >> > > > > > run
> >> > > > > > > > on
> >> > > > > > > > > > > shell
> >> > > > > > > > > > > > > > using
> >> > > > > > > > > > > > > > 1024M heap size like
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Any clue?????????
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Thanks in advance.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > --
> >> > > > > > > > > > > > > > Regards
> >> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > > > > > > > _________________________________
> >> > > > > > > > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > > > > > > > Lahore University of Management Sciences
> (LUMS)
> >> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > > > > > > > Cell: +92 3214207445
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > --
> >> > > > > > > > > > > > Regards
> >> > > > > > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > > > > > _________________________________
> >> > > > > > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > > > > > Cell: +92 3214207445
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > --
> >> > > > > > > > > > Regards
> >> > > > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > > > _________________________________
> >> > > > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > > > Cell: +92 3214207445
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Regards
> >> > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > _________________________________
> >> > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > Cell: +92 3214207445
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Regards
> >> > > > > > Shuja-ur-Rehman Baig
> >> > > > > > _________________________________
> >> > > > > > MS CS - School of Science and Engineering
> >> > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > Cell: +92 3214207445
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Regards
> >> > > > Shuja-ur-Rehman Baig
> >> > > > _________________________________
> >> > > > MS CS - School of Science and Engineering
> >> > > > Lahore University of Management Sciences (LUMS)
> >> > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > Cell: +92 3214207445
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Regards
> >> > Shuja-ur-Rehman Baig
> >> > _________________________________
> >> > MS CS - School of Science and Engineering
> >> > Lahore University of Management Sciences (LUMS)
> >> > Sector U, DHA, Lahore, 54792, Pakistan
> >> > Cell: +92 3214207445
> >> >
> >>
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Alex, any guess why it fails on server while it has more free memory than
slave.
On Tue, Jul 13, 2010 at 3:06 AM, Shuja Rehman <sh...@gmail.com> wrote:
> *Master Node output:*
>
> total used free shared buffers cached
> Mem: 2097328 515576 1581752 0 56060 254760
> -/+ buffers/cache: 204756 1892572
> Swap: 522104 0 522104
>
> *Slave Node output:*
> total used free shared buffers cached
> Mem: 1048752 860684 188068 0 148388 570948
> -/+ buffers/cache: 141348 907404
> Swap: 522104 40 522064
>
> it seems that on server there is more memory free.
>
>
>
> On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
>> Maybe you do not have enough available memory on master? What is the
>> output
>> of "*free*" on both nodes? -- Alex K
>>
>> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
>> wrote:
>>
>> > Hi
>> > I have added following line to my master node mapred-site.xml file
>> >
>> > <property>
>> > <name>mapred.child.ulimit</name>
>> > <value>3145728</value>
>> > </property>
>> >
>> > and run the job again, and wow..., the jobs get completed in 4th
>> attempt. I
>> > checked the at 50030. Hadoop runs job 3 times on master server and it
>> fails
>> > but when it run on 2nd node, it succeeded and produce the desired
>> result.
>> > Why it failed on master?
>> > Thanks
>> > Shuja
>> >
>> >
>> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
>> wrote:
>> >
>> > > Hmm. It means your options are not propagated to the nodes. Can you
>> put
>> > *
>> > > mapred.child.ulimit* in the mapred-siet.xml and restart the
>> tasktrackers?
>> > > I
>> > > was under impression that the below should be enough though. Glad you
>> > got
>> > > it working in local mode. -- Alex K
>> > >
>> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Alex, I am using putty to connect to servers. and this is almost
>> my
>> > > > maximum screen output which i sent. putty is not allowed me to
>> increase
>> > > the
>> > > > size of terminal. is there any other way that i get the complete
>> output
>> > > of
>> > > > ps-aef?
>> > > >
>> > > > Now i run the following command and thnx God, it did not fails and
>> > > produce
>> > > > the desired output.
>> > > >
>> > > > hadoop jar
>> > > >
>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > \
>> > > > -D mapred.child.java.opts=-Xmx1024m \
>> > > > -D mapred.child.ulimit=3145728 \
>> > > > -jt local \
>> > > > -inputformat StreamInputFormat \
>> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
>> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
>> http://www.w3.org/TR/REC-xml%5C> <
>> > http://www.w3.org/TR/REC-xml%5C> <
>> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
>> > > > \
>> > > > -input
>> > > >
>> > > >
>> > >
>> >
>> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
>> > > > \
>> > > > -jobconf mapred.map.tasks=1 \
>> > > > -jobconf mapred.reduce.tasks=0 \
>> > > > -output RNC32 \
>> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
>> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
>> > > > -file /home/ftpuser1/Nodemapper5.groovy
>> > > >
>> > > >
>> > > > but when i omit the -jt local, it produces the same error.
>> > > > Thanks Alex for helping
>> > > > Regards
>> > > > Shuja
>> > > >
>> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
>> > > wrote:
>> > > >
>> > > > > Hi Shuja,
>> > > > >
>> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..."
>> on
>> > the
>> > > > > command line, the last is valid. Unfortunately you have truncated
>> > > > command
>> > > > > lines. Can you show us the full command line, particularly for
>> the
>> > > > process
>> > > > > 26162? This seems to be causing problems.
>> > > > >
>> > > > > If you are running your cluster on 2 nodes, it may be that the
>> task
>> > was
>> > > > > scheduled on the second node. Did you run "ps -aef" on the second
>> > node
>> > > > as
>> > > > > well? You can see the task assignment in the JT web-UI (
>> > > > > http://jt-name:50030, drill down to tasks).
>> > > > >
>> > > > > I suggest you first debug your program in the local mode first,
>> > however
>> > > > > (use
>> > > > > "*-jt local*" option). Did you try the "*-D
>> > > > mapred.child.ulimit=3145728*"
>> > > > > option? I do not see it on the command line.
>> > > > >
>> > > > > Alex K
>> > > > >
>> > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
>> > shujamughal@gmail.com
>> > > > > >wrote:
>> > > > >
>> > > > > > Hi Alex
>> > > > > >
>> > > > > > I have tried with using quotes and also with -jt local but same
>> > heap
>> > > > > > error.
>> > > > > > and here is the output of ps -aef
>> > > > > >
>> > > > > > UID PID PPID C STIME TTY TIME CMD
>> > > > > > root 1 0 0 04:37 ? 00:00:00 init [3]
>> > > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
>> > > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
>> > > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
>> > > > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
>> > > > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
>> > > > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
>> > > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
>> > > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
>> > > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
>> > > > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
>> > > > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
>> > > > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
>> > > > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
>> > > > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
>> > > > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
>> > > > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
>> > > > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
>> > > > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
>> > > > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
>> > > > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
>> > > > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
>> > > > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
>> > > > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
>> > > > > > root 635 7 0 04:37 ? 00:00:00
>> [kmpath_handlerd]
>> > > > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
>> > > > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
>> > > > > > root 1032 1 0 04:38 ? 00:00:00 auditd
>> > > > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
>> > > > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
>> > > > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
>> > > > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
>> > > > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
>> > > > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon
>> > --system
>> > > > > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
>> > > > > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
>> > > > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
>> > > > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
>> > > > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
>> > > --server
>> > > > > > root 1295 1 0 04:38 ? 00:00:00 automount
>> > > > > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
>> > > > > > root 1326 1 0 04:38 ? 00:00:00 xinetd
>> -stayalive
>> > > > > -pidfile
>> > > > > > /var/run/xinetd.pid
>> > > > > > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
>> > > > > > /etc/vsftpd/vsftpd.conf
>> > > > > > root 1354 1 0 04:38 ? 00:00:00 sendmail:
>> accepting
>> > > > > > connections
>> > > > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue
>> > > > runner@01
>> > > > > > :00:00
>> > > > > > for /var/spool/clientmqueue
>> > > > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
>> > > /dev/input/mice
>> > > > -t
>> > > > > > exps2
>> > > > > > root 1410 1 0 04:38 ? 00:00:00 crond
>> > > > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv
>> > -daemon
>> > > > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
>> > > > > > 68 1508 1 0 04:38 ? 00:00:00 hald
>> > > > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
>> > > > > > root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd
>> -q
>> > > > never
>> > > > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty
>> xvc0
>> > > 9600
>> > > > > > vt100-nav
>> > > > > > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python
>> -tt
>> > > > > > /usr/sbin/yum-updatesd
>> > > > > > root 1539 1 0 04:38 ? 00:00:00
>> > > /usr/libexec/gam_server
>> > > > > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts
>> /0
>> > > > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
>> > > > > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts
>> /1
>> > > > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
>> > > > > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts
>> /2
>> > > > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
>> > > > > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts
>> /3
>> > > > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
>> > > > > > hadoop 24808 1 0 12:01 ? 00:00:02
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop 24893 1 0 12:01 ? 00:00:01
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop 24988 1 0 12:01 ? 00:00:01
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop 25085 1 0 12:01 ? 00:00:00
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop 25175 1 0 12:01 ? 00:00:01
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
>> > > > > > -Dhadoop.log.file=hadoo
>> > > > > > root 25925 21994 1 12:06 pts/2 00:00:00
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
>> > > > > > -Dhadoop.log.file=hadoop.log -
>> > > > > > hadoop 26120 25175 14 12:06 ? 00:00:01
>> > > > > > /usr/jdk1.6.0_03/jre/bin/java
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
>> > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
>> > > > > > -Dscript.name=/usr/local/groovy/b
>> > > > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
>> > > > > >
>> > > > > >
>> > > > > > *The command which i am executing is *
>> > > > > >
>> > > > > >
>> > > > > > hadoop jar
>> > > > > >
>> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > > > \
>> > > > > > -D mapred.child.java.opts=-Xmx1024m \
>> > > > > > -inputformat StreamInputFormat \
>> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
>> > > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C><
>> http://www.w3.org/TR/REC-xml%5C> <
>> > http://www.w3.org/TR/REC-xml%5C> <
>> > > http://www.w3.org/TR/REC-xml%5C> <
>> > > > http://www.w3.org/TR/REC-xml%5C> <
>> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
>> > > > > > \
>> > > > > > -input
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
>> > > > > > \
>> > > > > > -jobconf mapred.map.tasks=1 \
>> > > > > > -jobconf mapred.reduce.tasks=0 \
>> > > > > > -output RNC25 \
>> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
>> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
>> > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
>> > > > > > -jt local
>> > > > > >
>> > > > > > I have noticed that the all hadoop processes showing 2001 memory
>> > size
>> > > > > which
>> > > > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
>> > > mapper
>> > > > > and
>> > > > > > 1024 in child.java.opts but i think these values(1024,2001) are
>> not
>> > > in
>> > > > > use.
>> > > > > > secondly the following lines
>> > > > > >
>> > > > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
>> > > > > > /usr/jdk1.6.0_03/jre/bin/java
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
>> > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
>> > > > > > -Dscript.name=/usr/local/groovy/b*
>> > > > > >
>> > > > > > did not appear for first time when job runs. they appear when
>> job
>> > > > failed
>> > > > > > for
>> > > > > > first time and then again try to start mapping. I have one more
>> > > > question
>> > > > > > which is as all hadoop processes (namenode, datanode,
>> > tasktracker...)
>> > > > > > showing 2001 heapsize in process. will it means all the
>> processes
>> > > > using
>> > > > > > 2001m of memory??
>> > > > > >
>> > > > > > Regards
>> > > > > > Shuja
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
>> alexvk@cloudera.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Hi Shuja,
>> > > > > > >
>> > > > > > > I think you need to enclose the invocation string in quotes.
>> > Try:
>> > > > > > >
>> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
>> > > > > > >
>> > > > > > > Also, it would be nice to see how exactly the groovy is
>> invoked.
>> > > Is
>> > > > > > groovy
>> > > > > > > started and them gives you OOM or is OOM error during the
>> start?
>> > > Can
>> > > > > you
>> > > > > > > see the new process with "ps -aef"?
>> > > > > > >
>> > > > > > > Can you run groovy in local mode? Try "-jt local" option.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > >
>> > > > > > > Alex K
>> > > > > > >
>> > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
>> > > shujamughal@gmail.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi Patrick,
>> > > > > > > > Thanks for explanation. I have supply the heapsize in mapper
>> in
>> > > the
>> > > > > > > > following way
>> > > > > > > >
>> > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
>> > > > > > > >
>> > > > > > > > but still same error. Any other idea?
>> > > > > > > > Thanks
>> > > > > > > >
>> > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
>> > > > > patrick@cloudera.com
>> > > > > > > > >wrote:
>> > > > > > > >
>> > > > > > > > > Shuja,
>> > > > > > > > >
>> > > > > > > > > Those settings (mapred.child.jvm.opts and
>> > mapred.child.ulimit)
>> > > > are
>> > > > > > only
>> > > > > > > > > used
>> > > > > > > > > for child JVMs that get forked by the TaskTracker. You are
>> > > using
>> > > > > > Hadoop
>> > > > > > > > > streaming, which means the TaskTracker is forking a JVM
>> for
>> > > > > > streaming,
>> > > > > > > > > which
>> > > > > > > > > is then forking a shell process that runs your groovy code
>> > (in
>> > > > > > another
>> > > > > > > > > JVM).
>> > > > > > > > >
>> > > > > > > > > I'm not much of a groovy expert, but if there's a way you
>> can
>> > > > wrap
>> > > > > > your
>> > > > > > > > > code
>> > > > > > > > > around the MapReduce API that would work best. Otherwise,
>> you
>> > > can
>> > > > > > just
>> > > > > > > > pass
>> > > > > > > > > the heapsize in '-mapper' argument.
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > >
>> > > > > > > > > - Patrick
>> > > > > > > > >
>> > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
>> > > > > shujamughal@gmail.com
>> > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Alex,
>> > > > > > > > > >
>> > > > > > > > > > I have update the java to latest available version on
>> all
>> > > > > machines
>> > > > > > in
>> > > > > > > > the
>> > > > > > > > > > cluster and now i run the job by adding this line
>> > > > > > > > > >
>> > > > > > > > > > -D mapred.child.ulimit=3145728 \
>> > > > > > > > > >
>> > > > > > > > > > but still same error. Here is the output of this job.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
>> > > > > > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
>> > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
>> > > > > > doop.home.dir=/usr/lib/hadoop-0.20
>> > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
>> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
>> > > > > > > > > /usr/lib/hadoop-0.20/con
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
>> > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
>> > > > > > > > > > org.apache.hadoop.util.RunJar
>> > > > > > > > > >
>> > > > > > >
>> > > >
>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > > > > > > > -D
>> > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
>> > > > mapred.child.ulimit=3145728
>> > > > > > > > > > -inputformat StreamIn putFormat -inputreader
>> > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
>> http://www.w
>> > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
>> > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
>> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
>> > > > > > mapred.map.tasks=1
>> > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
>> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
>> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
>> > > > > > > > > /home/ftpuser1/Nodemapp
>> > > > > > > > > > er5.groovy
>> > > > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
>> > > > > > > Nodemapper5.groovy
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > Any clue?
>> > > > > > > > > > Thanks
>> > > > > > > > > >
>> > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
>> > > > > alexvk@cloudera.com>
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Shuja,
>> > > > > > > > > > >
>> > > > > > > > > > > First, thank you for using CDH3. Can you also check
>> what
>> > > m*
>> > > > > > > > > > > apred.child.ulimit* you are using? Try adding "*
>> > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
>> > > > > > > > > > >
>> > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
>> update
>> > 8
>> > > at
>> > > > a
>> > > > > > > > > minimum,
>> > > > > > > > > > > which you can download from the Java SE
>> > > > > > > > > > > Homepage<
>> http://java.sun.com/javase/downloads/index.jsp>
>> > > > > > > > > > > .
>> > > > > > > > > > >
>> > > > > > > > > > > Let me know how it goes.
>> > > > > > > > > > >
>> > > > > > > > > > > Alex K
>> > > > > > > > > > >
>> > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
>> > > > > > > > shujamughal@gmail.com
>> > > > > > > > > > > >wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi Alex
>> > > > > > > > > > > >
>> > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
>> and
>> > > using
>> > > > > > > > Cloudera
>> > > > > > > > > > > > distribution of hadoop. and here is the output of
>> this
>> > > > > command.
>> > > > > > > > > > > >
>> > > > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
>> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
>> > > > /hadoop-0.20/logs
>> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
>> > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
>> > > > > > > > > > > > -Dhadoop.id.str= -Dhado
>> > > op.root.logger=INFO,console
>> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
>> > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
>> > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
>> > > > > > > > > > > >
>> > > > > > > > >
>> > > > > >
>> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
>> > > > > > > StreamInputFormat
>> > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
>> <mdc
>> > > > > > > xmlns:HTML="
>> > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
>> > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
>> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
>> > > > > > > > mapred.map.tasks=1
>> > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
>> RNC11
>> > > > -mapper
>> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
>> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
>> > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
>> > > > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
>> > > > > > > > > Nodemapper5.groovy
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ------------------------------------------------------------------------------------------------------------------------------
>> > > > > > > > > > > > and what is meant by OOM and thanks for helping,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best Regards
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
>> > > > > > > alexvk@cloudera.com
>> > > > > > > > >
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Hi Shuja,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > It looks like the OOM is happening in your code.
>> Are
>> > > you
>> > > > > > > running
>> > > > > > > > > > > > MapReduce
>> > > > > > > > > > > > > in a cluster? If so, can you send the exact
>> command
>> > > line
>> > > > > > your
>> > > > > > > > code
>> > > > > > > > > > is
>> > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
>> grep
>> > > > > > > > > > > Nodemapper5.groovy'
>> > > > > > > > > > > > > command on one of the nodes which is running the
>> > task?
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Alex K
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
>> > > > > > > > > > shujamughal@gmail.com
>> > > > > > > > > > > > > >wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Hi All
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I am facing a hard problem. I am running a map
>> > reduce
>> > > > job
>> > > > > > > using
>> > > > > > > > > > > > streaming
>> > > > > > > > > > > > > > but it fails and it gives the following error.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
>> space
>> > > > > > > > > > > > > > at
>> > Nodemapper5.parseXML(Nodemapper5.groovy:25)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > java.lang.RuntimeException:
>> > > > > PipeMapRed.waitOutputThreads():
>> > > > > > > > > > > subprocess
>> > > > > > > > > > > > > > failed with code 1
>> > > > > > > > > > > > > > at
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>> > > > > > > > > > > > > > at
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > at
>> > > > > > > > > > > > >
>> > > > > > >
>> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
>> > > > > > > > > > > > > > at
>> > > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>> > > > > > > > > > > > > > at
>> > > > > > > > > > > > > >
>> > > > > > > > > >
>> > > > > >
>> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>> > > > > > > > > > > > > > at
>> > > > > > > > > > > >
>> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > at
>> > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> > > > > > > > > > > > > > at
>> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh
>> and
>> > > > make
>> > > > > it
>> > > > > > > > > 2000M.
>> > > > > > > > > > > Also
>> > > > > > > > > > > > I
>> > > > > > > > > > > > > > tell the job manually by following line.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > but it still gives the error. The same job runs
>> > fine
>> > > if
>> > > > i
>> > > > > > run
>> > > > > > > > on
>> > > > > > > > > > > shell
>> > > > > > > > > > > > > > using
>> > > > > > > > > > > > > > 1024M heap size like
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Any clue?????????
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Thanks in advance.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > Regards
>> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > > > > > > > _________________________________
>> > > > > > > > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > > > > > > > Cell: +92 3214207445
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > --
>> > > > > > > > > > > > Regards
>> > > > > > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > > > > > _________________________________
>> > > > > > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > > > > > Cell: +92 3214207445
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > > Regards
>> > > > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > > > _________________________________
>> > > > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > > > Cell: +92 3214207445
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Regards
>> > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > _________________________________
>> > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > Cell: +92 3214207445
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Regards
>> > > > > > Shuja-ur-Rehman Baig
>> > > > > > _________________________________
>> > > > > > MS CS - School of Science and Engineering
>> > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > Cell: +92 3214207445
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Regards
>> > > > Shuja-ur-Rehman Baig
>> > > > _________________________________
>> > > > MS CS - School of Science and Engineering
>> > > > Lahore University of Management Sciences (LUMS)
>> > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > Cell: +92 3214207445
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Regards
>> > Shuja-ur-Rehman Baig
>> > _________________________________
>> > MS CS - School of Science and Engineering
>> > Lahore University of Management Sciences (LUMS)
>> > Sector U, DHA, Lahore, 54792, Pakistan
>> > Cell: +92 3214207445
>> >
>>
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi Ted yu,
As i have cluster of 2 nodes and i have configured task tracker on name node
as well to process the files.
On Tue, Jul 13, 2010 at 5:49 AM, Ted Yu <yu...@gmail.com> wrote:
> Normally task tracker isn't run on Name node.
> Did you configure otherwise ?
>
> On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > *Master Node output:*
> >
> > total used free shared buffers cached
> > Mem: 2097328 515576 1581752 0 56060 254760
> > -/+ buffers/cache: 204756 1892572
> > Swap: 522104 0 522104
> >
> > *Slave Node output:*
> > total used free shared buffers cached
> > Mem: 1048752 860684 188068 0 148388 570948
> > -/+ buffers/cache: 141348 907404
> > Swap: 522104 40 522064
> >
> > it seems that on server there is more memory free.
> >
> >
> > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Maybe you do not have enough available memory on master? What is the
> > > output
> > > of "*free*" on both nodes? -- Alex K
> > >
> > > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > > I have added following line to my master node mapred-site.xml file
> > > >
> > > > <property>
> > > > <name>mapred.child.ulimit</name>
> > > > <value>3145728</value>
> > > > </property>
> > > >
> > > > and run the job again, and wow..., the jobs get completed in 4th
> > attempt.
> > > I
> > > > checked the at 50030. Hadoop runs job 3 times on master server and it
> > > fails
> > > > but when it run on 2nd node, it succeeded and produce the desired
> > result.
> > > > Why it failed on master?
> > > > Thanks
> > > > Shuja
> > > >
> > > >
> > > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hmm. It means your options are not propagated to the nodes. Can
> you
> > > put
> > > > *
> > > > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> > > tasktrackers?
> > > > > I
> > > > > was under impression that the below should be enough though. Glad
> > you
> > > > got
> > > > > it working in local mode. -- Alex K
> > > > >
> > > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <
> shujamughal@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Alex, I am using putty to connect to servers. and this is
> almost
> > > my
> > > > > > maximum screen output which i sent. putty is not allowed me to
> > > increase
> > > > > the
> > > > > > size of terminal. is there any other way that i get the complete
> > > output
> > > > > of
> > > > > > ps-aef?
> > > > > >
> > > > > > Now i run the following command and thnx God, it did not fails
> and
> > > > > produce
> > > > > > the desired output.
> > > > > >
> > > > > > hadoop jar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > \
> > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > -jt local \
> > > > > > -inputformat StreamInputFormat \
> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > \
> > > > > > -input
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > \
> > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > -output RNC32 \
> > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > -file /home/ftpuser1/Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > but when i omit the -jt local, it produces the same error.
> > > > > > Thanks Alex for helping
> > > > > > Regards
> > > > > > Shuja
> > > > > >
> > > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <
> alexvk@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > Java listens to the last xmx, so if you have multiple "-Xmx
> ..."
> > on
> > > > the
> > > > > > > command line, the last is valid. Unfortunately you have
> > truncated
> > > > > > command
> > > > > > > lines. Can you show us the full command line, particularly for
> > the
> > > > > > process
> > > > > > > 26162? This seems to be causing problems.
> > > > > > >
> > > > > > > If you are running your cluster on 2 nodes, it may be that the
> > task
> > > > was
> > > > > > > scheduled on the second node. Did you run "ps -aef" on the
> > second
> > > > node
> > > > > > as
> > > > > > > well? You can see the task assignment in the JT web-UI (
> > > > > > > http://jt-name:50030, drill down to tasks).
> > > > > > >
> > > > > > > I suggest you first debug your program in the local mode first,
> > > > however
> > > > > > > (use
> > > > > > > "*-jt local*" option). Did you try the "*-D
> > > > > > mapred.child.ulimit=3145728*"
> > > > > > > option? I do not see it on the command line.
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi Alex
> > > > > > > >
> > > > > > > > I have tried with using quotes and also with -jt local but
> > same
> > > > heap
> > > > > > > > error.
> > > > > > > > and here is the output of ps -aef
> > > > > > > >
> > > > > > > > UID PID PPID C STIME TTY TIME CMD
> > > > > > > > root 1 0 0 04:37 ? 00:00:00 init [3]
> > > > > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > > > > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > > > > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > > > > > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > > > > > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > > > > > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > > > > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > > > > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > > > > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > > > > > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > > > > > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > > > > > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > > > > > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > > > > > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > > > > > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > > > > > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > > > > > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > > > > > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > > > > > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd
> -d
> > > > > > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > > > > > > > root 635 7 0 04:37 ? 00:00:00
> > [kmpath_handlerd]
> > > > > > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > > > root 1032 1 0 04:38 ? 00:00:00 auditd
> > > > > > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > > > > > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > > > > > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > > > > > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > > > > > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > > > > > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon
> > > > --system
> > > > > > > > root 1184 1 0 04:38 ? 00:00:00
> /usr/sbin/hcid
> > > > > > > > root 1190 1 0 04:38 ? 00:00:00
> /usr/sbin/sdpd
> > > > > > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > > > > > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > > > > > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
> > > > > --server
> > > > > > > > root 1295 1 0 04:38 ? 00:00:00 automount
> > > > > > > > root 1314 1 0 04:38 ? 00:00:00
> /usr/sbin/sshd
> > > > > > > > root 1326 1 0 04:38 ? 00:00:00 xinetd
> > -stayalive
> > > > > > > -pidfile
> > > > > > > > /var/run/xinetd.pid
> > > > > > > > root 1337 1 0 04:38 ? 00:00:00
> > /usr/sbin/vsftpd
> > > > > > > > /etc/vsftpd/vsftpd.conf
> > > > > > > > root 1354 1 0 04:38 ? 00:00:00 sendmail:
> > > accepting
> > > > > > > > connections
> > > > > > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail:
> Queue
> > > > > > runner@01
> > > > > > > > :00:00
> > > > > > > > for /var/spool/clientmqueue
> > > > > > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
> > > > > /dev/input/mice
> > > > > > -t
> > > > > > > > exps2
> > > > > > > > root 1410 1 0 04:38 ? 00:00:00 crond
> > > > > > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv
> > > > -daemon
> > > > > > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > > > > > > > 68 1508 1 0 04:38 ? 00:00:00 hald
> > > > > > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > > > > > > > root 1533 1 0 04:38 ? 00:00:00
> > /usr/sbin/smartd
> > > -q
> > > > > > never
> > > > > > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty
> > xvc0
> > > > > 9600
> > > > > > > > vt100-nav
> > > > > > > > root 1537 1 0 04:38 ? 00:00:00
> /usr/bin/python
> > > -tt
> > > > > > > > /usr/sbin/yum-updatesd
> > > > > > > > root 1539 1 0 04:38 ? 00:00:00
> > > > > /usr/libexec/gam_server
> > > > > > > > root 21022 1314 0 11:27 ? 00:00:00 sshd:
> root@pts
> > /0
> > > > > > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > > > > > > > root 21103 1314 0 11:28 ? 00:00:00 sshd:
> root@pts
> > /1
> > > > > > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > > > > > > > root 21992 1314 0 11:47 ? 00:00:00 sshd:
> root@pts
> > /2
> > > > > > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > > > > > > > root 22433 1314 0 11:49 ? 00:00:00 sshd:
> root@pts
> > /3
> > > > > > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > > > > > > > hadoop 24808 1 0 12:01 ? 00:00:02
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop 24893 1 0 12:01 ? 00:00:01
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop 24988 1 0 12:01 ? 00:00:01
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop 25085 1 0 12:01 ? 00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop 25175 1 0 12:01 ? 00:00:01
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > > > > -Dhadoop.log.file=hadoo
> > > > > > > > root 25925 21994 1 12:06 pts/2 00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > -Dhadoop.log.file=hadoop.log -
> > > > > > > > hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > > -Dscript.name=/usr/local/groovy/b
> > > > > > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> > > > > > > >
> > > > > > > >
> > > > > > > > *The command which i am executing is *
> > > > > > > >
> > > > > > > >
> > > > > > > > hadoop jar
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > \
> > > > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > > > -inputformat StreamInputFormat \
> > > > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > > > http://www.w3.org/TR/REC-xml\<http://www.w3.org/TR/REC-xml%5C>
> <http://www.w3.org/TR/REC-xml%5C><
> > http://www.w3.org/TR/REC-xml%5C><
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > > > \
> > > > > > > > -input
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > > > \
> > > > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > > > -output RNC25 \
> > > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > > > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > > > > -jt local
> > > > > > > >
> > > > > > > > I have noticed that the all hadoop processes showing 2001
> > memory
> > > > size
> > > > > > > which
> > > > > > > > i have set in hadoop-env.sh. and one the command, i give 2000
> > in
> > > > > mapper
> > > > > > > and
> > > > > > > > 1024 in child.java.opts but i think these values(1024,2001)
> are
> > > not
> > > > > in
> > > > > > > use.
> > > > > > > > secondly the following lines
> > > > > > > >
> > > > > > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > > -Dscript.name=/usr/local/groovy/b*
> > > > > > > >
> > > > > > > > did not appear for first time when job runs. they appear when
> > job
> > > > > > failed
> > > > > > > > for
> > > > > > > > first time and then again try to start mapping. I have one
> more
> > > > > > question
> > > > > > > > which is as all hadoop processes (namenode, datanode,
> > > > tasktracker...)
> > > > > > > > showing 2001 heapsize in process. will it means all the
> > > processes
> > > > > > using
> > > > > > > > 2001m of memory??
> > > > > > > >
> > > > > > > > Regards
> > > > > > > > Shuja
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> > > alexvk@cloudera.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuja,
> > > > > > > > >
> > > > > > > > > I think you need to enclose the invocation string in
> quotes.
> > > > Try:
> > > > > > > > >
> > > > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > > > > >
> > > > > > > > > Also, it would be nice to see how exactly the groovy is
> > > invoked.
> > > > > Is
> > > > > > > > groovy
> > > > > > > > > started and them gives you OOM or is OOM error during the
> > > start?
> > > > > Can
> > > > > > > you
> > > > > > > > > see the new process with "ps -aef"?
> > > > > > > > >
> > > > > > > > > Can you run groovy in local mode? Try "-jt local" option.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Alex K
> > > > > > > > >
> > > > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Patrick,
> > > > > > > > > > Thanks for explanation. I have supply the heapsize in
> > mapper
> > > in
> > > > > the
> > > > > > > > > > following way
> > > > > > > > > >
> > > > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > > > > >
> > > > > > > > > > but still same error. Any other idea?
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > > > > patrick@cloudera.com
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Shuja,
> > > > > > > > > > >
> > > > > > > > > > > Those settings (mapred.child.jvm.opts and
> > > > mapred.child.ulimit)
> > > > > > are
> > > > > > > > only
> > > > > > > > > > > used
> > > > > > > > > > > for child JVMs that get forked by the TaskTracker. You
> > are
> > > > > using
> > > > > > > > Hadoop
> > > > > > > > > > > streaming, which means the TaskTracker is forking a JVM
> > for
> > > > > > > > streaming,
> > > > > > > > > > > which
> > > > > > > > > > > is then forking a shell process that runs your groovy
> > code
> > > > (in
> > > > > > > > another
> > > > > > > > > > > JVM).
> > > > > > > > > > >
> > > > > > > > > > > I'm not much of a groovy expert, but if there's a way
> you
> > > can
> > > > > > wrap
> > > > > > > > your
> > > > > > > > > > > code
> > > > > > > > > > > around the MapReduce API that would work best.
> Otherwise,
> > > you
> > > > > can
> > > > > > > > just
> > > > > > > > > > pass
> > > > > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > >
> > > > > > > > > > > - Patrick
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > > > > shujamughal@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > >
> > > > > > > > > > > > I have update the java to latest available version on
> > all
> > > > > > > machines
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > > > > >
> > > > > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > > > > >
> > > > > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > -D
> > > > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > > > > mapred.child.ulimit=3145728
> > > > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> > > http://www.w
> > > > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > > > > er5.groovy
> > > > > > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Any clue?
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > > > > alexvk@cloudera.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > >
> > > > > > > > > > > > > First, thank you for using CDH3. Can you also
> check
> > > what
> > > > > m*
> > > > > > > > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command
> line.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> > > update
> > > > 8
> > > > > at
> > > > > > a
> > > > > > > > > > > minimum,
> > > > > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > > > > Homepage<
> > > http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > > > > .
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let me know how it goes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex K
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Alex
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
> > and
> > > > > using
> > > > > > > > > > Cloudera
> > > > > > > > > > > > > > distribution of hadoop. and here is the output of
> > > this
> > > > > > > command.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > > > > /hadoop-0.20/logs
> > > > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > > > > op.root.logger=INFO,console
> > > > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > > > > StreamInputFormat
> > > > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> > > <mdc
> > > > > > > > > xmlns:HTML="
> > > > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> -jobconf
> > > > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
> > RNC11
> > > > > > -mapper
> > > > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer
> -file
> > /
> > > > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00
> > grep
> > > > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > > > > alexvk@cloudera.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It looks like the OOM is happening in your
> code.
> > > Are
> > > > > you
> > > > > > > > > running
> > > > > > > > > > > > > > MapReduce
> > > > > > > > > > > > > > > in a cluster? If so, can you send the exact
> > > command
> > > > > line
> > > > > > > > your
> > > > > > > > > > code
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> > > grep
> > > > > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > > > > command on one of the nodes which is running
> the
> > > > task?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alex K
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman
> <
> > > > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi All
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I am facing a hard problem. I am running a
> map
> > > > reduce
> > > > > > job
> > > > > > > > > using
> > > > > > > > > > > > > > streaming
> > > > > > > > > > > > > > > > but it fails and it gives the following
> error.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> > > space
> > > > > > > > > > > > > > > > at
> > > > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > java.lang.RuntimeException:
> > > > > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > > > > subprocess
> > > > > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > >
> > > > > > > > >
> > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > > > > at
> > > > > > > > > > >
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > >
> > > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > at
> > > > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > > > > at
> > > > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I have increased the heap size in
> hadoop-env.sh
> > > and
> > > > > > make
> > > > > > > it
> > > > > > > > > > > 2000M.
> > > > > > > > > > > > > Also
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > but it still gives the error. The same job
> runs
> > > > fine
> > > > > if
> > > > > > i
> > > > > > > > run
> > > > > > > > > > on
> > > > > > > > > > > > > shell
> > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > > > Lahore University of Management Sciences
> (LUMS)
> > > > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > _________________________________
> > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Ted Yu <yu...@gmail.com>.
Normally task tracker isn't run on Name node.
Did you configure otherwise ?
On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman <sh...@gmail.com> wrote:
> *Master Node output:*
>
> total used free shared buffers cached
> Mem: 2097328 515576 1581752 0 56060 254760
> -/+ buffers/cache: 204756 1892572
> Swap: 522104 0 522104
>
> *Slave Node output:*
> total used free shared buffers cached
> Mem: 1048752 860684 188068 0 148388 570948
> -/+ buffers/cache: 141348 907404
> Swap: 522104 40 522064
>
> it seems that on server there is more memory free.
>
>
> On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Maybe you do not have enough available memory on master? What is the
> > output
> > of "*free*" on both nodes? -- Alex K
> >
> > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi
> > > I have added following line to my master node mapred-site.xml file
> > >
> > > <property>
> > > <name>mapred.child.ulimit</name>
> > > <value>3145728</value>
> > > </property>
> > >
> > > and run the job again, and wow..., the jobs get completed in 4th
> attempt.
> > I
> > > checked the at 50030. Hadoop runs job 3 times on master server and it
> > fails
> > > but when it run on 2nd node, it succeeded and produce the desired
> result.
> > > Why it failed on master?
> > > Thanks
> > > Shuja
> > >
> > >
> > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hmm. It means your options are not propagated to the nodes. Can you
> > put
> > > *
> > > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> > tasktrackers?
> > > > I
> > > > was under impression that the below should be enough though. Glad
> you
> > > got
> > > > it working in local mode. -- Alex K
> > > >
> > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Alex, I am using putty to connect to servers. and this is almost
> > my
> > > > > maximum screen output which i sent. putty is not allowed me to
> > increase
> > > > the
> > > > > size of terminal. is there any other way that i get the complete
> > output
> > > > of
> > > > > ps-aef?
> > > > >
> > > > > Now i run the following command and thnx God, it did not fails and
> > > > produce
> > > > > the desired output.
> > > > >
> > > > > hadoop jar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > \
> > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > -D mapred.child.ulimit=3145728 \
> > > > > -jt local \
> > > > > -inputformat StreamInputFormat \
> > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > \
> > > > > -input
> > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > \
> > > > > -jobconf mapred.map.tasks=1 \
> > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > -output RNC32 \
> > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > -file /home/ftpuser1/Nodemapper5.groovy
> > > > >
> > > > >
> > > > > but when i omit the -jt local, it produces the same error.
> > > > > Thanks Alex for helping
> > > > > Regards
> > > > > Shuja
> > > > >
> > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..."
> on
> > > the
> > > > > > command line, the last is valid. Unfortunately you have
> truncated
> > > > > command
> > > > > > lines. Can you show us the full command line, particularly for
> the
> > > > > process
> > > > > > 26162? This seems to be causing problems.
> > > > > >
> > > > > > If you are running your cluster on 2 nodes, it may be that the
> task
> > > was
> > > > > > scheduled on the second node. Did you run "ps -aef" on the
> second
> > > node
> > > > > as
> > > > > > well? You can see the task assignment in the JT web-UI (
> > > > > > http://jt-name:50030, drill down to tasks).
> > > > > >
> > > > > > I suggest you first debug your program in the local mode first,
> > > however
> > > > > > (use
> > > > > > "*-jt local*" option). Did you try the "*-D
> > > > > mapred.child.ulimit=3145728*"
> > > > > > option? I do not see it on the command line.
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi Alex
> > > > > > >
> > > > > > > I have tried with using quotes and also with -jt local but
> same
> > > heap
> > > > > > > error.
> > > > > > > and here is the output of ps -aef
> > > > > > >
> > > > > > > UID PID PPID C STIME TTY TIME CMD
> > > > > > > root 1 0 0 04:37 ? 00:00:00 init [3]
> > > > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > > > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > > > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > > > > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > > > > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > > > > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > > > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > > > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > > > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > > > > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > > > > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > > > > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > > > > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > > > > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > > > > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > > > > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > > > > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > > > > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > > > > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> > > > > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > > > > > > root 635 7 0 04:37 ? 00:00:00
> [kmpath_handlerd]
> > > > > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > > root 1032 1 0 04:38 ? 00:00:00 auditd
> > > > > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > > > > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > > > > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > > > > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > > > > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > > > > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon
> > > --system
> > > > > > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> > > > > > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> > > > > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > > > > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > > > > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
> > > > --server
> > > > > > > root 1295 1 0 04:38 ? 00:00:00 automount
> > > > > > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> > > > > > > root 1326 1 0 04:38 ? 00:00:00 xinetd
> -stayalive
> > > > > > -pidfile
> > > > > > > /var/run/xinetd.pid
> > > > > > > root 1337 1 0 04:38 ? 00:00:00
> /usr/sbin/vsftpd
> > > > > > > /etc/vsftpd/vsftpd.conf
> > > > > > > root 1354 1 0 04:38 ? 00:00:00 sendmail:
> > accepting
> > > > > > > connections
> > > > > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue
> > > > > runner@01
> > > > > > > :00:00
> > > > > > > for /var/spool/clientmqueue
> > > > > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
> > > > /dev/input/mice
> > > > > -t
> > > > > > > exps2
> > > > > > > root 1410 1 0 04:38 ? 00:00:00 crond
> > > > > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv
> > > -daemon
> > > > > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > > > > > > 68 1508 1 0 04:38 ? 00:00:00 hald
> > > > > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > > > > > > root 1533 1 0 04:38 ? 00:00:00
> /usr/sbin/smartd
> > -q
> > > > > never
> > > > > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty
> xvc0
> > > > 9600
> > > > > > > vt100-nav
> > > > > > > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python
> > -tt
> > > > > > > /usr/sbin/yum-updatesd
> > > > > > > root 1539 1 0 04:38 ? 00:00:00
> > > > /usr/libexec/gam_server
> > > > > > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts
> /0
> > > > > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > > > > > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts
> /1
> > > > > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > > > > > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts
> /2
> > > > > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > > > > > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts
> /3
> > > > > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > > > > > > hadoop 24808 1 0 12:01 ? 00:00:02
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop 24893 1 0 12:01 ? 00:00:01
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop 24988 1 0 12:01 ? 00:00:01
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop 25085 1 0 12:01 ? 00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop 25175 1 0 12:01 ? 00:00:01
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > > > -Dhadoop.log.file=hadoo
> > > > > > > root 25925 21994 1 12:06 pts/2 00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log -
> > > > > > > hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > -Dscript.name=/usr/local/groovy/b
> > > > > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> > > > > > >
> > > > > > >
> > > > > > > *The command which i am executing is *
> > > > > > >
> > > > > > >
> > > > > > > hadoop jar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > \
> > > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > > -inputformat StreamInputFormat \
> > > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > > http://www.w3.org/TR/REC-xml\<http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C><
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > > \
> > > > > > > -input
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > > \
> > > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > > -output RNC25 \
> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > > > -jt local
> > > > > > >
> > > > > > > I have noticed that the all hadoop processes showing 2001
> memory
> > > size
> > > > > > which
> > > > > > > i have set in hadoop-env.sh. and one the command, i give 2000
> in
> > > > mapper
> > > > > > and
> > > > > > > 1024 in child.java.opts but i think these values(1024,2001) are
> > not
> > > > in
> > > > > > use.
> > > > > > > secondly the following lines
> > > > > > >
> > > > > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > -Dscript.name=/usr/local/groovy/b*
> > > > > > >
> > > > > > > did not appear for first time when job runs. they appear when
> job
> > > > > failed
> > > > > > > for
> > > > > > > first time and then again try to start mapping. I have one more
> > > > > question
> > > > > > > which is as all hadoop processes (namenode, datanode,
> > > tasktracker...)
> > > > > > > showing 2001 heapsize in process. will it means all the
> > processes
> > > > > using
> > > > > > > 2001m of memory??
> > > > > > >
> > > > > > > Regards
> > > > > > > Shuja
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> > alexvk@cloudera.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > I think you need to enclose the invocation string in quotes.
> > > Try:
> > > > > > > >
> > > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > > > >
> > > > > > > > Also, it would be nice to see how exactly the groovy is
> > invoked.
> > > > Is
> > > > > > > groovy
> > > > > > > > started and them gives you OOM or is OOM error during the
> > start?
> > > > Can
> > > > > > you
> > > > > > > > see the new process with "ps -aef"?
> > > > > > > >
> > > > > > > > Can you run groovy in local mode? Try "-jt local" option.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Patrick,
> > > > > > > > > Thanks for explanation. I have supply the heapsize in
> mapper
> > in
> > > > the
> > > > > > > > > following way
> > > > > > > > >
> > > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > > > >
> > > > > > > > > but still same error. Any other idea?
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > > > patrick@cloudera.com
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Shuja,
> > > > > > > > > >
> > > > > > > > > > Those settings (mapred.child.jvm.opts and
> > > mapred.child.ulimit)
> > > > > are
> > > > > > > only
> > > > > > > > > > used
> > > > > > > > > > for child JVMs that get forked by the TaskTracker. You
> are
> > > > using
> > > > > > > Hadoop
> > > > > > > > > > streaming, which means the TaskTracker is forking a JVM
> for
> > > > > > > streaming,
> > > > > > > > > > which
> > > > > > > > > > is then forking a shell process that runs your groovy
> code
> > > (in
> > > > > > > another
> > > > > > > > > > JVM).
> > > > > > > > > >
> > > > > > > > > > I'm not much of a groovy expert, but if there's a way you
> > can
> > > > > wrap
> > > > > > > your
> > > > > > > > > > code
> > > > > > > > > > around the MapReduce API that would work best. Otherwise,
> > you
> > > > can
> > > > > > > just
> > > > > > > > > pass
> > > > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > - Patrick
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > > > shujamughal@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Alex,
> > > > > > > > > > >
> > > > > > > > > > > I have update the java to latest available version on
> all
> > > > > > machines
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > > > >
> > > > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > > > >
> > > > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > -D
> > > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > > > mapred.child.ulimit=3145728
> > > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> > http://www.w
> > > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > mapred.map.tasks=1
> > > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > > > er5.groovy
> > > > > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > > > > > > Nodemapper5.groovy
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Any clue?
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > > > alexvk@cloudera.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > >
> > > > > > > > > > > > First, thank you for using CDH3. Can you also check
> > what
> > > > m*
> > > > > > > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > > > > >
> > > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> > update
> > > 8
> > > > at
> > > > > a
> > > > > > > > > > minimum,
> > > > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > > > Homepage<
> > http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > > > .
> > > > > > > > > > > >
> > > > > > > > > > > > Let me know how it goes.
> > > > > > > > > > > >
> > > > > > > > > > > > Alex K
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > >wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Alex
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
> and
> > > > using
> > > > > > > > > Cloudera
> > > > > > > > > > > > > distribution of hadoop. and here is the output of
> > this
> > > > > > command.
> > > > > > > > > > > > >
> > > > > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > > > /hadoop-0.20/logs
> > > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > > > op.root.logger=INFO,console
> > > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > > > StreamInputFormat
> > > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> > <mdc
> > > > > > > > xmlns:HTML="
> > > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
> RNC11
> > > > > -mapper
> > > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /
> > > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00
> grep
> > > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > > > alexvk@cloudera.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It looks like the OOM is happening in your code.
> > Are
> > > > you
> > > > > > > > running
> > > > > > > > > > > > > MapReduce
> > > > > > > > > > > > > > in a cluster? If so, can you send the exact
> > command
> > > > line
> > > > > > > your
> > > > > > > > > code
> > > > > > > > > > > is
> > > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> > grep
> > > > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > > > command on one of the nodes which is running the
> > > task?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Alex K
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi All
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I am facing a hard problem. I am running a map
> > > reduce
> > > > > job
> > > > > > > > using
> > > > > > > > > > > > > streaming
> > > > > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> > space
> > > > > > > > > > > > > > > at
> > > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > java.lang.RuntimeException:
> > > > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > > > subprocess
> > > > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > > >
> > > > > > > >
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > > > at
> > > > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > >
> > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > at
> > > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > > > at
> > > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh
> > and
> > > > > make
> > > > > > it
> > > > > > > > > > 2000M.
> > > > > > > > > > > > Also
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > but it still gives the error. The same job runs
> > > fine
> > > > if
> > > > > i
> > > > > > > run
> > > > > > > > > on
> > > > > > > > > > > > shell
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Regards
> > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards
> > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > _________________________________
> > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
*Master Node output:*
total used free shared buffers cached
Mem: 2097328 515576 1581752 0 56060 254760
-/+ buffers/cache: 204756 1892572
Swap: 522104 0 522104
*Slave Node output:*
total used free shared buffers cached
Mem: 1048752 860684 188068 0 148388 570948
-/+ buffers/cache: 141348 907404
Swap: 522104 40 522064
it seems that on server there is more memory free.
On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com> wrote:
> Maybe you do not have enough available memory on master? What is the
> output
> of "*free*" on both nodes? -- Alex K
>
> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi
> > I have added following line to my master node mapred-site.xml file
> >
> > <property>
> > <name>mapred.child.ulimit</name>
> > <value>3145728</value>
> > </property>
> >
> > and run the job again, and wow..., the jobs get completed in 4th attempt.
> I
> > checked the at 50030. Hadoop runs job 3 times on master server and it
> fails
> > but when it run on 2nd node, it succeeded and produce the desired result.
> > Why it failed on master?
> > Thanks
> > Shuja
> >
> >
> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hmm. It means your options are not propagated to the nodes. Can you
> put
> > *
> > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> tasktrackers?
> > > I
> > > was under impression that the below should be enough though. Glad you
> > got
> > > it working in local mode. -- Alex K
> > >
> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Alex, I am using putty to connect to servers. and this is almost
> my
> > > > maximum screen output which i sent. putty is not allowed me to
> increase
> > > the
> > > > size of terminal. is there any other way that i get the complete
> output
> > > of
> > > > ps-aef?
> > > >
> > > > Now i run the following command and thnx God, it did not fails and
> > > produce
> > > > the desired output.
> > > >
> > > > hadoop jar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > \
> > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > -D mapred.child.ulimit=3145728 \
> > > > -jt local \
> > > > -inputformat StreamInputFormat \
> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > \
> > > > -input
> > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > \
> > > > -jobconf mapred.map.tasks=1 \
> > > > -jobconf mapred.reduce.tasks=0 \
> > > > -output RNC32 \
> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > -file /home/ftpuser1/Nodemapper5.groovy
> > > >
> > > >
> > > > but when i omit the -jt local, it produces the same error.
> > > > Thanks Alex for helping
> > > > Regards
> > > > Shuja
> > > >
> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on
> > the
> > > > > command line, the last is valid. Unfortunately you have truncated
> > > > command
> > > > > lines. Can you show us the full command line, particularly for the
> > > > process
> > > > > 26162? This seems to be causing problems.
> > > > >
> > > > > If you are running your cluster on 2 nodes, it may be that the task
> > was
> > > > > scheduled on the second node. Did you run "ps -aef" on the second
> > node
> > > > as
> > > > > well? You can see the task assignment in the JT web-UI (
> > > > > http://jt-name:50030, drill down to tasks).
> > > > >
> > > > > I suggest you first debug your program in the local mode first,
> > however
> > > > > (use
> > > > > "*-jt local*" option). Did you try the "*-D
> > > > mapred.child.ulimit=3145728*"
> > > > > option? I do not see it on the command line.
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi Alex
> > > > > >
> > > > > > I have tried with using quotes and also with -jt local but same
> > heap
> > > > > > error.
> > > > > > and here is the output of ps -aef
> > > > > >
> > > > > > UID PID PPID C STIME TTY TIME CMD
> > > > > > root 1 0 0 04:37 ? 00:00:00 init [3]
> > > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > > > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > > > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > > > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > > > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > > > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > > > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > > > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > > > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > > > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > > > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > > > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > > > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > > > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> > > > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > > > > > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
> > > > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > > root 1032 1 0 04:38 ? 00:00:00 auditd
> > > > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > > > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > > > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > > > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > > > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > > > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon
> > --system
> > > > > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> > > > > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> > > > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > > > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > > > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
> > > --server
> > > > > > root 1295 1 0 04:38 ? 00:00:00 automount
> > > > > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> > > > > > root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive
> > > > > -pidfile
> > > > > > /var/run/xinetd.pid
> > > > > > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
> > > > > > /etc/vsftpd/vsftpd.conf
> > > > > > root 1354 1 0 04:38 ? 00:00:00 sendmail:
> accepting
> > > > > > connections
> > > > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue
> > > > runner@01
> > > > > > :00:00
> > > > > > for /var/spool/clientmqueue
> > > > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
> > > /dev/input/mice
> > > > -t
> > > > > > exps2
> > > > > > root 1410 1 0 04:38 ? 00:00:00 crond
> > > > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv
> > -daemon
> > > > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > > > > > 68 1508 1 0 04:38 ? 00:00:00 hald
> > > > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > > > > > root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd
> -q
> > > > never
> > > > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0
> > > 9600
> > > > > > vt100-nav
> > > > > > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python
> -tt
> > > > > > /usr/sbin/yum-updatesd
> > > > > > root 1539 1 0 04:38 ? 00:00:00
> > > /usr/libexec/gam_server
> > > > > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
> > > > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > > > > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
> > > > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > > > > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
> > > > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > > > > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
> > > > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > > > > > hadoop 24808 1 0 12:01 ? 00:00:02
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop 24893 1 0 12:01 ? 00:00:01
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop 24988 1 0 12:01 ? 00:00:01
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop 25085 1 0 12:01 ? 00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop 25175 1 0 12:01 ? 00:00:01
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > > -Dhadoop.log.file=hadoo
> > > > > > root 25925 21994 1 12:06 pts/2 00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > -Dhadoop.log.file=hadoop.log -
> > > > > > hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > -Dscript.name=/usr/local/groovy/b
> > > > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> > > > > >
> > > > > >
> > > > > > *The command which i am executing is *
> > > > > >
> > > > > >
> > > > > > hadoop jar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > \
> > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > -inputformat StreamInputFormat \
> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > \
> > > > > > -input
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > \
> > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > -output RNC25 \
> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > > -jt local
> > > > > >
> > > > > > I have noticed that the all hadoop processes showing 2001 memory
> > size
> > > > > which
> > > > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
> > > mapper
> > > > > and
> > > > > > 1024 in child.java.opts but i think these values(1024,2001) are
> not
> > > in
> > > > > use.
> > > > > > secondly the following lines
> > > > > >
> > > > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > -Dscript.name=/usr/local/groovy/b*
> > > > > >
> > > > > > did not appear for first time when job runs. they appear when job
> > > > failed
> > > > > > for
> > > > > > first time and then again try to start mapping. I have one more
> > > > question
> > > > > > which is as all hadoop processes (namenode, datanode,
> > tasktracker...)
> > > > > > showing 2001 heapsize in process. will it means all the
> processes
> > > > using
> > > > > > 2001m of memory??
> > > > > >
> > > > > > Regards
> > > > > > Shuja
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> alexvk@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > I think you need to enclose the invocation string in quotes.
> > Try:
> > > > > > >
> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > > >
> > > > > > > Also, it would be nice to see how exactly the groovy is
> invoked.
> > > Is
> > > > > > groovy
> > > > > > > started and them gives you OOM or is OOM error during the
> start?
> > > Can
> > > > > you
> > > > > > > see the new process with "ps -aef"?
> > > > > > >
> > > > > > > Can you run groovy in local mode? Try "-jt local" option.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Patrick,
> > > > > > > > Thanks for explanation. I have supply the heapsize in mapper
> in
> > > the
> > > > > > > > following way
> > > > > > > >
> > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > > >
> > > > > > > > but still same error. Any other idea?
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > > patrick@cloudera.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Shuja,
> > > > > > > > >
> > > > > > > > > Those settings (mapred.child.jvm.opts and
> > mapred.child.ulimit)
> > > > are
> > > > > > only
> > > > > > > > > used
> > > > > > > > > for child JVMs that get forked by the TaskTracker. You are
> > > using
> > > > > > Hadoop
> > > > > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > > > > streaming,
> > > > > > > > > which
> > > > > > > > > is then forking a shell process that runs your groovy code
> > (in
> > > > > > another
> > > > > > > > > JVM).
> > > > > > > > >
> > > > > > > > > I'm not much of a groovy expert, but if there's a way you
> can
> > > > wrap
> > > > > > your
> > > > > > > > > code
> > > > > > > > > around the MapReduce API that would work best. Otherwise,
> you
> > > can
> > > > > > just
> > > > > > > > pass
> > > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > - Patrick
> > > > > > > > >
> > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Alex,
> > > > > > > > > >
> > > > > > > > > > I have update the java to latest available version on all
> > > > > machines
> > > > > > in
> > > > > > > > the
> > > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > > >
> > > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > > >
> > > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > > >
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > -D
> > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > > mapred.child.ulimit=3145728
> > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> http://www.w
> > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > mapred.map.tasks=1
> > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > > er5.groovy
> > > > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > > > > > Nodemapper5.groovy
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Any clue?
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > > alexvk@cloudera.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Shuja,
> > > > > > > > > > >
> > > > > > > > > > > First, thank you for using CDH3. Can you also check
> what
> > > m*
> > > > > > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > > > >
> > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> update
> > 8
> > > at
> > > > a
> > > > > > > > > minimum,
> > > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > > Homepage<
> http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > > .
> > > > > > > > > > >
> > > > > > > > > > > Let me know how it goes.
> > > > > > > > > > >
> > > > > > > > > > > Alex K
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > >wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Alex
> > > > > > > > > > > >
> > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines and
> > > using
> > > > > > > > Cloudera
> > > > > > > > > > > > distribution of hadoop. and here is the output of
> this
> > > > > command.
> > > > > > > > > > > >
> > > > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > > /hadoop-0.20/logs
> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > > op.root.logger=INFO,console
> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > > StreamInputFormat
> > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> <mdc
> > > > > > > xmlns:HTML="
> > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11
> > > > -mapper
> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > > alexvk@cloudera.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > >
> > > > > > > > > > > > > It looks like the OOM is happening in your code.
> Are
> > > you
> > > > > > > running
> > > > > > > > > > > > MapReduce
> > > > > > > > > > > > > in a cluster? If so, can you send the exact
> command
> > > line
> > > > > > your
> > > > > > > > code
> > > > > > > > > > is
> > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> grep
> > > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > > command on one of the nodes which is running the
> > task?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex K
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi All
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am facing a hard problem. I am running a map
> > reduce
> > > > job
> > > > > > > using
> > > > > > > > > > > > streaming
> > > > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> space
> > > > > > > > > > > > > > at
> > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > java.lang.RuntimeException:
> > > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > > subprocess
> > > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > at
> > > > > > > > > > > > >
> > > > > > >
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > > at
> > > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > > at
> > > > > > > > > > > >
> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > at
> > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > > at
> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh
> and
> > > > make
> > > > > it
> > > > > > > > > 2000M.
> > > > > > > > > > > Also
> > > > > > > > > > > > I
> > > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > but it still gives the error. The same job runs
> > fine
> > > if
> > > > i
> > > > > > run
> > > > > > > > on
> > > > > > > > > > > shell
> > > > > > > > > > > > > > using
> > > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > _________________________________
> > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Maybe you do not have enough available memory on master? What is the output
of "*free*" on both nodes? -- Alex K
On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi
> I have added following line to my master node mapred-site.xml file
>
> <property>
> <name>mapred.child.ulimit</name>
> <value>3145728</value>
> </property>
>
> and run the job again, and wow..., the jobs get completed in 4th attempt. I
> checked the at 50030. Hadoop runs job 3 times on master server and it fails
> but when it run on 2nd node, it succeeded and produce the desired result.
> Why it failed on master?
> Thanks
> Shuja
>
>
> On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hmm. It means your options are not propagated to the nodes. Can you put
> *
> > mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers?
> > I
> > was under impression that the below should be enough though. Glad you
> got
> > it working in local mode. -- Alex K
> >
> > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi Alex, I am using putty to connect to servers. and this is almost my
> > > maximum screen output which i sent. putty is not allowed me to increase
> > the
> > > size of terminal. is there any other way that i get the complete output
> > of
> > > ps-aef?
> > >
> > > Now i run the following command and thnx God, it did not fails and
> > produce
> > > the desired output.
> > >
> > > hadoop jar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> \
> > > -D mapred.child.java.opts=-Xmx1024m \
> > > -D mapred.child.ulimit=3145728 \
> > > -jt local \
> > > -inputformat StreamInputFormat \
> > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > \
> > > -input
> > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > \
> > > -jobconf mapred.map.tasks=1 \
> > > -jobconf mapred.reduce.tasks=0 \
> > > -output RNC32 \
> > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > -file /home/ftpuser1/Nodemapper5.groovy
> > >
> > >
> > > but when i omit the -jt local, it produces the same error.
> > > Thanks Alex for helping
> > > Regards
> > > Shuja
> > >
> > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on
> the
> > > > command line, the last is valid. Unfortunately you have truncated
> > > command
> > > > lines. Can you show us the full command line, particularly for the
> > > process
> > > > 26162? This seems to be causing problems.
> > > >
> > > > If you are running your cluster on 2 nodes, it may be that the task
> was
> > > > scheduled on the second node. Did you run "ps -aef" on the second
> node
> > > as
> > > > well? You can see the task assignment in the JT web-UI (
> > > > http://jt-name:50030, drill down to tasks).
> > > >
> > > > I suggest you first debug your program in the local mode first,
> however
> > > > (use
> > > > "*-jt local*" option). Did you try the "*-D
> > > mapred.child.ulimit=3145728*"
> > > > option? I do not see it on the command line.
> > > >
> > > > Alex K
> > > >
> > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> shujamughal@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi Alex
> > > > >
> > > > > I have tried with using quotes and also with -jt local but same
> heap
> > > > > error.
> > > > > and here is the output of ps -aef
> > > > >
> > > > > UID PID PPID C STIME TTY TIME CMD
> > > > > root 1 0 0 04:37 ? 00:00:00 init [3]
> > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> > > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > > > > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
> > > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > > > > root 1032 1 0 04:38 ? 00:00:00 auditd
> > > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon
> --system
> > > > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> > > > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> > > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
> > --server
> > > > > root 1295 1 0 04:38 ? 00:00:00 automount
> > > > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> > > > > root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive
> > > > -pidfile
> > > > > /var/run/xinetd.pid
> > > > > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
> > > > > /etc/vsftpd/vsftpd.conf
> > > > > root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting
> > > > > connections
> > > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue
> > > runner@01
> > > > > :00:00
> > > > > for /var/spool/clientmqueue
> > > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
> > /dev/input/mice
> > > -t
> > > > > exps2
> > > > > root 1410 1 0 04:38 ? 00:00:00 crond
> > > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv
> -daemon
> > > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > > > > 68 1508 1 0 04:38 ? 00:00:00 hald
> > > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > > > > root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q
> > > never
> > > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0
> > 9600
> > > > > vt100-nav
> > > > > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt
> > > > > /usr/sbin/yum-updatesd
> > > > > root 1539 1 0 04:38 ? 00:00:00
> > /usr/libexec/gam_server
> > > > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
> > > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > > > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
> > > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > > > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
> > > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > > > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
> > > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > > > > hadoop 24808 1 0 12:01 ? 00:00:02
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop 24893 1 0 12:01 ? 00:00:01
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop 24988 1 0 12:01 ? 00:00:01
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop 25085 1 0 12:01 ? 00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop 25175 1 0 12:01 ? 00:00:01
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > -Dhadoop.log.file=hadoo
> > > > > root 25925 21994 1 12:06 pts/2 00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -
> > > > > hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > -Dscript.name=/usr/local/groovy/b
> > > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> > > > >
> > > > >
> > > > > *The command which i am executing is *
> > > > >
> > > > >
> > > > > hadoop jar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > \
> > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > -inputformat StreamInputFormat \
> > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > \
> > > > > -input
> > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > \
> > > > > -jobconf mapred.map.tasks=1 \
> > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > -output RNC25 \
> > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > -jt local
> > > > >
> > > > > I have noticed that the all hadoop processes showing 2001 memory
> size
> > > > which
> > > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
> > mapper
> > > > and
> > > > > 1024 in child.java.opts but i think these values(1024,2001) are not
> > in
> > > > use.
> > > > > secondly the following lines
> > > > >
> > > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > -Dscript.name=/usr/local/groovy/b*
> > > > >
> > > > > did not appear for first time when job runs. they appear when job
> > > failed
> > > > > for
> > > > > first time and then again try to start mapping. I have one more
> > > question
> > > > > which is as all hadoop processes (namenode, datanode,
> tasktracker...)
> > > > > showing 2001 heapsize in process. will it means all the processes
> > > using
> > > > > 2001m of memory??
> > > > >
> > > > > Regards
> > > > > Shuja
> > > > >
> > > > >
> > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > I think you need to enclose the invocation string in quotes.
> Try:
> > > > > >
> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > >
> > > > > > Also, it would be nice to see how exactly the groovy is invoked.
> > Is
> > > > > groovy
> > > > > > started and them gives you OOM or is OOM error during the start?
> > Can
> > > > you
> > > > > > see the new process with "ps -aef"?
> > > > > >
> > > > > > Can you run groovy in local mode? Try "-jt local" option.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Patrick,
> > > > > > > Thanks for explanation. I have supply the heapsize in mapper in
> > the
> > > > > > > following way
> > > > > > >
> > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > >
> > > > > > > but still same error. Any other idea?
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > patrick@cloudera.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Shuja,
> > > > > > > >
> > > > > > > > Those settings (mapred.child.jvm.opts and
> mapred.child.ulimit)
> > > are
> > > > > only
> > > > > > > > used
> > > > > > > > for child JVMs that get forked by the TaskTracker. You are
> > using
> > > > > Hadoop
> > > > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > > > streaming,
> > > > > > > > which
> > > > > > > > is then forking a shell process that runs your groovy code
> (in
> > > > > another
> > > > > > > > JVM).
> > > > > > > >
> > > > > > > > I'm not much of a groovy expert, but if there's a way you can
> > > wrap
> > > > > your
> > > > > > > > code
> > > > > > > > around the MapReduce API that would work best. Otherwise, you
> > can
> > > > > just
> > > > > > > pass
> > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > - Patrick
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Alex,
> > > > > > > > >
> > > > > > > > > I have update the java to latest available version on all
> > > > machines
> > > > > in
> > > > > > > the
> > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > >
> > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > >
> > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > -D
> > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > mapred.child.ulimit=3145728
> > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > mapred.map.tasks=1
> > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > er5.groovy
> > > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > > > > Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Any clue?
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > alexvk@cloudera.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuja,
> > > > > > > > > >
> > > > > > > > > > First, thank you for using CDH3. Can you also check what
> > m*
> > > > > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > > >
> > > > > > > > > > I would also recommend to upgrade java to JDK 1.6 update
> 8
> > at
> > > a
> > > > > > > > minimum,
> > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > .
> > > > > > > > > >
> > > > > > > > > > Let me know how it goes.
> > > > > > > > > >
> > > > > > > > > > Alex K
> > > > > > > > > >
> > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > shujamughal@gmail.com
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Alex
> > > > > > > > > > >
> > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines and
> > using
> > > > > > > Cloudera
> > > > > > > > > > > distribution of hadoop. and here is the output of this
> > > > command.
> > > > > > > > > > >
> > > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > /hadoop-0.20/logs
> > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > op.root.logger=INFO,console
> > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > StreamInputFormat
> > > > > > > > > > > -inputreader StreamXmlRecordReader,begin= <mdc
> > > > > > xmlns:HTML="
> > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > mapred.map.tasks=1
> > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11
> > > -mapper
> > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > > > > > > Nodemapper5.groovy
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > >
> > > > > > > > > > > Best Regards
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > alexvk@cloudera.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > >
> > > > > > > > > > > > It looks like the OOM is happening in your code. Are
> > you
> > > > > > running
> > > > > > > > > > > MapReduce
> > > > > > > > > > > > in a cluster? If so, can you send the exact command
> > line
> > > > > your
> > > > > > > code
> > > > > > > > > is
> > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > command on one of the nodes which is running the
> task?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Alex K
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > >wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi All
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am facing a hard problem. I am running a map
> reduce
> > > job
> > > > > > using
> > > > > > > > > > > streaming
> > > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > > > > at
> Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > >
> > > > > > > > > > > > > java.lang.RuntimeException:
> > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > subprocess
> > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > at
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > at
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > >
> > > > > > > > > > > > > at
> > > > > > > > > > > >
> > > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > at
> > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > at
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > at
> > > > > > > > > > >
> > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > >
> > > > > > > > > > > > > at
> > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > at
> > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh and
> > > make
> > > > it
> > > > > > > > 2000M.
> > > > > > > > > > Also
> > > > > > > > > > > I
> > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > >
> > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > >
> > > > > > > > > > > > > but it still gives the error. The same job runs
> fine
> > if
> > > i
> > > > > run
> > > > > > > on
> > > > > > > > > > shell
> > > > > > > > > > > > > using
> > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > >
> > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Regards
> > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards
> > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > _________________________________
> > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi
I have added following line to my master node mapred-site.xml file
<property>
<name>mapred.child.ulimit</name>
<value>3145728</value>
</property>
and run the job again, and wow..., the jobs get completed in 4th attempt. I
checked the at 50030. Hadoop runs job 3 times on master server and it fails
but when it run on 2nd node, it succeeded and produce the desired result.
Why it failed on master?
Thanks
Shuja
On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com> wrote:
> Hmm. It means your options are not propagated to the nodes. Can you put *
> mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers?
> I
> was under impression that the below should be enough though. Glad you got
> it working in local mode. -- Alex K
>
> On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi Alex, I am using putty to connect to servers. and this is almost my
> > maximum screen output which i sent. putty is not allowed me to increase
> the
> > size of terminal. is there any other way that i get the complete output
> of
> > ps-aef?
> >
> > Now i run the following command and thnx God, it did not fails and
> produce
> > the desired output.
> >
> > hadoop jar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> > -D mapred.child.java.opts=-Xmx1024m \
> > -D mapred.child.ulimit=3145728 \
> > -jt local \
> > -inputformat StreamInputFormat \
> > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > \
> > -input
> >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > \
> > -jobconf mapred.map.tasks=1 \
> > -jobconf mapred.reduce.tasks=0 \
> > -output RNC32 \
> > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > -file /home/ftpuser1/Nodemapper5.groovy
> >
> >
> > but when i omit the -jt local, it produces the same error.
> > Thanks Alex for helping
> > Regards
> > Shuja
> >
> > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
> > > command line, the last is valid. Unfortunately you have truncated
> > command
> > > lines. Can you show us the full command line, particularly for the
> > process
> > > 26162? This seems to be causing problems.
> > >
> > > If you are running your cluster on 2 nodes, it may be that the task was
> > > scheduled on the second node. Did you run "ps -aef" on the second node
> > as
> > > well? You can see the task assignment in the JT web-UI (
> > > http://jt-name:50030, drill down to tasks).
> > >
> > > I suggest you first debug your program in the local mode first, however
> > > (use
> > > "*-jt local*" option). Did you try the "*-D
> > mapred.child.ulimit=3145728*"
> > > option? I do not see it on the command line.
> > >
> > > Alex K
> > >
> > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi Alex
> > > >
> > > > I have tried with using quotes and also with -jt local but same heap
> > > > error.
> > > > and here is the output of ps -aef
> > > >
> > > > UID PID PPID C STIME TTY TIME CMD
> > > > root 1 0 0 04:37 ? 00:00:00 init [3]
> > > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> > > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > > > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
> > > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > > > root 1032 1 0 04:38 ? 00:00:00 auditd
> > > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system
> > > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> > > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> > > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd
> --server
> > > > root 1295 1 0 04:38 ? 00:00:00 automount
> > > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> > > > root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive
> > > -pidfile
> > > > /var/run/xinetd.pid
> > > > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
> > > > /etc/vsftpd/vsftpd.conf
> > > > root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting
> > > > connections
> > > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue
> > runner@01
> > > > :00:00
> > > > for /var/spool/clientmqueue
> > > > root 1379 1 0 04:38 ? 00:00:00 gpm -m
> /dev/input/mice
> > -t
> > > > exps2
> > > > root 1410 1 0 04:38 ? 00:00:00 crond
> > > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv -daemon
> > > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > > > 68 1508 1 0 04:38 ? 00:00:00 hald
> > > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > > > root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q
> > never
> > > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0
> 9600
> > > > vt100-nav
> > > > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt
> > > > /usr/sbin/yum-updatesd
> > > > root 1539 1 0 04:38 ? 00:00:00
> /usr/libexec/gam_server
> > > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
> > > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
> > > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
> > > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
> > > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > > > hadoop 24808 1 0 12:01 ? 00:00:02
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop 24893 1 0 12:01 ? 00:00:01
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop 24988 1 0 12:01 ? 00:00:01
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop 25085 1 0 12:01 ? 00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop 25175 1 0 12:01 ? 00:00:01
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > -Dhadoop.log.file=hadoo
> > > > root 25925 21994 1 12:06 pts/2 00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -
> > > > hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > /usr/jdk1.6.0_03/jre/bin/java
> > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > /usr/jdk1.6.0_03/bin/java
> > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > -Dscript.name=/usr/local/groovy/b
> > > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> > > >
> > > >
> > > > *The command which i am executing is *
> > > >
> > > >
> > > > hadoop jar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > \
> > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > -inputformat StreamInputFormat \
> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > \
> > > > -input
> > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > \
> > > > -jobconf mapred.map.tasks=1 \
> > > > -jobconf mapred.reduce.tasks=0 \
> > > > -output RNC25 \
> > > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > -jt local
> > > >
> > > > I have noticed that the all hadoop processes showing 2001 memory size
> > > which
> > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
> mapper
> > > and
> > > > 1024 in child.java.opts but i think these values(1024,2001) are not
> in
> > > use.
> > > > secondly the following lines
> > > >
> > > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > > > /usr/jdk1.6.0_03/jre/bin/java
> > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > hadoop 26162 26120 89 12:06 ? 00:00:05
> > /usr/jdk1.6.0_03/bin/java
> > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > -Dscript.name=/usr/local/groovy/b*
> > > >
> > > > did not appear for first time when job runs. they appear when job
> > failed
> > > > for
> > > > first time and then again try to start mapping. I have one more
> > question
> > > > which is as all hadoop processes (namenode, datanode, tasktracker...)
> > > > showing 2001 heapsize in process. will it means all the processes
> > using
> > > > 2001m of memory??
> > > >
> > > > Regards
> > > > Shuja
> > > >
> > > >
> > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > I think you need to enclose the invocation string in quotes. Try:
> > > > >
> > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > >
> > > > > Also, it would be nice to see how exactly the groovy is invoked.
> Is
> > > > groovy
> > > > > started and them gives you OOM or is OOM error during the start?
> Can
> > > you
> > > > > see the new process with "ps -aef"?
> > > > >
> > > > > Can you run groovy in local mode? Try "-jt local" option.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> shujamughal@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Patrick,
> > > > > > Thanks for explanation. I have supply the heapsize in mapper in
> the
> > > > > > following way
> > > > > >
> > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > >
> > > > > > but still same error. Any other idea?
> > > > > > Thanks
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > patrick@cloudera.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Shuja,
> > > > > > >
> > > > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit)
> > are
> > > > only
> > > > > > > used
> > > > > > > for child JVMs that get forked by the TaskTracker. You are
> using
> > > > Hadoop
> > > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > > streaming,
> > > > > > > which
> > > > > > > is then forking a shell process that runs your groovy code (in
> > > > another
> > > > > > > JVM).
> > > > > > >
> > > > > > > I'm not much of a groovy expert, but if there's a way you can
> > wrap
> > > > your
> > > > > > > code
> > > > > > > around the MapReduce API that would work best. Otherwise, you
> can
> > > > just
> > > > > > pass
> > > > > > > the heapsize in '-mapper' argument.
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > - Patrick
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Alex,
> > > > > > > >
> > > > > > > > I have update the java to latest available version on all
> > > machines
> > > > in
> > > > > > the
> > > > > > > > cluster and now i run the job by adding this line
> > > > > > > >
> > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > >
> > > > > > > > but still same error. Here is the output of this job.
> > > > > > > >
> > > > > > > >
> > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > -D
> > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > mapred.child.ulimit=3145728
> > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > mapred.map.tasks=1
> > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > er5.groovy
> > > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > > > Nodemapper5.groovy
> > > > > > > >
> > > > > > > >
> > > > > > > > Any clue?
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > alexvk@cloudera.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuja,
> > > > > > > > >
> > > > > > > > > First, thank you for using CDH3. Can you also check what
> m*
> > > > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > >
> > > > > > > > > I would also recommend to upgrade java to JDK 1.6 update 8
> at
> > a
> > > > > > > minimum,
> > > > > > > > > which you can download from the Java SE
> > > > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > .
> > > > > > > > >
> > > > > > > > > Let me know how it goes.
> > > > > > > > >
> > > > > > > > > Alex K
> > > > > > > > >
> > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > shujamughal@gmail.com
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Hi Alex
> > > > > > > > > >
> > > > > > > > > > Yeah, I am running a job on cluster of 2 machines and
> using
> > > > > > Cloudera
> > > > > > > > > > distribution of hadoop. and here is the output of this
> > > command.
> > > > > > > > > >
> > > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > /hadoop-0.20/logs
> > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > -Dhadoop.id.str= -Dhado
> op.root.logger=INFO,console
> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > >
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > StreamInputFormat
> > > > > > > > > > -inputreader StreamXmlRecordReader,begin= <mdc
> > > > > xmlns:HTML="
> > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > mapred.map.tasks=1
> > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11
> > -mapper
> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > > > > > Nodemapper5.groovy
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > >
> > > > > > > > > > Best Regards
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > alexvk@cloudera.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Shuja,
> > > > > > > > > > >
> > > > > > > > > > > It looks like the OOM is happening in your code. Are
> you
> > > > > running
> > > > > > > > > > MapReduce
> > > > > > > > > > > in a cluster? If so, can you send the exact command
> line
> > > > your
> > > > > > code
> > > > > > > > is
> > > > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > command on one of the nodes which is running the task?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Alex K
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > >wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi All
> > > > > > > > > > > >
> > > > > > > > > > > > I am facing a hard problem. I am running a map reduce
> > job
> > > > > using
> > > > > > > > > > streaming
> > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > >
> > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > >
> > > > > > > > > > > > java.lang.RuntimeException:
> > > PipeMapRed.waitOutputThreads():
> > > > > > > > > subprocess
> > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > at
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > at
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > >
> > > > > > > > > > > > at
> > > > > > > > > > >
> > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > at
> > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > at
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > at
> > > > > > > > > >
> > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > >
> > > > > > > > > > > > at
> > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > at
> > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I have increased the heap size in hadoop-env.sh and
> > make
> > > it
> > > > > > > 2000M.
> > > > > > > > > Also
> > > > > > > > > > I
> > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > >
> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > >
> > > > > > > > > > > > but it still gives the error. The same job runs fine
> if
> > i
> > > > run
> > > > > > on
> > > > > > > > > shell
> > > > > > > > > > > > using
> > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > >
> > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > _________________________________
> > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Hmm. It means your options are not propagated to the nodes. Can you put *
mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers? I
was under impression that the below should be enough though. Glad you got
it working in local mode. -- Alex K
On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi Alex, I am using putty to connect to servers. and this is almost my
> maximum screen output which i sent. putty is not allowed me to increase the
> size of terminal. is there any other way that i get the complete output of
> ps-aef?
>
> Now i run the following command and thnx God, it did not fails and produce
> the desired output.
>
> hadoop jar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> -D mapred.child.java.opts=-Xmx1024m \
> -D mapred.child.ulimit=3145728 \
> -jt local \
> -inputformat StreamInputFormat \
> -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> \
> -input
>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> \
> -jobconf mapred.map.tasks=1 \
> -jobconf mapred.reduce.tasks=0 \
> -output RNC32 \
> -mapper /home/ftpuser1/Nodemapper5.groovy \
> -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> -file /home/ftpuser1/Nodemapper5.groovy
>
>
> but when i omit the -jt local, it produces the same error.
> Thanks Alex for helping
> Regards
> Shuja
>
> On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
> > command line, the last is valid. Unfortunately you have truncated
> command
> > lines. Can you show us the full command line, particularly for the
> process
> > 26162? This seems to be causing problems.
> >
> > If you are running your cluster on 2 nodes, it may be that the task was
> > scheduled on the second node. Did you run "ps -aef" on the second node
> as
> > well? You can see the task assignment in the JT web-UI (
> > http://jt-name:50030, drill down to tasks).
> >
> > I suggest you first debug your program in the local mode first, however
> > (use
> > "*-jt local*" option). Did you try the "*-D
> mapred.child.ulimit=3145728*"
> > option? I do not see it on the command line.
> >
> > Alex K
> >
> > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com
> > >wrote:
> >
> > > Hi Alex
> > >
> > > I have tried with using quotes and also with -jt local but same heap
> > > error.
> > > and here is the output of ps -aef
> > >
> > > UID PID PPID C STIME TTY TIME CMD
> > > root 1 0 0 04:37 ? 00:00:00 init [3]
> > > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
> > > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > > root 1032 1 0 04:38 ? 00:00:00 auditd
> > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system
> > > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> > > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> > > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd --server
> > > root 1295 1 0 04:38 ? 00:00:00 automount
> > > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> > > root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive
> > -pidfile
> > > /var/run/xinetd.pid
> > > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
> > > /etc/vsftpd/vsftpd.conf
> > > root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting
> > > connections
> > > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue
> runner@01
> > > :00:00
> > > for /var/spool/clientmqueue
> > > root 1379 1 0 04:38 ? 00:00:00 gpm -m /dev/input/mice
> -t
> > > exps2
> > > root 1410 1 0 04:38 ? 00:00:00 crond
> > > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv -daemon
> > > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > > 68 1508 1 0 04:38 ? 00:00:00 hald
> > > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > > root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q
> never
> > > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0 9600
> > > vt100-nav
> > > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt
> > > /usr/sbin/yum-updatesd
> > > root 1539 1 0 04:38 ? 00:00:00 /usr/libexec/gam_server
> > > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
> > > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
> > > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
> > > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
> > > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > > hadoop 24808 1 0 12:01 ? 00:00:02
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop 24893 1 0 12:01 ? 00:00:01
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop 24988 1 0 12:01 ? 00:00:01
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop 25085 1 0 12:01 ? 00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop 25175 1 0 12:01 ? 00:00:01
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > -Dhadoop.log.file=hadoo
> > > root 25925 21994 1 12:06 pts/2 00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -
> > > hadoop 26120 25175 14 12:06 ? 00:00:01
> > > /usr/jdk1.6.0_03/jre/bin/java
> > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > hadoop 26162 26120 89 12:06 ? 00:00:05
> /usr/jdk1.6.0_03/bin/java
> > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > -Dscript.name=/usr/local/groovy/b
> > > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> > >
> > >
> > > *The command which i am executing is *
> > >
> > >
> > > hadoop jar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> \
> > > -D mapred.child.java.opts=-Xmx1024m \
> > > -inputformat StreamInputFormat \
> > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > \
> > > -input
> > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > \
> > > -jobconf mapred.map.tasks=1 \
> > > -jobconf mapred.reduce.tasks=0 \
> > > -output RNC25 \
> > > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > -jt local
> > >
> > > I have noticed that the all hadoop processes showing 2001 memory size
> > which
> > > i have set in hadoop-env.sh. and one the command, i give 2000 in mapper
> > and
> > > 1024 in child.java.opts but i think these values(1024,2001) are not in
> > use.
> > > secondly the following lines
> > >
> > > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > > /usr/jdk1.6.0_03/jre/bin/java
> > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > hadoop 26162 26120 89 12:06 ? 00:00:05
> /usr/jdk1.6.0_03/bin/java
> > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > -Dscript.name=/usr/local/groovy/b*
> > >
> > > did not appear for first time when job runs. they appear when job
> failed
> > > for
> > > first time and then again try to start mapping. I have one more
> question
> > > which is as all hadoop processes (namenode, datanode, tasktracker...)
> > > showing 2001 heapsize in process. will it means all the processes
> using
> > > 2001m of memory??
> > >
> > > Regards
> > > Shuja
> > >
> > >
> > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > I think you need to enclose the invocation string in quotes. Try:
> > > >
> > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > >
> > > > Also, it would be nice to see how exactly the groovy is invoked. Is
> > > groovy
> > > > started and them gives you OOM or is OOM error during the start? Can
> > you
> > > > see the new process with "ps -aef"?
> > > >
> > > > Can you run groovy in local mode? Try "-jt local" option.
> > > >
> > > > Thanks,
> > > >
> > > > Alex K
> > > >
> > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Patrick,
> > > > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > > > following way
> > > > >
> > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > >
> > > > > but still same error. Any other idea?
> > > > > Thanks
> > > > >
> > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > patrick@cloudera.com
> > > > > >wrote:
> > > > >
> > > > > > Shuja,
> > > > > >
> > > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit)
> are
> > > only
> > > > > > used
> > > > > > for child JVMs that get forked by the TaskTracker. You are using
> > > Hadoop
> > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > streaming,
> > > > > > which
> > > > > > is then forking a shell process that runs your groovy code (in
> > > another
> > > > > > JVM).
> > > > > >
> > > > > > I'm not much of a groovy expert, but if there's a way you can
> wrap
> > > your
> > > > > > code
> > > > > > around the MapReduce API that would work best. Otherwise, you can
> > > just
> > > > > pass
> > > > > > the heapsize in '-mapper' argument.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > - Patrick
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Alex,
> > > > > > >
> > > > > > > I have update the java to latest available version on all
> > machines
> > > in
> > > > > the
> > > > > > > cluster and now i run the job by adding this line
> > > > > > >
> > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > >
> > > > > > > but still same error. Here is the output of this job.
> > > > > > >
> > > > > > >
> > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > /usr/lib/hadoop-0.20/con
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > org.apache.hadoop.util.RunJar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > -D
> > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> mapred.child.ulimit=3145728
> > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > mapred.map.tasks=1
> > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > /home/ftpuser1/Nodemapp
> > > > > > > er5.groovy
> > > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > > Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > > Any clue?
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > alexvk@cloudera.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > First, thank you for using CDH3. Can you also check what m*
> > > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > >
> > > > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at
> a
> > > > > > minimum,
> > > > > > > > which you can download from the Java SE
> > > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > .
> > > > > > > >
> > > > > > > > Let me know how it goes.
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Hi Alex
> > > > > > > > >
> > > > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > > > Cloudera
> > > > > > > > > distribution of hadoop. and here is the output of this
> > command.
> > > > > > > > >
> > > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> /hadoop-0.20/logs
> > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > StreamInputFormat
> > > > > > > > > -inputreader StreamXmlRecordReader,begin= <mdc
> > > > xmlns:HTML="
> > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > mapred.map.tasks=1
> > > > > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11
> -mapper
> > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > > > > Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > >
> > > > > > > > > Best Regards
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > alexvk@cloudera.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuja,
> > > > > > > > > >
> > > > > > > > > > It looks like the OOM is happening in your code. Are you
> > > > running
> > > > > > > > > MapReduce
> > > > > > > > > > in a cluster? If so, can you send the exact command line
> > > your
> > > > > code
> > > > > > > is
> > > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > command on one of the nodes which is running the task?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Alex K
> > > > > > > > > >
> > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > shujamughal@gmail.com
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi All
> > > > > > > > > > >
> > > > > > > > > > > I am facing a hard problem. I am running a map reduce
> job
> > > > using
> > > > > > > > > streaming
> > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > >
> > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > >
> > > > > > > > > > > java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads():
> > > > > > > > subprocess
> > > > > > > > > > > failed with code 1
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > >
> > > > > > > > > > > at
> > > > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > at
> > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > at
> > > > > > > > > > >
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > at
> > > > > > > > >
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > >
> > > > > > > > > > > at
> > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > at
> > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I have increased the heap size in hadoop-env.sh and
> make
> > it
> > > > > > 2000M.
> > > > > > > > Also
> > > > > > > > > I
> > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > >
> > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > >
> > > > > > > > > > > but it still gives the error. The same job runs fine if
> i
> > > run
> > > > > on
> > > > > > > > shell
> > > > > > > > > > > using
> > > > > > > > > > > 1024M heap size like
> > > > > > > > > > >
> > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Any clue?????????
> > > > > > > > > > >
> > > > > > > > > > > Thanks in advance.
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards
> > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > _________________________________
> > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex, I am using putty to connect to servers. and this is almost my
maximum screen output which i sent. putty is not allowed me to increase the
size of terminal. is there any other way that i get the complete output of
ps-aef?
Now i run the following command and thnx God, it did not fails and produce
the desired output.
hadoop jar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
-D mapred.child.java.opts=-Xmx1024m \
-D mapred.child.ulimit=3145728 \
-jt local \
-inputformat StreamInputFormat \
-inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
http://www.w3.org/TR/REC-xml\">,end=</mdc>" \
-input
/user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
\
-jobconf mapred.map.tasks=1 \
-jobconf mapred.reduce.tasks=0 \
-output RNC32 \
-mapper /home/ftpuser1/Nodemapper5.groovy \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer \
-file /home/ftpuser1/Nodemapper5.groovy
but when i omit the -jt local, it produces the same error.
Thanks Alex for helping
Regards
Shuja
On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com> wrote:
> Hi Shuja,
>
> Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
> command line, the last is valid. Unfortunately you have truncated command
> lines. Can you show us the full command line, particularly for the process
> 26162? This seems to be causing problems.
>
> If you are running your cluster on 2 nodes, it may be that the task was
> scheduled on the second node. Did you run "ps -aef" on the second node as
> well? You can see the task assignment in the JT web-UI (
> http://jt-name:50030, drill down to tasks).
>
> I suggest you first debug your program in the local mode first, however
> (use
> "*-jt local*" option). Did you try the "*-D mapred.child.ulimit=3145728*"
> option? I do not see it on the command line.
>
> Alex K
>
> On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com
> >wrote:
>
> > Hi Alex
> >
> > I have tried with using quotes and also with -jt local but same heap
> > error.
> > and here is the output of ps -aef
> >
> > UID PID PPID C STIME TTY TIME CMD
> > root 1 0 0 04:37 ? 00:00:00 init [3]
> > root 2 1 0 04:37 ? 00:00:00 [migration/0]
> > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> > root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> > root 5 1 0 04:37 ? 00:00:00 [events/0]
> > root 6 1 0 04:37 ? 00:00:00 [khelper]
> > root 7 1 0 04:37 ? 00:00:00 [kthread]
> > root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> > root 10 7 0 04:37 ? 00:00:00 [xenbus]
> > root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> > root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> > root 22 7 0 04:37 ? 00:00:00 [khubd]
> > root 24 7 0 04:37 ? 00:00:00 [kseriod]
> > root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> > root 85 7 0 04:37 ? 00:00:00 [pdflush]
> > root 86 7 0 04:37 ? 00:00:00 [pdflush]
> > root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> > root 88 7 0 04:37 ? 00:00:00 [aio/0]
> > root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> > root 248 7 0 04:37 ? 00:00:00 [kstriped]
> > root 257 7 0 04:37 ? 00:00:00 [kjournald]
> > root 279 7 0 04:37 ? 00:00:00 [kauditd]
> > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
> > root 660 7 0 04:37 ? 00:00:00 [kjournald]
> > root 662 7 0 04:37 ? 00:00:00 [kjournald]
> > root 1032 1 0 04:38 ? 00:00:00 auditd
> > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> > root 1052 1 0 04:38 ? 00:00:00 klogd -x
> > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system
> > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> > root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> > root 1244 1 0 04:38 ? 00:00:00 pcscd
> > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd --server
> > root 1295 1 0 04:38 ? 00:00:00 automount
> > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> > root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive
> -pidfile
> > /var/run/xinetd.pid
> > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
> > /etc/vsftpd/vsftpd.conf
> > root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting
> > connections
> > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue runner@01
> > :00:00
> > for /var/spool/clientmqueue
> > root 1379 1 0 04:38 ? 00:00:00 gpm -m /dev/input/mice -t
> > exps2
> > root 1410 1 0 04:38 ? 00:00:00 crond
> > xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv -daemon
> > root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> > 68 1508 1 0 04:38 ? 00:00:00 hald
> > root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> > root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q never
> > root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0 9600
> > vt100-nav
> > root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt
> > /usr/sbin/yum-updatesd
> > root 1539 1 0 04:38 ? 00:00:00 /usr/libexec/gam_server
> > root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
> > root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> > root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
> > root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> > root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
> > root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> > root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
> > root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> > hadoop 24808 1 0 12:01 ? 00:00:02 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop 24893 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop 24988 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop 25085 1 0 12:01 ? 00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop 25175 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > -Dhadoop.log.file=hadoo
> > root 25925 21994 1 12:06 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -
> > hadoop 26120 25175 14 12:06 ? 00:00:01
> > /usr/jdk1.6.0_03/jre/bin/java
> >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java
> > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > -Dscript.name=/usr/local/groovy/b
> > root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
> >
> >
> > *The command which i am executing is *
> >
> >
> > hadoop jar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> > -D mapred.child.java.opts=-Xmx1024m \
> > -inputformat StreamInputFormat \
> > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > \
> > -input
> >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > \
> > -jobconf mapred.map.tasks=1 \
> > -jobconf mapred.reduce.tasks=0 \
> > -output RNC25 \
> > -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > -file /home/ftpuser1/Nodemapper5.groovy \
> > -jt local
> >
> > I have noticed that the all hadoop processes showing 2001 memory size
> which
> > i have set in hadoop-env.sh. and one the command, i give 2000 in mapper
> and
> > 1024 in child.java.opts but i think these values(1024,2001) are not in
> use.
> > secondly the following lines
> >
> > *hadoop 26120 25175 14 12:06 ? 00:00:01
> > /usr/jdk1.6.0_03/jre/bin/java
> >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java
> > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > -Dscript.name=/usr/local/groovy/b*
> >
> > did not appear for first time when job runs. they appear when job failed
> > for
> > first time and then again try to start mapping. I have one more question
> > which is as all hadoop processes (namenode, datanode, tasktracker...)
> > showing 2001 heapsize in process. will it means all the processes using
> > 2001m of memory??
> >
> > Regards
> > Shuja
> >
> >
> > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > I think you need to enclose the invocation string in quotes. Try:
> > >
> > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > >
> > > Also, it would be nice to see how exactly the groovy is invoked. Is
> > groovy
> > > started and them gives you OOM or is OOM error during the start? Can
> you
> > > see the new process with "ps -aef"?
> > >
> > > Can you run groovy in local mode? Try "-jt local" option.
> > >
> > > Thanks,
> > >
> > > Alex K
> > >
> > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Patrick,
> > > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > > following way
> > > >
> > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > >
> > > > but still same error. Any other idea?
> > > > Thanks
> > > >
> > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> patrick@cloudera.com
> > > > >wrote:
> > > >
> > > > > Shuja,
> > > > >
> > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are
> > only
> > > > > used
> > > > > for child JVMs that get forked by the TaskTracker. You are using
> > Hadoop
> > > > > streaming, which means the TaskTracker is forking a JVM for
> > streaming,
> > > > > which
> > > > > is then forking a shell process that runs your groovy code (in
> > another
> > > > > JVM).
> > > > >
> > > > > I'm not much of a groovy expert, but if there's a way you can wrap
> > your
> > > > > code
> > > > > around the MapReduce API that would work best. Otherwise, you can
> > just
> > > > pass
> > > > > the heapsize in '-mapper' argument.
> > > > >
> > > > > Regards,
> > > > >
> > > > > - Patrick
> > > > >
> > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> shujamughal@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Alex,
> > > > > >
> > > > > > I have update the java to latest available version on all
> machines
> > in
> > > > the
> > > > > > cluster and now i run the job by adding this line
> > > > > >
> > > > > > -D mapred.child.ulimit=3145728 \
> > > > > >
> > > > > > but still same error. Here is the output of this job.
> > > > > >
> > > > > >
> > > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > /usr/lib/hadoop-0.20/con
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > org.apache.hadoop.util.RunJar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > -D
> > > > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > mapred.map.tasks=1
> > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > /home/ftpuser1/Nodemapp
> > > > > > er5.groovy
> > > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > > Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > Any clue?
> > > > > > Thanks
> > > > > >
> > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> alexvk@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > First, thank you for using CDH3. Can you also check what m*
> > > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > >
> > > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > > > > minimum,
> > > > > > > which you can download from the Java SE
> > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > .
> > > > > > >
> > > > > > > Let me know how it goes.
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi Alex
> > > > > > > >
> > > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > > Cloudera
> > > > > > > > distribution of hadoop. and here is the output of this
> command.
> > > > > > > >
> > > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > > > > > > > -Dhadoop.log.file=hadoop.log
> > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > StreamInputFormat
> > > > > > > > -inputreader StreamXmlRecordReader,begin= <mdc
> > > xmlns:HTML="
> > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > mapred.map.tasks=1
> > > > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > > > Nodemapper5.groovy
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > alexvk@cloudera.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuja,
> > > > > > > > >
> > > > > > > > > It looks like the OOM is happening in your code. Are you
> > > running
> > > > > > > > MapReduce
> > > > > > > > > in a cluster? If so, can you send the exact command line
> > your
> > > > code
> > > > > > is
> > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > Nodemapper5.groovy'
> > > > > > > > > command on one of the nodes which is running the task?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Alex K
> > > > > > > > >
> > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > shujamughal@gmail.com
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Hi All
> > > > > > > > > >
> > > > > > > > > > I am facing a hard problem. I am running a map reduce job
> > > using
> > > > > > > > streaming
> > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > >
> > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > >
> > > > > > > > > > java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads():
> > > > > > > subprocess
> > > > > > > > > > failed with code 1
> > > > > > > > > > at
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > at
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > >
> > > > > > > > > > at
> > > > > > > > >
> > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > at
> > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > at
> > > > > > > > > >
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > at
> > > > > > > >
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > >
> > > > > > > > > > at
> > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > at
> > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I have increased the heap size in hadoop-env.sh and make
> it
> > > > > 2000M.
> > > > > > > Also
> > > > > > > > I
> > > > > > > > > > tell the job manually by following line.
> > > > > > > > > >
> > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > >
> > > > > > > > > > but it still gives the error. The same job runs fine if i
> > run
> > > > on
> > > > > > > shell
> > > > > > > > > > using
> > > > > > > > > > 1024M heap size like
> > > > > > > > > >
> > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Any clue?????????
> > > > > > > > > >
> > > > > > > > > > Thanks in advance.
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,
Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
command line, the last is valid. Unfortunately you have truncated command
lines. Can you show us the full command line, particularly for the process
26162? This seems to be causing problems.
If you are running your cluster on 2 nodes, it may be that the task was
scheduled on the second node. Did you run "ps -aef" on the second node as
well? You can see the task assignment in the JT web-UI (
http://jt-name:50030, drill down to tasks).
I suggest you first debug your program in the local mode first, however (use
"*-jt local*" option). Did you try the "*-D mapred.child.ulimit=3145728*"
option? I do not see it on the command line.
Alex K
On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <sh...@gmail.com>wrote:
> Hi Alex
>
> I have tried with using quotes and also with -jt local but same heap
> error.
> and here is the output of ps -aef
>
> UID PID PPID C STIME TTY TIME CMD
> root 1 0 0 04:37 ? 00:00:00 init [3]
> root 2 1 0 04:37 ? 00:00:00 [migration/0]
> root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
> root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
> root 5 1 0 04:37 ? 00:00:00 [events/0]
> root 6 1 0 04:37 ? 00:00:00 [khelper]
> root 7 1 0 04:37 ? 00:00:00 [kthread]
> root 9 7 0 04:37 ? 00:00:00 [xenwatch]
> root 10 7 0 04:37 ? 00:00:00 [xenbus]
> root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
> root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
> root 22 7 0 04:37 ? 00:00:00 [khubd]
> root 24 7 0 04:37 ? 00:00:00 [kseriod]
> root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
> root 85 7 0 04:37 ? 00:00:00 [pdflush]
> root 86 7 0 04:37 ? 00:00:00 [pdflush]
> root 87 7 0 04:37 ? 00:00:00 [kswapd0]
> root 88 7 0 04:37 ? 00:00:00 [aio/0]
> root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
> root 248 7 0 04:37 ? 00:00:00 [kstriped]
> root 257 7 0 04:37 ? 00:00:00 [kjournald]
> root 279 7 0 04:37 ? 00:00:00 [kauditd]
> root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
> root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
> root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
> root 660 7 0 04:37 ? 00:00:00 [kjournald]
> root 662 7 0 04:37 ? 00:00:00 [kjournald]
> root 1032 1 0 04:38 ? 00:00:00 auditd
> root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
> root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
> root 1052 1 0 04:38 ? 00:00:00 klogd -x
> root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
> root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
> dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system
> root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
> root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
> root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
> root 1244 1 0 04:38 ? 00:00:00 pcscd
> root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd --server
> root 1295 1 0 04:38 ? 00:00:00 automount
> root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
> root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive -pidfile
> /var/run/xinetd.pid
> root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
> /etc/vsftpd/vsftpd.conf
> root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting
> connections
> smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue runner@01
> :00:00
> for /var/spool/clientmqueue
> root 1379 1 0 04:38 ? 00:00:00 gpm -m /dev/input/mice -t
> exps2
> root 1410 1 0 04:38 ? 00:00:00 crond
> xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv -daemon
> root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
> 68 1508 1 0 04:38 ? 00:00:00 hald
> root 1509 1508 0 04:38 ? 00:00:00 hald-runner
> root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q never
> root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0 9600
> vt100-nav
> root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt
> /usr/sbin/yum-updatesd
> root 1539 1 0 04:38 ? 00:00:00 /usr/libexec/gam_server
> root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
> root 21024 21022 0 11:27 pts/0 00:00:00 -bash
> root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
> root 21105 21103 0 11:28 pts/1 00:00:00 -bash
> root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
> root 21994 21992 0 11:47 pts/2 00:00:00 -bash
> root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
> root 22437 22433 0 11:49 pts/3 00:00:00 -bash
> hadoop 24808 1 0 12:01 ? 00:00:02 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop 24893 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop 24988 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop 25085 1 0 12:01 ? 00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop 25175 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> -Dhadoop.log.file=hadoo
> root 25925 21994 1 12:06 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -
> hadoop 26120 25175 14 12:06 ? 00:00:01
> /usr/jdk1.6.0_03/jre/bin/java
>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java
> -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> -Dscript.name=/usr/local/groovy/b
> root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
>
>
> *The command which i am executing is *
>
>
> hadoop jar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> -D mapred.child.java.opts=-Xmx1024m \
> -inputformat StreamInputFormat \
> -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> \
> -input
>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> \
> -jobconf mapred.map.tasks=1 \
> -jobconf mapred.reduce.tasks=0 \
> -output RNC25 \
> -mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
> -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> -file /home/ftpuser1/Nodemapper5.groovy \
> -jt local
>
> I have noticed that the all hadoop processes showing 2001 memory size which
> i have set in hadoop-env.sh. and one the command, i give 2000 in mapper and
> 1024 in child.java.opts but i think these values(1024,2001) are not in use.
> secondly the following lines
>
> *hadoop 26120 25175 14 12:06 ? 00:00:01
> /usr/jdk1.6.0_03/jre/bin/java
>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java
> -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> -Dscript.name=/usr/local/groovy/b*
>
> did not appear for first time when job runs. they appear when job failed
> for
> first time and then again try to start mapping. I have one more question
> which is as all hadoop processes (namenode, datanode, tasktracker...)
> showing 2001 heapsize in process. will it means all the processes using
> 2001m of memory??
>
> Regards
> Shuja
>
>
> On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > I think you need to enclose the invocation string in quotes. Try:
> >
> > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> >
> > Also, it would be nice to see how exactly the groovy is invoked. Is
> groovy
> > started and them gives you OOM or is OOM error during the start? Can you
> > see the new process with "ps -aef"?
> >
> > Can you run groovy in local mode? Try "-jt local" option.
> >
> > Thanks,
> >
> > Alex K
> >
> > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi Patrick,
> > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > following way
> > >
> > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > >
> > > but still same error. Any other idea?
> > > Thanks
> > >
> > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> > > >wrote:
> > >
> > > > Shuja,
> > > >
> > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are
> only
> > > > used
> > > > for child JVMs that get forked by the TaskTracker. You are using
> Hadoop
> > > > streaming, which means the TaskTracker is forking a JVM for
> streaming,
> > > > which
> > > > is then forking a shell process that runs your groovy code (in
> another
> > > > JVM).
> > > >
> > > > I'm not much of a groovy expert, but if there's a way you can wrap
> your
> > > > code
> > > > around the MapReduce API that would work best. Otherwise, you can
> just
> > > pass
> > > > the heapsize in '-mapper' argument.
> > > >
> > > > Regards,
> > > >
> > > > - Patrick
> > > >
> > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Alex,
> > > > >
> > > > > I have update the java to latest available version on all machines
> in
> > > the
> > > > > cluster and now i run the job by adding this line
> > > > >
> > > > > -D mapred.child.ulimit=3145728 \
> > > > >
> > > > > but still same error. Here is the output of this job.
> > > > >
> > > > >
> > > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -Dha
> doop.home.dir=/usr/lib/hadoop-0.20
> > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > /usr/lib/hadoop-0.20/con
> > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > org.apache.hadoop.util.RunJar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > -D
> > > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > > -inputformat StreamIn putFormat -inputreader
> > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> mapred.map.tasks=1
> > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > /home/ftpuser1/Nodemapp
> > > > > er5.groovy
> > > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> > Nodemapper5.groovy
> > > > >
> > > > >
> > > > > Any clue?
> > > > > Thanks
> > > > >
> > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > First, thank you for using CDH3. Can you also check what m*
> > > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > >
> > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > > > minimum,
> > > > > > which you can download from the Java SE
> > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > .
> > > > > >
> > > > > > Let me know how it goes.
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi Alex
> > > > > > >
> > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > Cloudera
> > > > > > > distribution of hadoop. and here is the output of this command.
> > > > > > >
> > > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log
> > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > StreamInputFormat
> > > > > > > -inputreader StreamXmlRecordReader,begin= <mdc
> > xmlns:HTML="
> > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > mapred.map.tasks=1
> > > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > > Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > alexvk@cloudera.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > It looks like the OOM is happening in your code. Are you
> > running
> > > > > > > MapReduce
> > > > > > > > in a cluster? If so, can you send the exact command line
> your
> > > code
> > > > > is
> > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > Nodemapper5.groovy'
> > > > > > > > command on one of the nodes which is running the task?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Hi All
> > > > > > > > >
> > > > > > > > > I am facing a hard problem. I am running a map reduce job
> > using
> > > > > > > streaming
> > > > > > > > > but it fails and it gives the following error.
> > > > > > > > >
> > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > >
> > > > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > > > subprocess
> > > > > > > > > failed with code 1
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > at
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > >
> > > > > > > > > at
> > > > > > > >
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > at
> > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > at
> > > > > > > > >
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > at
> > > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > >
> > > > > > > > > at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > at
> org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > > > 2000M.
> > > > > > Also
> > > > > > > I
> > > > > > > > > tell the job manually by following line.
> > > > > > > > >
> > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > >
> > > > > > > > > but it still gives the error. The same job runs fine if i
> run
> > > on
> > > > > > shell
> > > > > > > > > using
> > > > > > > > > 1024M heap size like
> > > > > > > > >
> > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Any clue?????????
> > > > > > > > >
> > > > > > > > > Thanks in advance.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex
I have tried with using quotes and also with -jt local but same heap error.
and here is the output of ps -aef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 04:37 ? 00:00:00 init [3]
root 2 1 0 04:37 ? 00:00:00 [migration/0]
root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
root 4 1 0 04:37 ? 00:00:00 [watchdog/0]
root 5 1 0 04:37 ? 00:00:00 [events/0]
root 6 1 0 04:37 ? 00:00:00 [khelper]
root 7 1 0 04:37 ? 00:00:00 [kthread]
root 9 7 0 04:37 ? 00:00:00 [xenwatch]
root 10 7 0 04:37 ? 00:00:00 [xenbus]
root 17 7 0 04:37 ? 00:00:00 [kblockd/0]
root 18 7 0 04:37 ? 00:00:00 [cqueue/0]
root 22 7 0 04:37 ? 00:00:00 [khubd]
root 24 7 0 04:37 ? 00:00:00 [kseriod]
root 84 7 0 04:37 ? 00:00:00 [khungtaskd]
root 85 7 0 04:37 ? 00:00:00 [pdflush]
root 86 7 0 04:37 ? 00:00:00 [pdflush]
root 87 7 0 04:37 ? 00:00:00 [kswapd0]
root 88 7 0 04:37 ? 00:00:00 [aio/0]
root 229 7 0 04:37 ? 00:00:00 [kpsmoused]
root 248 7 0 04:37 ? 00:00:00 [kstriped]
root 257 7 0 04:37 ? 00:00:00 [kjournald]
root 279 7 0 04:37 ? 00:00:00 [kauditd]
root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d
root 634 7 0 04:37 ? 00:00:00 [kmpathd/0]
root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd]
root 660 7 0 04:37 ? 00:00:00 [kjournald]
root 662 7 0 04:37 ? 00:00:00 [kjournald]
root 1032 1 0 04:38 ? 00:00:00 auditd
root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd
root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0
root 1052 1 0 04:38 ? 00:00:00 klogd -x
root 1090 7 0 04:38 ? 00:00:00 [rpciod/0]
root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd
dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system
root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid
root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd
root 1210 1 0 04:38 ? 00:00:00 [krfcommd]
root 1244 1 0 04:38 ? 00:00:00 pcscd
root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd --server
root 1295 1 0 04:38 ? 00:00:00 automount
root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd
root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive -pidfile
/var/run/xinetd.pid
root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd
/etc/vsftpd/vsftpd.conf
root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting
connections
smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue runner@01:00:00
for /var/spool/clientmqueue
root 1379 1 0 04:38 ? 00:00:00 gpm -m /dev/input/mice -t
exps2
root 1410 1 0 04:38 ? 00:00:00 crond
xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv -daemon
root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd
68 1508 1 0 04:38 ? 00:00:00 hald
root 1509 1508 0 04:38 ? 00:00:00 hald-runner
root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q never
root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0 9600
vt100-nav
root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt
/usr/sbin/yum-updatesd
root 1539 1 0 04:38 ? 00:00:00 /usr/libexec/gam_server
root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0
root 21024 21022 0 11:27 pts/0 00:00:00 -bash
root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1
root 21105 21103 0 11:28 pts/1 00:00:00 -bash
root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2
root 21994 21992 0 11:47 pts/2 00:00:00 -bash
root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3
root 22437 22433 0 11:49 pts/3 00:00:00 -bash
hadoop 24808 1 0 12:01 ? 00:00:02 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop 24893 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop 24988 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop 25085 1 0 12:01 ? 00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop 25175 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
-Dhadoop.log.file=hadoo
root 25925 21994 1 12:06 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
-Dhadoop.log.file=hadoop.log -
hadoop 26120 25175 14 12:06 ? 00:00:01
/usr/jdk1.6.0_03/jre/bin/java
-Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java
-classpath /usr/local/groovy/lib/groovy-1.7.3.jar
-Dscript.name=/usr/local/groovy/b
root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef
*The command which i am executing is *
hadoop jar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
-D mapred.child.java.opts=-Xmx1024m \
-inputformat StreamInputFormat \
-inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
http://www.w3.org/TR/REC-xml\">,end=</mdc>" \
-input
/user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
\
-jobconf mapred.map.tasks=1 \
-jobconf mapred.reduce.tasks=0 \
-output RNC25 \
-mapper "/home/ftpuser1/Nodemapper5.groovy -Xmx2000m"\
-reducer org.apache.hadoop.mapred.lib.IdentityReducer \
-file /home/ftpuser1/Nodemapper5.groovy \
-jt local
I have noticed that the all hadoop processes showing 2001 memory size which
i have set in hadoop-env.sh. and one the command, i give 2000 in mapper and
1024 in child.java.opts but i think these values(1024,2001) are not in use.
secondly the following lines
*hadoop 26120 25175 14 12:06 ? 00:00:01
/usr/jdk1.6.0_03/jre/bin/java
-Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java
-classpath /usr/local/groovy/lib/groovy-1.7.3.jar
-Dscript.name=/usr/local/groovy/b*
did not appear for first time when job runs. they appear when job failed for
first time and then again try to start mapping. I have one more question
which is as all hadoop processes (namenode, datanode, tasktracker...)
showing 2001 heapsize in process. will it means all the processes using
2001m of memory??
Regards
Shuja
On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com> wrote:
> Hi Shuja,
>
> I think you need to enclose the invocation string in quotes. Try:
>
> -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
>
> Also, it would be nice to see how exactly the groovy is invoked. Is groovy
> started and them gives you OOM or is OOM error during the start? Can you
> see the new process with "ps -aef"?
>
> Can you run groovy in local mode? Try "-jt local" option.
>
> Thanks,
>
> Alex K
>
> On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi Patrick,
> > Thanks for explanation. I have supply the heapsize in mapper in the
> > following way
> >
> > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> >
> > but still same error. Any other idea?
> > Thanks
> >
> > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> > >wrote:
> >
> > > Shuja,
> > >
> > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> > > used
> > > for child JVMs that get forked by the TaskTracker. You are using Hadoop
> > > streaming, which means the TaskTracker is forking a JVM for streaming,
> > > which
> > > is then forking a shell process that runs your groovy code (in another
> > > JVM).
> > >
> > > I'm not much of a groovy expert, but if there's a way you can wrap your
> > > code
> > > around the MapReduce API that would work best. Otherwise, you can just
> > pass
> > > the heapsize in '-mapper' argument.
> > >
> > > Regards,
> > >
> > > - Patrick
> > >
> > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Alex,
> > > >
> > > > I have update the java to latest available version on all machines in
> > the
> > > > cluster and now i run the job by adding this line
> > > >
> > > > -D mapred.child.ulimit=3145728 \
> > > >
> > > > but still same error. Here is the output of this job.
> > > >
> > > >
> > > > root 7845 5674 3 01:24 pts/1 00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > /usr/lib/hadoop-0.20/con
> > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > org.apache.hadoop.util.RunJar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > -D
> > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > -inputformat StreamIn putFormat -inputreader
> > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > /home/ftpuser1/Nodemapp
> > > > er5.groovy
> > > > root 7930 7632 0 01:24 pts/2 00:00:00 grep
> Nodemapper5.groovy
> > > >
> > > >
> > > > Any clue?
> > > > Thanks
> > > >
> > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > First, thank you for using CDH3. Can you also check what m*
> > > > > apred.child.ulimit* you are using? Try adding "*
> > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > >
> > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > > minimum,
> > > > > which you can download from the Java SE
> > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > .
> > > > >
> > > > > Let me know how it goes.
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi Alex
> > > > > >
> > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > Cloudera
> > > > > > distribution of hadoop. and here is the output of this command.
> > > > > >
> > > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > > > > > -Dhadoop.log.file=hadoop.log
> -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> StreamInputFormat
> > > > > > -inputreader StreamXmlRecordReader,begin= <mdc
> xmlns:HTML="
> > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > mapred.map.tasks=1
> > > > > > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > > Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > and what is meant by OOM and thanks for helping,
> > > > > >
> > > > > > Best Regards
> > > > > >
> > > > > >
> > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> alexvk@cloudera.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > It looks like the OOM is happening in your code. Are you
> running
> > > > > > MapReduce
> > > > > > > in a cluster? If so, can you send the exact command line your
> > code
> > > > is
> > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > Nodemapper5.groovy'
> > > > > > > command on one of the nodes which is running the task?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi All
> > > > > > > >
> > > > > > > > I am facing a hard problem. I am running a map reduce job
> using
> > > > > > streaming
> > > > > > > > but it fails and it gives the following error.
> > > > > > > >
> > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > >
> > > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > > subprocess
> > > > > > > > failed with code 1
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > at
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > >
> > > > > > > > at
> > > > > > >
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > at
> > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > at
> > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > at
> > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > >
> > > > > > > > at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > >
> > > > > > > >
> > > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > > 2000M.
> > > > > Also
> > > > > > I
> > > > > > > > tell the job manually by following line.
> > > > > > > >
> > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > >
> > > > > > > > but it still gives the error. The same job runs fine if i run
> > on
> > > > > shell
> > > > > > > > using
> > > > > > > > 1024M heap size like
> > > > > > > >
> > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > >
> > > > > > > >
> > > > > > > > Any clue?????????
> > > > > > > >
> > > > > > > > Thanks in advance.
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,
I think you need to enclose the invocation string in quotes. Try:
-mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
Also, it would be nice to see how exactly the groovy is invoked. Is groovy
started and them gives you OOM or is OOM error during the start? Can you
see the new process with "ps -aef"?
Can you run groovy in local mode? Try "-jt local" option.
Thanks,
Alex K
On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi Patrick,
> Thanks for explanation. I have supply the heapsize in mapper in the
> following way
>
> -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
>
> but still same error. Any other idea?
> Thanks
>
> On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> >wrote:
>
> > Shuja,
> >
> > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> > used
> > for child JVMs that get forked by the TaskTracker. You are using Hadoop
> > streaming, which means the TaskTracker is forking a JVM for streaming,
> > which
> > is then forking a shell process that runs your groovy code (in another
> > JVM).
> >
> > I'm not much of a groovy expert, but if there's a way you can wrap your
> > code
> > around the MapReduce API that would work best. Otherwise, you can just
> pass
> > the heapsize in '-mapper' argument.
> >
> > Regards,
> >
> > - Patrick
> >
> > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi Alex,
> > >
> > > I have update the java to latest available version on all machines in
> the
> > > cluster and now i run the job by adding this line
> > >
> > > -D mapred.child.ulimit=3145728 \
> > >
> > > but still same error. Here is the output of this job.
> > >
> > >
> > > root 7845 5674 3 01:24 pts/1 00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > /usr/lib/hadoop-0.20/con
> > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > org.apache.hadoop.util.RunJar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> -D
> > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > -inputformat StreamIn putFormat -inputreader
> > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > 3.org/TR/REC-xml">,end=</mdc>
> > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > /home/ftpuser1/Nodemapp
> > > er5.groovy
> > > root 7930 7632 0 01:24 pts/2 00:00:00 grep Nodemapper5.groovy
> > >
> > >
> > > Any clue?
> > > Thanks
> > >
> > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > First, thank you for using CDH3. Can you also check what m*
> > > > apred.child.ulimit* you are using? Try adding "*
> > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > >
> > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > minimum,
> > > > which you can download from the Java SE
> > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > .
> > > >
> > > > Let me know how it goes.
> > > >
> > > > Alex K
> > > >
> > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> shujamughal@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi Alex
> > > > >
> > > > > Yeah, I am running a job on cluster of 2 machines and using
> Cloudera
> > > > > distribution of hadoop. and here is the output of this command.
> > > > >
> > > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > > > -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML="
> > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> mapred.map.tasks=1
> > > > > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> > Nodemapper5.groovy
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > and what is meant by OOM and thanks for helping,
> > > > >
> > > > > Best Regards
> > > > >
> > > > >
> > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <alexvk@cloudera.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > It looks like the OOM is happening in your code. Are you running
> > > > > MapReduce
> > > > > > in a cluster? If so, can you send the exact command line your
> code
> > > is
> > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > Nodemapper5.groovy'
> > > > > > command on one of the nodes which is running the task?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi All
> > > > > > >
> > > > > > > I am facing a hard problem. I am running a map reduce job using
> > > > > streaming
> > > > > > > but it fails and it gives the following error.
> > > > > > >
> > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > >
> > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > subprocess
> > > > > > > failed with code 1
> > > > > > > at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > >
> > > > > > > at
> > > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > at
> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > at
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > at
> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > >
> > > > > > > at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > >
> > > > > > >
> > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > 2000M.
> > > > Also
> > > > > I
> > > > > > > tell the job manually by following line.
> > > > > > >
> > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > >
> > > > > > > but it still gives the error. The same job runs fine if i run
> on
> > > > shell
> > > > > > > using
> > > > > > > 1024M heap size like
> > > > > > >
> > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > > Any clue?????????
> > > > > > >
> > > > > > > Thanks in advance.
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi Patrick,
Thanks for explanation. I have supply the heapsize in mapper in the
following way
-mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
but still same error. Any other idea?
Thanks
On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <pa...@cloudera.com>wrote:
> Shuja,
>
> Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> used
> for child JVMs that get forked by the TaskTracker. You are using Hadoop
> streaming, which means the TaskTracker is forking a JVM for streaming,
> which
> is then forking a shell process that runs your groovy code (in another
> JVM).
>
> I'm not much of a groovy expert, but if there's a way you can wrap your
> code
> around the MapReduce API that would work best. Otherwise, you can just pass
> the heapsize in '-mapper' argument.
>
> Regards,
>
> - Patrick
>
> On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi Alex,
> >
> > I have update the java to latest available version on all machines in the
> > cluster and now i run the job by adding this line
> >
> > -D mapred.child.ulimit=3145728 \
> >
> > but still same error. Here is the output of this job.
> >
> >
> > root 7845 5674 3 01:24 pts/1 00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> /usr/lib/hadoop-0.20/con
> >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > org.apache.hadoop.util.RunJar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
> > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > -inputformat StreamIn putFormat -inputreader
> > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > 3.org/TR/REC-xml">,end=</mdc>
> > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /home/ftpuser1/Nodemapp
> > er5.groovy
> > root 7930 7632 0 01:24 pts/2 00:00:00 grep Nodemapper5.groovy
> >
> >
> > Any clue?
> > Thanks
> >
> > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > First, thank you for using CDH3. Can you also check what m*
> > > apred.child.ulimit* you are using? Try adding "*
> > > -D mapred.child.ulimit=3145728*" to the command line.
> > >
> > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> minimum,
> > > which you can download from the Java SE
> > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > .
> > >
> > > Let me know how it goes.
> > >
> > > Alex K
> > >
> > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi Alex
> > > >
> > > > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > > > distribution of hadoop. and here is the output of this command.
> > > >
> > > > root 5277 5238 3 12:51 pts/2 00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > > -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML="
> > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > home/ftpuser1/Nodemapper5.groovy
> > > > root 5360 5074 0 12:51 pts/1 00:00:00 grep
> Nodemapper5.groovy
> > > >
> > > >
> > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > and what is meant by OOM and thanks for helping,
> > > >
> > > > Best Regards
> > > >
> > > >
> > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > It looks like the OOM is happening in your code. Are you running
> > > > MapReduce
> > > > > in a cluster? If so, can you send the exact command line your code
> > is
> > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > Nodemapper5.groovy'
> > > > > command on one of the nodes which is running the task?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi All
> > > > > >
> > > > > > I am facing a hard problem. I am running a map reduce job using
> > > > streaming
> > > > > > but it fails and it gives the following error.
> > > > > >
> > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > >
> > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > subprocess
> > > > > > failed with code 1
> > > > > > at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > >
> > > > > > at
> > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > at
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > at
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > at
> > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > >
> > > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > >
> > > > > >
> > > > > > I have increased the heap size in hadoop-env.sh and make it
> 2000M.
> > > Also
> > > > I
> > > > > > tell the job manually by following line.
> > > > > >
> > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > >
> > > > > > but it still gives the error. The same job runs fine if i run on
> > > shell
> > > > > > using
> > > > > > 1024M heap size like
> > > > > >
> > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > Any clue?????????
> > > > > >
> > > > > > Thanks in advance.
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Patrick Angeles <pa...@cloudera.com>.
Shuja,
Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only used
for child JVMs that get forked by the TaskTracker. You are using Hadoop
streaming, which means the TaskTracker is forking a JVM for streaming, which
is then forking a shell process that runs your groovy code (in another JVM).
I'm not much of a groovy expert, but if there's a way you can wrap your code
around the MapReduce API that would work best. Otherwise, you can just pass
the heapsize in '-mapper' argument.
Regards,
- Patrick
On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com> wrote:
> Hi Alex,
>
> I have update the java to latest available version on all machines in the
> cluster and now i run the job by adding this line
>
> -D mapred.child.ulimit=3145728 \
>
> but still same error. Here is the output of this job.
>
>
> root 7845 5674 3 01:24 pts/1 00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/con
>
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
>
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
>
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
>
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
>
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
>
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
>
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
>
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
>
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
>
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
>
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
>
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
>
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
>
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
>
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
>
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
>
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
>
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
>
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
>
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> org.apache.hadoop.util.RunJar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
> mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> -inputformat StreamIn putFormat -inputreader
> StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> 3.org/TR/REC-xml">,end=</mdc>
> -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> /home/ftpuser1/Nodemapper5.groovy -re ducer
> org.apache.hadoop.mapred.lib.IdentityReducer -file /home/ftpuser1/Nodemapp
> er5.groovy
> root 7930 7632 0 01:24 pts/2 00:00:00 grep Nodemapper5.groovy
>
>
> Any clue?
> Thanks
>
> On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > First, thank you for using CDH3. Can you also check what m*
> > apred.child.ulimit* you are using? Try adding "*
> > -D mapred.child.ulimit=3145728*" to the command line.
> >
> > I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
> > which you can download from the Java SE
> > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > .
> >
> > Let me know how it goes.
> >
> > Alex K
> >
> > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> > >wrote:
> >
> > > Hi Alex
> > >
> > > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > > distribution of hadoop. and here is the output of this command.
> > >
> > > root 5277 5238 3 12:51 pts/2 00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > /usr/lib/hadoop-0.20/conf:/usr/
> > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > -2.1.jar org.apache.hadoop.util.RunJar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML="
> > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > home/ftpuser1/Nodemapper5.groovy
> > > root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy
> > >
> > >
> > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > and what is meant by OOM and thanks for helping,
> > >
> > > Best Regards
> > >
> > >
> > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > It looks like the OOM is happening in your code. Are you running
> > > MapReduce
> > > > in a cluster? If so, can you send the exact command line your code
> is
> > > > invoked with -- you can get it with a 'ps -Af | grep
> > Nodemapper5.groovy'
> > > > command on one of the nodes which is running the task?
> > > >
> > > > Thanks,
> > > >
> > > > Alex K
> > > >
> > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> shujamughal@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi All
> > > > >
> > > > > I am facing a hard problem. I am running a map reduce job using
> > > streaming
> > > > > but it fails and it gives the following error.
> > > > >
> > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > >
> > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess
> > > > > failed with code 1
> > > > > at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > >
> > > > > at
> > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > at
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > at
> > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > >
> > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > >
> > > > >
> > > > > I have increased the heap size in hadoop-env.sh and make it 2000M.
> > Also
> > > I
> > > > > tell the job manually by following line.
> > > > >
> > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > >
> > > > > but it still gives the error. The same job runs fine if i run on
> > shell
> > > > > using
> > > > > 1024M heap size like
> > > > >
> > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > >
> > > > >
> > > > > Any clue?????????
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex,
I have update the java to latest available version on all machines in the
cluster and now i run the job by adding this line
-D mapred.child.ulimit=3145728 \
but still same error. Here is the output of this job.
root 7845 5674 3 01:24 pts/1 00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
-Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
-Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
-Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/con
f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
org.apache.hadoop.util.RunJar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
-inputformat StreamIn putFormat -inputreader
StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
3.org/TR/REC-xml">,end=</mdc>
-input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
-jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
/home/ftpuser1/Nodemapper5.groovy -re ducer
org.apache.hadoop.mapred.lib.IdentityReducer -file /home/ftpuser1/Nodemapp
er5.groovy
root 7930 7632 0 01:24 pts/2 00:00:00 grep Nodemapper5.groovy
Any clue?
Thanks
On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com> wrote:
> Hi Shuja,
>
> First, thank you for using CDH3. Can you also check what m*
> apred.child.ulimit* you are using? Try adding "*
> -D mapred.child.ulimit=3145728*" to the command line.
>
> I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
> which you can download from the Java SE
> Homepage<http://java.sun.com/javase/downloads/index.jsp>
> .
>
> Let me know how it goes.
>
> Alex K
>
> On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> >wrote:
>
> > Hi Alex
> >
> > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > distribution of hadoop. and here is the output of this command.
> >
> > root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > /usr/lib/hadoop-0.20/conf:/usr/
> >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > -2.1.jar org.apache.hadoop.util.RunJar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML="
> > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> > /home/ftpuser1/Nodemapper5.groovy -reducer
> > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > home/ftpuser1/Nodemapper5.groovy
> > root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy
> >
> >
> >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > and what is meant by OOM and thanks for helping,
> >
> > Best Regards
> >
> >
> > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > It looks like the OOM is happening in your code. Are you running
> > MapReduce
> > > in a cluster? If so, can you send the exact command line your code is
> > > invoked with -- you can get it with a 'ps -Af | grep
> Nodemapper5.groovy'
> > > command on one of the nodes which is running the task?
> > >
> > > Thanks,
> > >
> > > Alex K
> > >
> > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi All
> > > >
> > > > I am facing a hard problem. I am running a map reduce job using
> > streaming
> > > > but it fails and it gives the following error.
> > > >
> > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > >
> > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess
> > > > failed with code 1
> > > > at
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > at
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > >
> > > > at
> > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > at
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > >
> > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > >
> > > >
> > > > I have increased the heap size in hadoop-env.sh and make it 2000M.
> Also
> > I
> > > > tell the job manually by following line.
> > > >
> > > > -D mapred.child.java.opts=-Xmx2000M \
> > > >
> > > > but it still gives the error. The same job runs fine if i run on
> shell
> > > > using
> > > > 1024M heap size like
> > > >
> > > > cat file.xml | /root/Nodemapper5.groovy
> > > >
> > > >
> > > > Any clue?????????
> > > >
> > > > Thanks in advance.
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,
First, thank you for using CDH3. Can you also check what m*
apred.child.ulimit* you are using? Try adding "*
-D mapred.child.ulimit=3145728*" to the command line.
I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
which you can download from the Java SE
Homepage<http://java.sun.com/javase/downloads/index.jsp>
.
Let me know how it goes.
Alex K
On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <sh...@gmail.com>wrote:
> Hi Alex
>
> Yeah, I am running a job on cluster of 2 machines and using Cloudera
> distribution of hadoop. and here is the output of this command.
>
> root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> -Dhadoop.id.str= -Dhado op.root.logger=INFO,console
> -Dhadoop.policy.file=hadoop-policy.xml -classpath
> /usr/lib/hadoop-0.20/conf:/usr/
>
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
>
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
>
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
>
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
>
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
>
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
>
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
>
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
>
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
>
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
>
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
>
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
>
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
>
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
>
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> -2.1.jar org.apache.hadoop.util.RunJar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML="
> http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
> /home/ftpuser1/Nodemapper5.groovy -reducer
> org.apache.hadoop.mapred.lib.IdentityReducer -file /
> home/ftpuser1/Nodemapper5.groovy
> root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy
>
>
>
> ------------------------------------------------------------------------------------------------------------------------------
> and what is meant by OOM and thanks for helping,
>
> Best Regards
>
>
> On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > It looks like the OOM is happening in your code. Are you running
> MapReduce
> > in a cluster? If so, can you send the exact command line your code is
> > invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy'
> > command on one of the nodes which is running the task?
> >
> > Thanks,
> >
> > Alex K
> >
> > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamughal@gmail.com
> > >wrote:
> >
> > > Hi All
> > >
> > > I am facing a hard problem. I am running a map reduce job using
> streaming
> > > but it fails and it gives the following error.
> > >
> > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > >
> > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > > failed with code 1
> > > at
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > at
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > >
> > > at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > at
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > >
> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > >
> > >
> > > I have increased the heap size in hadoop-env.sh and make it 2000M. Also
> I
> > > tell the job manually by following line.
> > >
> > > -D mapred.child.java.opts=-Xmx2000M \
> > >
> > > but it still gives the error. The same job runs fine if i run on shell
> > > using
> > > 1024M heap size like
> > >
> > > cat file.xml | /root/Nodemapper5.groovy
> > >
> > >
> > > Any clue?????????
> > >
> > > Thanks in advance.
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex
Yeah, I am running a job on cluster of 2 machines and using Cloudera
distribution of hadoop. and here is the output of this command.
root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
-Dhadoop.id.str= -Dhado op.root.logger=INFO,console
-Dhadoop.policy.file=hadoop-policy.xml -classpath
/usr/lib/hadoop-0.20/conf:/usr/
jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
-2.1.jar org.apache.hadoop.util.RunJar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
-D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
-inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML="
http://www.w3.org/TR/REC-xml">,end=</mdc> -input
/user/root/RNCDATA/MDFDORKUCRAR02/A20100531
.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
-jobconf mapred.reduce.tasks=0 -output RNC11 -mapper
/home/ftpuser1/Nodemapper5.groovy -reducer
org.apache.hadoop.mapred.lib.IdentityReducer -file /
home/ftpuser1/Nodemapper5.groovy
root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy
------------------------------------------------------------------------------------------------------------------------------
and what is meant by OOM and thanks for helping,
Best Regards
On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com> wrote:
> Hi Shuja,
>
> It looks like the OOM is happening in your code. Are you running MapReduce
> in a cluster? If so, can you send the exact command line your code is
> invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy'
> command on one of the nodes which is running the task?
>
> Thanks,
>
> Alex K
>
> On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamughal@gmail.com
> >wrote:
>
> > Hi All
> >
> > I am facing a hard problem. I am running a map reduce job using streaming
> > but it fails and it gives the following error.
> >
> > Caught: java.lang.OutOfMemoryError: Java heap space
> > at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> >
> > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > failed with code 1
> > at
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > at
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> >
> > at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> >
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> >
> > I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
> > tell the job manually by following line.
> >
> > -D mapred.child.java.opts=-Xmx2000M \
> >
> > but it still gives the error. The same job runs fine if i run on shell
> > using
> > 1024M heap size like
> >
> > cat file.xml | /root/Nodemapper5.groovy
> >
> >
> > Any clue?????????
> >
> > Thanks in advance.
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
Re: java.lang.OutOfMemoryError: Java heap space
Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,
It looks like the OOM is happening in your code. Are you running MapReduce
in a cluster? If so, can you send the exact command line your code is
invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy'
command on one of the nodes which is running the task?
Thanks,
Alex K
On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <sh...@gmail.com>wrote:
> Hi All
>
> I am facing a hard problem. I am running a map reduce job using streaming
> but it fails and it gives the following error.
>
> Caught: java.lang.OutOfMemoryError: Java heap space
> at Nodemapper5.parseXML(Nodemapper5.groovy:25)
>
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
> at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>
> at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
> tell the job manually by following line.
>
> -D mapred.child.java.opts=-Xmx2000M \
>
> but it still gives the error. The same job runs fine if i run on shell
> using
> 1024M heap size like
>
> cat file.xml | /root/Nodemapper5.groovy
>
>
> Any clue?????????
>
> Thanks in advance.
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>
java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi All
I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.
Caught: java.lang.OutOfMemoryError: Java heap space
at Nodemapper5.parseXML(Nodemapper5.groovy:25)
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
tell the job manually by following line.
-D mapred.child.java.opts=-Xmx2000M \
but it still gives the error. The same job runs fine if i run on shell using
1024M heap size like
cat file.xml | /root/Nodemapper5.groovy
Any clue?????????
Thanks in advance.
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445
java.lang.OutOfMemoryError: Java heap space
Posted by Shuja Rehman <sh...@gmail.com>.
Hi All
I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.
Caught: java.lang.OutOfMemoryError: Java heap space
at Nodemapper5.parseXML(Nodemapper5.groovy:25)
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
tell the job manually by following line.
-D mapred.child.java.opts=-Xmx2000M \
but it still gives the error. The same job runs fine if i run on shell using
1024M heap size like
cat file.xml | /root/Nodemapper5.groovy
Any clue?????????
Thanks in advance.
--
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445