You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Shuja Rehman <sh...@gmail.com> on 2010/07/09 19:14:55 UTC

java.lang.OutOfMemoryError: Java heap space

Hi All

I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.

Caught: java.lang.OutOfMemoryError: Java heap space
	at Nodemapper5.parseXML(Nodemapper5.groovy:25)
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)



I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
tell the job manually by following line.

-D mapred.child.java.opts=-Xmx2000M \

but it still gives the error. The same job runs fine if i run on shell using
1024M heap size like

cat file.xml | /root/Nodemapper5.groovy


Any clue?????????

Thanks in advance.


-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by anshul goel <an...@gmail.com>.
unsubscribe

On Fri, Jul 9, 2010 at 10:59 PM, anshul goel <an...@gmail.com> wrote:

> unsubscrive
>
>
> On Fri, Jul 9, 2010 at 10:44 PM, Shuja Rehman <sh...@gmail.com>wrote:
>
>> Hi All
>>
>> I am facing a hard problem. I am running a map reduce job using streaming
>> but it fails and it gives the following error.
>>
>> Caught: java.lang.OutOfMemoryError: Java heap space
>>        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
>> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
>> failed with code 1
>>        at
>> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>>        at
>> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>>        at
>> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
>>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>>        at
>> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>
>>
>>
>> I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
>> tell the job manually by following line.
>>
>> -D mapred.child.java.opts=-Xmx2000M \
>>
>> but it still gives the error. The same job runs fine if i run on shell
>> using
>> 1024M heap size like
>>
>> cat file.xml | /root/Nodemapper5.groovy
>>
>>
>> Any clue?????????
>>
>> Thanks in advance.
>>
>>
>> --
>> Regards
>> Shuja-ur-Rehman Baig
>> _________________________________
>> MS CS - School of Science and Engineering
>> Lahore University of Management Sciences (LUMS)
>> Sector U, DHA, Lahore, 54792, Pakistan
>> Cell: +92 3214207445
>>
>
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by anshul goel <an...@gmail.com>.
unsubscrive

On Fri, Jul 9, 2010 at 10:44 PM, Shuja Rehman <sh...@gmail.com> wrote:

> Hi All
>
> I am facing a hard problem. I am running a map reduce job using streaming
> but it fails and it gives the following error.
>
> Caught: java.lang.OutOfMemoryError: Java heap space
>        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>        at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>        at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>        at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
>
> I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
> tell the job manually by following line.
>
> -D mapred.child.java.opts=-Xmx2000M \
>
> but it still gives the error. The same job runs fine if i run on shell
> using
> 1024M heap size like
>
> cat file.xml | /root/Nodemapper5.groovy
>
>
> Any clue?????????
>
> Thanks in advance.
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Honestly, no idea.  I can just suggest running "*hadoop jar
/usr/lib/hadoop-0.20/contrib/*
*streaming/hadoop-streaming-0.**20.2+320.jar -jt local -fs local ...*" on
both nodes and debug.

On Mon, Jul 12, 2010 at 4:53 PM, Shuja Rehman <sh...@gmail.com> wrote:

> Alex, any guess why it fails on server while it has more free memory than
> slave.
>
> On Tue, Jul 13, 2010 at 3:06 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > *Master Node output:*
> >
> >    total       used       free     shared    buffers     cached
> > Mem:       2097328     515576    1581752          0      56060     254760
> > -/+ buffers/cache:     204756    1892572
> > Swap:       522104          0     522104
> >
> > *Slave Node output:*
> >   total       used       free     shared    buffers     cached
> > Mem:       1048752     860684     188068          0     148388     570948
> > -/+ buffers/cache:     141348     907404
> > Swap:       522104         40     522064
> >
> > it seems that on server there is more memory free.
> >
> > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> >> Maybe you do not have enough available memory on master?  What is the
> >> output
> >> of "*free*" on both nodes?  -- Alex K
> >>
> >> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> >> wrote:
> >>
> >> > Hi
> >> > I have added following line to my master node mapred-site.xml file
> >> >
> >> > <property>
> >> >    <name>mapred.child.ulimit</name>
> >> >    <value>3145728</value>
> >> >  </property>
> >> >
> >> > and run the job again, and wow..., the jobs get completed in 4th
> >> attempt. I
> >> > checked the at 50030. Hadoop runs job 3 times on master server and it
> >> fails
> >> > but when it run on 2nd node, it succeeded and produce the desired
> >> result.
> >> > Why it failed on master?
> >> > Thanks
> >> > Shuja
> >> >
> >> >
> >> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> >> wrote:
> >> >
> >> > > Hmm.  It means your options are not propagated to the nodes.  Can
> you
> >> put
> >> > *
> >> > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> >> tasktrackers?
> >> > >  I
> >> > > was under impression that the below should be enough though.  Glad
> you
> >> > got
> >> > > it working in local mode.  -- Alex K
> >> > >
> >> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <
> shujamughal@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Alex, I am using putty to connect to servers. and this is
> almost
> >> my
> >> > > > maximum screen output which i sent. putty is not allowed me to
> >> increase
> >> > > the
> >> > > > size of terminal. is there any other way that i get the complete
> >> output
> >> > > of
> >> > > > ps-aef?
> >> > > >
> >> > > > Now i run the following command and thnx God, it did not fails and
> >> > > produce
> >> > > > the desired output.
> >> > > >
> >> > > > hadoop jar
> >> > > >
> >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > \
> >> > > > -D mapred.child.java.opts=-Xmx1024m \
> >> > > > -D mapred.child.ulimit=3145728 \
> >> > > > -jt local \
> >> > > > -inputformat StreamInputFormat \
> >> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> >> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> >> http://www.w3.org/TR/REC-xml%5C> <
> >> > http://www.w3.org/TR/REC-xml%5C> <
> >> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> >> > > > \
> >> > > > -input
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> >> > > > \
> >> > > > -jobconf mapred.map.tasks=1 \
> >> > > > -jobconf mapred.reduce.tasks=0 \
> >> > > > -output RNC32 \
> >> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> >> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> >> > > > -file /home/ftpuser1/Nodemapper5.groovy
> >> > > >
> >> > > >
> >> > > > but when i omit the -jt local, it produces the same error.
> >> > > > Thanks Alex for helping
> >> > > > Regards
> >> > > > Shuja
> >> > > >
> >> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <alexvk@cloudera.com
> >
> >> > > wrote:
> >> > > >
> >> > > > > Hi Shuja,
> >> > > > >
> >> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..."
> >> on
> >> > the
> >> > > > > command line, the last is valid.  Unfortunately you have
> truncated
> >> > > > command
> >> > > > > lines.  Can you show us the full command line, particularly for
> >> the
> >> > > > process
> >> > > > > 26162?  This seems to be causing problems.
> >> > > > >
> >> > > > > If you are running your cluster on 2 nodes, it may be that the
> >> task
> >> > was
> >> > > > > scheduled on the second node.  Did you run "ps -aef" on the
> second
> >> > node
> >> > > > as
> >> > > > > well?  You can see the task assignment in the JT web-UI (
> >> > > > > http://jt-name:50030, drill down to tasks).
> >> > > > >
> >> > > > > I suggest you first debug your program in the local mode first,
> >> > however
> >> > > > > (use
> >> > > > > "*-jt local*" option).  Did you try the "*-D
> >> > > > mapred.child.ulimit=3145728*"
> >> > > > > option?  I do not see it on the command line.
> >> > > > >
> >> > > > > Alex K
> >> > > > >
> >> > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> >> > shujamughal@gmail.com
> >> > > > > >wrote:
> >> > > > >
> >> > > > > > Hi Alex
> >> > > > > >
> >> > > > > > I have tried with using quotes  and also with -jt local but
> same
> >> > heap
> >> > > > > > error.
> >> > > > > > and here is the output  of ps -aef
> >> > > > > >
> >> > > > > > UID        PID  PPID  C STIME TTY          TIME CMD
> >> > > > > > root         1     0  0 04:37 ?        00:00:00 init [3]
> >> > > > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> >> > > > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> >> > > > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> >> > > > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> >> > > > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> >> > > > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> >> > > > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> >> > > > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> >> > > > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> >> > > > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> >> > > > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> >> > > > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> >> > > > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> >> > > > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> >> > > > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> >> > > > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> >> > > > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> >> > > > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> >> > > > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> >> > > > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> >> > > > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> >> > > > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> >> > > > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> >> > > > > > root       635     7  0 04:37 ?        00:00:00
> >> [kmpath_handlerd]
> >> > > > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> >> > > > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> >> > > > > > root      1032     1  0 04:38 ?        00:00:00 auditd
> >> > > > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> >> > > > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> >> > > > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> >> > > > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> >> > > > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> >> > > > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon
> >> > --system
> >> > > > > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> >> > > > > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> >> > > > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> >> > > > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> >> > > > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
> >> > > --server
> >> > > > > > root      1295     1  0 04:38 ?        00:00:00 automount
> >> > > > > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> >> > > > > > root      1326     1  0 04:38 ?        00:00:00 xinetd
> >> -stayalive
> >> > > > > -pidfile
> >> > > > > > /var/run/xinetd.pid
> >> > > > > > root      1337     1  0 04:38 ?        00:00:00
> /usr/sbin/vsftpd
> >> > > > > > /etc/vsftpd/vsftpd.conf
> >> > > > > > root      1354     1  0 04:38 ?        00:00:00 sendmail:
> >> accepting
> >> > > > > > connections
> >> > > > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail:
> Queue
> >> > > > runner@01
> >> > > > > > :00:00
> >> > > > > > for /var/spool/clientmqueue
> >> > > > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
> >> > > /dev/input/mice
> >> > > > -t
> >> > > > > > exps2
> >> > > > > > root      1410     1  0 04:38 ?        00:00:00 crond
> >> > > > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv
> >> > -daemon
> >> > > > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> >> > > > > > 68        1508     1  0 04:38 ?        00:00:00 hald
> >> > > > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> >> > > > > > root      1533     1  0 04:38 ?        00:00:00
> /usr/sbin/smartd
> >> -q
> >> > > > never
> >> > > > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty
> >> xvc0
> >> > > 9600
> >> > > > > > vt100-nav
> >> > > > > > root      1537     1  0 04:38 ?        00:00:00
> /usr/bin/python
> >> -tt
> >> > > > > > /usr/sbin/yum-updatesd
> >> > > > > > root      1539     1  0 04:38 ?        00:00:00
> >> > > /usr/libexec/gam_server
> >> > > > > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts
> >> /0
> >> > > > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> >> > > > > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts
> >> /1
> >> > > > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> >> > > > > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts
> >> /2
> >> > > > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> >> > > > > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts
> >> /3
> >> > > > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> >> > > > > > hadoop   24808     1  0 12:01 ?        00:00:02
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop   24893     1  0 12:01 ?        00:00:01
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop   24988     1  0 12:01 ?        00:00:01
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop   25085     1  0 12:01 ?        00:00:00
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> >> > > -Dcom.sun.management.jmxremote
> >> > > > > > -Dhadoop.lo
> >> > > > > > hadoop   25175     1  0 12:01 ?        00:00:01
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> >> > > > > > -Dhadoop.log.file=hadoo
> >> > > > > > root     25925 21994  1 12:06 pts/2    00:00:00
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> >> > > > > > -Dhadoop.log.file=hadoop.log -
> >> > > > > > hadoop   26120 25175 14 12:06 ?        00:00:01
> >> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> >> > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> >> > > > > > -Dscript.name=/usr/local/groovy/b
> >> > > > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> >> > > > > >
> >> > > > > >
> >> > > > > > *The command which i am executing is *
> >> > > > > >
> >> > > > > >
> >> > > > > > hadoop jar
> >> > > > > >
> >> > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > > > \
> >> > > > > > -D mapred.child.java.opts=-Xmx1024m \
> >> > > > > > -inputformat StreamInputFormat \
> >> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> >> > > > > > http://www.w3.org/TR/REC-xml\<http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C><
> >> http://www.w3.org/TR/REC-xml%5C> <
> >> > http://www.w3.org/TR/REC-xml%5C> <
> >> > > http://www.w3.org/TR/REC-xml%5C> <
> >> > > > http://www.w3.org/TR/REC-xml%5C> <
> >> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> >> > > > > > \
> >> > > > > > -input
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> >> > > > > > \
> >> > > > > > -jobconf mapred.map.tasks=1 \
> >> > > > > > -jobconf mapred.reduce.tasks=0 \
> >> > > > > > -output RNC25 \
> >> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> >> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> >> > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> >> > > > > > -jt local
> >> > > > > >
> >> > > > > > I have noticed that the all hadoop processes showing 2001
> memory
> >> > size
> >> > > > > which
> >> > > > > > i have set in hadoop-env.sh. and one the command, i give 2000
> in
> >> > > mapper
> >> > > > > and
> >> > > > > > 1024 in child.java.opts but i think these values(1024,2001)
> are
> >> not
> >> > > in
> >> > > > > use.
> >> > > > > > secondly the following lines
> >> > > > > >
> >> > > > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> >> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> >> > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> >> > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> >> > > > > > -Dscript.name=/usr/local/groovy/b*
> >> > > > > >
> >> > > > > > did not appear for first time when job runs. they appear when
> >> job
> >> > > > failed
> >> > > > > > for
> >> > > > > > first time and then again try to start mapping. I have one
> more
> >> > > > question
> >> > > > > > which is as all hadoop processes (namenode, datanode,
> >> > tasktracker...)
> >> > > > > > showing 2001 heapsize in process. will it means  all the
> >> processes
> >> > > > using
> >> > > > > > 2001m of memory??
> >> > > > > >
> >> > > > > > Regards
> >> > > > > > Shuja
> >> > > > > >
> >> > > > > >
> >> > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> >> alexvk@cloudera.com>
> >> > > > > wrote:
> >> > > > > >
> >> > > > > > > Hi Shuja,
> >> > > > > > >
> >> > > > > > > I think you need to enclose the invocation string in quotes.
> >> >  Try:
> >> > > > > > >
> >> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> >> > > > > > >
> >> > > > > > > Also, it would be nice to see how exactly the groovy is
> >> invoked.
> >> > >  Is
> >> > > > > > groovy
> >> > > > > > > started and them gives you OOM or is OOM error during the
> >> start?
> >> > >  Can
> >> > > > > you
> >> > > > > > > see the new process with "ps -aef"?
> >> > > > > > >
> >> > > > > > > Can you run groovy in local mode?  Try "-jt local" option.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > >
> >> > > > > > > Alex K
> >> > > > > > >
> >> > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> >> > > shujamughal@gmail.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi Patrick,
> >> > > > > > > > Thanks for explanation. I have supply the heapsize in
> mapper
> >> in
> >> > > the
> >> > > > > > > > following way
> >> > > > > > > >
> >> > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> >> > > > > > > >
> >> > > > > > > > but still same error. Any other idea?
> >> > > > > > > > Thanks
> >> > > > > > > >
> >> > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> >> > > > > patrick@cloudera.com
> >> > > > > > > > >wrote:
> >> > > > > > > >
> >> > > > > > > > > Shuja,
> >> > > > > > > > >
> >> > > > > > > > > Those settings (mapred.child.jvm.opts and
> >> > mapred.child.ulimit)
> >> > > > are
> >> > > > > > only
> >> > > > > > > > > used
> >> > > > > > > > > for child JVMs that get forked by the TaskTracker. You
> are
> >> > > using
> >> > > > > > Hadoop
> >> > > > > > > > > streaming, which means the TaskTracker is forking a JVM
> >> for
> >> > > > > > streaming,
> >> > > > > > > > > which
> >> > > > > > > > > is then forking a shell process that runs your groovy
> code
> >> > (in
> >> > > > > > another
> >> > > > > > > > > JVM).
> >> > > > > > > > >
> >> > > > > > > > > I'm not much of a groovy expert, but if there's a way
> you
> >> can
> >> > > > wrap
> >> > > > > > your
> >> > > > > > > > > code
> >> > > > > > > > > around the MapReduce API that would work best.
> Otherwise,
> >> you
> >> > > can
> >> > > > > > just
> >> > > > > > > > pass
> >> > > > > > > > > the heapsize in '-mapper' argument.
> >> > > > > > > > >
> >> > > > > > > > > Regards,
> >> > > > > > > > >
> >> > > > > > > > > - Patrick
> >> > > > > > > > >
> >> > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> >> > > > > shujamughal@gmail.com
> >> > > > > > >
> >> > > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hi Alex,
> >> > > > > > > > > >
> >> > > > > > > > > > I have update the java to latest available version on
> >> all
> >> > > > > machines
> >> > > > > > in
> >> > > > > > > > the
> >> > > > > > > > > > cluster and now i run the job by adding this line
> >> > > > > > > > > >
> >> > > > > > > > > > -D mapred.child.ulimit=3145728 \
> >> > > > > > > > > >
> >> > > > > > > > > > but still same error. Here is the output of this job.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> >> > > > > > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> >> > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> >> > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> >> > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> >> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> >> > > > > > > > > /usr/lib/hadoop-0.20/con
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> >> > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> >> > > > > > > > > > org.apache.hadoop.util.RunJar
> >> > > > > > > > > >
> >> > > > > > >
> >> > > >
> >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > > > > > > > -D
> >> > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> >> > > > mapred.child.ulimit=3145728
> >> > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> >> > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> >> http://www.w
> >> > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> >> > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> >> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> >> > > > > > mapred.map.tasks=1
> >> > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> >> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> >> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> >> > > > > > > > > /home/ftpuser1/Nodemapp
> >> > > > > > > > > > er5.groovy
> >> > > > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> >> > > > > > > Nodemapper5.groovy
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Any clue?
> >> > > > > > > > > > Thanks
> >> > > > > > > > > >
> >> > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> >> > > > > alexvk@cloudera.com>
> >> > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Hi Shuja,
> >> > > > > > > > > > >
> >> > > > > > > > > > > First, thank you for using CDH3.  Can you also check
> >> what
> >> > > m*
> >> > > > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> >> > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command
> line.
> >> > > > > > > > > > >
> >> > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> >> update
> >> > 8
> >> > > at
> >> > > > a
> >> > > > > > > > > minimum,
> >> > > > > > > > > > > which you can download from the Java SE
> >> > > > > > > > > > > Homepage<
> >> http://java.sun.com/javase/downloads/index.jsp>
> >> > > > > > > > > > > .
> >> > > > > > > > > > >
> >> > > > > > > > > > > Let me know how it goes.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Alex K
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> >> > > > > > > > shujamughal@gmail.com
> >> > > > > > > > > > > >wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Hi Alex
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
> >> and
> >> > > using
> >> > > > > > > > Cloudera
> >> > > > > > > > > > > > distribution of hadoop. and here is the output of
> >> this
> >> > > > > command.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> >> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> >> > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> >> > > > /hadoop-0.20/logs
> >> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> >> > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> >> > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> >> > > op.root.logger=INFO,console
> >> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> >> > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> >> > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> >> > > > > > > > > > > >
> >> > > > > > > > >
> >> > > > > >
> >> > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> >> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> >> > > > > > > StreamInputFormat
> >> > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> >> <mdc
> >> > > > > > > xmlns:HTML="
> >> > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> >> > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> >> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> >> > > > > > > > mapred.map.tasks=1
> >> > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
> >>  RNC11
> >> > > > -mapper
> >> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> >> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /
> >> > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> >> > > > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00
> grep
> >> > > > > > > > > Nodemapper5.groovy
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> ------------------------------------------------------------------------------------------------------------------------------
> >> > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Best Regards
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> >> > > > > > > alexvk@cloudera.com
> >> > > > > > > > >
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Hi Shuja,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > It looks like the OOM is happening in your code.
> >>  Are
> >> > > you
> >> > > > > > > running
> >> > > > > > > > > > > > MapReduce
> >> > > > > > > > > > > > > in a cluster?  If so, can you send the exact
> >> command
> >> > > line
> >> > > > > > your
> >> > > > > > > > code
> >> > > > > > > > > > is
> >> > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> >> grep
> >> > > > > > > > > > > Nodemapper5.groovy'
> >> > > > > > > > > > > > > command on one of the nodes which is running the
> >> > task?
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Alex K
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> >> > > > > > > > > > shujamughal@gmail.com
> >> > > > > > > > > > > > > >wrote:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Hi All
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > I am facing a hard problem. I am running a map
> >> > reduce
> >> > > > job
> >> > > > > > > using
> >> > > > > > > > > > > > streaming
> >> > > > > > > > > > > > > > but it fails and it gives the following error.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> >> space
> >> > > > > > > > > > > > > >        at
> >> > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > java.lang.RuntimeException:
> >> > > > > PipeMapRed.waitOutputThreads():
> >> > > > > > > > > > > subprocess
> >> > > > > > > > > > > > > > failed with code 1
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > > > > > >
> >> > > > > > >
> >> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > >
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > >
> >> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > > > > >
> >> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >        at
> >> > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> >> > > > > > > > > > > > > >        at
> >> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > I have increased the heap size in
> hadoop-env.sh
> >> and
> >> > > > make
> >> > > > > it
> >> > > > > > > > > 2000M.
> >> > > > > > > > > > > Also
> >> > > > > > > > > > > > I
> >> > > > > > > > > > > > > > tell the job manually by following line.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > but it still gives the error. The same job
> runs
> >> > fine
> >> > > if
> >> > > > i
> >> > > > > > run
> >> > > > > > > > on
> >> > > > > > > > > > > shell
> >> > > > > > > > > > > > > > using
> >> > > > > > > > > > > > > > 1024M heap size like
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Any clue?????????
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Thanks in advance.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > --
> >> > > > > > > > > > > > > > Regards
> >> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > > > > > > > _________________________________
> >> > > > > > > > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > > > > > > > Lahore University of Management Sciences
> (LUMS)
> >> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > > > > > > > Cell: +92 3214207445
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > --
> >> > > > > > > > > > > > Regards
> >> > > > > > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > > > > > _________________________________
> >> > > > > > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > > > > > Cell: +92 3214207445
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > --
> >> > > > > > > > > > Regards
> >> > > > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > > > _________________________________
> >> > > > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > > > Cell: +92 3214207445
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Regards
> >> > > > > > > > Shuja-ur-Rehman Baig
> >> > > > > > > > _________________________________
> >> > > > > > > > MS CS - School of Science and Engineering
> >> > > > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > > > Cell: +92 3214207445
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Regards
> >> > > > > > Shuja-ur-Rehman Baig
> >> > > > > > _________________________________
> >> > > > > > MS CS - School of Science and Engineering
> >> > > > > > Lahore University of Management Sciences (LUMS)
> >> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > > > Cell: +92 3214207445
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Regards
> >> > > > Shuja-ur-Rehman Baig
> >> > > > _________________________________
> >> > > > MS CS - School of Science and Engineering
> >> > > > Lahore University of Management Sciences (LUMS)
> >> > > > Sector U, DHA, Lahore, 54792, Pakistan
> >> > > > Cell: +92 3214207445
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Regards
> >> > Shuja-ur-Rehman Baig
> >> > _________________________________
> >> > MS CS - School of Science and Engineering
> >> > Lahore University of Management Sciences (LUMS)
> >> > Sector U, DHA, Lahore, 54792, Pakistan
> >> > Cell: +92 3214207445
> >> >
> >>
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Alex, any guess why it fails on server while it has more free memory than
slave.

On Tue, Jul 13, 2010 at 3:06 AM, Shuja Rehman <sh...@gmail.com> wrote:

> *Master Node output:*
>
>    total       used       free     shared    buffers     cached
> Mem:       2097328     515576    1581752          0      56060     254760
> -/+ buffers/cache:     204756    1892572
> Swap:       522104          0     522104
>
> *Slave Node output:*
>   total       used       free     shared    buffers     cached
> Mem:       1048752     860684     188068          0     148388     570948
> -/+ buffers/cache:     141348     907404
> Swap:       522104         40     522064
>
> it seems that on server there is more memory free.
>
>
>
> On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
>> Maybe you do not have enough available memory on master?  What is the
>> output
>> of "*free*" on both nodes?  -- Alex K
>>
>> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
>> wrote:
>>
>> > Hi
>> > I have added following line to my master node mapred-site.xml file
>> >
>> > <property>
>> >    <name>mapred.child.ulimit</name>
>> >    <value>3145728</value>
>> >  </property>
>> >
>> > and run the job again, and wow..., the jobs get completed in 4th
>> attempt. I
>> > checked the at 50030. Hadoop runs job 3 times on master server and it
>> fails
>> > but when it run on 2nd node, it succeeded and produce the desired
>> result.
>> > Why it failed on master?
>> > Thanks
>> > Shuja
>> >
>> >
>> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
>> wrote:
>> >
>> > > Hmm.  It means your options are not propagated to the nodes.  Can you
>> put
>> > *
>> > > mapred.child.ulimit* in the mapred-siet.xml and restart the
>> tasktrackers?
>> > >  I
>> > > was under impression that the below should be enough though.  Glad you
>> > got
>> > > it working in local mode.  -- Alex K
>> > >
>> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Alex, I am using putty to connect to servers. and this is almost
>> my
>> > > > maximum screen output which i sent. putty is not allowed me to
>> increase
>> > > the
>> > > > size of terminal. is there any other way that i get the complete
>> output
>> > > of
>> > > > ps-aef?
>> > > >
>> > > > Now i run the following command and thnx God, it did not fails and
>> > > produce
>> > > > the desired output.
>> > > >
>> > > > hadoop jar
>> > > >
>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > \
>> > > > -D mapred.child.java.opts=-Xmx1024m \
>> > > > -D mapred.child.ulimit=3145728 \
>> > > > -jt local \
>> > > > -inputformat StreamInputFormat \
>> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
>> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
>> http://www.w3.org/TR/REC-xml%5C> <
>> > http://www.w3.org/TR/REC-xml%5C> <
>> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
>> > > > \
>> > > > -input
>> > > >
>> > > >
>> > >
>> >
>> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
>> > > > \
>> > > > -jobconf mapred.map.tasks=1 \
>> > > > -jobconf mapred.reduce.tasks=0 \
>> > > > -output RNC32 \
>> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
>> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
>> > > > -file /home/ftpuser1/Nodemapper5.groovy
>> > > >
>> > > >
>> > > > but when i omit the -jt local, it produces the same error.
>> > > > Thanks Alex for helping
>> > > > Regards
>> > > > Shuja
>> > > >
>> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
>> > > wrote:
>> > > >
>> > > > > Hi Shuja,
>> > > > >
>> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..."
>> on
>> > the
>> > > > > command line, the last is valid.  Unfortunately you have truncated
>> > > > command
>> > > > > lines.  Can you show us the full command line, particularly for
>> the
>> > > > process
>> > > > > 26162?  This seems to be causing problems.
>> > > > >
>> > > > > If you are running your cluster on 2 nodes, it may be that the
>> task
>> > was
>> > > > > scheduled on the second node.  Did you run "ps -aef" on the second
>> > node
>> > > > as
>> > > > > well?  You can see the task assignment in the JT web-UI (
>> > > > > http://jt-name:50030, drill down to tasks).
>> > > > >
>> > > > > I suggest you first debug your program in the local mode first,
>> > however
>> > > > > (use
>> > > > > "*-jt local*" option).  Did you try the "*-D
>> > > > mapred.child.ulimit=3145728*"
>> > > > > option?  I do not see it on the command line.
>> > > > >
>> > > > > Alex K
>> > > > >
>> > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
>> > shujamughal@gmail.com
>> > > > > >wrote:
>> > > > >
>> > > > > > Hi Alex
>> > > > > >
>> > > > > > I have tried with using quotes  and also with -jt local but same
>> > heap
>> > > > > > error.
>> > > > > > and here is the output  of ps -aef
>> > > > > >
>> > > > > > UID        PID  PPID  C STIME TTY          TIME CMD
>> > > > > > root         1     0  0 04:37 ?        00:00:00 init [3]
>> > > > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
>> > > > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
>> > > > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
>> > > > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
>> > > > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
>> > > > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
>> > > > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
>> > > > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
>> > > > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
>> > > > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
>> > > > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
>> > > > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
>> > > > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
>> > > > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
>> > > > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
>> > > > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
>> > > > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
>> > > > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
>> > > > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
>> > > > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
>> > > > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
>> > > > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
>> > > > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
>> > > > > > root       635     7  0 04:37 ?        00:00:00
>> [kmpath_handlerd]
>> > > > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
>> > > > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
>> > > > > > root      1032     1  0 04:38 ?        00:00:00 auditd
>> > > > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
>> > > > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
>> > > > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
>> > > > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
>> > > > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
>> > > > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon
>> > --system
>> > > > > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
>> > > > > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
>> > > > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
>> > > > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
>> > > > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
>> > > --server
>> > > > > > root      1295     1  0 04:38 ?        00:00:00 automount
>> > > > > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
>> > > > > > root      1326     1  0 04:38 ?        00:00:00 xinetd
>> -stayalive
>> > > > > -pidfile
>> > > > > > /var/run/xinetd.pid
>> > > > > > root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
>> > > > > > /etc/vsftpd/vsftpd.conf
>> > > > > > root      1354     1  0 04:38 ?        00:00:00 sendmail:
>> accepting
>> > > > > > connections
>> > > > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue
>> > > > runner@01
>> > > > > > :00:00
>> > > > > > for /var/spool/clientmqueue
>> > > > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
>> > > /dev/input/mice
>> > > > -t
>> > > > > > exps2
>> > > > > > root      1410     1  0 04:38 ?        00:00:00 crond
>> > > > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv
>> > -daemon
>> > > > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
>> > > > > > 68        1508     1  0 04:38 ?        00:00:00 hald
>> > > > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
>> > > > > > root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd
>> -q
>> > > > never
>> > > > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty
>> xvc0
>> > > 9600
>> > > > > > vt100-nav
>> > > > > > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python
>> -tt
>> > > > > > /usr/sbin/yum-updatesd
>> > > > > > root      1539     1  0 04:38 ?        00:00:00
>> > > /usr/libexec/gam_server
>> > > > > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts
>> /0
>> > > > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
>> > > > > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts
>> /1
>> > > > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
>> > > > > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts
>> /2
>> > > > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
>> > > > > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts
>> /3
>> > > > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
>> > > > > > hadoop   24808     1  0 12:01 ?        00:00:02
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop   24893     1  0 12:01 ?        00:00:01
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop   24988     1  0 12:01 ?        00:00:01
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop   25085     1  0 12:01 ?        00:00:00
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
>> > > -Dcom.sun.management.jmxremote
>> > > > > > -Dhadoop.lo
>> > > > > > hadoop   25175     1  0 12:01 ?        00:00:01
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
>> > > > > > -Dhadoop.log.file=hadoo
>> > > > > > root     25925 21994  1 12:06 pts/2    00:00:00
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
>> > > > > > -Dhadoop.log.file=hadoop.log -
>> > > > > > hadoop   26120 25175 14 12:06 ?        00:00:01
>> > > > > > /usr/jdk1.6.0_03/jre/bin/java
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
>> > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
>> > > > > > -Dscript.name=/usr/local/groovy/b
>> > > > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
>> > > > > >
>> > > > > >
>> > > > > > *The command which i am executing is *
>> > > > > >
>> > > > > >
>> > > > > > hadoop jar
>> > > > > >
>> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > > > \
>> > > > > > -D mapred.child.java.opts=-Xmx1024m \
>> > > > > > -inputformat StreamInputFormat \
>> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
>> > > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C><
>> http://www.w3.org/TR/REC-xml%5C> <
>> > http://www.w3.org/TR/REC-xml%5C> <
>> > > http://www.w3.org/TR/REC-xml%5C> <
>> > > > http://www.w3.org/TR/REC-xml%5C> <
>> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
>> > > > > > \
>> > > > > > -input
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
>> > > > > > \
>> > > > > > -jobconf mapred.map.tasks=1 \
>> > > > > > -jobconf mapred.reduce.tasks=0 \
>> > > > > > -output RNC25 \
>> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
>> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
>> > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
>> > > > > > -jt local
>> > > > > >
>> > > > > > I have noticed that the all hadoop processes showing 2001 memory
>> > size
>> > > > > which
>> > > > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
>> > > mapper
>> > > > > and
>> > > > > > 1024 in child.java.opts but i think these values(1024,2001) are
>> not
>> > > in
>> > > > > use.
>> > > > > > secondly the following lines
>> > > > > >
>> > > > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
>> > > > > > /usr/jdk1.6.0_03/jre/bin/java
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
>> > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
>> > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
>> > > > > > -Dscript.name=/usr/local/groovy/b*
>> > > > > >
>> > > > > > did not appear for first time when job runs. they appear when
>> job
>> > > > failed
>> > > > > > for
>> > > > > > first time and then again try to start mapping. I have one more
>> > > > question
>> > > > > > which is as all hadoop processes (namenode, datanode,
>> > tasktracker...)
>> > > > > > showing 2001 heapsize in process. will it means  all the
>> processes
>> > > > using
>> > > > > > 2001m of memory??
>> > > > > >
>> > > > > > Regards
>> > > > > > Shuja
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
>> alexvk@cloudera.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Hi Shuja,
>> > > > > > >
>> > > > > > > I think you need to enclose the invocation string in quotes.
>> >  Try:
>> > > > > > >
>> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
>> > > > > > >
>> > > > > > > Also, it would be nice to see how exactly the groovy is
>> invoked.
>> > >  Is
>> > > > > > groovy
>> > > > > > > started and them gives you OOM or is OOM error during the
>> start?
>> > >  Can
>> > > > > you
>> > > > > > > see the new process with "ps -aef"?
>> > > > > > >
>> > > > > > > Can you run groovy in local mode?  Try "-jt local" option.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > >
>> > > > > > > Alex K
>> > > > > > >
>> > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
>> > > shujamughal@gmail.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi Patrick,
>> > > > > > > > Thanks for explanation. I have supply the heapsize in mapper
>> in
>> > > the
>> > > > > > > > following way
>> > > > > > > >
>> > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
>> > > > > > > >
>> > > > > > > > but still same error. Any other idea?
>> > > > > > > > Thanks
>> > > > > > > >
>> > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
>> > > > > patrick@cloudera.com
>> > > > > > > > >wrote:
>> > > > > > > >
>> > > > > > > > > Shuja,
>> > > > > > > > >
>> > > > > > > > > Those settings (mapred.child.jvm.opts and
>> > mapred.child.ulimit)
>> > > > are
>> > > > > > only
>> > > > > > > > > used
>> > > > > > > > > for child JVMs that get forked by the TaskTracker. You are
>> > > using
>> > > > > > Hadoop
>> > > > > > > > > streaming, which means the TaskTracker is forking a JVM
>> for
>> > > > > > streaming,
>> > > > > > > > > which
>> > > > > > > > > is then forking a shell process that runs your groovy code
>> > (in
>> > > > > > another
>> > > > > > > > > JVM).
>> > > > > > > > >
>> > > > > > > > > I'm not much of a groovy expert, but if there's a way you
>> can
>> > > > wrap
>> > > > > > your
>> > > > > > > > > code
>> > > > > > > > > around the MapReduce API that would work best. Otherwise,
>> you
>> > > can
>> > > > > > just
>> > > > > > > > pass
>> > > > > > > > > the heapsize in '-mapper' argument.
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > >
>> > > > > > > > > - Patrick
>> > > > > > > > >
>> > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
>> > > > > shujamughal@gmail.com
>> > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Alex,
>> > > > > > > > > >
>> > > > > > > > > > I have update the java to latest available version on
>> all
>> > > > > machines
>> > > > > > in
>> > > > > > > > the
>> > > > > > > > > > cluster and now i run the job by adding this line
>> > > > > > > > > >
>> > > > > > > > > > -D mapred.child.ulimit=3145728 \
>> > > > > > > > > >
>> > > > > > > > > > but still same error. Here is the output of this job.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
>> > > > > > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
>> > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
>> > > > > > doop.home.dir=/usr/lib/hadoop-0.20
>> > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
>> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
>> > > > > > > > > /usr/lib/hadoop-0.20/con
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
>> > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
>> > > > > > > > > > org.apache.hadoop.util.RunJar
>> > > > > > > > > >
>> > > > > > >
>> > > >
>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > > > > > > > -D
>> > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
>> > > > mapred.child.ulimit=3145728
>> > > > > > > > > > -inputformat StreamIn putFormat -inputreader
>> > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
>> http://www.w
>> > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
>> > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
>> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
>> > > > > > mapred.map.tasks=1
>> > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
>> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
>> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
>> > > > > > > > > /home/ftpuser1/Nodemapp
>> > > > > > > > > > er5.groovy
>> > > > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
>> > > > > > > Nodemapper5.groovy
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > Any clue?
>> > > > > > > > > > Thanks
>> > > > > > > > > >
>> > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
>> > > > > alexvk@cloudera.com>
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Shuja,
>> > > > > > > > > > >
>> > > > > > > > > > > First, thank you for using CDH3.  Can you also check
>> what
>> > > m*
>> > > > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
>> > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
>> > > > > > > > > > >
>> > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
>> update
>> > 8
>> > > at
>> > > > a
>> > > > > > > > > minimum,
>> > > > > > > > > > > which you can download from the Java SE
>> > > > > > > > > > > Homepage<
>> http://java.sun.com/javase/downloads/index.jsp>
>> > > > > > > > > > > .
>> > > > > > > > > > >
>> > > > > > > > > > > Let me know how it goes.
>> > > > > > > > > > >
>> > > > > > > > > > > Alex K
>> > > > > > > > > > >
>> > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
>> > > > > > > > shujamughal@gmail.com
>> > > > > > > > > > > >wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi Alex
>> > > > > > > > > > > >
>> > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
>> and
>> > > using
>> > > > > > > > Cloudera
>> > > > > > > > > > > > distribution of hadoop. and here is the output of
>> this
>> > > > > command.
>> > > > > > > > > > > >
>> > > > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
>> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
>> > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
>> > > > /hadoop-0.20/logs
>> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
>> > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
>> > > > > > > > > > > > -Dhadoop.id.str= -Dhado
>> > > op.root.logger=INFO,console
>> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
>> > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
>> > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
>> > > > > > > > > > > >
>> > > > > > > > >
>> > > > > >
>> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
>> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
>> > > > > > > StreamInputFormat
>> > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
>> <mdc
>> > > > > > > xmlns:HTML="
>> > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
>> > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
>> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
>> > > > > > > > mapred.map.tasks=1
>> > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
>>  RNC11
>> > > > -mapper
>> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
>> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
>> > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
>> > > > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
>> > > > > > > > > Nodemapper5.groovy
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ------------------------------------------------------------------------------------------------------------------------------
>> > > > > > > > > > > > and what is meant by OOM and thanks for helping,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best Regards
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
>> > > > > > > alexvk@cloudera.com
>> > > > > > > > >
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Hi Shuja,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > It looks like the OOM is happening in your code.
>>  Are
>> > > you
>> > > > > > > running
>> > > > > > > > > > > > MapReduce
>> > > > > > > > > > > > > in a cluster?  If so, can you send the exact
>> command
>> > > line
>> > > > > > your
>> > > > > > > > code
>> > > > > > > > > > is
>> > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
>> grep
>> > > > > > > > > > > Nodemapper5.groovy'
>> > > > > > > > > > > > > command on one of the nodes which is running the
>> > task?
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Alex K
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
>> > > > > > > > > > shujamughal@gmail.com
>> > > > > > > > > > > > > >wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Hi All
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I am facing a hard problem. I am running a map
>> > reduce
>> > > > job
>> > > > > > > using
>> > > > > > > > > > > > streaming
>> > > > > > > > > > > > > > but it fails and it gives the following error.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
>> space
>> > > > > > > > > > > > > >        at
>> > Nodemapper5.parseXML(Nodemapper5.groovy:25)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > java.lang.RuntimeException:
>> > > > > PipeMapRed.waitOutputThreads():
>> > > > > > > > > > > subprocess
>> > > > > > > > > > > > > > failed with code 1
>> > > > > > > > > > > > > >        at
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>> > > > > > > > > > > > > >        at
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >        at
>> > > > > > > > > > > > >
>> > > > > > >
>> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
>> > > > > > > > > > > > > >        at
>> > > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>> > > > > > > > > > > > > >        at
>> > > > > > > > > > > > > >
>> > > > > > > > > >
>> > > > > >
>> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>> > > > > > > > > > > > > >        at
>> > > > > > > > > > > >
>> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >        at
>> > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> > > > > > > > > > > > > >        at
>> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh
>> and
>> > > > make
>> > > > > it
>> > > > > > > > > 2000M.
>> > > > > > > > > > > Also
>> > > > > > > > > > > > I
>> > > > > > > > > > > > > > tell the job manually by following line.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > but it still gives the error. The same job runs
>> > fine
>> > > if
>> > > > i
>> > > > > > run
>> > > > > > > > on
>> > > > > > > > > > > shell
>> > > > > > > > > > > > > > using
>> > > > > > > > > > > > > > 1024M heap size like
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Any clue?????????
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Thanks in advance.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > Regards
>> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > > > > > > > _________________________________
>> > > > > > > > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > > > > > > > Cell: +92 3214207445
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > --
>> > > > > > > > > > > > Regards
>> > > > > > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > > > > > _________________________________
>> > > > > > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > > > > > Cell: +92 3214207445
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > > Regards
>> > > > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > > > _________________________________
>> > > > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > > > Cell: +92 3214207445
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Regards
>> > > > > > > > Shuja-ur-Rehman Baig
>> > > > > > > > _________________________________
>> > > > > > > > MS CS - School of Science and Engineering
>> > > > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > > > Cell: +92 3214207445
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Regards
>> > > > > > Shuja-ur-Rehman Baig
>> > > > > > _________________________________
>> > > > > > MS CS - School of Science and Engineering
>> > > > > > Lahore University of Management Sciences (LUMS)
>> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > > > Cell: +92 3214207445
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Regards
>> > > > Shuja-ur-Rehman Baig
>> > > > _________________________________
>> > > > MS CS - School of Science and Engineering
>> > > > Lahore University of Management Sciences (LUMS)
>> > > > Sector U, DHA, Lahore, 54792, Pakistan
>> > > > Cell: +92 3214207445
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Regards
>> > Shuja-ur-Rehman Baig
>> > _________________________________
>> > MS CS - School of Science and Engineering
>> > Lahore University of Management Sciences (LUMS)
>> > Sector U, DHA, Lahore, 54792, Pakistan
>> > Cell: +92 3214207445
>> >
>>
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi Ted yu,
As i have cluster of 2 nodes and i have configured task tracker on name node
as well to process the files.

On Tue, Jul 13, 2010 at 5:49 AM, Ted Yu <yu...@gmail.com> wrote:

> Normally task tracker isn't run on Name node.
> Did you configure otherwise ?
>
> On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > *Master Node output:*
> >
> >   total       used       free     shared    buffers     cached
> > Mem:       2097328     515576    1581752          0      56060     254760
> > -/+ buffers/cache:     204756    1892572
> > Swap:       522104          0     522104
> >
> > *Slave Node output:*
> >  total       used       free     shared    buffers     cached
> > Mem:       1048752     860684     188068          0     148388     570948
> > -/+ buffers/cache:     141348     907404
> > Swap:       522104         40     522064
> >
> > it seems that on server there is more memory free.
> >
> >
> > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Maybe you do not have enough available memory on master?  What is the
> > > output
> > > of "*free*" on both nodes?  -- Alex K
> > >
> > > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > > I have added following line to my master node mapred-site.xml file
> > > >
> > > > <property>
> > > >    <name>mapred.child.ulimit</name>
> > > >    <value>3145728</value>
> > > >  </property>
> > > >
> > > > and run the job again, and wow..., the jobs get completed in 4th
> > attempt.
> > > I
> > > > checked the at 50030. Hadoop runs job 3 times on master server and it
> > > fails
> > > > but when it run on 2nd node, it succeeded and produce the desired
> > result.
> > > > Why it failed on master?
> > > > Thanks
> > > > Shuja
> > > >
> > > >
> > > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hmm.  It means your options are not propagated to the nodes.  Can
> you
> > > put
> > > > *
> > > > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> > > tasktrackers?
> > > > >  I
> > > > > was under impression that the below should be enough though.  Glad
> > you
> > > > got
> > > > > it working in local mode.  -- Alex K
> > > > >
> > > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <
> shujamughal@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Alex, I am using putty to connect to servers. and this is
> almost
> > > my
> > > > > > maximum screen output which i sent. putty is not allowed me to
> > > increase
> > > > > the
> > > > > > size of terminal. is there any other way that i get the complete
> > > output
> > > > > of
> > > > > > ps-aef?
> > > > > >
> > > > > > Now i run the following command and thnx God, it did not fails
> and
> > > > > produce
> > > > > > the desired output.
> > > > > >
> > > > > > hadoop jar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > \
> > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > -jt local \
> > > > > > -inputformat StreamInputFormat \
> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > \
> > > > > > -input
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > \
> > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > -output RNC32 \
> > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > -file /home/ftpuser1/Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > but when i omit the -jt local, it produces the same error.
> > > > > > Thanks Alex for helping
> > > > > > Regards
> > > > > > Shuja
> > > > > >
> > > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <
> alexvk@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > Java listens to the last xmx, so if you have multiple "-Xmx
> ..."
> > on
> > > > the
> > > > > > > command line, the last is valid.  Unfortunately you have
> > truncated
> > > > > > command
> > > > > > > lines.  Can you show us the full command line, particularly for
> > the
> > > > > > process
> > > > > > > 26162?  This seems to be causing problems.
> > > > > > >
> > > > > > > If you are running your cluster on 2 nodes, it may be that the
> > task
> > > > was
> > > > > > > scheduled on the second node.  Did you run "ps -aef" on the
> > second
> > > > node
> > > > > > as
> > > > > > > well?  You can see the task assignment in the JT web-UI (
> > > > > > > http://jt-name:50030, drill down to tasks).
> > > > > > >
> > > > > > > I suggest you first debug your program in the local mode first,
> > > > however
> > > > > > > (use
> > > > > > > "*-jt local*" option).  Did you try the "*-D
> > > > > > mapred.child.ulimit=3145728*"
> > > > > > > option?  I do not see it on the command line.
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi Alex
> > > > > > > >
> > > > > > > > I have tried with using quotes  and also with -jt local but
> > same
> > > > heap
> > > > > > > > error.
> > > > > > > > and here is the output  of ps -aef
> > > > > > > >
> > > > > > > > UID        PID  PPID  C STIME TTY          TIME CMD
> > > > > > > > root         1     0  0 04:37 ?        00:00:00 init [3]
> > > > > > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > > > > > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > > > > > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > > > > > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > > > > > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > > > > > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > > > > > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > > > > > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > > > > > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > > > > > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > > > > > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > > > > > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > > > > > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > > > > > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > > > > > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > > > > > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > > > > > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > > > > > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > > > > > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd
> -d
> > > > > > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > > > > > > > root       635     7  0 04:37 ?        00:00:00
> > [kmpath_handlerd]
> > > > > > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > > > root      1032     1  0 04:38 ?        00:00:00 auditd
> > > > > > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > > > > > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > > > > > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > > > > > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > > > > > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > > > > > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon
> > > > --system
> > > > > > > > root      1184     1  0 04:38 ?        00:00:00
> /usr/sbin/hcid
> > > > > > > > root      1190     1  0 04:38 ?        00:00:00
> /usr/sbin/sdpd
> > > > > > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > > > > > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > > > > > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
> > > > > --server
> > > > > > > > root      1295     1  0 04:38 ?        00:00:00 automount
> > > > > > > > root      1314     1  0 04:38 ?        00:00:00
> /usr/sbin/sshd
> > > > > > > > root      1326     1  0 04:38 ?        00:00:00 xinetd
> > -stayalive
> > > > > > > -pidfile
> > > > > > > > /var/run/xinetd.pid
> > > > > > > > root      1337     1  0 04:38 ?        00:00:00
> > /usr/sbin/vsftpd
> > > > > > > > /etc/vsftpd/vsftpd.conf
> > > > > > > > root      1354     1  0 04:38 ?        00:00:00 sendmail:
> > > accepting
> > > > > > > > connections
> > > > > > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail:
> Queue
> > > > > > runner@01
> > > > > > > > :00:00
> > > > > > > > for /var/spool/clientmqueue
> > > > > > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
> > > > > /dev/input/mice
> > > > > > -t
> > > > > > > > exps2
> > > > > > > > root      1410     1  0 04:38 ?        00:00:00 crond
> > > > > > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv
> > > > -daemon
> > > > > > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > > > > > > > 68        1508     1  0 04:38 ?        00:00:00 hald
> > > > > > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > > > > > > > root      1533     1  0 04:38 ?        00:00:00
> > /usr/sbin/smartd
> > > -q
> > > > > > never
> > > > > > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty
> > xvc0
> > > > > 9600
> > > > > > > > vt100-nav
> > > > > > > > root      1537     1  0 04:38 ?        00:00:00
> /usr/bin/python
> > > -tt
> > > > > > > > /usr/sbin/yum-updatesd
> > > > > > > > root      1539     1  0 04:38 ?        00:00:00
> > > > > /usr/libexec/gam_server
> > > > > > > > root     21022  1314  0 11:27 ?        00:00:00 sshd:
> root@pts
> > /0
> > > > > > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > > > > > > > root     21103  1314  0 11:28 ?        00:00:00 sshd:
> root@pts
> > /1
> > > > > > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > > > > > > > root     21992  1314  0 11:47 ?        00:00:00 sshd:
> root@pts
> > /2
> > > > > > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > > > > > > > root     22433  1314  0 11:49 ?        00:00:00 sshd:
> root@pts
> > /3
> > > > > > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > > > > > > > hadoop   24808     1  0 12:01 ?        00:00:02
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop   24893     1  0 12:01 ?        00:00:01
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop   24988     1  0 12:01 ?        00:00:01
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop   25085     1  0 12:01 ?        00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > > -Dcom.sun.management.jmxremote
> > > > > > > > -Dhadoop.lo
> > > > > > > > hadoop   25175     1  0 12:01 ?        00:00:01
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > > > > -Dhadoop.log.file=hadoo
> > > > > > > > root     25925 21994  1 12:06 pts/2    00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > -Dhadoop.log.file=hadoop.log -
> > > > > > > > hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > > -Dscript.name=/usr/local/groovy/b
> > > > > > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> > > > > > > >
> > > > > > > >
> > > > > > > > *The command which i am executing is *
> > > > > > > >
> > > > > > > >
> > > > > > > > hadoop jar
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > \
> > > > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > > > -inputformat StreamInputFormat \
> > > > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > > > http://www.w3.org/TR/REC-xml\<http://www.w3.org/TR/REC-xml%5C>
> <http://www.w3.org/TR/REC-xml%5C><
> > http://www.w3.org/TR/REC-xml%5C><
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > > > \
> > > > > > > > -input
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > > > \
> > > > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > > > -output RNC25 \
> > > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > > > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > > > > -jt local
> > > > > > > >
> > > > > > > > I have noticed that the all hadoop processes showing 2001
> > memory
> > > > size
> > > > > > > which
> > > > > > > > i have set in hadoop-env.sh. and one the command, i give 2000
> > in
> > > > > mapper
> > > > > > > and
> > > > > > > > 1024 in child.java.opts but i think these values(1024,2001)
> are
> > > not
> > > > > in
> > > > > > > use.
> > > > > > > > secondly the following lines
> > > > > > > >
> > > > > > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > > -Dscript.name=/usr/local/groovy/b*
> > > > > > > >
> > > > > > > > did not appear for first time when job runs. they appear when
> > job
> > > > > > failed
> > > > > > > > for
> > > > > > > > first time and then again try to start mapping. I have one
> more
> > > > > > question
> > > > > > > > which is as all hadoop processes (namenode, datanode,
> > > > tasktracker...)
> > > > > > > > showing 2001 heapsize in process. will it means  all the
> > > processes
> > > > > > using
> > > > > > > > 2001m of memory??
> > > > > > > >
> > > > > > > > Regards
> > > > > > > > Shuja
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> > > alexvk@cloudera.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuja,
> > > > > > > > >
> > > > > > > > > I think you need to enclose the invocation string in
> quotes.
> > > >  Try:
> > > > > > > > >
> > > > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > > > > >
> > > > > > > > > Also, it would be nice to see how exactly the groovy is
> > > invoked.
> > > > >  Is
> > > > > > > > groovy
> > > > > > > > > started and them gives you OOM or is OOM error during the
> > > start?
> > > > >  Can
> > > > > > > you
> > > > > > > > > see the new process with "ps -aef"?
> > > > > > > > >
> > > > > > > > > Can you run groovy in local mode?  Try "-jt local" option.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Alex K
> > > > > > > > >
> > > > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Patrick,
> > > > > > > > > > Thanks for explanation. I have supply the heapsize in
> > mapper
> > > in
> > > > > the
> > > > > > > > > > following way
> > > > > > > > > >
> > > > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > > > > >
> > > > > > > > > > but still same error. Any other idea?
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > > > > patrick@cloudera.com
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Shuja,
> > > > > > > > > > >
> > > > > > > > > > > Those settings (mapred.child.jvm.opts and
> > > > mapred.child.ulimit)
> > > > > > are
> > > > > > > > only
> > > > > > > > > > > used
> > > > > > > > > > > for child JVMs that get forked by the TaskTracker. You
> > are
> > > > > using
> > > > > > > > Hadoop
> > > > > > > > > > > streaming, which means the TaskTracker is forking a JVM
> > for
> > > > > > > > streaming,
> > > > > > > > > > > which
> > > > > > > > > > > is then forking a shell process that runs your groovy
> > code
> > > > (in
> > > > > > > > another
> > > > > > > > > > > JVM).
> > > > > > > > > > >
> > > > > > > > > > > I'm not much of a groovy expert, but if there's a way
> you
> > > can
> > > > > > wrap
> > > > > > > > your
> > > > > > > > > > > code
> > > > > > > > > > > around the MapReduce API that would work best.
> Otherwise,
> > > you
> > > > > can
> > > > > > > > just
> > > > > > > > > > pass
> > > > > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > >
> > > > > > > > > > > - Patrick
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > > > > shujamughal@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > >
> > > > > > > > > > > > I have update the java to latest available version on
> > all
> > > > > > > machines
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > > > > >
> > > > > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > > > > >
> > > > > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > -D
> > > > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > > > > mapred.child.ulimit=3145728
> > > > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> > > http://www.w
> > > > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > > > > er5.groovy
> > > > > > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Any clue?
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > > > > alexvk@cloudera.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > >
> > > > > > > > > > > > > First, thank you for using CDH3.  Can you also
> check
> > > what
> > > > > m*
> > > > > > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command
> line.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> > > update
> > > > 8
> > > > > at
> > > > > > a
> > > > > > > > > > > minimum,
> > > > > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > > > > Homepage<
> > > http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > > > > .
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let me know how it goes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex K
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Alex
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
> > and
> > > > > using
> > > > > > > > > > Cloudera
> > > > > > > > > > > > > > distribution of hadoop. and here is the output of
> > > this
> > > > > > > command.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > > > > /hadoop-0.20/logs
> > > > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > > > > op.root.logger=INFO,console
> > > > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > > > > StreamInputFormat
> > > > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> > > <mdc
> > > > > > > > > xmlns:HTML="
> > > > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> -jobconf
> > > > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
> >  RNC11
> > > > > > -mapper
> > > > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer
> -file
> > /
> > > > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00
> > grep
> > > > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > > > > alexvk@cloudera.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It looks like the OOM is happening in your
> code.
> > >  Are
> > > > > you
> > > > > > > > > running
> > > > > > > > > > > > > > MapReduce
> > > > > > > > > > > > > > > in a cluster?  If so, can you send the exact
> > > command
> > > > > line
> > > > > > > > your
> > > > > > > > > > code
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> > > grep
> > > > > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > > > > command on one of the nodes which is running
> the
> > > > task?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Alex K
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman
> <
> > > > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi All
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I am facing a hard problem. I am running a
> map
> > > > reduce
> > > > > > job
> > > > > > > > > using
> > > > > > > > > > > > > > streaming
> > > > > > > > > > > > > > > > but it fails and it gives the following
> error.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> > > space
> > > > > > > > > > > > > > > >        at
> > > > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > java.lang.RuntimeException:
> > > > > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > > > > subprocess
> > > > > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > >
> > > > > > > > >
> > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > >
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > >
> > > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >        at
> > > > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > > > >        at
> > > > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I have increased the heap size in
> hadoop-env.sh
> > > and
> > > > > > make
> > > > > > > it
> > > > > > > > > > > 2000M.
> > > > > > > > > > > > > Also
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > but it still gives the error. The same job
> runs
> > > > fine
> > > > > if
> > > > > > i
> > > > > > > > run
> > > > > > > > > > on
> > > > > > > > > > > > > shell
> > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > > > Lahore University of Management Sciences
> (LUMS)
> > > > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > _________________________________
> > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Ted Yu <yu...@gmail.com>.
Normally task tracker isn't run on Name node.
Did you configure otherwise ?

On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman <sh...@gmail.com> wrote:

> *Master Node output:*
>
>   total       used       free     shared    buffers     cached
> Mem:       2097328     515576    1581752          0      56060     254760
> -/+ buffers/cache:     204756    1892572
> Swap:       522104          0     522104
>
> *Slave Node output:*
>  total       used       free     shared    buffers     cached
> Mem:       1048752     860684     188068          0     148388     570948
> -/+ buffers/cache:     141348     907404
> Swap:       522104         40     522064
>
> it seems that on server there is more memory free.
>
>
> On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Maybe you do not have enough available memory on master?  What is the
> > output
> > of "*free*" on both nodes?  -- Alex K
> >
> > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi
> > > I have added following line to my master node mapred-site.xml file
> > >
> > > <property>
> > >    <name>mapred.child.ulimit</name>
> > >    <value>3145728</value>
> > >  </property>
> > >
> > > and run the job again, and wow..., the jobs get completed in 4th
> attempt.
> > I
> > > checked the at 50030. Hadoop runs job 3 times on master server and it
> > fails
> > > but when it run on 2nd node, it succeeded and produce the desired
> result.
> > > Why it failed on master?
> > > Thanks
> > > Shuja
> > >
> > >
> > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hmm.  It means your options are not propagated to the nodes.  Can you
> > put
> > > *
> > > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> > tasktrackers?
> > > >  I
> > > > was under impression that the below should be enough though.  Glad
> you
> > > got
> > > > it working in local mode.  -- Alex K
> > > >
> > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Alex, I am using putty to connect to servers. and this is almost
> > my
> > > > > maximum screen output which i sent. putty is not allowed me to
> > increase
> > > > the
> > > > > size of terminal. is there any other way that i get the complete
> > output
> > > > of
> > > > > ps-aef?
> > > > >
> > > > > Now i run the following command and thnx God, it did not fails and
> > > > produce
> > > > > the desired output.
> > > > >
> > > > > hadoop jar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > \
> > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > -D mapred.child.ulimit=3145728 \
> > > > > -jt local \
> > > > > -inputformat StreamInputFormat \
> > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > \
> > > > > -input
> > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > \
> > > > > -jobconf mapred.map.tasks=1 \
> > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > -output RNC32 \
> > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > -file /home/ftpuser1/Nodemapper5.groovy
> > > > >
> > > > >
> > > > > but when i omit the -jt local, it produces the same error.
> > > > > Thanks Alex for helping
> > > > > Regards
> > > > > Shuja
> > > > >
> > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..."
> on
> > > the
> > > > > > command line, the last is valid.  Unfortunately you have
> truncated
> > > > > command
> > > > > > lines.  Can you show us the full command line, particularly for
> the
> > > > > process
> > > > > > 26162?  This seems to be causing problems.
> > > > > >
> > > > > > If you are running your cluster on 2 nodes, it may be that the
> task
> > > was
> > > > > > scheduled on the second node.  Did you run "ps -aef" on the
> second
> > > node
> > > > > as
> > > > > > well?  You can see the task assignment in the JT web-UI (
> > > > > > http://jt-name:50030, drill down to tasks).
> > > > > >
> > > > > > I suggest you first debug your program in the local mode first,
> > > however
> > > > > > (use
> > > > > > "*-jt local*" option).  Did you try the "*-D
> > > > > mapred.child.ulimit=3145728*"
> > > > > > option?  I do not see it on the command line.
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi Alex
> > > > > > >
> > > > > > > I have tried with using quotes  and also with -jt local but
> same
> > > heap
> > > > > > > error.
> > > > > > > and here is the output  of ps -aef
> > > > > > >
> > > > > > > UID        PID  PPID  C STIME TTY          TIME CMD
> > > > > > > root         1     0  0 04:37 ?        00:00:00 init [3]
> > > > > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > > > > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > > > > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > > > > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > > > > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > > > > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > > > > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > > > > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > > > > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > > > > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > > > > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > > > > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > > > > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > > > > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > > > > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > > > > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > > > > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > > > > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > > > > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> > > > > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > > > > > > root       635     7  0 04:37 ?        00:00:00
> [kmpath_handlerd]
> > > > > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > > root      1032     1  0 04:38 ?        00:00:00 auditd
> > > > > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > > > > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > > > > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > > > > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > > > > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > > > > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon
> > > --system
> > > > > > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> > > > > > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> > > > > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > > > > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > > > > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
> > > > --server
> > > > > > > root      1295     1  0 04:38 ?        00:00:00 automount
> > > > > > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> > > > > > > root      1326     1  0 04:38 ?        00:00:00 xinetd
> -stayalive
> > > > > > -pidfile
> > > > > > > /var/run/xinetd.pid
> > > > > > > root      1337     1  0 04:38 ?        00:00:00
> /usr/sbin/vsftpd
> > > > > > > /etc/vsftpd/vsftpd.conf
> > > > > > > root      1354     1  0 04:38 ?        00:00:00 sendmail:
> > accepting
> > > > > > > connections
> > > > > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue
> > > > > runner@01
> > > > > > > :00:00
> > > > > > > for /var/spool/clientmqueue
> > > > > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
> > > > /dev/input/mice
> > > > > -t
> > > > > > > exps2
> > > > > > > root      1410     1  0 04:38 ?        00:00:00 crond
> > > > > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv
> > > -daemon
> > > > > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > > > > > > 68        1508     1  0 04:38 ?        00:00:00 hald
> > > > > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > > > > > > root      1533     1  0 04:38 ?        00:00:00
> /usr/sbin/smartd
> > -q
> > > > > never
> > > > > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty
> xvc0
> > > > 9600
> > > > > > > vt100-nav
> > > > > > > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python
> > -tt
> > > > > > > /usr/sbin/yum-updatesd
> > > > > > > root      1539     1  0 04:38 ?        00:00:00
> > > > /usr/libexec/gam_server
> > > > > > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts
> /0
> > > > > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > > > > > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts
> /1
> > > > > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > > > > > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts
> /2
> > > > > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > > > > > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts
> /3
> > > > > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > > > > > > hadoop   24808     1  0 12:01 ?        00:00:02
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop   24893     1  0 12:01 ?        00:00:01
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop   24988     1  0 12:01 ?        00:00:01
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop   25085     1  0 12:01 ?        00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > > -Dcom.sun.management.jmxremote
> > > > > > > -Dhadoop.lo
> > > > > > > hadoop   25175     1  0 12:01 ?        00:00:01
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > > > -Dhadoop.log.file=hadoo
> > > > > > > root     25925 21994  1 12:06 pts/2    00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log -
> > > > > > > hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > -Dscript.name=/usr/local/groovy/b
> > > > > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> > > > > > >
> > > > > > >
> > > > > > > *The command which i am executing is *
> > > > > > >
> > > > > > >
> > > > > > > hadoop jar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > \
> > > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > > -inputformat StreamInputFormat \
> > > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > > http://www.w3.org/TR/REC-xml\<http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C><
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > > \
> > > > > > > -input
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > > \
> > > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > > -output RNC25 \
> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > > > -jt local
> > > > > > >
> > > > > > > I have noticed that the all hadoop processes showing 2001
> memory
> > > size
> > > > > > which
> > > > > > > i have set in hadoop-env.sh. and one the command, i give 2000
> in
> > > > mapper
> > > > > > and
> > > > > > > 1024 in child.java.opts but i think these values(1024,2001) are
> > not
> > > > in
> > > > > > use.
> > > > > > > secondly the following lines
> > > > > > >
> > > > > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > > -Dscript.name=/usr/local/groovy/b*
> > > > > > >
> > > > > > > did not appear for first time when job runs. they appear when
> job
> > > > > failed
> > > > > > > for
> > > > > > > first time and then again try to start mapping. I have one more
> > > > > question
> > > > > > > which is as all hadoop processes (namenode, datanode,
> > > tasktracker...)
> > > > > > > showing 2001 heapsize in process. will it means  all the
> > processes
> > > > > using
> > > > > > > 2001m of memory??
> > > > > > >
> > > > > > > Regards
> > > > > > > Shuja
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> > alexvk@cloudera.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > I think you need to enclose the invocation string in quotes.
> > >  Try:
> > > > > > > >
> > > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > > > >
> > > > > > > > Also, it would be nice to see how exactly the groovy is
> > invoked.
> > > >  Is
> > > > > > > groovy
> > > > > > > > started and them gives you OOM or is OOM error during the
> > start?
> > > >  Can
> > > > > > you
> > > > > > > > see the new process with "ps -aef"?
> > > > > > > >
> > > > > > > > Can you run groovy in local mode?  Try "-jt local" option.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Patrick,
> > > > > > > > > Thanks for explanation. I have supply the heapsize in
> mapper
> > in
> > > > the
> > > > > > > > > following way
> > > > > > > > >
> > > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > > > >
> > > > > > > > > but still same error. Any other idea?
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > > > patrick@cloudera.com
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Shuja,
> > > > > > > > > >
> > > > > > > > > > Those settings (mapred.child.jvm.opts and
> > > mapred.child.ulimit)
> > > > > are
> > > > > > > only
> > > > > > > > > > used
> > > > > > > > > > for child JVMs that get forked by the TaskTracker. You
> are
> > > > using
> > > > > > > Hadoop
> > > > > > > > > > streaming, which means the TaskTracker is forking a JVM
> for
> > > > > > > streaming,
> > > > > > > > > > which
> > > > > > > > > > is then forking a shell process that runs your groovy
> code
> > > (in
> > > > > > > another
> > > > > > > > > > JVM).
> > > > > > > > > >
> > > > > > > > > > I'm not much of a groovy expert, but if there's a way you
> > can
> > > > > wrap
> > > > > > > your
> > > > > > > > > > code
> > > > > > > > > > around the MapReduce API that would work best. Otherwise,
> > you
> > > > can
> > > > > > > just
> > > > > > > > > pass
> > > > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > - Patrick
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > > > shujamughal@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Alex,
> > > > > > > > > > >
> > > > > > > > > > > I have update the java to latest available version on
> all
> > > > > > machines
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > > > >
> > > > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > > > >
> > > > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > -D
> > > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > > > mapred.child.ulimit=3145728
> > > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> > http://www.w
> > > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > mapred.map.tasks=1
> > > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > > > er5.groovy
> > > > > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > > > > > > Nodemapper5.groovy
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Any clue?
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > > > alexvk@cloudera.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > >
> > > > > > > > > > > > First, thank you for using CDH3.  Can you also check
> > what
> > > > m*
> > > > > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > > > > >
> > > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> > update
> > > 8
> > > > at
> > > > > a
> > > > > > > > > > minimum,
> > > > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > > > Homepage<
> > http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > > > .
> > > > > > > > > > > >
> > > > > > > > > > > > Let me know how it goes.
> > > > > > > > > > > >
> > > > > > > > > > > > Alex K
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > >wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Alex
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines
> and
> > > > using
> > > > > > > > > Cloudera
> > > > > > > > > > > > > distribution of hadoop. and here is the output of
> > this
> > > > > > command.
> > > > > > > > > > > > >
> > > > > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > > > /hadoop-0.20/logs
> > > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > > > op.root.logger=INFO,console
> > > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > > > StreamInputFormat
> > > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> > <mdc
> > > > > > > > xmlns:HTML="
> > > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output
>  RNC11
> > > > > -mapper
> > > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /
> > > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00
> grep
> > > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > > > alexvk@cloudera.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It looks like the OOM is happening in your code.
> >  Are
> > > > you
> > > > > > > > running
> > > > > > > > > > > > > MapReduce
> > > > > > > > > > > > > > in a cluster?  If so, can you send the exact
> > command
> > > > line
> > > > > > > your
> > > > > > > > > code
> > > > > > > > > > > is
> > > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> > grep
> > > > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > > > command on one of the nodes which is running the
> > > task?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Alex K
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi All
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I am facing a hard problem. I am running a map
> > > reduce
> > > > > job
> > > > > > > > using
> > > > > > > > > > > > > streaming
> > > > > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> > space
> > > > > > > > > > > > > > >        at
> > > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > java.lang.RuntimeException:
> > > > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > > > subprocess
> > > > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > >
> > > > > > > >
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > > >        at
> > > > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > > >        at
> > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > > >        at
> > > > > > > > > > > > >
> > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >        at
> > > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > > >        at
> > > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh
> > and
> > > > > make
> > > > > > it
> > > > > > > > > > 2000M.
> > > > > > > > > > > > Also
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > but it still gives the error. The same job runs
> > > fine
> > > > if
> > > > > i
> > > > > > > run
> > > > > > > > > on
> > > > > > > > > > > > shell
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Regards
> > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards
> > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > _________________________________
> > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
*Master Node output:*

   total       used       free     shared    buffers     cached
Mem:       2097328     515576    1581752          0      56060     254760
-/+ buffers/cache:     204756    1892572
Swap:       522104          0     522104

*Slave Node output:*
  total       used       free     shared    buffers     cached
Mem:       1048752     860684     188068          0     148388     570948
-/+ buffers/cache:     141348     907404
Swap:       522104         40     522064

it seems that on server there is more memory free.


On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <al...@cloudera.com> wrote:

> Maybe you do not have enough available memory on master?  What is the
> output
> of "*free*" on both nodes?  -- Alex K
>
> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi
> > I have added following line to my master node mapred-site.xml file
> >
> > <property>
> >    <name>mapred.child.ulimit</name>
> >    <value>3145728</value>
> >  </property>
> >
> > and run the job again, and wow..., the jobs get completed in 4th attempt.
> I
> > checked the at 50030. Hadoop runs job 3 times on master server and it
> fails
> > but when it run on 2nd node, it succeeded and produce the desired result.
> > Why it failed on master?
> > Thanks
> > Shuja
> >
> >
> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hmm.  It means your options are not propagated to the nodes.  Can you
> put
> > *
> > > mapred.child.ulimit* in the mapred-siet.xml and restart the
> tasktrackers?
> > >  I
> > > was under impression that the below should be enough though.  Glad you
> > got
> > > it working in local mode.  -- Alex K
> > >
> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Alex, I am using putty to connect to servers. and this is almost
> my
> > > > maximum screen output which i sent. putty is not allowed me to
> increase
> > > the
> > > > size of terminal. is there any other way that i get the complete
> output
> > > of
> > > > ps-aef?
> > > >
> > > > Now i run the following command and thnx God, it did not fails and
> > > produce
> > > > the desired output.
> > > >
> > > > hadoop jar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > \
> > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > -D mapred.child.ulimit=3145728 \
> > > > -jt local \
> > > > -inputformat StreamInputFormat \
> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > \
> > > > -input
> > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > \
> > > > -jobconf mapred.map.tasks=1 \
> > > > -jobconf mapred.reduce.tasks=0 \
> > > > -output RNC32 \
> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > -file /home/ftpuser1/Nodemapper5.groovy
> > > >
> > > >
> > > > but when i omit the -jt local, it produces the same error.
> > > > Thanks Alex for helping
> > > > Regards
> > > > Shuja
> > > >
> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on
> > the
> > > > > command line, the last is valid.  Unfortunately you have truncated
> > > > command
> > > > > lines.  Can you show us the full command line, particularly for the
> > > > process
> > > > > 26162?  This seems to be causing problems.
> > > > >
> > > > > If you are running your cluster on 2 nodes, it may be that the task
> > was
> > > > > scheduled on the second node.  Did you run "ps -aef" on the second
> > node
> > > > as
> > > > > well?  You can see the task assignment in the JT web-UI (
> > > > > http://jt-name:50030, drill down to tasks).
> > > > >
> > > > > I suggest you first debug your program in the local mode first,
> > however
> > > > > (use
> > > > > "*-jt local*" option).  Did you try the "*-D
> > > > mapred.child.ulimit=3145728*"
> > > > > option?  I do not see it on the command line.
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi Alex
> > > > > >
> > > > > > I have tried with using quotes  and also with -jt local but same
> > heap
> > > > > > error.
> > > > > > and here is the output  of ps -aef
> > > > > >
> > > > > > UID        PID  PPID  C STIME TTY          TIME CMD
> > > > > > root         1     0  0 04:37 ?        00:00:00 init [3]
> > > > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > > > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > > > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > > > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > > > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > > > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > > > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > > > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > > > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > > > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > > > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > > > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > > > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > > > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > > > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > > > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > > > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > > > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > > > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> > > > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > > > > > root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> > > > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > > root      1032     1  0 04:38 ?        00:00:00 auditd
> > > > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > > > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > > > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > > > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > > > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > > > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon
> > --system
> > > > > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> > > > > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> > > > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > > > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > > > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
> > > --server
> > > > > > root      1295     1  0 04:38 ?        00:00:00 automount
> > > > > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> > > > > > root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive
> > > > > -pidfile
> > > > > > /var/run/xinetd.pid
> > > > > > root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> > > > > > /etc/vsftpd/vsftpd.conf
> > > > > > root      1354     1  0 04:38 ?        00:00:00 sendmail:
> accepting
> > > > > > connections
> > > > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue
> > > > runner@01
> > > > > > :00:00
> > > > > > for /var/spool/clientmqueue
> > > > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
> > > /dev/input/mice
> > > > -t
> > > > > > exps2
> > > > > > root      1410     1  0 04:38 ?        00:00:00 crond
> > > > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv
> > -daemon
> > > > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > > > > > 68        1508     1  0 04:38 ?        00:00:00 hald
> > > > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > > > > > root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd
> -q
> > > > never
> > > > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0
> > > 9600
> > > > > > vt100-nav
> > > > > > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python
> -tt
> > > > > > /usr/sbin/yum-updatesd
> > > > > > root      1539     1  0 04:38 ?        00:00:00
> > > /usr/libexec/gam_server
> > > > > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> > > > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > > > > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> > > > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > > > > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> > > > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > > > > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> > > > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > > > > > hadoop   24808     1  0 12:01 ?        00:00:02
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop   24893     1  0 12:01 ?        00:00:01
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop   24988     1  0 12:01 ?        00:00:01
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop   25085     1  0 12:01 ?        00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > > -Dcom.sun.management.jmxremote
> > > > > > -Dhadoop.lo
> > > > > > hadoop   25175     1  0 12:01 ?        00:00:01
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > > -Dhadoop.log.file=hadoo
> > > > > > root     25925 21994  1 12:06 pts/2    00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > -Dhadoop.log.file=hadoop.log -
> > > > > > hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > -Dscript.name=/usr/local/groovy/b
> > > > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> > > > > >
> > > > > >
> > > > > > *The command which i am executing is *
> > > > > >
> > > > > >
> > > > > > hadoop jar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > \
> > > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > > -inputformat StreamInputFormat \
> > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C><
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C> <
> > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > > \
> > > > > > -input
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > > \
> > > > > > -jobconf mapred.map.tasks=1 \
> > > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > > -output RNC25 \
> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > > -jt local
> > > > > >
> > > > > > I have noticed that the all hadoop processes showing 2001 memory
> > size
> > > > > which
> > > > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
> > > mapper
> > > > > and
> > > > > > 1024 in child.java.opts but i think these values(1024,2001) are
> not
> > > in
> > > > > use.
> > > > > > secondly the following lines
> > > > > >
> > > > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > > -Dscript.name=/usr/local/groovy/b*
> > > > > >
> > > > > > did not appear for first time when job runs. they appear when job
> > > > failed
> > > > > > for
> > > > > > first time and then again try to start mapping. I have one more
> > > > question
> > > > > > which is as all hadoop processes (namenode, datanode,
> > tasktracker...)
> > > > > > showing 2001 heapsize in process. will it means  all the
> processes
> > > > using
> > > > > > 2001m of memory??
> > > > > >
> > > > > > Regards
> > > > > > Shuja
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <
> alexvk@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > I think you need to enclose the invocation string in quotes.
> >  Try:
> > > > > > >
> > > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > > >
> > > > > > > Also, it would be nice to see how exactly the groovy is
> invoked.
> > >  Is
> > > > > > groovy
> > > > > > > started and them gives you OOM or is OOM error during the
> start?
> > >  Can
> > > > > you
> > > > > > > see the new process with "ps -aef"?
> > > > > > >
> > > > > > > Can you run groovy in local mode?  Try "-jt local" option.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Patrick,
> > > > > > > > Thanks for explanation. I have supply the heapsize in mapper
> in
> > > the
> > > > > > > > following way
> > > > > > > >
> > > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > > >
> > > > > > > > but still same error. Any other idea?
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > > patrick@cloudera.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Shuja,
> > > > > > > > >
> > > > > > > > > Those settings (mapred.child.jvm.opts and
> > mapred.child.ulimit)
> > > > are
> > > > > > only
> > > > > > > > > used
> > > > > > > > > for child JVMs that get forked by the TaskTracker. You are
> > > using
> > > > > > Hadoop
> > > > > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > > > > streaming,
> > > > > > > > > which
> > > > > > > > > is then forking a shell process that runs your groovy code
> > (in
> > > > > > another
> > > > > > > > > JVM).
> > > > > > > > >
> > > > > > > > > I'm not much of a groovy expert, but if there's a way you
> can
> > > > wrap
> > > > > > your
> > > > > > > > > code
> > > > > > > > > around the MapReduce API that would work best. Otherwise,
> you
> > > can
> > > > > > just
> > > > > > > > pass
> > > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > - Patrick
> > > > > > > > >
> > > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Alex,
> > > > > > > > > >
> > > > > > > > > > I have update the java to latest available version on all
> > > > > machines
> > > > > > in
> > > > > > > > the
> > > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > > >
> > > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > > >
> > > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > > >
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > -D
> > > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > > mapred.child.ulimit=3145728
> > > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="
> http://www.w
> > > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > mapred.map.tasks=1
> > > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > > er5.groovy
> > > > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > > > > > Nodemapper5.groovy
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Any clue?
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > > alexvk@cloudera.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Shuja,
> > > > > > > > > > >
> > > > > > > > > > > First, thank you for using CDH3.  Can you also check
> what
> > > m*
> > > > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > > > >
> > > > > > > > > > > I would also recommend to upgrade java to JDK 1.6
> update
> > 8
> > > at
> > > > a
> > > > > > > > > minimum,
> > > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > > Homepage<
> http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > > .
> > > > > > > > > > >
> > > > > > > > > > > Let me know how it goes.
> > > > > > > > > > >
> > > > > > > > > > > Alex K
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > >wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Alex
> > > > > > > > > > > >
> > > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines and
> > > using
> > > > > > > > Cloudera
> > > > > > > > > > > > distribution of hadoop. and here is the output of
> this
> > > > > command.
> > > > > > > > > > > >
> > > > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > > /hadoop-0.20/logs
> > > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > > op.root.logger=INFO,console
> > > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > > StreamInputFormat
> > > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=
> <mdc
> > > > > > > xmlns:HTML="
> > > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > > mapred.map.tasks=1
> > > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11
> > > > -mapper
> > > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > > > > > > Nodemapper5.groovy
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > > alexvk@cloudera.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > > >
> > > > > > > > > > > > > It looks like the OOM is happening in your code.
>  Are
> > > you
> > > > > > > running
> > > > > > > > > > > > MapReduce
> > > > > > > > > > > > > in a cluster?  If so, can you send the exact
> command
> > > line
> > > > > > your
> > > > > > > > code
> > > > > > > > > > is
> > > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af |
> grep
> > > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > > command on one of the nodes which is running the
> > task?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Alex K
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > > >wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi All
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am facing a hard problem. I am running a map
> > reduce
> > > > job
> > > > > > > using
> > > > > > > > > > > > streaming
> > > > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap
> space
> > > > > > > > > > > > > >        at
> > Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > java.lang.RuntimeException:
> > > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > > subprocess
> > > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > > >        at
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > > >        at
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >        at
> > > > > > > > > > > > >
> > > > > > >
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > > >        at
> > > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > > >        at
> > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > > >        at
> > > > > > > > > > > >
> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >        at
> > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > > >        at
> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh
> and
> > > > make
> > > > > it
> > > > > > > > > 2000M.
> > > > > > > > > > > Also
> > > > > > > > > > > > I
> > > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > but it still gives the error. The same job runs
> > fine
> > > if
> > > > i
> > > > > > run
> > > > > > > > on
> > > > > > > > > > > shell
> > > > > > > > > > > > > > using
> > > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Regards
> > > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > _________________________________
> > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Maybe you do not have enough available memory on master?  What is the output
of "*free*" on both nodes?  -- Alex K

On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <sh...@gmail.com> wrote:

> Hi
> I have added following line to my master node mapred-site.xml file
>
> <property>
>    <name>mapred.child.ulimit</name>
>    <value>3145728</value>
>  </property>
>
> and run the job again, and wow..., the jobs get completed in 4th attempt. I
> checked the at 50030. Hadoop runs job 3 times on master server and it fails
> but when it run on 2nd node, it succeeded and produce the desired result.
> Why it failed on master?
> Thanks
> Shuja
>
>
> On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hmm.  It means your options are not propagated to the nodes.  Can you put
> *
> > mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers?
> >  I
> > was under impression that the below should be enough though.  Glad you
> got
> > it working in local mode.  -- Alex K
> >
> > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi Alex, I am using putty to connect to servers. and this is almost my
> > > maximum screen output which i sent. putty is not allowed me to increase
> > the
> > > size of terminal. is there any other way that i get the complete output
> > of
> > > ps-aef?
> > >
> > > Now i run the following command and thnx God, it did not fails and
> > produce
> > > the desired output.
> > >
> > > hadoop jar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> \
> > > -D mapred.child.java.opts=-Xmx1024m \
> > > -D mapred.child.ulimit=3145728 \
> > > -jt local \
> > > -inputformat StreamInputFormat \
> > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > \
> > > -input
> > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > \
> > > -jobconf mapred.map.tasks=1 \
> > > -jobconf mapred.reduce.tasks=0 \
> > > -output RNC32 \
> > > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > -file /home/ftpuser1/Nodemapper5.groovy
> > >
> > >
> > > but when i omit the -jt local, it produces the same error.
> > > Thanks Alex for helping
> > > Regards
> > > Shuja
> > >
> > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on
> the
> > > > command line, the last is valid.  Unfortunately you have truncated
> > > command
> > > > lines.  Can you show us the full command line, particularly for the
> > > process
> > > > 26162?  This seems to be causing problems.
> > > >
> > > > If you are running your cluster on 2 nodes, it may be that the task
> was
> > > > scheduled on the second node.  Did you run "ps -aef" on the second
> node
> > > as
> > > > well?  You can see the task assignment in the JT web-UI (
> > > > http://jt-name:50030, drill down to tasks).
> > > >
> > > > I suggest you first debug your program in the local mode first,
> however
> > > > (use
> > > > "*-jt local*" option).  Did you try the "*-D
> > > mapred.child.ulimit=3145728*"
> > > > option?  I do not see it on the command line.
> > > >
> > > > Alex K
> > > >
> > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <
> shujamughal@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi Alex
> > > > >
> > > > > I have tried with using quotes  and also with -jt local but same
> heap
> > > > > error.
> > > > > and here is the output  of ps -aef
> > > > >
> > > > > UID        PID  PPID  C STIME TTY          TIME CMD
> > > > > root         1     0  0 04:37 ?        00:00:00 init [3]
> > > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> > > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > > > > root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> > > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > > > > root      1032     1  0 04:38 ?        00:00:00 auditd
> > > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon
> --system
> > > > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> > > > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> > > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
> > --server
> > > > > root      1295     1  0 04:38 ?        00:00:00 automount
> > > > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> > > > > root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive
> > > > -pidfile
> > > > > /var/run/xinetd.pid
> > > > > root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> > > > > /etc/vsftpd/vsftpd.conf
> > > > > root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
> > > > > connections
> > > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue
> > > runner@01
> > > > > :00:00
> > > > > for /var/spool/clientmqueue
> > > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
> > /dev/input/mice
> > > -t
> > > > > exps2
> > > > > root      1410     1  0 04:38 ?        00:00:00 crond
> > > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv
> -daemon
> > > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > > > > 68        1508     1  0 04:38 ?        00:00:00 hald
> > > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > > > > root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q
> > > never
> > > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0
> > 9600
> > > > > vt100-nav
> > > > > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
> > > > > /usr/sbin/yum-updatesd
> > > > > root      1539     1  0 04:38 ?        00:00:00
> > /usr/libexec/gam_server
> > > > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> > > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > > > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> > > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > > > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> > > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > > > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> > > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > > > > hadoop   24808     1  0 12:01 ?        00:00:02
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop   24893     1  0 12:01 ?        00:00:01
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop   24988     1  0 12:01 ?        00:00:01
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop   25085     1  0 12:01 ?        00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote
> > > > > -Dhadoop.lo
> > > > > hadoop   25175     1  0 12:01 ?        00:00:01
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > > -Dhadoop.log.file=hadoo
> > > > > root     25925 21994  1 12:06 pts/2    00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -
> > > > > hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > -Dscript.name=/usr/local/groovy/b
> > > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> > > > >
> > > > >
> > > > > *The command which i am executing is *
> > > > >
> > > > >
> > > > > hadoop jar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > \
> > > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > > -inputformat StreamInputFormat \
> > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C> <
> > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > > \
> > > > > -input
> > > > >
> > > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > > \
> > > > > -jobconf mapred.map.tasks=1 \
> > > > > -jobconf mapred.reduce.tasks=0 \
> > > > > -output RNC25 \
> > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > > -jt local
> > > > >
> > > > > I have noticed that the all hadoop processes showing 2001 memory
> size
> > > > which
> > > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
> > mapper
> > > > and
> > > > > 1024 in child.java.opts but i think these values(1024,2001) are not
> > in
> > > > use.
> > > > > secondly the following lines
> > > > >
> > > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > > /usr/jdk1.6.0_03/jre/bin/java
> > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > > -Dscript.name=/usr/local/groovy/b*
> > > > >
> > > > > did not appear for first time when job runs. they appear when job
> > > failed
> > > > > for
> > > > > first time and then again try to start mapping. I have one more
> > > question
> > > > > which is as all hadoop processes (namenode, datanode,
> tasktracker...)
> > > > > showing 2001 heapsize in process. will it means  all the processes
> > > using
> > > > > 2001m of memory??
> > > > >
> > > > > Regards
> > > > > Shuja
> > > > >
> > > > >
> > > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > I think you need to enclose the invocation string in quotes.
>  Try:
> > > > > >
> > > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > > >
> > > > > > Also, it would be nice to see how exactly the groovy is invoked.
> >  Is
> > > > > groovy
> > > > > > started and them gives you OOM or is OOM error during the start?
> >  Can
> > > > you
> > > > > > see the new process with "ps -aef"?
> > > > > >
> > > > > > Can you run groovy in local mode?  Try "-jt local" option.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Patrick,
> > > > > > > Thanks for explanation. I have supply the heapsize in mapper in
> > the
> > > > > > > following way
> > > > > > >
> > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > > >
> > > > > > > but still same error. Any other idea?
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > > patrick@cloudera.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Shuja,
> > > > > > > >
> > > > > > > > Those settings (mapred.child.jvm.opts and
> mapred.child.ulimit)
> > > are
> > > > > only
> > > > > > > > used
> > > > > > > > for child JVMs that get forked by the TaskTracker. You are
> > using
> > > > > Hadoop
> > > > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > > > streaming,
> > > > > > > > which
> > > > > > > > is then forking a shell process that runs your groovy code
> (in
> > > > > another
> > > > > > > > JVM).
> > > > > > > >
> > > > > > > > I'm not much of a groovy expert, but if there's a way you can
> > > wrap
> > > > > your
> > > > > > > > code
> > > > > > > > around the MapReduce API that would work best. Otherwise, you
> > can
> > > > > just
> > > > > > > pass
> > > > > > > > the heapsize in '-mapper' argument.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > - Patrick
> > > > > > > >
> > > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Alex,
> > > > > > > > >
> > > > > > > > > I have update the java to latest available version on all
> > > > machines
> > > > > in
> > > > > > > the
> > > > > > > > > cluster and now i run the job by adding this line
> > > > > > > > >
> > > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > > >
> > > > > > > > > but still same error. Here is the output of this job.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > -D
> > > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > > mapred.child.ulimit=3145728
> > > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > mapred.map.tasks=1
> > > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > > er5.groovy
> > > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > > > > Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Any clue?
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > > alexvk@cloudera.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuja,
> > > > > > > > > >
> > > > > > > > > > First, thank you for using CDH3.  Can you also check what
> > m*
> > > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > > >
> > > > > > > > > > I would also recommend to upgrade java to JDK 1.6 update
> 8
> > at
> > > a
> > > > > > > > minimum,
> > > > > > > > > > which you can download from the Java SE
> > > > > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > > .
> > > > > > > > > >
> > > > > > > > > > Let me know how it goes.
> > > > > > > > > >
> > > > > > > > > > Alex K
> > > > > > > > > >
> > > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > > shujamughal@gmail.com
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Alex
> > > > > > > > > > >
> > > > > > > > > > > Yeah, I am running a job on cluster of 2 machines and
> > using
> > > > > > > Cloudera
> > > > > > > > > > > distribution of hadoop. and here is the output of this
> > > > command.
> > > > > > > > > > >
> > > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > > /hadoop-0.20/logs
> > > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > > -Dhadoop.id.str= -Dhado
> > op.root.logger=INFO,console
> > > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > > StreamInputFormat
> > > > > > > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> > > > > > xmlns:HTML="
> > > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > > mapred.map.tasks=1
> > > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11
> > > -mapper
> > > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > > > > > Nodemapper5.groovy
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > > >
> > > > > > > > > > > Best Regards
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > > alexvk@cloudera.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Shuja,
> > > > > > > > > > > >
> > > > > > > > > > > > It looks like the OOM is happening in your code.  Are
> > you
> > > > > > running
> > > > > > > > > > > MapReduce
> > > > > > > > > > > > in a cluster?  If so, can you send the exact command
> > line
> > > > > your
> > > > > > > code
> > > > > > > > > is
> > > > > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > > command on one of the nodes which is running the
> task?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Alex K
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > > >wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi All
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am facing a hard problem. I am running a map
> reduce
> > > job
> > > > > > using
> > > > > > > > > > > streaming
> > > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > > > >        at
> Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > > >
> > > > > > > > > > > > > java.lang.RuntimeException:
> > > > PipeMapRed.waitOutputThreads():
> > > > > > > > > > subprocess
> > > > > > > > > > > > > failed with code 1
> > > > > > > > > > > > >        at
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > > >        at
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > > >
> > > > > > > > > > > > >        at
> > > > > > > > > > > >
> > > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > > >        at
> > > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > > >        at
> > > > > > > > > > > > >
> > > > > > > > >
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > > >        at
> > > > > > > > > > >
> > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > > >
> > > > > > > > > > > > >        at
> > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > > >        at
> > > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have increased the heap size in hadoop-env.sh and
> > > make
> > > > it
> > > > > > > > 2000M.
> > > > > > > > > > Also
> > > > > > > > > > > I
> > > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > > >
> > > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > > >
> > > > > > > > > > > > > but it still gives the error. The same job runs
> fine
> > if
> > > i
> > > > > run
> > > > > > > on
> > > > > > > > > > shell
> > > > > > > > > > > > > using
> > > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > > >
> > > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Regards
> > > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > > _________________________________
> > > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards
> > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > _________________________________
> > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi
I have added following line to my master node mapred-site.xml file

<property>
    <name>mapred.child.ulimit</name>
    <value>3145728</value>
  </property>

and run the job again, and wow..., the jobs get completed in 4th attempt. I
checked the at 50030. Hadoop runs job 3 times on master server and it fails
but when it run on 2nd node, it succeeded and produce the desired result.
Why it failed on master?
Thanks
Shuja


On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <al...@cloudera.com> wrote:

> Hmm.  It means your options are not propagated to the nodes.  Can you put *
> mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers?
>  I
> was under impression that the below should be enough though.  Glad you got
> it working in local mode.  -- Alex K
>
> On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi Alex, I am using putty to connect to servers. and this is almost my
> > maximum screen output which i sent. putty is not allowed me to increase
> the
> > size of terminal. is there any other way that i get the complete output
> of
> > ps-aef?
> >
> > Now i run the following command and thnx God, it did not fails and
> produce
> > the desired output.
> >
> > hadoop jar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> > -D mapred.child.java.opts=-Xmx1024m \
> > -D mapred.child.ulimit=3145728 \
> > -jt local \
> > -inputformat StreamInputFormat \
> > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > \
> > -input
> >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > \
> > -jobconf mapred.map.tasks=1 \
> > -jobconf mapred.reduce.tasks=0 \
> > -output RNC32 \
> > -mapper /home/ftpuser1/Nodemapper5.groovy \
> > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > -file /home/ftpuser1/Nodemapper5.groovy
> >
> >
> > but when i omit the -jt local, it produces the same error.
> > Thanks Alex for helping
> > Regards
> > Shuja
> >
> > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
> > > command line, the last is valid.  Unfortunately you have truncated
> > command
> > > lines.  Can you show us the full command line, particularly for the
> > process
> > > 26162?  This seems to be causing problems.
> > >
> > > If you are running your cluster on 2 nodes, it may be that the task was
> > > scheduled on the second node.  Did you run "ps -aef" on the second node
> > as
> > > well?  You can see the task assignment in the JT web-UI (
> > > http://jt-name:50030, drill down to tasks).
> > >
> > > I suggest you first debug your program in the local mode first, however
> > > (use
> > > "*-jt local*" option).  Did you try the "*-D
> > mapred.child.ulimit=3145728*"
> > > option?  I do not see it on the command line.
> > >
> > > Alex K
> > >
> > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi Alex
> > > >
> > > > I have tried with using quotes  and also with -jt local but same heap
> > > > error.
> > > > and here is the output  of ps -aef
> > > >
> > > > UID        PID  PPID  C STIME TTY          TIME CMD
> > > > root         1     0  0 04:37 ?        00:00:00 init [3]
> > > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> > > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > > > root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> > > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > > > root      1032     1  0 04:38 ?        00:00:00 auditd
> > > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon --system
> > > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> > > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> > > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd
> --server
> > > > root      1295     1  0 04:38 ?        00:00:00 automount
> > > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> > > > root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive
> > > -pidfile
> > > > /var/run/xinetd.pid
> > > > root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> > > > /etc/vsftpd/vsftpd.conf
> > > > root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
> > > > connections
> > > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue
> > runner@01
> > > > :00:00
> > > > for /var/spool/clientmqueue
> > > > root      1379     1  0 04:38 ?        00:00:00 gpm -m
> /dev/input/mice
> > -t
> > > > exps2
> > > > root      1410     1  0 04:38 ?        00:00:00 crond
> > > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv -daemon
> > > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > > > 68        1508     1  0 04:38 ?        00:00:00 hald
> > > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > > > root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q
> > never
> > > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0
> 9600
> > > > vt100-nav
> > > > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
> > > > /usr/sbin/yum-updatesd
> > > > root      1539     1  0 04:38 ?        00:00:00
> /usr/libexec/gam_server
> > > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> > > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> > > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> > > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> > > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > > > hadoop   24808     1  0 12:01 ?        00:00:02
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop   24893     1  0 12:01 ?        00:00:01
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop   24988     1  0 12:01 ?        00:00:01
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop   25085     1  0 12:01 ?        00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote
> > > > -Dhadoop.lo
> > > > hadoop   25175     1  0 12:01 ?        00:00:01
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > > -Dhadoop.log.file=hadoo
> > > > root     25925 21994  1 12:06 pts/2    00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -
> > > > hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > /usr/jdk1.6.0_03/jre/bin/java
> > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > /usr/jdk1.6.0_03/bin/java
> > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > -Dscript.name=/usr/local/groovy/b
> > > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> > > >
> > > >
> > > > *The command which i am executing is *
> > > >
> > > >
> > > > hadoop jar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > \
> > > > -D mapred.child.java.opts=-Xmx1024m \
> > > > -inputformat StreamInputFormat \
> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C> <
> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > > \
> > > > -input
> > > >
> > > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > > \
> > > > -jobconf mapred.map.tasks=1 \
> > > > -jobconf mapred.reduce.tasks=0 \
> > > > -output RNC25 \
> > > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > > -jt local
> > > >
> > > > I have noticed that the all hadoop processes showing 2001 memory size
> > > which
> > > > i have set in hadoop-env.sh. and one the command, i give 2000 in
> mapper
> > > and
> > > > 1024 in child.java.opts but i think these values(1024,2001) are not
> in
> > > use.
> > > > secondly the following lines
> > > >
> > > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > > > /usr/jdk1.6.0_03/jre/bin/java
> > > >
> > > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > > hadoop   26162 26120 89 12:06 ?        00:00:05
> > /usr/jdk1.6.0_03/bin/java
> > > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > > -Dscript.name=/usr/local/groovy/b*
> > > >
> > > > did not appear for first time when job runs. they appear when job
> > failed
> > > > for
> > > > first time and then again try to start mapping. I have one more
> > question
> > > > which is as all hadoop processes (namenode, datanode, tasktracker...)
> > > > showing 2001 heapsize in process. will it means  all the processes
> > using
> > > > 2001m of memory??
> > > >
> > > > Regards
> > > > Shuja
> > > >
> > > >
> > > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > I think you need to enclose the invocation string in quotes.  Try:
> > > > >
> > > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > > >
> > > > > Also, it would be nice to see how exactly the groovy is invoked.
>  Is
> > > > groovy
> > > > > started and them gives you OOM or is OOM error during the start?
>  Can
> > > you
> > > > > see the new process with "ps -aef"?
> > > > >
> > > > > Can you run groovy in local mode?  Try "-jt local" option.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <
> shujamughal@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Patrick,
> > > > > > Thanks for explanation. I have supply the heapsize in mapper in
> the
> > > > > > following way
> > > > > >
> > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > > >
> > > > > > but still same error. Any other idea?
> > > > > > Thanks
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > > patrick@cloudera.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Shuja,
> > > > > > >
> > > > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit)
> > are
> > > > only
> > > > > > > used
> > > > > > > for child JVMs that get forked by the TaskTracker. You are
> using
> > > > Hadoop
> > > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > > streaming,
> > > > > > > which
> > > > > > > is then forking a shell process that runs your groovy code (in
> > > > another
> > > > > > > JVM).
> > > > > > >
> > > > > > > I'm not much of a groovy expert, but if there's a way you can
> > wrap
> > > > your
> > > > > > > code
> > > > > > > around the MapReduce API that would work best. Otherwise, you
> can
> > > > just
> > > > > > pass
> > > > > > > the heapsize in '-mapper' argument.
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > - Patrick
> > > > > > >
> > > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Alex,
> > > > > > > >
> > > > > > > > I have update the java to latest available version on all
> > > machines
> > > > in
> > > > > > the
> > > > > > > > cluster and now i run the job by adding this line
> > > > > > > >
> > > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > > >
> > > > > > > > but still same error. Here is the output of this job.
> > > > > > > >
> > > > > > > >
> > > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > /usr/lib/hadoop-0.20/con
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > > org.apache.hadoop.util.RunJar
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > -D
> > > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> > mapred.child.ulimit=3145728
> > > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > mapred.map.tasks=1
> > > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > > /home/ftpuser1/Nodemapp
> > > > > > > > er5.groovy
> > > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > > > Nodemapper5.groovy
> > > > > > > >
> > > > > > > >
> > > > > > > > Any clue?
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > > alexvk@cloudera.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuja,
> > > > > > > > >
> > > > > > > > > First, thank you for using CDH3.  Can you also check what
> m*
> > > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > > >
> > > > > > > > > I would also recommend to upgrade java to JDK 1.6 update 8
> at
> > a
> > > > > > > minimum,
> > > > > > > > > which you can download from the Java SE
> > > > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > > .
> > > > > > > > >
> > > > > > > > > Let me know how it goes.
> > > > > > > > >
> > > > > > > > > Alex K
> > > > > > > > >
> > > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > > shujamughal@gmail.com
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Hi Alex
> > > > > > > > > >
> > > > > > > > > > Yeah, I am running a job on cluster of 2 machines and
> using
> > > > > > Cloudera
> > > > > > > > > > distribution of hadoop. and here is the output of this
> > > command.
> > > > > > > > > >
> > > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> > /hadoop-0.20/logs
> > > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > > -Dhadoop.id.str= -Dhado
> op.root.logger=INFO,console
> > > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > > >
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > > StreamInputFormat
> > > > > > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> > > > > xmlns:HTML="
> > > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > > mapred.map.tasks=1
> > > > > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11
> > -mapper
> > > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > > > > Nodemapper5.groovy
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > > >
> > > > > > > > > > Best Regards
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > > alexvk@cloudera.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Shuja,
> > > > > > > > > > >
> > > > > > > > > > > It looks like the OOM is happening in your code.  Are
> you
> > > > > running
> > > > > > > > > > MapReduce
> > > > > > > > > > > in a cluster?  If so, can you send the exact command
> line
> > > > your
> > > > > > code
> > > > > > > > is
> > > > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > > command on one of the nodes which is running the task?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Alex K
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > > shujamughal@gmail.com
> > > > > > > > > > > >wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi All
> > > > > > > > > > > >
> > > > > > > > > > > > I am facing a hard problem. I am running a map reduce
> > job
> > > > > using
> > > > > > > > > > streaming
> > > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > > >
> > > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > > >
> > > > > > > > > > > > java.lang.RuntimeException:
> > > PipeMapRed.waitOutputThreads():
> > > > > > > > > subprocess
> > > > > > > > > > > > failed with code 1
> > > > > > > > > > > >        at
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > > >        at
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > > >
> > > > > > > > > > > >        at
> > > > > > > > > > >
> > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > > >        at
> > > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > > >        at
> > > > > > > > > > > >
> > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > > >        at
> > > > > > > > > >
> > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > > >
> > > > > > > > > > > >        at
> > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > > >        at
> > > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I have increased the heap size in hadoop-env.sh and
> > make
> > > it
> > > > > > > 2000M.
> > > > > > > > > Also
> > > > > > > > > > I
> > > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > > >
> > > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > > >
> > > > > > > > > > > > but it still gives the error. The same job runs fine
> if
> > i
> > > > run
> > > > > > on
> > > > > > > > > shell
> > > > > > > > > > > > using
> > > > > > > > > > > > 1024M heap size like
> > > > > > > > > > > >
> > > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Any clue?????????
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks in advance.
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Regards
> > > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > > _________________________________
> > > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Hmm.  It means your options are not propagated to the nodes.  Can you put *
mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers?  I
was under impression that the below should be enough though.  Glad you got
it working in local mode.  -- Alex K

On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <sh...@gmail.com> wrote:

> Hi Alex, I am using putty to connect to servers. and this is almost my
> maximum screen output which i sent. putty is not allowed me to increase the
> size of terminal. is there any other way that i get the complete output of
> ps-aef?
>
> Now i run the following command and thnx God, it did not fails and produce
> the desired output.
>
> hadoop jar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> -D mapred.child.java.opts=-Xmx1024m \
> -D mapred.child.ulimit=3145728 \
> -jt local \
> -inputformat StreamInputFormat \
> -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> \
> -input
>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> \
> -jobconf mapred.map.tasks=1 \
> -jobconf mapred.reduce.tasks=0 \
> -output RNC32 \
> -mapper /home/ftpuser1/Nodemapper5.groovy \
> -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> -file /home/ftpuser1/Nodemapper5.groovy
>
>
> but when i omit the -jt local, it produces the same error.
> Thanks Alex for helping
> Regards
> Shuja
>
> On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
> > command line, the last is valid.  Unfortunately you have truncated
> command
> > lines.  Can you show us the full command line, particularly for the
> process
> > 26162?  This seems to be causing problems.
> >
> > If you are running your cluster on 2 nodes, it may be that the task was
> > scheduled on the second node.  Did you run "ps -aef" on the second node
> as
> > well?  You can see the task assignment in the JT web-UI (
> > http://jt-name:50030, drill down to tasks).
> >
> > I suggest you first debug your program in the local mode first, however
> > (use
> > "*-jt local*" option).  Did you try the "*-D
> mapred.child.ulimit=3145728*"
> > option?  I do not see it on the command line.
> >
> > Alex K
> >
> > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com
> > >wrote:
> >
> > > Hi Alex
> > >
> > > I have tried with using quotes  and also with -jt local but same heap
> > > error.
> > > and here is the output  of ps -aef
> > >
> > > UID        PID  PPID  C STIME TTY          TIME CMD
> > > root         1     0  0 04:37 ?        00:00:00 init [3]
> > > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> > > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > > root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> > > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > > root      1032     1  0 04:38 ?        00:00:00 auditd
> > > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon --system
> > > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> > > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> > > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd --server
> > > root      1295     1  0 04:38 ?        00:00:00 automount
> > > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> > > root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive
> > -pidfile
> > > /var/run/xinetd.pid
> > > root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> > > /etc/vsftpd/vsftpd.conf
> > > root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
> > > connections
> > > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue
> runner@01
> > > :00:00
> > > for /var/spool/clientmqueue
> > > root      1379     1  0 04:38 ?        00:00:00 gpm -m /dev/input/mice
> -t
> > > exps2
> > > root      1410     1  0 04:38 ?        00:00:00 crond
> > > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv -daemon
> > > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > > 68        1508     1  0 04:38 ?        00:00:00 hald
> > > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > > root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q
> never
> > > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0 9600
> > > vt100-nav
> > > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
> > > /usr/sbin/yum-updatesd
> > > root      1539     1  0 04:38 ?        00:00:00 /usr/libexec/gam_server
> > > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> > > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> > > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> > > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> > > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > > hadoop   24808     1  0 12:01 ?        00:00:02
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop   24893     1  0 12:01 ?        00:00:01
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop   24988     1  0 12:01 ?        00:00:01
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop   25085     1  0 12:01 ?        00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > > -Dhadoop.lo
> > > hadoop   25175     1  0 12:01 ?        00:00:01
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > > -Dhadoop.log.file=hadoo
> > > root     25925 21994  1 12:06 pts/2    00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -
> > > hadoop   26120 25175 14 12:06 ?        00:00:01
> > > /usr/jdk1.6.0_03/jre/bin/java
> > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > hadoop   26162 26120 89 12:06 ?        00:00:05
> /usr/jdk1.6.0_03/bin/java
> > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > -Dscript.name=/usr/local/groovy/b
> > > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> > >
> > >
> > > *The command which i am executing is *
> > >
> > >
> > > hadoop jar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> \
> > > -D mapred.child.java.opts=-Xmx1024m \
> > > -inputformat StreamInputFormat \
> > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C> <
> > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > > \
> > > -input
> > >
> > >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > > \
> > > -jobconf mapred.map.tasks=1 \
> > > -jobconf mapred.reduce.tasks=0 \
> > > -output RNC25 \
> > > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > > -file /home/ftpuser1/Nodemapper5.groovy \
> > > -jt local
> > >
> > > I have noticed that the all hadoop processes showing 2001 memory size
> > which
> > > i have set in hadoop-env.sh. and one the command, i give 2000 in mapper
> > and
> > > 1024 in child.java.opts but i think these values(1024,2001) are not in
> > use.
> > > secondly the following lines
> > >
> > > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > > /usr/jdk1.6.0_03/jre/bin/java
> > >
> > >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > > hadoop   26162 26120 89 12:06 ?        00:00:05
> /usr/jdk1.6.0_03/bin/java
> > > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > > -Dscript.name=/usr/local/groovy/b*
> > >
> > > did not appear for first time when job runs. they appear when job
> failed
> > > for
> > > first time and then again try to start mapping. I have one more
> question
> > > which is as all hadoop processes (namenode, datanode, tasktracker...)
> > > showing 2001 heapsize in process. will it means  all the processes
> using
> > > 2001m of memory??
> > >
> > > Regards
> > > Shuja
> > >
> > >
> > > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > I think you need to enclose the invocation string in quotes.  Try:
> > > >
> > > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > > >
> > > > Also, it would be nice to see how exactly the groovy is invoked.  Is
> > > groovy
> > > > started and them gives you OOM or is OOM error during the start?  Can
> > you
> > > > see the new process with "ps -aef"?
> > > >
> > > > Can you run groovy in local mode?  Try "-jt local" option.
> > > >
> > > > Thanks,
> > > >
> > > > Alex K
> > > >
> > > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Patrick,
> > > > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > > > following way
> > > > >
> > > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > > >
> > > > > but still same error. Any other idea?
> > > > > Thanks
> > > > >
> > > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> > patrick@cloudera.com
> > > > > >wrote:
> > > > >
> > > > > > Shuja,
> > > > > >
> > > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit)
> are
> > > only
> > > > > > used
> > > > > > for child JVMs that get forked by the TaskTracker. You are using
> > > Hadoop
> > > > > > streaming, which means the TaskTracker is forking a JVM for
> > > streaming,
> > > > > > which
> > > > > > is then forking a shell process that runs your groovy code (in
> > > another
> > > > > > JVM).
> > > > > >
> > > > > > I'm not much of a groovy expert, but if there's a way you can
> wrap
> > > your
> > > > > > code
> > > > > > around the MapReduce API that would work best. Otherwise, you can
> > > just
> > > > > pass
> > > > > > the heapsize in '-mapper' argument.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > - Patrick
> > > > > >
> > > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Alex,
> > > > > > >
> > > > > > > I have update the java to latest available version on all
> > machines
> > > in
> > > > > the
> > > > > > > cluster and now i run the job by adding this line
> > > > > > >
> > > > > > > -D mapred.child.ulimit=3145728 \
> > > > > > >
> > > > > > > but still same error. Here is the output of this job.
> > > > > > >
> > > > > > >
> > > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > /usr/lib/hadoop-0.20/con
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > > org.apache.hadoop.util.RunJar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > -D
> > > > > > > mapre d.child.java.opts=-Xmx2000M -D
> mapred.child.ulimit=3145728
> > > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > mapred.map.tasks=1
> > > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > > /home/ftpuser1/Nodemapp
> > > > > > > er5.groovy
> > > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > > Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > > Any clue?
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> > alexvk@cloudera.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > First, thank you for using CDH3.  Can you also check what m*
> > > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > > >
> > > > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at
> a
> > > > > > minimum,
> > > > > > > > which you can download from the Java SE
> > > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > > .
> > > > > > > >
> > > > > > > > Let me know how it goes.
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Hi Alex
> > > > > > > > >
> > > > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > > > Cloudera
> > > > > > > > > distribution of hadoop. and here is the output of this
> > command.
> > > > > > > > >
> > > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib
> /hadoop-0.20/logs
> > > > > > > > > -Dhadoop.log.file=hadoop.log
> > > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > > >
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > > StreamInputFormat
> > > > > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> > > > xmlns:HTML="
> > > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > > mapred.map.tasks=1
> > > > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11
> -mapper
> > > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > > > Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > > >
> > > > > > > > > Best Regards
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > > alexvk@cloudera.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuja,
> > > > > > > > > >
> > > > > > > > > > It looks like the OOM is happening in your code.  Are you
> > > > running
> > > > > > > > > MapReduce
> > > > > > > > > > in a cluster?  If so, can you send the exact command line
> > > your
> > > > > code
> > > > > > > is
> > > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > > Nodemapper5.groovy'
> > > > > > > > > > command on one of the nodes which is running the task?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Alex K
> > > > > > > > > >
> > > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > > shujamughal@gmail.com
> > > > > > > > > > >wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi All
> > > > > > > > > > >
> > > > > > > > > > > I am facing a hard problem. I am running a map reduce
> job
> > > > using
> > > > > > > > > streaming
> > > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > > >
> > > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > > >
> > > > > > > > > > > java.lang.RuntimeException:
> > PipeMapRed.waitOutputThreads():
> > > > > > > > subprocess
> > > > > > > > > > > failed with code 1
> > > > > > > > > > >        at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > > >        at
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > > >
> > > > > > > > > > >        at
> > > > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > > >        at
> > > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > > >        at
> > > > > > > > > > >
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > > >        at
> > > > > > > > >
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > > >
> > > > > > > > > > >        at
> > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > > >        at
> > > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I have increased the heap size in hadoop-env.sh and
> make
> > it
> > > > > > 2000M.
> > > > > > > > Also
> > > > > > > > > I
> > > > > > > > > > > tell the job manually by following line.
> > > > > > > > > > >
> > > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > > >
> > > > > > > > > > > but it still gives the error. The same job runs fine if
> i
> > > run
> > > > > on
> > > > > > > > shell
> > > > > > > > > > > using
> > > > > > > > > > > 1024M heap size like
> > > > > > > > > > >
> > > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Any clue?????????
> > > > > > > > > > >
> > > > > > > > > > > Thanks in advance.
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Regards
> > > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > > _________________________________
> > > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex, I am using putty to connect to servers. and this is almost my
maximum screen output which i sent. putty is not allowed me to increase the
size of terminal. is there any other way that i get the complete output of
ps-aef?

Now i run the following command and thnx God, it did not fails and produce
the desired output.

hadoop jar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
-D mapred.child.java.opts=-Xmx1024m \
-D mapred.child.ulimit=3145728 \
-jt local \
-inputformat StreamInputFormat \
-inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
http://www.w3.org/TR/REC-xml\">,end=</mdc>" \
-input
/user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
\
-jobconf mapred.map.tasks=1 \
-jobconf mapred.reduce.tasks=0 \
-output RNC32 \
-mapper /home/ftpuser1/Nodemapper5.groovy \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer \
-file /home/ftpuser1/Nodemapper5.groovy


but when i omit the -jt local, it produces the same error.
Thanks Alex for helping
Regards
Shuja

On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <al...@cloudera.com> wrote:

> Hi Shuja,
>
> Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
> command line, the last is valid.  Unfortunately you have truncated command
> lines.  Can you show us the full command line, particularly for the process
> 26162?  This seems to be causing problems.
>
> If you are running your cluster on 2 nodes, it may be that the task was
> scheduled on the second node.  Did you run "ps -aef" on the second node as
> well?  You can see the task assignment in the JT web-UI (
> http://jt-name:50030, drill down to tasks).
>
> I suggest you first debug your program in the local mode first, however
> (use
> "*-jt local*" option).  Did you try the "*-D mapred.child.ulimit=3145728*"
> option?  I do not see it on the command line.
>
> Alex K
>
> On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <shujamughal@gmail.com
> >wrote:
>
> > Hi Alex
> >
> > I have tried with using quotes  and also with -jt local but same heap
> > error.
> > and here is the output  of ps -aef
> >
> > UID        PID  PPID  C STIME TTY          TIME CMD
> > root         1     0  0 04:37 ?        00:00:00 init [3]
> > root         2     1  0 04:37 ?        00:00:00 [migration/0]
> > root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> > root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> > root         5     1  0 04:37 ?        00:00:00 [events/0]
> > root         6     1  0 04:37 ?        00:00:00 [khelper]
> > root         7     1  0 04:37 ?        00:00:00 [kthread]
> > root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> > root        10     7  0 04:37 ?        00:00:00 [xenbus]
> > root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> > root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> > root        22     7  0 04:37 ?        00:00:00 [khubd]
> > root        24     7  0 04:37 ?        00:00:00 [kseriod]
> > root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> > root        85     7  0 04:37 ?        00:00:00 [pdflush]
> > root        86     7  0 04:37 ?        00:00:00 [pdflush]
> > root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> > root        88     7  0 04:37 ?        00:00:00 [aio/0]
> > root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> > root       248     7  0 04:37 ?        00:00:00 [kstriped]
> > root       257     7  0 04:37 ?        00:00:00 [kjournald]
> > root       279     7  0 04:37 ?        00:00:00 [kauditd]
> > root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> > root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> > root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> > root       660     7  0 04:37 ?        00:00:00 [kjournald]
> > root       662     7  0 04:37 ?        00:00:00 [kjournald]
> > root      1032     1  0 04:38 ?        00:00:00 auditd
> > root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> > root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> > root      1052     1  0 04:38 ?        00:00:00 klogd -x
> > root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> > root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> > dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon --system
> > root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> > root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> > root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> > root      1244     1  0 04:38 ?        00:00:00 pcscd
> > root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd --server
> > root      1295     1  0 04:38 ?        00:00:00 automount
> > root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> > root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive
> -pidfile
> > /var/run/xinetd.pid
> > root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> > /etc/vsftpd/vsftpd.conf
> > root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
> > connections
> > smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue runner@01
> > :00:00
> > for /var/spool/clientmqueue
> > root      1379     1  0 04:38 ?        00:00:00 gpm -m /dev/input/mice -t
> > exps2
> > root      1410     1  0 04:38 ?        00:00:00 crond
> > xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv -daemon
> > root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> > 68        1508     1  0 04:38 ?        00:00:00 hald
> > root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> > root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q never
> > root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0 9600
> > vt100-nav
> > root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
> > /usr/sbin/yum-updatesd
> > root      1539     1  0 04:38 ?        00:00:00 /usr/libexec/gam_server
> > root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> > root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> > root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> > root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> > root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> > root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> > root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> > root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> > hadoop   24808     1  0 12:01 ?        00:00:02 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop   24893     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop   24988     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop   25085     1  0 12:01 ?        00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> > -Dhadoop.lo
> > hadoop   25175     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> > -Dhadoop.log.file=hadoo
> > root     25925 21994  1 12:06 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -
> > hadoop   26120 25175 14 12:06 ?        00:00:01
> > /usr/jdk1.6.0_03/jre/bin/java
> >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
> > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > -Dscript.name=/usr/local/groovy/b
> > root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
> >
> >
> > *The command which i am executing is *
> >
> >
> > hadoop jar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> > -D mapred.child.java.opts=-Xmx1024m \
> > -inputformat StreamInputFormat \
> > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> > http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C> <
> http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> > \
> > -input
> >
> >
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> > \
> > -jobconf mapred.map.tasks=1 \
> > -jobconf mapred.reduce.tasks=0 \
> > -output RNC25 \
> > -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> > -file /home/ftpuser1/Nodemapper5.groovy \
> > -jt local
> >
> > I have noticed that the all hadoop processes showing 2001 memory size
> which
> > i have set in hadoop-env.sh. and one the command, i give 2000 in mapper
> and
> > 1024 in child.java.opts but i think these values(1024,2001) are not in
> use.
> > secondly the following lines
> >
> > *hadoop   26120 25175 14 12:06 ?        00:00:01
> > /usr/jdk1.6.0_03/jre/bin/java
> >
> >
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> > hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
> > -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> > -Dscript.name=/usr/local/groovy/b*
> >
> > did not appear for first time when job runs. they appear when job failed
> > for
> > first time and then again try to start mapping. I have one more question
> > which is as all hadoop processes (namenode, datanode, tasktracker...)
> > showing 2001 heapsize in process. will it means  all the processes using
> > 2001m of memory??
> >
> > Regards
> > Shuja
> >
> >
> > On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > I think you need to enclose the invocation string in quotes.  Try:
> > >
> > > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> > >
> > > Also, it would be nice to see how exactly the groovy is invoked.  Is
> > groovy
> > > started and them gives you OOM or is OOM error during the start?  Can
> you
> > > see the new process with "ps -aef"?
> > >
> > > Can you run groovy in local mode?  Try "-jt local" option.
> > >
> > > Thanks,
> > >
> > > Alex K
> > >
> > > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Patrick,
> > > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > > following way
> > > >
> > > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > > >
> > > > but still same error. Any other idea?
> > > > Thanks
> > > >
> > > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <
> patrick@cloudera.com
> > > > >wrote:
> > > >
> > > > > Shuja,
> > > > >
> > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are
> > only
> > > > > used
> > > > > for child JVMs that get forked by the TaskTracker. You are using
> > Hadoop
> > > > > streaming, which means the TaskTracker is forking a JVM for
> > streaming,
> > > > > which
> > > > > is then forking a shell process that runs your groovy code (in
> > another
> > > > > JVM).
> > > > >
> > > > > I'm not much of a groovy expert, but if there's a way you can wrap
> > your
> > > > > code
> > > > > around the MapReduce API that would work best. Otherwise, you can
> > just
> > > > pass
> > > > > the heapsize in '-mapper' argument.
> > > > >
> > > > > Regards,
> > > > >
> > > > > - Patrick
> > > > >
> > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <
> shujamughal@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Alex,
> > > > > >
> > > > > > I have update the java to latest available version on all
> machines
> > in
> > > > the
> > > > > > cluster and now i run the job by adding this line
> > > > > >
> > > > > > -D mapred.child.ulimit=3145728 \
> > > > > >
> > > > > > but still same error. Here is the output of this job.
> > > > > >
> > > > > >
> > > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > > -Dhadoop.log.file=hadoop.log -Dha
> > doop.home.dir=/usr/lib/hadoop-0.20
> > > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > /usr/lib/hadoop-0.20/con
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > > org.apache.hadoop.util.RunJar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > -D
> > > > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > > > -inputformat StreamIn putFormat -inputreader
> > > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > mapred.map.tasks=1
> > > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > > /home/ftpuser1/Nodemapp
> > > > > > er5.groovy
> > > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > > Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > Any clue?
> > > > > > Thanks
> > > > > >
> > > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <
> alexvk@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > First, thank you for using CDH3.  Can you also check what m*
> > > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > > >
> > > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > > > > minimum,
> > > > > > > which you can download from the Java SE
> > > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > > .
> > > > > > >
> > > > > > > Let me know how it goes.
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi Alex
> > > > > > > >
> > > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > > Cloudera
> > > > > > > > distribution of hadoop. and here is the output of this
> command.
> > > > > > > >
> > > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > > > > > -Dhadoop.log.file=hadoop.log
> > > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > > >
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > > StreamInputFormat
> > > > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> > > xmlns:HTML="
> > > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > > mapred.map.tasks=1
> > > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > > Nodemapper5.groovy
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > > alexvk@cloudera.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuja,
> > > > > > > > >
> > > > > > > > > It looks like the OOM is happening in your code.  Are you
> > > running
> > > > > > > > MapReduce
> > > > > > > > > in a cluster?  If so, can you send the exact command line
> > your
> > > > code
> > > > > > is
> > > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > > Nodemapper5.groovy'
> > > > > > > > > command on one of the nodes which is running the task?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Alex K
> > > > > > > > >
> > > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > > shujamughal@gmail.com
> > > > > > > > > >wrote:
> > > > > > > > >
> > > > > > > > > > Hi All
> > > > > > > > > >
> > > > > > > > > > I am facing a hard problem. I am running a map reduce job
> > > using
> > > > > > > > streaming
> > > > > > > > > > but it fails and it gives the following error.
> > > > > > > > > >
> > > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > > >
> > > > > > > > > > java.lang.RuntimeException:
> PipeMapRed.waitOutputThreads():
> > > > > > > subprocess
> > > > > > > > > > failed with code 1
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > > >
> > > > > > > > > >        at
> > > > > > > > >
> > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > > >        at
> > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > > >        at
> > > > > > > > > >
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > > >        at
> > > > > > > >
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > > >
> > > > > > > > > >        at
> > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > > >        at
> > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I have increased the heap size in hadoop-env.sh and make
> it
> > > > > 2000M.
> > > > > > > Also
> > > > > > > > I
> > > > > > > > > > tell the job manually by following line.
> > > > > > > > > >
> > > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > > >
> > > > > > > > > > but it still gives the error. The same job runs fine if i
> > run
> > > > on
> > > > > > > shell
> > > > > > > > > > using
> > > > > > > > > > 1024M heap size like
> > > > > > > > > >
> > > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Any clue?????????
> > > > > > > > > >
> > > > > > > > > > Thanks in advance.
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Regards
> > > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > > _________________________________
> > > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > > Cell: +92 3214207445
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,

Java listens to the last xmx, so if you have multiple "-Xmx ..." on the
command line, the last is valid.  Unfortunately you have truncated command
lines.  Can you show us the full command line, particularly for the process
26162?  This seems to be causing problems.

If you are running your cluster on 2 nodes, it may be that the task was
scheduled on the second node.  Did you run "ps -aef" on the second node as
well?  You can see the task assignment in the JT web-UI (
http://jt-name:50030, drill down to tasks).

I suggest you first debug your program in the local mode first, however (use
"*-jt local*" option).  Did you try the "*-D mapred.child.ulimit=3145728*"
option?  I do not see it on the command line.

Alex K

On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <sh...@gmail.com>wrote:

> Hi Alex
>
> I have tried with using quotes  and also with -jt local but same heap
> error.
> and here is the output  of ps -aef
>
> UID        PID  PPID  C STIME TTY          TIME CMD
> root         1     0  0 04:37 ?        00:00:00 init [3]
> root         2     1  0 04:37 ?        00:00:00 [migration/0]
> root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
> root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
> root         5     1  0 04:37 ?        00:00:00 [events/0]
> root         6     1  0 04:37 ?        00:00:00 [khelper]
> root         7     1  0 04:37 ?        00:00:00 [kthread]
> root         9     7  0 04:37 ?        00:00:00 [xenwatch]
> root        10     7  0 04:37 ?        00:00:00 [xenbus]
> root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
> root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
> root        22     7  0 04:37 ?        00:00:00 [khubd]
> root        24     7  0 04:37 ?        00:00:00 [kseriod]
> root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
> root        85     7  0 04:37 ?        00:00:00 [pdflush]
> root        86     7  0 04:37 ?        00:00:00 [pdflush]
> root        87     7  0 04:37 ?        00:00:00 [kswapd0]
> root        88     7  0 04:37 ?        00:00:00 [aio/0]
> root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
> root       248     7  0 04:37 ?        00:00:00 [kstriped]
> root       257     7  0 04:37 ?        00:00:00 [kjournald]
> root       279     7  0 04:37 ?        00:00:00 [kauditd]
> root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
> root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
> root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
> root       660     7  0 04:37 ?        00:00:00 [kjournald]
> root       662     7  0 04:37 ?        00:00:00 [kjournald]
> root      1032     1  0 04:38 ?        00:00:00 auditd
> root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
> root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
> root      1052     1  0 04:38 ?        00:00:00 klogd -x
> root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
> root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
> dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon --system
> root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
> root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
> root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
> root      1244     1  0 04:38 ?        00:00:00 pcscd
> root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd --server
> root      1295     1  0 04:38 ?        00:00:00 automount
> root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
> root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive -pidfile
> /var/run/xinetd.pid
> root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
> /etc/vsftpd/vsftpd.conf
> root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
> connections
> smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue runner@01
> :00:00
> for /var/spool/clientmqueue
> root      1379     1  0 04:38 ?        00:00:00 gpm -m /dev/input/mice -t
> exps2
> root      1410     1  0 04:38 ?        00:00:00 crond
> xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv -daemon
> root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
> 68        1508     1  0 04:38 ?        00:00:00 hald
> root      1509  1508  0 04:38 ?        00:00:00 hald-runner
> root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q never
> root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0 9600
> vt100-nav
> root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
> /usr/sbin/yum-updatesd
> root      1539     1  0 04:38 ?        00:00:00 /usr/libexec/gam_server
> root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
> root     21024 21022  0 11:27 pts/0    00:00:00 -bash
> root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
> root     21105 21103  0 11:28 pts/1    00:00:00 -bash
> root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
> root     21994 21992  0 11:47 pts/2    00:00:00 -bash
> root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
> root     22437 22433  0 11:49 pts/3    00:00:00 -bash
> hadoop   24808     1  0 12:01 ?        00:00:02 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   24893     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   24988     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   25085     1  0 12:01 ?        00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
> -Dhadoop.lo
> hadoop   25175     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
> -Dhadoop.log.file=hadoo
> root     25925 21994  1 12:06 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -
> hadoop   26120 25175 14 12:06 ?        00:00:01
> /usr/jdk1.6.0_03/jre/bin/java
>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
> -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> -Dscript.name=/usr/local/groovy/b
> root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef
>
>
> *The command which i am executing is *
>
>
> hadoop jar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
> -D mapred.child.java.opts=-Xmx1024m \
> -inputformat StreamInputFormat \
> -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
> http://www.w3.org/TR/REC-xml\ <http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>"
> \
> -input
>
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
> \
> -jobconf mapred.map.tasks=1 \
> -jobconf mapred.reduce.tasks=0 \
> -output RNC25 \
> -mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
> -reducer org.apache.hadoop.mapred.lib.IdentityReducer \
> -file /home/ftpuser1/Nodemapper5.groovy \
> -jt local
>
> I have noticed that the all hadoop processes showing 2001 memory size which
> i have set in hadoop-env.sh. and one the command, i give 2000 in mapper and
> 1024 in child.java.opts but i think these values(1024,2001) are not in use.
> secondly the following lines
>
> *hadoop   26120 25175 14 12:06 ?        00:00:01
> /usr/jdk1.6.0_03/jre/bin/java
>
> -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
> hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
> -classpath /usr/local/groovy/lib/groovy-1.7.3.jar
> -Dscript.name=/usr/local/groovy/b*
>
> did not appear for first time when job runs. they appear when job failed
> for
> first time and then again try to start mapping. I have one more question
> which is as all hadoop processes (namenode, datanode, tasktracker...)
> showing 2001 heapsize in process. will it means  all the processes using
> 2001m of memory??
>
> Regards
> Shuja
>
>
> On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > I think you need to enclose the invocation string in quotes.  Try:
> >
> > -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
> >
> > Also, it would be nice to see how exactly the groovy is invoked.  Is
> groovy
> > started and them gives you OOM or is OOM error during the start?  Can you
> > see the new process with "ps -aef"?
> >
> > Can you run groovy in local mode?  Try "-jt local" option.
> >
> > Thanks,
> >
> > Alex K
> >
> > On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi Patrick,
> > > Thanks for explanation. I have supply the heapsize in mapper in the
> > > following way
> > >
> > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> > >
> > > but still same error. Any other idea?
> > > Thanks
> > >
> > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> > > >wrote:
> > >
> > > > Shuja,
> > > >
> > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are
> only
> > > > used
> > > > for child JVMs that get forked by the TaskTracker. You are using
> Hadoop
> > > > streaming, which means the TaskTracker is forking a JVM for
> streaming,
> > > > which
> > > > is then forking a shell process that runs your groovy code (in
> another
> > > > JVM).
> > > >
> > > > I'm not much of a groovy expert, but if there's a way you can wrap
> your
> > > > code
> > > > around the MapReduce API that would work best. Otherwise, you can
> just
> > > pass
> > > > the heapsize in '-mapper' argument.
> > > >
> > > > Regards,
> > > >
> > > > - Patrick
> > > >
> > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <shujamughal@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Alex,
> > > > >
> > > > > I have update the java to latest available version on all machines
> in
> > > the
> > > > > cluster and now i run the job by adding this line
> > > > >
> > > > > -D mapred.child.ulimit=3145728 \
> > > > >
> > > > > but still same error. Here is the output of this job.
> > > > >
> > > > >
> > > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -Dha
> doop.home.dir=/usr/lib/hadoop-0.20
> > > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > /usr/lib/hadoop-0.20/con
> > > > >
> > > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > > >
> > > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > > >
> > > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > > >
> > > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > > >
> > > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > > >
> > > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > > >
> > > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > > >
> > > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > > >
> > > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > > >
> > > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > > >
> > > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > > >
> > > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > > >
> > > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > > org.apache.hadoop.util.RunJar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > -D
> > > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > > -inputformat StreamIn putFormat -inputreader
> > > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> mapred.map.tasks=1
> > > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > > /home/ftpuser1/Nodemapp
> > > > > er5.groovy
> > > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> > Nodemapper5.groovy
> > > > >
> > > > >
> > > > > Any clue?
> > > > > Thanks
> > > > >
> > > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > First, thank you for using CDH3.  Can you also check what m*
> > > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > > >
> > > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > > > minimum,
> > > > > > which you can download from the Java SE
> > > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > > .
> > > > > >
> > > > > > Let me know how it goes.
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi Alex
> > > > > > >
> > > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > > Cloudera
> > > > > > > distribution of hadoop. and here is the output of this command.
> > > > > > >
> > > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > > /usr/jdk1.6.0_03/bin/java
> > > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > > > > -Dhadoop.log.file=hadoop.log
> > -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > > >
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> > StreamInputFormat
> > > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> > xmlns:HTML="
> > > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > > mapred.map.tasks=1
> > > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > > Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > > and what is meant by OOM and thanks for helping,
> > > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> > alexvk@cloudera.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuja,
> > > > > > > >
> > > > > > > > It looks like the OOM is happening in your code.  Are you
> > running
> > > > > > > MapReduce
> > > > > > > > in a cluster?  If so, can you send the exact command line
> your
> > > code
> > > > > is
> > > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > > Nodemapper5.groovy'
> > > > > > > > command on one of the nodes which is running the task?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Alex K
> > > > > > > >
> > > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > > shujamughal@gmail.com
> > > > > > > > >wrote:
> > > > > > > >
> > > > > > > > > Hi All
> > > > > > > > >
> > > > > > > > > I am facing a hard problem. I am running a map reduce job
> > using
> > > > > > > streaming
> > > > > > > > > but it fails and it gives the following error.
> > > > > > > > >
> > > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > > >
> > > > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > > > subprocess
> > > > > > > > > failed with code 1
> > > > > > > > >        at
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > > >        at
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > > >
> > > > > > > > >        at
> > > > > > > >
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > > >        at
> > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > > >        at
> > > > > > > > >
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > > >        at
> > > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > > >
> > > > > > > > >        at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > > >        at
> org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > > > 2000M.
> > > > > > Also
> > > > > > > I
> > > > > > > > > tell the job manually by following line.
> > > > > > > > >
> > > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > > >
> > > > > > > > > but it still gives the error. The same job runs fine if i
> run
> > > on
> > > > > > shell
> > > > > > > > > using
> > > > > > > > > 1024M heap size like
> > > > > > > > >
> > > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Any clue?????????
> > > > > > > > >
> > > > > > > > > Thanks in advance.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Regards
> > > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > > _________________________________
> > > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > > Cell: +92 3214207445
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex

I have tried with using quotes  and also with -jt local but same heap error.
and here is the output  of ps -aef

UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 04:37 ?        00:00:00 init [3]
root         2     1  0 04:37 ?        00:00:00 [migration/0]
root         3     1  0 04:37 ?        00:00:00 [ksoftirqd/0]
root         4     1  0 04:37 ?        00:00:00 [watchdog/0]
root         5     1  0 04:37 ?        00:00:00 [events/0]
root         6     1  0 04:37 ?        00:00:00 [khelper]
root         7     1  0 04:37 ?        00:00:00 [kthread]
root         9     7  0 04:37 ?        00:00:00 [xenwatch]
root        10     7  0 04:37 ?        00:00:00 [xenbus]
root        17     7  0 04:37 ?        00:00:00 [kblockd/0]
root        18     7  0 04:37 ?        00:00:00 [cqueue/0]
root        22     7  0 04:37 ?        00:00:00 [khubd]
root        24     7  0 04:37 ?        00:00:00 [kseriod]
root        84     7  0 04:37 ?        00:00:00 [khungtaskd]
root        85     7  0 04:37 ?        00:00:00 [pdflush]
root        86     7  0 04:37 ?        00:00:00 [pdflush]
root        87     7  0 04:37 ?        00:00:00 [kswapd0]
root        88     7  0 04:37 ?        00:00:00 [aio/0]
root       229     7  0 04:37 ?        00:00:00 [kpsmoused]
root       248     7  0 04:37 ?        00:00:00 [kstriped]
root       257     7  0 04:37 ?        00:00:00 [kjournald]
root       279     7  0 04:37 ?        00:00:00 [kauditd]
root       307     1  0 04:37 ?        00:00:00 /sbin/udevd -d
root       634     7  0 04:37 ?        00:00:00 [kmpathd/0]
root       635     7  0 04:37 ?        00:00:00 [kmpath_handlerd]
root       660     7  0 04:37 ?        00:00:00 [kjournald]
root       662     7  0 04:37 ?        00:00:00 [kjournald]
root      1032     1  0 04:38 ?        00:00:00 auditd
root      1034  1032  0 04:38 ?        00:00:00 /sbin/audispd
root      1049     1  0 04:38 ?        00:00:00 syslogd -m 0
root      1052     1  0 04:38 ?        00:00:00 klogd -x
root      1090     7  0 04:38 ?        00:00:00 [rpciod/0]
root      1158     1  0 04:38 ?        00:00:00 rpc.idmapd
dbus      1171     1  0 04:38 ?        00:00:00 dbus-daemon --system
root      1184     1  0 04:38 ?        00:00:00 /usr/sbin/hcid
root      1190     1  0 04:38 ?        00:00:00 /usr/sbin/sdpd
root      1210     1  0 04:38 ?        00:00:00 [krfcommd]
root      1244     1  0 04:38 ?        00:00:00 pcscd
root      1264     1  0 04:38 ?        00:00:00 /usr/bin/hidd --server
root      1295     1  0 04:38 ?        00:00:00 automount
root      1314     1  0 04:38 ?        00:00:00 /usr/sbin/sshd
root      1326     1  0 04:38 ?        00:00:00 xinetd -stayalive -pidfile
/var/run/xinetd.pid
root      1337     1  0 04:38 ?        00:00:00 /usr/sbin/vsftpd
/etc/vsftpd/vsftpd.conf
root      1354     1  0 04:38 ?        00:00:00 sendmail: accepting
connections
smmsp     1362     1  0 04:38 ?        00:00:00 sendmail: Queue runner@01:00:00
for /var/spool/clientmqueue
root      1379     1  0 04:38 ?        00:00:00 gpm -m /dev/input/mice -t
exps2
root      1410     1  0 04:38 ?        00:00:00 crond
xfs       1450     1  0 04:38 ?        00:00:00 xfs -droppriv -daemon
root      1482     1  0 04:38 ?        00:00:00 /usr/sbin/atd
68        1508     1  0 04:38 ?        00:00:00 hald
root      1509  1508  0 04:38 ?        00:00:00 hald-runner
root      1533     1  0 04:38 ?        00:00:00 /usr/sbin/smartd -q never
root      1536     1  0 04:38 xvc0     00:00:00 /sbin/agetty xvc0 9600
vt100-nav
root      1537     1  0 04:38 ?        00:00:00 /usr/bin/python -tt
/usr/sbin/yum-updatesd
root      1539     1  0 04:38 ?        00:00:00 /usr/libexec/gam_server
root     21022  1314  0 11:27 ?        00:00:00 sshd: root@pts/0
root     21024 21022  0 11:27 pts/0    00:00:00 -bash
root     21103  1314  0 11:28 ?        00:00:00 sshd: root@pts/1
root     21105 21103  0 11:28 pts/1    00:00:00 -bash
root     21992  1314  0 11:47 ?        00:00:00 sshd: root@pts/2
root     21994 21992  0 11:47 pts/2    00:00:00 -bash
root     22433  1314  0 11:49 ?        00:00:00 sshd: root@pts/3
root     22437 22433  0 11:49 pts/3    00:00:00 -bash
hadoop   24808     1  0 12:01 ?        00:00:02 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop   24893     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop   24988     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop   25085     1  0 12:01 ?        00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dhadoop.lo
hadoop   25175     1  0 12:01 ?        00:00:01 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs
-Dhadoop.log.file=hadoo
root     25925 21994  1 12:06 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
-Dhadoop.log.file=hadoop.log -
hadoop   26120 25175 14 12:06 ?        00:00:01
/usr/jdk1.6.0_03/jre/bin/java
-Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
-classpath /usr/local/groovy/lib/groovy-1.7.3.jar
-Dscript.name=/usr/local/groovy/b
root     26185 22437  0 12:07 pts/3    00:00:00 ps -aef


*The command which i am executing is *


hadoop jar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \
-D mapred.child.java.opts=-Xmx1024m \
-inputformat StreamInputFormat \
-inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\"
http://www.w3.org/TR/REC-xml\">,end=</mdc>" \
-input
/user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02
\
-jobconf mapred.map.tasks=1 \
-jobconf mapred.reduce.tasks=0 \
-output RNC25 \
-mapper "/home/ftpuser1/Nodemapper5.groovy  -Xmx2000m"\
-reducer org.apache.hadoop.mapred.lib.IdentityReducer \
-file /home/ftpuser1/Nodemapper5.groovy \
-jt local

I have noticed that the all hadoop processes showing 2001 memory size which
i have set in hadoop-env.sh. and one the command, i give 2000 in mapper and
1024 in child.java.opts but i think these values(1024,2001) are not in use.
secondly the following lines

*hadoop   26120 25175 14 12:06 ?        00:00:01
/usr/jdk1.6.0_03/jre/bin/java
-Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l
hadoop   26162 26120 89 12:06 ?        00:00:05 /usr/jdk1.6.0_03/bin/java
-classpath /usr/local/groovy/lib/groovy-1.7.3.jar
-Dscript.name=/usr/local/groovy/b*

did not appear for first time when job runs. they appear when job failed for
first time and then again try to start mapping. I have one more question
which is as all hadoop processes (namenode, datanode, tasktracker...)
showing 2001 heapsize in process. will it means  all the processes using
2001m of memory??

Regards
Shuja


On Mon, Jul 12, 2010 at 8:51 PM, Alex Kozlov <al...@cloudera.com> wrote:

> Hi Shuja,
>
> I think you need to enclose the invocation string in quotes.  Try:
>
> -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"
>
> Also, it would be nice to see how exactly the groovy is invoked.  Is groovy
> started and them gives you OOM or is OOM error during the start?  Can you
> see the new process with "ps -aef"?
>
> Can you run groovy in local mode?  Try "-jt local" option.
>
> Thanks,
>
> Alex K
>
> On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi Patrick,
> > Thanks for explanation. I have supply the heapsize in mapper in the
> > following way
> >
> > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
> >
> > but still same error. Any other idea?
> > Thanks
> >
> > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> > >wrote:
> >
> > > Shuja,
> > >
> > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> > > used
> > > for child JVMs that get forked by the TaskTracker. You are using Hadoop
> > > streaming, which means the TaskTracker is forking a JVM for streaming,
> > > which
> > > is then forking a shell process that runs your groovy code (in another
> > > JVM).
> > >
> > > I'm not much of a groovy expert, but if there's a way you can wrap your
> > > code
> > > around the MapReduce API that would work best. Otherwise, you can just
> > pass
> > > the heapsize in '-mapper' argument.
> > >
> > > Regards,
> > >
> > > - Patrick
> > >
> > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com>
> > > wrote:
> > >
> > > > Hi Alex,
> > > >
> > > > I have update the java to latest available version on all machines in
> > the
> > > > cluster and now i run the job by adding this line
> > > >
> > > > -D mapred.child.ulimit=3145728 \
> > > >
> > > > but still same error. Here is the output of this job.
> > > >
> > > >
> > > > root      7845  5674  3 01:24 pts/1    00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > /usr/lib/hadoop-0.20/con
> > > >
> > > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > > >
> > > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > > >
> > > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > > >
> > > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > > >
> > > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > > >
> > > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > > >
> > > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > > >
> > > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > > >
> > > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > > >
> > > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > > >
> > > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > > >
> > > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > > >
> > > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > > >
> > > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > > >
> > > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > > >
> > > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > > >
> > > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > > >
> > > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > > org.apache.hadoop.util.RunJar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > -D
> > > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > > -inputformat StreamIn putFormat -inputreader
> > > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > > 3.org/TR/REC-xml">,end=</mdc>
> > > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > > /home/ftpuser1/Nodemapp
> > > > er5.groovy
> > > > root      7930  7632  0 01:24 pts/2    00:00:00 grep
> Nodemapper5.groovy
> > > >
> > > >
> > > > Any clue?
> > > > Thanks
> > > >
> > > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > First, thank you for using CDH3.  Can you also check what m*
> > > > > apred.child.ulimit* you are using?  Try adding "*
> > > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > > >
> > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > > minimum,
> > > > > which you can download from the Java SE
> > > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > > .
> > > > >
> > > > > Let me know how it goes.
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi Alex
> > > > > >
> > > > > > Yeah, I am running a job on cluster of 2 machines and using
> > Cloudera
> > > > > > distribution of hadoop. and here is the output of this command.
> > > > > >
> > > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > > /usr/jdk1.6.0_03/bin/java
> > > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > > > -Dhadoop.log.file=hadoop.log
> -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > > >
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat
> StreamInputFormat
> > > > > > -inputreader StreamXmlRecordReader,begin=         <mdc
> xmlns:HTML="
> > > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> > mapred.map.tasks=1
> > > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > > Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > > and what is meant by OOM and thanks for helping,
> > > > > >
> > > > > > Best Regards
> > > > > >
> > > > > >
> > > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <
> alexvk@cloudera.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Shuja,
> > > > > > >
> > > > > > > It looks like the OOM is happening in your code.  Are you
> running
> > > > > > MapReduce
> > > > > > > in a cluster?  If so, can you send the exact command line your
> > code
> > > > is
> > > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > > Nodemapper5.groovy'
> > > > > > > command on one of the nodes which is running the task?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Alex K
> > > > > > >
> > > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > > shujamughal@gmail.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Hi All
> > > > > > > >
> > > > > > > > I am facing a hard problem. I am running a map reduce job
> using
> > > > > > streaming
> > > > > > > > but it fails and it gives the following error.
> > > > > > > >
> > > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > > >
> > > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > > subprocess
> > > > > > > > failed with code 1
> > > > > > > >        at
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > > >        at
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > > >
> > > > > > > >        at
> > > > > > >
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > > >        at
> > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > > >        at
> > > > > > > >
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > > >        at
> > > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > > >
> > > > > > > >        at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > > >
> > > > > > > >
> > > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > > 2000M.
> > > > > Also
> > > > > > I
> > > > > > > > tell the job manually by following line.
> > > > > > > >
> > > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > > >
> > > > > > > > but it still gives the error. The same job runs fine if i run
> > on
> > > > > shell
> > > > > > > > using
> > > > > > > > 1024M heap size like
> > > > > > > >
> > > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > > >
> > > > > > > >
> > > > > > > > Any clue?????????
> > > > > > > >
> > > > > > > > Thanks in advance.
> > > > > > > >
> > > > > > > > --
> > > > > > > > Regards
> > > > > > > > Shuja-ur-Rehman Baig
> > > > > > > > _________________________________
> > > > > > > > MS CS - School of Science and Engineering
> > > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > > Cell: +92 3214207445
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,

I think you need to enclose the invocation string in quotes.  Try:

-mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m"

Also, it would be nice to see how exactly the groovy is invoked.  Is groovy
started and them gives you OOM or is OOM error during the start?  Can you
see the new process with "ps -aef"?

Can you run groovy in local mode?  Try "-jt local" option.

Thanks,

Alex K

On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <sh...@gmail.com> wrote:

> Hi Patrick,
> Thanks for explanation. I have supply the heapsize in mapper in the
> following way
>
> -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \
>
> but still same error. Any other idea?
> Thanks
>
> On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <patrick@cloudera.com
> >wrote:
>
> > Shuja,
> >
> > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> > used
> > for child JVMs that get forked by the TaskTracker. You are using Hadoop
> > streaming, which means the TaskTracker is forking a JVM for streaming,
> > which
> > is then forking a shell process that runs your groovy code (in another
> > JVM).
> >
> > I'm not much of a groovy expert, but if there's a way you can wrap your
> > code
> > around the MapReduce API that would work best. Otherwise, you can just
> pass
> > the heapsize in '-mapper' argument.
> >
> > Regards,
> >
> > - Patrick
> >
> > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com>
> > wrote:
> >
> > > Hi Alex,
> > >
> > > I have update the java to latest available version on all machines in
> the
> > > cluster and now i run the job by adding this line
> > >
> > > -D mapred.child.ulimit=3145728 \
> > >
> > > but still same error. Here is the output of this job.
> > >
> > >
> > > root      7845  5674  3 01:24 pts/1    00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > /usr/lib/hadoop-0.20/con
> > >
> > >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> > >
> > >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> > >
> > >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> > >
> > >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> > >
> > >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> > >
> > >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> > >
> > >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> > >
> > >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> > >
> > >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> > >
> > >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> > >
> > >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> > >
> > >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> > >
> > >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> > >
> > >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> > >
> > >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> > >
> > >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> > >
> > >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> > >
> > >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > > org.apache.hadoop.util.RunJar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> -D
> > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > > -inputformat StreamIn putFormat -inputreader
> > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > > 3.org/TR/REC-xml">,end=</mdc>
> > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > > org.apache.hadoop.mapred.lib.IdentityReducer -file
> > /home/ftpuser1/Nodemapp
> > > er5.groovy
> > > root      7930  7632  0 01:24 pts/2    00:00:00 grep Nodemapper5.groovy
> > >
> > >
> > > Any clue?
> > > Thanks
> > >
> > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > First, thank you for using CDH3.  Can you also check what m*
> > > > apred.child.ulimit* you are using?  Try adding "*
> > > > -D mapred.child.ulimit=3145728*" to the command line.
> > > >
> > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> > minimum,
> > > > which you can download from the Java SE
> > > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > > .
> > > >
> > > > Let me know how it goes.
> > > >
> > > > Alex K
> > > >
> > > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <
> shujamughal@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi Alex
> > > > >
> > > > > Yeah, I am running a job on cluster of 2 machines and using
> Cloudera
> > > > > distribution of hadoop. and here is the output of this command.
> > > > >
> > > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > > /usr/jdk1.6.0_03/bin/java
> > > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > > >
> > > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > > >
> > > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > > >
> > > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > > >
> > > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > > >
> > > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > > >
> > > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > > >
> > > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > > >
> > > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > > >
> > > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > > >
> > > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > > >
> > > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > > >
> > > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > > >
> > > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > > >
> > > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > > >
> > > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > > >
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > > > -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> > > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf
> mapred.map.tasks=1
> > > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > > home/ftpuser1/Nodemapper5.groovy
> > > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> > Nodemapper5.groovy
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > > and what is meant by OOM and thanks for helping,
> > > > >
> > > > > Best Regards
> > > > >
> > > > >
> > > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <alexvk@cloudera.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi Shuja,
> > > > > >
> > > > > > It looks like the OOM is happening in your code.  Are you running
> > > > > MapReduce
> > > > > > in a cluster?  If so, can you send the exact command line your
> code
> > > is
> > > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > > Nodemapper5.groovy'
> > > > > > command on one of the nodes which is running the task?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex K
> > > > > >
> > > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > > shujamughal@gmail.com
> > > > > > >wrote:
> > > > > >
> > > > > > > Hi All
> > > > > > >
> > > > > > > I am facing a hard problem. I am running a map reduce job using
> > > > > streaming
> > > > > > > but it fails and it gives the following error.
> > > > > > >
> > > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > > >
> > > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > > subprocess
> > > > > > > failed with code 1
> > > > > > >        at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > > >        at
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > > >
> > > > > > >        at
> > > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > > >        at
> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > > >        at
> > > > > > >
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > > >        at
> > > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > > >
> > > > > > >        at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > >
> > > > > > >
> > > > > > > I have increased the heap size in hadoop-env.sh and make it
> > 2000M.
> > > > Also
> > > > > I
> > > > > > > tell the job manually by following line.
> > > > > > >
> > > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > > >
> > > > > > > but it still gives the error. The same job runs fine if i run
> on
> > > > shell
> > > > > > > using
> > > > > > > 1024M heap size like
> > > > > > >
> > > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > > >
> > > > > > >
> > > > > > > Any clue?????????
> > > > > > >
> > > > > > > Thanks in advance.
> > > > > > >
> > > > > > > --
> > > > > > > Regards
> > > > > > > Shuja-ur-Rehman Baig
> > > > > > > _________________________________
> > > > > > > MS CS - School of Science and Engineering
> > > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > > Cell: +92 3214207445
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi Patrick,
Thanks for explanation. I have supply the heapsize in mapper in the
following way

-mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \

but still same error. Any other idea?
Thanks

On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <pa...@cloudera.com>wrote:

> Shuja,
>
> Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only
> used
> for child JVMs that get forked by the TaskTracker. You are using Hadoop
> streaming, which means the TaskTracker is forking a JVM for streaming,
> which
> is then forking a shell process that runs your groovy code (in another
> JVM).
>
> I'm not much of a groovy expert, but if there's a way you can wrap your
> code
> around the MapReduce API that would work best. Otherwise, you can just pass
> the heapsize in '-mapper' argument.
>
> Regards,
>
> - Patrick
>
> On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com>
> wrote:
>
> > Hi Alex,
> >
> > I have update the java to latest available version on all machines in the
> > cluster and now i run the job by adding this line
> >
> > -D mapred.child.ulimit=3145728 \
> >
> > but still same error. Here is the output of this job.
> >
> >
> > root      7845  5674  3 01:24 pts/1    00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> /usr/lib/hadoop-0.20/con
> >
> >
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
> >
> >
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
> >
> >
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
> >
> >
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
> >
> >
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
> >
> >
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
> >
> >
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
> >
> >
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
> >
> >
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
> >
> >
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
> >
> >
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
> >
> >
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
> >
> >
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
> >
> >
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
> >
> >
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
> >
> >
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
> >
> >
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
> >
> >
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> > org.apache.hadoop.util.RunJar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
> > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> > -inputformat StreamIn putFormat -inputreader
> > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> > 3.org/TR/REC-xml">,end=</mdc>
> > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> > /home/ftpuser1/Nodemapper5.groovy -re ducer
> > org.apache.hadoop.mapred.lib.IdentityReducer -file
> /home/ftpuser1/Nodemapp
> > er5.groovy
> > root      7930  7632  0 01:24 pts/2    00:00:00 grep Nodemapper5.groovy
> >
> >
> > Any clue?
> > Thanks
> >
> > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > First, thank you for using CDH3.  Can you also check what m*
> > > apred.child.ulimit* you are using?  Try adding "*
> > > -D mapred.child.ulimit=3145728*" to the command line.
> > >
> > > I would also recommend to upgrade java to JDK 1.6 update 8 at a
> minimum,
> > > which you can download from the Java SE
> > > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > > .
> > >
> > > Let me know how it goes.
> > >
> > > Alex K
> > >
> > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi Alex
> > > >
> > > > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > > > distribution of hadoop. and here is the output of this command.
> > > >
> > > > root      5277  5238  3 12:51 pts/2    00:00:00
> > /usr/jdk1.6.0_03/bin/java
> > > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > > /usr/lib/hadoop-0.20/conf:/usr/
> > > >
> > > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > > >
> > > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > > >
> > > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > > >
> > > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > > >
> > > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > > >
> > > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > > >
> > > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > > >
> > > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > > >
> > > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > > >
> > > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > > >
> > > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > > >
> > > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > > >
> > > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > > >
> > > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > > >
> > > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > > -2.1.jar org.apache.hadoop.util.RunJar
> > > >
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > > -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> > > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > > home/ftpuser1/Nodemapper5.groovy
> > > > root      5360  5074  0 12:51 pts/1    00:00:00 grep
> Nodemapper5.groovy
> > > >
> > > >
> > > >
> > > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > > and what is meant by OOM and thanks for helping,
> > > >
> > > > Best Regards
> > > >
> > > >
> > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com>
> > > wrote:
> > > >
> > > > > Hi Shuja,
> > > > >
> > > > > It looks like the OOM is happening in your code.  Are you running
> > > > MapReduce
> > > > > in a cluster?  If so, can you send the exact command line your code
> > is
> > > > > invoked with -- you can get it with a 'ps -Af | grep
> > > Nodemapper5.groovy'
> > > > > command on one of the nodes which is running the task?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Alex K
> > > > >
> > > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> > shujamughal@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi All
> > > > > >
> > > > > > I am facing a hard problem. I am running a map reduce job using
> > > > streaming
> > > > > > but it fails and it gives the following error.
> > > > > >
> > > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > > >
> > > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > > subprocess
> > > > > > failed with code 1
> > > > > >        at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > > >        at
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > > >
> > > > > >        at
> > > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > > >        at
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > > >        at
> > > > > >
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > > >        at
> > > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > > >
> > > > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > >
> > > > > >
> > > > > > I have increased the heap size in hadoop-env.sh and make it
> 2000M.
> > > Also
> > > > I
> > > > > > tell the job manually by following line.
> > > > > >
> > > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > > >
> > > > > > but it still gives the error. The same job runs fine if i run on
> > > shell
> > > > > > using
> > > > > > 1024M heap size like
> > > > > >
> > > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > > >
> > > > > >
> > > > > > Any clue?????????
> > > > > >
> > > > > > Thanks in advance.
> > > > > >
> > > > > > --
> > > > > > Regards
> > > > > > Shuja-ur-Rehman Baig
> > > > > > _________________________________
> > > > > > MS CS - School of Science and Engineering
> > > > > > Lahore University of Management Sciences (LUMS)
> > > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > > Cell: +92 3214207445
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Patrick Angeles <pa...@cloudera.com>.
Shuja,

Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only used
for child JVMs that get forked by the TaskTracker. You are using Hadoop
streaming, which means the TaskTracker is forking a JVM for streaming, which
is then forking a shell process that runs your groovy code (in another JVM).

I'm not much of a groovy expert, but if there's a way you can wrap your code
around the MapReduce API that would work best. Otherwise, you can just pass
the heapsize in '-mapper' argument.

Regards,

- Patrick

On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <sh...@gmail.com> wrote:

> Hi Alex,
>
> I have update the java to latest available version on all machines in the
> cluster and now i run the job by adding this line
>
> -D mapred.child.ulimit=3145728 \
>
> but still same error. Here is the output of this job.
>
>
> root      7845  5674  3 01:24 pts/1    00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
> -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
> -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/con
>
> f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
>
> p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
>
> op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
>
> sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
>
> mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
>
> sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
>
> .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
>
> p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
>
> 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
>
> core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
>
> b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
>
> time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
>
> ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
>
> op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
>
> -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
>
> 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
>
> hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
>
> 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
>
> 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
>
> /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
> r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
> org.apache.hadoop.util.RunJar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
> mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
> -inputformat StreamIn putFormat -inputreader
> StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
> 3.org/TR/REC-xml">,end=</mdc>
> -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
> /home/ftpuser1/Nodemapper5.groovy -re ducer
> org.apache.hadoop.mapred.lib.IdentityReducer -file /home/ftpuser1/Nodemapp
> er5.groovy
> root      7930  7632  0 01:24 pts/2    00:00:00 grep Nodemapper5.groovy
>
>
> Any clue?
> Thanks
>
> On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > First, thank you for using CDH3.  Can you also check what m*
> > apred.child.ulimit* you are using?  Try adding "*
> > -D mapred.child.ulimit=3145728*" to the command line.
> >
> > I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
> > which you can download from the Java SE
> > Homepage<http://java.sun.com/javase/downloads/index.jsp>
> > .
> >
> > Let me know how it goes.
> >
> > Alex K
> >
> > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> > >wrote:
> >
> > > Hi Alex
> > >
> > > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > > distribution of hadoop. and here is the output of this command.
> > >
> > > root      5277  5238  3 12:51 pts/2    00:00:00
> /usr/jdk1.6.0_03/bin/java
> > > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > > /usr/lib/hadoop-0.20/conf:/usr/
> > >
> > >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> > >
> > >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> > >
> > >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> > >
> > >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> > >
> > >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> > >
> > >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> > >
> > >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> > >
> > >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> > >
> > >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> > >
> > >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> > >
> > >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> > >
> > >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> > >
> > >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> > >
> > >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> > >
> > >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > > -2.1.jar org.apache.hadoop.util.RunJar
> > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > > -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> > > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > > /home/ftpuser1/Nodemapper5.groovy -reducer
> > > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > > home/ftpuser1/Nodemapper5.groovy
> > > root      5360  5074  0 12:51 pts/1    00:00:00 grep Nodemapper5.groovy
> > >
> > >
> > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > > and what is meant by OOM and thanks for helping,
> > >
> > > Best Regards
> > >
> > >
> > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com>
> > wrote:
> > >
> > > > Hi Shuja,
> > > >
> > > > It looks like the OOM is happening in your code.  Are you running
> > > MapReduce
> > > > in a cluster?  If so, can you send the exact command line your code
> is
> > > > invoked with -- you can get it with a 'ps -Af | grep
> > Nodemapper5.groovy'
> > > > command on one of the nodes which is running the task?
> > > >
> > > > Thanks,
> > > >
> > > > Alex K
> > > >
> > > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <
> shujamughal@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi All
> > > > >
> > > > > I am facing a hard problem. I am running a map reduce job using
> > > streaming
> > > > > but it fails and it gives the following error.
> > > > >
> > > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > > >
> > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> > subprocess
> > > > > failed with code 1
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > > >        at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > > >
> > > > >        at
> > > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > > >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > > >        at
> > > > >
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > > >        at
> > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > > >
> > > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > >
> > > > >
> > > > > I have increased the heap size in hadoop-env.sh and make it 2000M.
> > Also
> > > I
> > > > > tell the job manually by following line.
> > > > >
> > > > > -D mapred.child.java.opts=-Xmx2000M \
> > > > >
> > > > > but it still gives the error. The same job runs fine if i run on
> > shell
> > > > > using
> > > > > 1024M heap size like
> > > > >
> > > > > cat file.xml | /root/Nodemapper5.groovy
> > > > >
> > > > >
> > > > > Any clue?????????
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > > > --
> > > > > Regards
> > > > > Shuja-ur-Rehman Baig
> > > > > _________________________________
> > > > > MS CS - School of Science and Engineering
> > > > > Lahore University of Management Sciences (LUMS)
> > > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > > Cell: +92 3214207445
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex,

I have update the java to latest available version on all machines in the
cluster and now i run the job by adding this line

-D mapred.child.ulimit=3145728 \

but still same error. Here is the output of this job.


root      7845  5674  3 01:24 pts/1    00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs
-Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20
-Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole
-Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/con
f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo
p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado
op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u
sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com
mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u
sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1
.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo
p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.
8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-
core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li
b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run
time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l
ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado
op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop
-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.
20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/
hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-
2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.
20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr
/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar
org.apache.hadoop.util.RunJar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D
mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728
-inputformat StreamIn putFormat -inputreader
StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w
3.org/TR/REC-xml">,end=</mdc>
-input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
-jobconf m apred.reduce.tasks=0 -output RNC14 -mapper
/home/ftpuser1/Nodemapper5.groovy -re ducer
org.apache.hadoop.mapred.lib.IdentityReducer -file /home/ftpuser1/Nodemapp
er5.groovy
root      7930  7632  0 01:24 pts/2    00:00:00 grep Nodemapper5.groovy


Any clue?
Thanks

On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <al...@cloudera.com> wrote:

> Hi Shuja,
>
> First, thank you for using CDH3.  Can you also check what m*
> apred.child.ulimit* you are using?  Try adding "*
> -D mapred.child.ulimit=3145728*" to the command line.
>
> I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
> which you can download from the Java SE
> Homepage<http://java.sun.com/javase/downloads/index.jsp>
> .
>
> Let me know how it goes.
>
> Alex K
>
> On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <shujamughal@gmail.com
> >wrote:
>
> > Hi Alex
> >
> > Yeah, I am running a job on cluster of 2 machines and using Cloudera
> > distribution of hadoop. and here is the output of this command.
> >
> > root      5277  5238  3 12:51 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
> > -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> > -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> > -Dhadoop.policy.file=hadoop-policy.xml -classpath
> > /usr/lib/hadoop-0.20/conf:/usr/
> >
> >
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
> >
> >
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
> >
> >
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
> >
> >
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
> >
> >
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
> >
> >
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
> >
> >
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
> >
> >
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
> >
> >
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
> >
> >
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
> >
> >
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
> >
> >
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
> >
> >
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
> >
> >
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
> >
> >
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> > -2.1.jar org.apache.hadoop.util.RunJar
> > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> > -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> > http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> > -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> > /home/ftpuser1/Nodemapper5.groovy -reducer
> > org.apache.hadoop.mapred.lib.IdentityReducer -file /
> > home/ftpuser1/Nodemapper5.groovy
> > root      5360  5074  0 12:51 pts/1    00:00:00 grep Nodemapper5.groovy
> >
> >
> >
> >
> ------------------------------------------------------------------------------------------------------------------------------
> > and what is meant by OOM and thanks for helping,
> >
> > Best Regards
> >
> >
> > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com>
> wrote:
> >
> > > Hi Shuja,
> > >
> > > It looks like the OOM is happening in your code.  Are you running
> > MapReduce
> > > in a cluster?  If so, can you send the exact command line your code is
> > > invoked with -- you can get it with a 'ps -Af | grep
> Nodemapper5.groovy'
> > > command on one of the nodes which is running the task?
> > >
> > > Thanks,
> > >
> > > Alex K
> > >
> > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamughal@gmail.com
> > > >wrote:
> > >
> > > > Hi All
> > > >
> > > > I am facing a hard problem. I am running a map reduce job using
> > streaming
> > > > but it fails and it gives the following error.
> > > >
> > > > Caught: java.lang.OutOfMemoryError: Java heap space
> > > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > > >
> > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads():
> subprocess
> > > > failed with code 1
> > > >        at
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > > >        at
> > > >
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > > >
> > > >        at
> > > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > > >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > > >        at
> > > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > > >        at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > > >
> > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > >
> > > >
> > > > I have increased the heap size in hadoop-env.sh and make it 2000M.
> Also
> > I
> > > > tell the job manually by following line.
> > > >
> > > > -D mapred.child.java.opts=-Xmx2000M \
> > > >
> > > > but it still gives the error. The same job runs fine if i run on
> shell
> > > > using
> > > > 1024M heap size like
> > > >
> > > > cat file.xml | /root/Nodemapper5.groovy
> > > >
> > > >
> > > > Any clue?????????
> > > >
> > > > Thanks in advance.
> > > >
> > > > --
> > > > Regards
> > > > Shuja-ur-Rehman Baig
> > > > _________________________________
> > > > MS CS - School of Science and Engineering
> > > > Lahore University of Management Sciences (LUMS)
> > > > Sector U, DHA, Lahore, 54792, Pakistan
> > > > Cell: +92 3214207445
> > > >
> > >
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,

First, thank you for using CDH3.  Can you also check what m*
apred.child.ulimit* you are using?  Try adding "*
-D mapred.child.ulimit=3145728*" to the command line.

I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
which you can download from the Java SE
Homepage<http://java.sun.com/javase/downloads/index.jsp>
.

Let me know how it goes.

Alex K

On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <sh...@gmail.com>wrote:

> Hi Alex
>
> Yeah, I am running a job on cluster of 2 machines and using Cloudera
> distribution of hadoop. and here is the output of this command.
>
> root      5277  5238  3 12:51 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
> -Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
> -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
> -Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
> -Dhadoop.policy.file=hadoop-policy.xml -classpath
> /usr/lib/hadoop-0.20/conf:/usr/
>
> jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
>
> p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
>
> s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
>
> .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
>
> r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
>
> ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
>
> adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
>
> n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
>
> ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
>
> /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
>
> 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
>
> /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
>
> a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
>
> sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
>
> .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
> -2.1.jar org.apache.hadoop.util.RunJar
> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
> -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
> -inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
> http://www.w3.org/TR/REC-xml">,end=</mdc> -input
> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531
> .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
> -jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
> /home/ftpuser1/Nodemapper5.groovy -reducer
> org.apache.hadoop.mapred.lib.IdentityReducer -file /
> home/ftpuser1/Nodemapper5.groovy
> root      5360  5074  0 12:51 pts/1    00:00:00 grep Nodemapper5.groovy
>
>
>
> ------------------------------------------------------------------------------------------------------------------------------
> and what is meant by OOM and thanks for helping,
>
> Best Regards
>
>
> On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com> wrote:
>
> > Hi Shuja,
> >
> > It looks like the OOM is happening in your code.  Are you running
> MapReduce
> > in a cluster?  If so, can you send the exact command line your code is
> > invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy'
> > command on one of the nodes which is running the task?
> >
> > Thanks,
> >
> > Alex K
> >
> > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamughal@gmail.com
> > >wrote:
> >
> > > Hi All
> > >
> > > I am facing a hard problem. I am running a map reduce job using
> streaming
> > > but it fails and it gives the following error.
> > >
> > > Caught: java.lang.OutOfMemoryError: Java heap space
> > >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> > >
> > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > > failed with code 1
> > >        at
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> > >        at
> > >
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> > >
> > >        at
> > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> > >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> > >        at
> > > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> > >        at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> > >
> > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > >
> > >
> > > I have increased the heap size in hadoop-env.sh and make it 2000M. Also
> I
> > > tell the job manually by following line.
> > >
> > > -D mapred.child.java.opts=-Xmx2000M \
> > >
> > > but it still gives the error. The same job runs fine if i run on shell
> > > using
> > > 1024M heap size like
> > >
> > > cat file.xml | /root/Nodemapper5.groovy
> > >
> > >
> > > Any clue?????????
> > >
> > > Thanks in advance.
> > >
> > > --
> > > Regards
> > > Shuja-ur-Rehman Baig
> > > _________________________________
> > > MS CS - School of Science and Engineering
> > > Lahore University of Management Sciences (LUMS)
> > > Sector U, DHA, Lahore, 54792, Pakistan
> > > Cell: +92 3214207445
> > >
> >
>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi Alex

Yeah, I am running a job on cluster of 2 machines and using Cloudera
distribution of hadoop. and here is the output of this command.

root      5277  5238  3 12:51 pts/2    00:00:00 /usr/jdk1.6.0_03/bin/java
-Xmx1023m -Dhadoop.log.dir=/usr/lib         /hadoop-0.20/logs
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20
-Dhadoop.id.str= -Dhado         op.root.logger=INFO,console
-Dhadoop.policy.file=hadoop-policy.xml -classpath
/usr/lib/hadoop-0.20/conf:/usr/
jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo
p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common
s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1
.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja
r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l
ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h
adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso
n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru
ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib
/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.
2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib
/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav
a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u
sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0
.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api
-2.1.jar org.apache.hadoop.util.RunJar
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar
-D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat
-inputreader StreamXmlRecordReader,begin=         <mdc xmlns:HTML="
http://www.w3.org/TR/REC-xml">,end=</mdc> -input
/user/root/RNCDATA/MDFDORKUCRAR02/A20100531
.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1
-jobconf mapred.reduce.tasks=0 -output          RNC11 -mapper
/home/ftpuser1/Nodemapper5.groovy -reducer
org.apache.hadoop.mapred.lib.IdentityReducer -file /
home/ftpuser1/Nodemapper5.groovy
root      5360  5074  0 12:51 pts/1    00:00:00 grep Nodemapper5.groovy


------------------------------------------------------------------------------------------------------------------------------
and what is meant by OOM and thanks for helping,

Best Regards


On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <al...@cloudera.com> wrote:

> Hi Shuja,
>
> It looks like the OOM is happening in your code.  Are you running MapReduce
> in a cluster?  If so, can you send the exact command line your code is
> invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy'
> command on one of the nodes which is running the task?
>
> Thanks,
>
> Alex K
>
> On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <shujamughal@gmail.com
> >wrote:
>
> > Hi All
> >
> > I am facing a hard problem. I am running a map reduce job using streaming
> > but it fails and it gives the following error.
> >
> > Caught: java.lang.OutOfMemoryError: Java heap space
> >        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
> >
> > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> > failed with code 1
> >        at
> >
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
> >        at
> >
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
> >
> >        at
> org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
> >        at
> > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
> >        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
> >
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> >
> > I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
> > tell the job manually by following line.
> >
> > -D mapred.child.java.opts=-Xmx2000M \
> >
> > but it still gives the error. The same job runs fine if i run on shell
> > using
> > 1024M heap size like
> >
> > cat file.xml | /root/Nodemapper5.groovy
> >
> >
> > Any clue?????????
> >
> > Thanks in advance.
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > _________________________________
> > MS CS - School of Science and Engineering
> > Lahore University of Management Sciences (LUMS)
> > Sector U, DHA, Lahore, 54792, Pakistan
> > Cell: +92 3214207445
> >
>



-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Alex Kozlov <al...@cloudera.com>.
Hi Shuja,

It looks like the OOM is happening in your code.  Are you running MapReduce
in a cluster?  If so, can you send the exact command line your code is
invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy'
command on one of the nodes which is running the task?

Thanks,

Alex K

On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <sh...@gmail.com>wrote:

> Hi All
>
> I am facing a hard problem. I am running a map reduce job using streaming
> but it fails and it gives the following error.
>
> Caught: java.lang.OutOfMemoryError: Java heap space
>        at Nodemapper5.parseXML(Nodemapper5.groovy:25)
>
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>        at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>        at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>
>        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>        at
> org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
>
> I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
> tell the job manually by following line.
>
> -D mapred.child.java.opts=-Xmx2000M \
>
> but it still gives the error. The same job runs fine if i run on shell
> using
> 1024M heap size like
>
> cat file.xml | /root/Nodemapper5.groovy
>
>
> Any clue?????????
>
> Thanks in advance.
>
> --
> Regards
> Shuja-ur-Rehman Baig
> _________________________________
> MS CS - School of Science and Engineering
> Lahore University of Management Sciences (LUMS)
> Sector U, DHA, Lahore, 54792, Pakistan
> Cell: +92 3214207445
>

java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi All

I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.

Caught: java.lang.OutOfMemoryError: Java heap space
	at Nodemapper5.parseXML(Nodemapper5.groovy:25)

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)

	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)

	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)



I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
tell the job manually by following line.

-D mapred.child.java.opts=-Xmx2000M \

but it still gives the error. The same job runs fine if i run on shell using
1024M heap size like

cat file.xml | /root/Nodemapper5.groovy


Any clue?????????

Thanks in advance.

-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445

java.lang.OutOfMemoryError: Java heap space

Posted by Shuja Rehman <sh...@gmail.com>.
Hi All

I am facing a hard problem. I am running a map reduce job using streaming
but it fails and it gives the following error.

Caught: java.lang.OutOfMemoryError: Java heap space
	at Nodemapper5.parseXML(Nodemapper5.groovy:25)

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
failed with code 1
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)

	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)

	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)



I have increased the heap size in hadoop-env.sh and make it 2000M. Also I
tell the job manually by following line.

-D mapred.child.java.opts=-Xmx2000M \

but it still gives the error. The same job runs fine if i run on shell using
1024M heap size like

cat file.xml | /root/Nodemapper5.groovy


Any clue?????????

Thanks in advance.


-- 
Regards
Shuja-ur-Rehman Baig
_________________________________
MS CS - School of Science and Engineering
Lahore University of Management Sciences (LUMS)
Sector U, DHA, Lahore, 54792, Pakistan
Cell: +92 3214207445