You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Tanton Gibbs <ta...@gmail.com> on 2008/05/23 07:52:06 UTC
Spillable memory manager
I upgraded to hadoop 17 and the latest Pig from svn.
I'm now getting a ton of lines in my log files that say:
2008-05-23 00:49:27,832 INFO
org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
called init = 1441792(1408K) used = 483176072(471851K) committed =
641335296(626304K) max = 954466304(932096K)
In addition, jobs on big files are running very slowly.
Does anyone have any ideas as to what I could have screwed up?
Thanks!
Tanton
Re: Spillable memory manager
Posted by Iván de Prado <iv...@gmail.com>.
I have update to trunk and Hadoop 0.17.0. The memory limit per task is
400 Mb. An OutOfMemory exception is launched at first reduce. I have
notice that this Pig script worked with 1GB of memory per task. What are
the memory requirements for PIG?
Thanks!
Iván de Prado
www.ivanprado.es
2008-05-30 11:21:29,863 INFO
org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
called init = 5439488(5312K) used = 166885368(162973K) committed =
246087680(240320K) max = 279642112(273088K)
2008-05-30 11:21:33,069 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 225822592(220529K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:36,047 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 169349352(165380K) committed = 267780096(261504K) max = 279642112(273088K)
2008-05-30 11:21:39,369 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 267780072(261503K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:44,505 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 255668880(249676K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:51,019 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 265970168(259736K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:58,115 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 266914224(260658K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:01,423 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 223674352(218431K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:05,163 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 258252264(252199K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:41,457 ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce: java.lang.OutOfMemoryError: Java heap space
________________________________________________________________________
Explain:
Logical Plan:
|---LOSort ( BY GENERATE {[FLATTEN PROJECT $1]} )
|---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2],[FLATTEN PROJECT $3],[FLATTEN PROJECT $4],[FLATTEN PROJECT $5]} )
|---LOCogroup ( GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $2],[*]} )
|---LOEval ( [FILTER BY ([PROJECT $6] == ['1'])] )
|---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
|---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
|---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
|---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
|---LOCogroup ( GENERATE {[*],[*]} )
|---LOEval ( GENERATE {[PROJECT $2],[PROJECT $1]} )
|---LOEval ( [FILTER BY ([PROJECT $6] == ['1'])] )
|---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
|---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
|---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $2],[*]} )
|---LOEval ( [FILTER BY (([PROJECT $6] == ['3']) OR ([PROJECT $6] == ['4']) OR ([PROJECT $6] == ['5']))] )
|---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
|---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
|---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT $1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
|---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
|---LOCogroup ( GENERATE {[*],[*]} )
|---LOEval ( GENERATE {[PROJECT $2],[PROJECT $1]} )
|---LOEval ( [FILTER BY (([PROJECT $6] == ['3']) OR ([PROJECT $6] == ['4']) OR ([PROJECT $6] == ['5']))] )
|---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
|---LOCogroup ( GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( [FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] != ['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
|---LOLoad ( file = /user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
|---LOCogroup ( GENERATE {[*],[*]} )
|---LOEval ( GENERATE {[PROJECT $2],[PROJECT $5]} )
|---LOLoad ( file = /user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS id,wid,locid,status,proptype,country,sor )
-----------------------------------------------
Physical Plan:
|---POMapreduce
Partition Function: org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SortPartitioner
Map : *
Reduce : Generate(Project(1))
Grouping : Generate(Generate(Project(1)),*)
Input File(s) : /tmp/temp1398936874/tmp-1538794351
Properties :
|---POMapreduce
Map : Composite(*,Generate(Project(1)))
Reduce : Generate(FuncEval(org.apache.pig.impl.builtin.FindQuantiles(Generate(Const(1),Composite(Project(1),Sort(*))))))
Grouping : Generate(Const(all),*)
Input File(s) : /tmp/temp1398936874/tmp-1538794351
Properties :
|---POMapreduce
Map : *****
Reduce : Generate(Project(1),Project(2),Project(3),Project(4),Project(5))
Grouping : Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp-585863913, /tmp/temp1398936874/tmp-536934015, /tmp/temp1398936874/tmp23578316, /tmp/temp1398936874/tmp662497645, /tmp/temp1398936874/tmp582570364
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(2),*)
Input File(s) : /tmp/temp1398936874/tmp-1880872512
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: COMP )
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp-1242543041
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /tmp/temp1398936874/tmp2015750396
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: COMP ,Generate(Project(2),Project(1)))
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(2),*)
Input File(s) : /tmp/temp1398936874/tmp-1934972255
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: OR )
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce : Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp799024189
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /tmp/temp1398936874/tmp1055965366
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce : Composite(Generate(Project(1),Project(2)),Filter: OR ,Generate(Project(2),Project(1)))
Grouping : Generate(Project(1),*)Generate(Project(0),*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump, /user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Generate(Project(2),Project(5)))
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /user/properazzi/mc/mc_20080529000002/input/partition_B.dump
Properties : pig.input.splittable:true
El vie, 30-05-2008 a las 20:01 +1000, pi song escribió:
> We've already fixed the memory issue introduced in Pig-85. Could you please
> update to the latest version and try again?
>
> Pi
>
> On Wed, May 28, 2008 at 9:18 AM, pi song <pi...@gmail.com> wrote:
>
> > This might have nothing to do with Hadoop 0.17 but something else that we
> > fixed right after it. I'm investigating. Sorry for inconvenience.
> >
> > FYI,
> > Pi
> >
> >
> > On 5/28/08, Tanton Gibbs <ta...@gmail.com> wrote:
> >>
> >> I think you need to increase the amount of memory you give to java.
> >>
> >> It looks like it is currently set to 256M. I upped mine to 2G. Of
> >> course it depends on how much ram you have available.
> >>
> >> mapred.child.java.opts is the parameter
> >> mine is currently set to 2048M in my hadoop-site.xml file.
> >>
> >> For performance reasons, I upped the io.sort.mb parameter. However,
> >> if this is too close to 50% of the total memory, you will get the
> >> Spillable messages.
> >>
> >> HTH,
> >> Tanton
> >>
> >
> >
Re: Spillable memory manager
Posted by Alan Gates <ga...@YAHOO-INC.COM>.
What's currently on top of trunk requires use of hadoop17.jar instead of
hadoop16.jar.
Alan.
Iván de Prado wrote:
> I did ant clean, and tried to recompile with hadoop16.jar. But I got
> these compilation errors:
>
> [javac] Compiling 241 source files to /opt/hd1/pig-trunk/build/classes
> [javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/PigOutputFormat.java:28: cannot find symbol
> [javac] symbol : class FileOutputFormat
> [javac] location: package org.apache.hadoop.mapred
> [javac] import org.apache.hadoop.mapred.FileOutputFormat;
> [javac] ^
> [javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/SortPartitioner.java:26: cannot find symbol
> [javac] symbol : class RawComparator
> [javac] location: package org.apache.hadoop.io
> [javac] import org.apache.hadoop.io.RawComparator;
> [javac] ^
> [javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/SortPartitioner.java:37: cannot find symbol
> [javac] symbol : class RawComparator
> [javac] location: class org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SortPartitioner
> [javac] RawComparator comparator;
> [javac] ^
> [javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/PigOutputFormat.java:48: cannot find symbol
> [javac] symbol : variable FileOutputFormat
> [javac] location: class org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat
> [javac] Path outputDir = FileOutputFormat.getWorkOutputPath(job);
> [javac] ^
> [javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/PigOutputFormat.java:62: cannot find symbol
> [javac] symbol : variable FileOutputFormat
> [javac] location: class org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat
> [javac] String parentName = FileOutputFormat.getOutputPath(job).getName();
>
>
> Is not Pig compatible with Hadoop 0.16 anymore? Did I do something wrong when compiling?
>
> I'm using the revision 661633
>
> Iván
>
> El vie, 30-05-2008 a las 13:06 +0200, Iván de Prado escribió:
>
>> With the latest version I'm getting an error:
>>
>> java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/FileOutputFormat
>> at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat.getRecordWriter(PigOutputFormat.java:48)
>> at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupMapPipe(PigMapReduce.java:257)
>> at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:111)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
>> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
>> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.FileOutputFormat
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
>> ... 5 more
>>
>> What did I do wrong? I'm launching pig using bin/pig script. Before the update, it worked.
>>
>> Iván de Prado
>> www.ivanprado.es
>>
>>
>> El vie, 30-05-2008 a las 20:01 +1000, pi song escribió:
>>
>>> We've already fixed the memory issue introduced in Pig-85. Could you please
>>> update to the latest version and try again?
>>>
>>> Pi
>>>
>>> On Wed, May 28, 2008 at 9:18 AM, pi song <pi...@gmail.com> wrote:
>>>
>>>
>>>> This might have nothing to do with Hadoop 0.17 but something else that we
>>>> fixed right after it. I'm investigating. Sorry for inconvenience.
>>>>
>>>> FYI,
>>>> Pi
>>>>
>>>>
>>>> On 5/28/08, Tanton Gibbs <ta...@gmail.com> wrote:
>>>>
>>>>> I think you need to increase the amount of memory you give to java.
>>>>>
>>>>> It looks like it is currently set to 256M. I upped mine to 2G. Of
>>>>> course it depends on how much ram you have available.
>>>>>
>>>>> mapred.child.java.opts is the parameter
>>>>> mine is currently set to 2048M in my hadoop-site.xml file.
>>>>>
>>>>> For performance reasons, I upped the io.sort.mb parameter. However,
>>>>> if this is too close to 50% of the total memory, you will get the
>>>>> Spillable messages.
>>>>>
>>>>> HTH,
>>>>> Tanton
>>>>>
>>>>>
>>>>
>
>
Re: Spillable memory manager
Posted by Iván de Prado <iv...@properazzi.com>.
I did ant clean, and tried to recompile with hadoop16.jar. But I got
these compilation errors:
[javac] Compiling 241 source files to /opt/hd1/pig-trunk/build/classes
[javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/PigOutputFormat.java:28: cannot find symbol
[javac] symbol : class FileOutputFormat
[javac] location: package org.apache.hadoop.mapred
[javac] import org.apache.hadoop.mapred.FileOutputFormat;
[javac] ^
[javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/SortPartitioner.java:26: cannot find symbol
[javac] symbol : class RawComparator
[javac] location: package org.apache.hadoop.io
[javac] import org.apache.hadoop.io.RawComparator;
[javac] ^
[javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/SortPartitioner.java:37: cannot find symbol
[javac] symbol : class RawComparator
[javac] location: class org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SortPartitioner
[javac] RawComparator comparator;
[javac] ^
[javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/PigOutputFormat.java:48: cannot find symbol
[javac] symbol : variable FileOutputFormat
[javac] location: class org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat
[javac] Path outputDir = FileOutputFormat.getWorkOutputPath(job);
[javac] ^
[javac] /opt/hd1/pig-trunk/src/org/apache/pig/backend/hadoop/executionengine/mapreduceExec/PigOutputFormat.java:62: cannot find symbol
[javac] symbol : variable FileOutputFormat
[javac] location: class org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat
[javac] String parentName = FileOutputFormat.getOutputPath(job).getName();
Is not Pig compatible with Hadoop 0.16 anymore? Did I do something wrong when compiling?
I'm using the revision 661633
Iván
El vie, 30-05-2008 a las 13:06 +0200, Iván de Prado escribió:
> With the latest version I'm getting an error:
>
> java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/FileOutputFormat
> at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat.getRecordWriter(PigOutputFormat.java:48)
> at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupMapPipe(PigMapReduce.java:257)
> at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:111)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.FileOutputFormat
> at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
> ... 5 more
>
> What did I do wrong? I'm launching pig using bin/pig script. Before the update, it worked.
>
> Iván de Prado
> www.ivanprado.es
>
>
> El vie, 30-05-2008 a las 20:01 +1000, pi song escribió:
> > We've already fixed the memory issue introduced in Pig-85. Could you please
> > update to the latest version and try again?
> >
> > Pi
> >
> > On Wed, May 28, 2008 at 9:18 AM, pi song <pi...@gmail.com> wrote:
> >
> > > This might have nothing to do with Hadoop 0.17 but something else that we
> > > fixed right after it. I'm investigating. Sorry for inconvenience.
> > >
> > > FYI,
> > > Pi
> > >
> > >
> > > On 5/28/08, Tanton Gibbs <ta...@gmail.com> wrote:
> > >>
> > >> I think you need to increase the amount of memory you give to java.
> > >>
> > >> It looks like it is currently set to 256M. I upped mine to 2G. Of
> > >> course it depends on how much ram you have available.
> > >>
> > >> mapred.child.java.opts is the parameter
> > >> mine is currently set to 2048M in my hadoop-site.xml file.
> > >>
> > >> For performance reasons, I upped the io.sort.mb parameter. However,
> > >> if this is too close to 50% of the total memory, you will get the
> > >> Spillable messages.
> > >>
> > >> HTH,
> > >> Tanton
> > >>
> > >
> > >
Re: Spillable memory manager
Posted by Iván de Prado <iv...@gmail.com>.
With the latest version I'm getting an error:
java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/FileOutputFormat
at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat.getRecordWriter(PigOutputFormat.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupMapPipe(PigMapReduce.java:257)
at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:111)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.FileOutputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
... 5 more
What did I do wrong? I'm launching pig using bin/pig script. Before the update, it worked.
Iván de Prado
www.ivanprado.es
El vie, 30-05-2008 a las 20:01 +1000, pi song escribió:
> We've already fixed the memory issue introduced in Pig-85. Could you please
> update to the latest version and try again?
>
> Pi
>
> On Wed, May 28, 2008 at 9:18 AM, pi song <pi...@gmail.com> wrote:
>
> > This might have nothing to do with Hadoop 0.17 but something else that we
> > fixed right after it. I'm investigating. Sorry for inconvenience.
> >
> > FYI,
> > Pi
> >
> >
> > On 5/28/08, Tanton Gibbs <ta...@gmail.com> wrote:
> >>
> >> I think you need to increase the amount of memory you give to java.
> >>
> >> It looks like it is currently set to 256M. I upped mine to 2G. Of
> >> course it depends on how much ram you have available.
> >>
> >> mapred.child.java.opts is the parameter
> >> mine is currently set to 2048M in my hadoop-site.xml file.
> >>
> >> For performance reasons, I upped the io.sort.mb parameter. However,
> >> if this is too close to 50% of the total memory, you will get the
> >> Spillable messages.
> >>
> >> HTH,
> >> Tanton
> >>
> >
> >
Re: Spillable memory manager
Posted by pi song <pi...@gmail.com>.
We've already fixed the memory issue introduced in Pig-85. Could you please
update to the latest version and try again?
Pi
On Wed, May 28, 2008 at 9:18 AM, pi song <pi...@gmail.com> wrote:
> This might have nothing to do with Hadoop 0.17 but something else that we
> fixed right after it. I'm investigating. Sorry for inconvenience.
>
> FYI,
> Pi
>
>
> On 5/28/08, Tanton Gibbs <ta...@gmail.com> wrote:
>>
>> I think you need to increase the amount of memory you give to java.
>>
>> It looks like it is currently set to 256M. I upped mine to 2G. Of
>> course it depends on how much ram you have available.
>>
>> mapred.child.java.opts is the parameter
>> mine is currently set to 2048M in my hadoop-site.xml file.
>>
>> For performance reasons, I upped the io.sort.mb parameter. However,
>> if this is too close to 50% of the total memory, you will get the
>> Spillable messages.
>>
>> HTH,
>> Tanton
>>
>
>
Re: Spillable memory manager
Posted by pi song <pi...@gmail.com>.
This might have nothing to do with Hadoop 0.17 but something else that we
fixed right after it. I'm investigating. Sorry for inconvenience.
FYI,
Pi
On 5/28/08, Tanton Gibbs <ta...@gmail.com> wrote:
>
> I think you need to increase the amount of memory you give to java.
>
> It looks like it is currently set to 256M. I upped mine to 2G. Of
> course it depends on how much ram you have available.
>
> mapred.child.java.opts is the parameter
> mine is currently set to 2048M in my hadoop-site.xml file.
>
> For performance reasons, I upped the io.sort.mb parameter. However,
> if this is too close to 50% of the total memory, you will get the
> Spillable messages.
>
> HTH,
> Tanton
>
Re: Spillable memory manager
Posted by Tanton Gibbs <ta...@gmail.com>.
I think you need to increase the amount of memory you give to java.
It looks like it is currently set to 256M. I upped mine to 2G. Of
course it depends on how much ram you have available.
mapred.child.java.opts is the parameter
mine is currently set to 2048M in my hadoop-site.xml file.
For performance reasons, I upped the io.sort.mb parameter. However,
if this is too close to 50% of the total memory, you will get the
Spillable messages.
HTH,
Tanton
Re: Spillable memory manager
Posted by Iván de Prado <iv...@gmail.com>.
Hi Tanton,
I am having the same problem, but I have got an Out of Memory Exception
in the reduce phase. Which Hadoop config parameter did you change? Is it
the io.seqfile.compress.blocksize?
My current value for this parameter is:
<property>
<name>io.seqfile.compress.blocksize</name>
<value>1000000</value>
<description>The minimum block size for compression in block compressed
SequenceFiles.
</description>
</property>
The logs I got:
2008-05-27 11:05:38,087 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 273852568(267434K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:05:38,802 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 273860104(267441K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:05:44,893 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 279642088(273087K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:05:52,704 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 217972656(212863K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:05:56,510 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 269271376(262960K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:03,686 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 244418296(238689K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:10,610 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 269740120(263418K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:16,370 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 271831992(265460K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:19,813 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 258029960(251982K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:23,948 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 279642104(273087K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:27,208 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 195321216(190743K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:30,932 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 266489648(260243K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:34,463 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 279642088(273087K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:38,214 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 279642112(273088K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:06:44,571 INFO org.apache.pig.impl.util.SpillableMemoryManager: low memory handler called init = 5439488(5312K) used = 268382184(262091K) committed = 279642112(273088K) max = 279642112(273088K)
2008-05-27 11:11:02,570 INFO org.apache.hadoop.mapred.TaskRunner: Communication exception: java.lang.OutOfMemoryError: Java heap space
2008-05-27 11:11:02,571 ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce: java.lang.OutOfMemoryError: Java heap space
2008-05-27 11:11:03,234 INFO org.apache.hadoop.ipc.Client: java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at org.apache.hadoop.ipc.Client$Connection$1.read(Client.java:190)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readInt(DataInputStream.java:370)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:276)
Thanks,
Iván
El vie, 23-05-2008 a las 10:23 -0500, Tanton Gibbs escribió:
> I upped my maximum memory from 1024M to 2048M and the problem went
> away. I think the problem was that my sortable memory was already set
> to 400M so it was very close to the 50% mark already.
>
> Is there a way to up the spillable threshold to 80%?
>
> On Fri, May 23, 2008 at 10:04 AM, Tanton Gibbs <ta...@gmail.com> wrote:
> > It is in a map phase. I don't think I used a custom chunker. My
> > splits are set to be 128M.
> >
> > On Fri, May 23, 2008 at 9:07 AM, pi song <pi...@gmail.com> wrote:
> >> Dear Tanton,
> >>
> >> This means the MemoryManager is not successful at reclaiming memory. Did
> >> that happen in Map phase or Reduce phase? Did you use a custom chunker? How
> >> big is your split?
> >>
> >> Pi
> >>
> >> On Fri, May 23, 2008 at 3:52 PM, Tanton Gibbs <ta...@gmail.com>
> >> wrote:
> >>
> >>> I upgraded to hadoop 17 and the latest Pig from svn.
> >>>
> >>> I'm now getting a ton of lines in my log files that say:
> >>>
> >>> 2008-05-23 00:49:27,832 INFO
> >>> org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
> >>> called init = 1441792(1408K) used = 483176072(471851K) committed =
> >>> 641335296(626304K) max = 954466304(932096K)
> >>>
> >>> In addition, jobs on big files are running very slowly.
> >>>
> >>> Does anyone have any ideas as to what I could have screwed up?
> >>>
> >>> Thanks!
> >>> Tanton
> >>>
> >>
> >
Re: Spillable memory manager
Posted by Tanton Gibbs <ta...@gmail.com>.
I upped my maximum memory from 1024M to 2048M and the problem went
away. I think the problem was that my sortable memory was already set
to 400M so it was very close to the 50% mark already.
Is there a way to up the spillable threshold to 80%?
On Fri, May 23, 2008 at 10:04 AM, Tanton Gibbs <ta...@gmail.com> wrote:
> It is in a map phase. I don't think I used a custom chunker. My
> splits are set to be 128M.
>
> On Fri, May 23, 2008 at 9:07 AM, pi song <pi...@gmail.com> wrote:
>> Dear Tanton,
>>
>> This means the MemoryManager is not successful at reclaiming memory. Did
>> that happen in Map phase or Reduce phase? Did you use a custom chunker? How
>> big is your split?
>>
>> Pi
>>
>> On Fri, May 23, 2008 at 3:52 PM, Tanton Gibbs <ta...@gmail.com>
>> wrote:
>>
>>> I upgraded to hadoop 17 and the latest Pig from svn.
>>>
>>> I'm now getting a ton of lines in my log files that say:
>>>
>>> 2008-05-23 00:49:27,832 INFO
>>> org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
>>> called init = 1441792(1408K) used = 483176072(471851K) committed =
>>> 641335296(626304K) max = 954466304(932096K)
>>>
>>> In addition, jobs on big files are running very slowly.
>>>
>>> Does anyone have any ideas as to what I could have screwed up?
>>>
>>> Thanks!
>>> Tanton
>>>
>>
>
Re: Spillable memory manager
Posted by Tanton Gibbs <ta...@gmail.com>.
It is in a map phase. I don't think I used a custom chunker. My
splits are set to be 128M.
On Fri, May 23, 2008 at 9:07 AM, pi song <pi...@gmail.com> wrote:
> Dear Tanton,
>
> This means the MemoryManager is not successful at reclaiming memory. Did
> that happen in Map phase or Reduce phase? Did you use a custom chunker? How
> big is your split?
>
> Pi
>
> On Fri, May 23, 2008 at 3:52 PM, Tanton Gibbs <ta...@gmail.com>
> wrote:
>
>> I upgraded to hadoop 17 and the latest Pig from svn.
>>
>> I'm now getting a ton of lines in my log files that say:
>>
>> 2008-05-23 00:49:27,832 INFO
>> org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
>> called init = 1441792(1408K) used = 483176072(471851K) committed =
>> 641335296(626304K) max = 954466304(932096K)
>>
>> In addition, jobs on big files are running very slowly.
>>
>> Does anyone have any ideas as to what I could have screwed up?
>>
>> Thanks!
>> Tanton
>>
>
Re: Spillable memory manager
Posted by pi song <pi...@gmail.com>.
Dear Tanton,
This means the MemoryManager is not successful at reclaiming memory. Did
that happen in Map phase or Reduce phase? Did you use a custom chunker? How
big is your split?
Pi
On Fri, May 23, 2008 at 3:52 PM, Tanton Gibbs <ta...@gmail.com>
wrote:
> I upgraded to hadoop 17 and the latest Pig from svn.
>
> I'm now getting a ton of lines in my log files that say:
>
> 2008-05-23 00:49:27,832 INFO
> org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
> called init = 1441792(1408K) used = 483176072(471851K) committed =
> 641335296(626304K) max = 954466304(932096K)
>
> In addition, jobs on big files are running very slowly.
>
> Does anyone have any ideas as to what I could have screwed up?
>
> Thanks!
> Tanton
>
Re: Spillable memory manager
Posted by pi song <pi...@gmail.com>.
Dear Tanton,
This means the MemoryManager is not successful at reclaiming memory. Did
that happen in Map phase or Reduce phase? Did you use a custom chunker? How
big is your split?
Pi
On Fri, May 23, 2008 at 3:52 PM, Tanton Gibbs <ta...@gmail.com>
wrote:
> I upgraded to hadoop 17 and the latest Pig from svn.
>
> I'm now getting a ton of lines in my log files that say:
>
> 2008-05-23 00:49:27,832 INFO
> org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
> called init = 1441792(1408K) used = 483176072(471851K) committed =
> 641335296(626304K) max = 954466304(932096K)
>
> In addition, jobs on big files are running very slowly.
>
> Does anyone have any ideas as to what I could have screwed up?
>
> Thanks!
> Tanton
>