You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Sam Rash <sa...@ning.com> on 2009/10/12 20:55:05 UTC
ILLUSTRATE with custom loadFunc
Hi,
I have a couple questions about using ILLUSTRATE with a custom load
function
1. This seems to require I manually make sure my code is in the local
classpath (doing REGISTER in the pig script only makes it available
remotely). Is this by design?
2. I have implemented custom LoadFunc (implements SamplableLoader in
fact) and am getting errors when trying to use ILLUSTRATE with it.
The error is:
2009-10-12 11:54:08,057 [main] ERROR
org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to
setup the load function.
2009-10-12 11:54:08,059 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 2999: Unexpected internal error. Unable to setup the load
function.
stack trace from the log file:
Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error. Unable to setup the load
function.
java.lang.RuntimeException: Unable to setup the load function.
at
org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:
96)
at org.apache.pig.PigServer.getExamples(PigServer.java:723)
at
org
.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:
545)
at
org
.apache
.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
246)
at
org
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:
168)
at
org
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:
144)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
at org.apache.pig.Main.main(Main.java:320)
It works fine when I run the job normally. Is there anything special I
need to do in order to get it to work with ILLUSTRATE?
thx,
-sr
Sam Rash
samr@ning.com
Re: ILLUSTRATE with custom loadFunc
Posted by Alan Gates <ga...@yahoo-inc.com>.
This looks like a bug. PORead is trying to access its internal bag at
line 70. Either it wasn't passed the bag or the bag it was passed is
null. Does this happen if you don't use a custom load function?
Either way I think you should file a JIRA on this one.
Alan.
On Oct 12, 2009, at 3:44 PM, Sam Rash wrote:
> Actually I think I see the problem--I am setting the input path to a
> directory and assuming bindTo() is called with an actual filename.
> This happens correctly if it's running mapreduce mode, but not local
> mode (it gets the input path as I specify it). Not sure if this is
> by design, but I can spider the dir and get a path to use if that's
> the contract of LoadFunc in localmode.
>
> So...I fixed that and have bindTo() creating a SequencFile.Reader on
> an actual file instance, but get this error:
>
> Pig Stack Trace
> ---------------
> ERROR 2999: Unexpected internal error. null
>
> java.lang.NullPointerException
> at
> org
> .apache
> .pig
> .backend
> .hadoop
> .executionengine
> .physicalLayer.relationalOperators.PORead.getNext(PORead.java:70)
> at
> org
> .apache
> .pig
> .backend
> .hadoop
> .executionengine
> .physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:
> 231)
> at
> org
> .apache
> .pig
> .backend
> .hadoop
> .executionengine
> .physicalLayer
> .relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:
> 240)
> at
> org
> .apache
> .pig
> .backend
> .local
> .executionengine
> .physicalLayer
> .relationalOperators.POCogroup.accumulateData(POCogroup.java:174)
> at
> org
> .apache
> .pig
> .backend
> .local
> .executionengine
> .physicalLayer.relationalOperators.POCogroup.getNext(POCogroup.java:
> 93)
> at
> org.apache.pig.pen.DerivedDataVisitor.visit(DerivedDataVisitor.java:
> 182)
> at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:
> 333)
> at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:
> 44)
> at
> org
> .apache
> .pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:
> 68)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> at
> org
> .apache
> .pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:98)
> at
> org
> .apache
> .pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:90)
> at
> org
> .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:
> 106)
> at org.apache.pig.PigServer.getExamples(PigServer.java:723)
> at
> org
> .apache
> .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:545)
> at
> org
> .apache
> .pig
> .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
> 246)
> at
> org
> .apache
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> at
> org
> .apache
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:320)
>
>
> thx again,
> -sr
>
> Sam Rash
> samr@ning.com
>
>
>
> On Oct 12, 2009, at 3:18 PM, Sam Rash wrote:
>
>> Hey Alan,
>>
>> ah, how do i make a load function work locally, do I have to do
>> anything special? does local mean local execution? or local
>> filesystem as well? (ie could it just be failing to find inputs to
>> sample)
>> the LoadFunc I have borrows heavily from https://issues.apache.org/jira/browse/PIG-911
>> but has some custom bits for schemas.
>>
>> thx,
>> -sr
>>
>> Sam Rash
>> samr@ning.com
>>
>>
>>
>> On Oct 12, 2009, at 2:29 PM, Alan Gates wrote:
>>
>>> Looking at the code, the two cases I can see when it should get this
>>> error is if it fails to open the file, or the call to bindTo fails
>>> (see POLoad.setUp()). When you say it works when you run it
>>> normally
>>> does normally mean in local mode or hadoop mode? Illustrate is
>>> trying
>>> to run your function in local mode.
>>>
>>> Alan.
>>>
>>> On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
>>>
>>> > Hi,
>>> >
>>> > I have a couple questions about using ILLUSTRATE with a custom
>>> load
>>> > function
>>> >
>>> > 1. This seems to require I manually make sure my code is in the
>>> > local classpath (doing REGISTER in the pig script only makes it
>>> > available remotely). Is this by design?
>>> >
>>> > 2. I have implemented custom LoadFunc (implements SamplableLoader
>>> > in fact) and am getting errors when trying to use ILLUSTRATE with
>>> > it. The error is:
>>> >
>>> >
>>> > 2009-10-12 11:54:08,057 [main] ERROR
>>> > org.apache.pig.pen.ExampleGenerator - Error reading data. Unable
>>> to
>>> > setup the load function.
>>> > 2009-10-12 11:54:08,059 [main] ERROR
>>> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
>>> > error. Unable to setup the load function.
>>> >
>>> > stack trace from the log file:
>>> >
>>> > Pig Stack Trace
>>> > ---------------
>>> > ERROR 2999: Unexpected internal error. Unable to setup the load
>>> > function.
>>> >
>>> > java.lang.RuntimeException: Unable to setup the load function.
>>> > at
>>> > org
>>> >
>>> .apache
>>> .pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:96)
>>> > at org.apache.pig.PigServer.getExamples(PigServer.java:723)
>>> > at
>>> > org
>>> > .apache
>>> > .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:
>>> 545)
>>> > at
>>> > org
>>> > .apache
>>> > .pig
>>> >
>>> .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
>>> > 246)
>>> > at
>>> > org
>>> > .apache
>>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:
>>> 168)
>>> > at
>>> > org
>>> > .apache
>>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:
>>> 144)
>>> > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
>>> > at org.apache.pig.Main.main(Main.java:320)
>>> >
>>> >
>>> > It works fine when I run the job normally. Is there anything
>>> special
>>> > I need to do in order to get it to work with ILLUSTRATE?
>>> >
>>> > thx,
>>> > -sr
>>> >
>>> > Sam Rash
>>> > samr@ning.com
>>> >
>>> >
>>> >
>>>
>>>
>>
>
Re: ILLUSTRATE with custom loadFunc
Posted by Sam Rash <sa...@ning.com>.
Actually I think I see the problem--I am setting the input path to a
directory and assuming bindTo() is called with an actual filename.
This happens correctly if it's running mapreduce mode, but not local
mode (it gets the input path as I specify it). Not sure if this is by
design, but I can spider the dir and get a path to use if that's the
contract of LoadFunc in localmode.
So...I fixed that and have bindTo() creating a SequencFile.Reader on
an actual file instance, but get this error:
Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error. null
java.lang.NullPointerException
at
org
.apache
.pig
.backend
.hadoop
.executionengine
.physicalLayer.relationalOperators.PORead.getNext(PORead.java:70)
at
org
.apache
.pig
.backend
.hadoop
.executionengine
.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
at
org
.apache
.pig
.backend
.hadoop
.executionengine
.physicalLayer
.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
at
org
.apache
.pig
.backend
.local
.executionengine
.physicalLayer
.relationalOperators.POCogroup.accumulateData(POCogroup.java:174)
at
org
.apache
.pig
.backend
.local
.executionengine
.physicalLayer.relationalOperators.POCogroup.getNext(POCogroup.java:93)
at
org.apache.pig.pen.DerivedDataVisitor.visit(DerivedDataVisitor.java:182)
at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:333)
at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:44)
at
org
.apache
.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
at
org
.apache
.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:98)
at
org
.apache
.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:90)
at
org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:
106)
at org.apache.pig.PigServer.getExamples(PigServer.java:723)
at
org
.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:
545)
at
org
.apache
.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
246)
at
org
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:
168)
at
org
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:
144)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
at org.apache.pig.Main.main(Main.java:320)
thx again,
-sr
Sam Rash
samr@ning.com
On Oct 12, 2009, at 3:18 PM, Sam Rash wrote:
> Hey Alan,
>
> ah, how do i make a load function work locally, do I have to do
> anything special? does local mean local execution? or local
> filesystem as well? (ie could it just be failing to find inputs to
> sample)
> the LoadFunc I have borrows heavily from https://issues.apache.org/jira/browse/PIG-911
> but has some custom bits for schemas.
>
> thx,
> -sr
>
> Sam Rash
> samr@ning.com
>
>
>
> On Oct 12, 2009, at 2:29 PM, Alan Gates wrote:
>
>> Looking at the code, the two cases I can see when it should get this
>> error is if it fails to open the file, or the call to bindTo fails
>> (see POLoad.setUp()). When you say it works when you run it normally
>> does normally mean in local mode or hadoop mode? Illustrate is
>> trying
>> to run your function in local mode.
>>
>> Alan.
>>
>> On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
>>
>> > Hi,
>> >
>> > I have a couple questions about using ILLUSTRATE with a custom load
>> > function
>> >
>> > 1. This seems to require I manually make sure my code is in the
>> > local classpath (doing REGISTER in the pig script only makes it
>> > available remotely). Is this by design?
>> >
>> > 2. I have implemented custom LoadFunc (implements SamplableLoader
>> > in fact) and am getting errors when trying to use ILLUSTRATE with
>> > it. The error is:
>> >
>> >
>> > 2009-10-12 11:54:08,057 [main] ERROR
>> > org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to
>> > setup the load function.
>> > 2009-10-12 11:54:08,059 [main] ERROR
>> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
>> > error. Unable to setup the load function.
>> >
>> > stack trace from the log file:
>> >
>> > Pig Stack Trace
>> > ---------------
>> > ERROR 2999: Unexpected internal error. Unable to setup the load
>> > function.
>> >
>> > java.lang.RuntimeException: Unable to setup the load function.
>> > at
>> > org
>> >
>> .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:
>> 96)
>> > at org.apache.pig.PigServer.getExamples(PigServer.java:723)
>> > at
>> > org
>> > .apache
>> > .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:
>> 545)
>> > at
>> > org
>> > .apache
>> > .pig
>> > .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
>> > 246)
>> > at
>> > org
>> > .apache
>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>> > at
>> > org
>> > .apache
>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
>> > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
>> > at org.apache.pig.Main.main(Main.java:320)
>> >
>> >
>> > It works fine when I run the job normally. Is there anything
>> special
>> > I need to do in order to get it to work with ILLUSTRATE?
>> >
>> > thx,
>> > -sr
>> >
>> > Sam Rash
>> > samr@ning.com
>> >
>> >
>> >
>>
>>
>
Re: ILLUSTRATE with custom loadFunc
Posted by Sam Rash <sa...@ning.com>.
Hey Alan,
ah, how do i make a load function work locally, do I have to do
anything special? does local mean local execution? or local
filesystem as well? (ie could it just be failing to find inputs to
sample)
the LoadFunc I have borrows heavily from https://issues.apache.org/jira/browse/PIG-911
but has some custom bits for schemas.
thx,
-sr
Sam Rash
samr@ning.com
On Oct 12, 2009, at 2:29 PM, Alan Gates wrote:
> Looking at the code, the two cases I can see when it should get this
> error is if it fails to open the file, or the call to bindTo fails
> (see POLoad.setUp()). When you say it works when you run it normally
> does normally mean in local mode or hadoop mode? Illustrate is trying
> to run your function in local mode.
>
> Alan.
>
> On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
>
> > Hi,
> >
> > I have a couple questions about using ILLUSTRATE with a custom load
> > function
> >
> > 1. This seems to require I manually make sure my code is in the
> > local classpath (doing REGISTER in the pig script only makes it
> > available remotely). Is this by design?
> >
> > 2. I have implemented custom LoadFunc (implements SamplableLoader
> > in fact) and am getting errors when trying to use ILLUSTRATE with
> > it. The error is:
> >
> >
> > 2009-10-12 11:54:08,057 [main] ERROR
> > org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to
> > setup the load function.
> > 2009-10-12 11:54:08,059 [main] ERROR
> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
> > error. Unable to setup the load function.
> >
> > stack trace from the log file:
> >
> > Pig Stack Trace
> > ---------------
> > ERROR 2999: Unexpected internal error. Unable to setup the load
> > function.
> >
> > java.lang.RuntimeException: Unable to setup the load function.
> > at
> > org
> > .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:
> 96)
> > at org.apache.pig.PigServer.getExamples(PigServer.java:723)
> > at
> > org
> > .apache
> > .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:545)
> > at
> > org
> > .apache
> > .pig
> > .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
> > 246)
> > at
> > org
> > .apache
> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> > at
> > org
> > .apache
> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> > at org.apache.pig.Main.main(Main.java:320)
> >
> >
> > It works fine when I run the job normally. Is there anything special
> > I need to do in order to get it to work with ILLUSTRATE?
> >
> > thx,
> > -sr
> >
> > Sam Rash
> > samr@ning.com
> >
> >
> >
>
>
Re: ILLUSTRATE with custom loadFunc
Posted by Alan Gates <ga...@yahoo-inc.com>.
Looking at the code, the two cases I can see when it should get this
error is if it fails to open the file, or the call to bindTo fails
(see POLoad.setUp()). When you say it works when you run it normally
does normally mean in local mode or hadoop mode? Illustrate is trying
to run your function in local mode.
Alan.
On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
> Hi,
>
> I have a couple questions about using ILLUSTRATE with a custom load
> function
>
> 1. This seems to require I manually make sure my code is in the
> local classpath (doing REGISTER in the pig script only makes it
> available remotely). Is this by design?
>
> 2. I have implemented custom LoadFunc (implements SamplableLoader
> in fact) and am getting errors when trying to use ILLUSTRATE with
> it. The error is:
>
>
> 2009-10-12 11:54:08,057 [main] ERROR
> org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to
> setup the load function.
> 2009-10-12 11:54:08,059 [main] ERROR
> org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
> error. Unable to setup the load function.
>
> stack trace from the log file:
>
> Pig Stack Trace
> ---------------
> ERROR 2999: Unexpected internal error. Unable to setup the load
> function.
>
> java.lang.RuntimeException: Unable to setup the load function.
> at
> org
> .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:96)
> at org.apache.pig.PigServer.getExamples(PigServer.java:723)
> at
> org
> .apache
> .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:545)
> at
> org
> .apache
> .pig
> .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
> 246)
> at
> org
> .apache
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> at
> org
> .apache
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:320)
>
>
> It works fine when I run the job normally. Is there anything special
> I need to do in order to get it to work with ILLUSTRATE?
>
> thx,
> -sr
>
> Sam Rash
> samr@ning.com
>
>
>