You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Sam Rash <sa...@ning.com> on 2009/10/12 20:55:05 UTC

ILLUSTRATE with custom loadFunc

Hi,

I have a couple questions about using ILLUSTRATE with a custom load  
function

1. This seems to require I manually make sure my code is in the local  
classpath (doing  REGISTER in the pig script only makes it available  
remotely).  Is this by design?

2. I have implemented  custom LoadFunc (implements SamplableLoader in  
fact) and am getting errors when trying to use ILLUSTRATE with it.   
The error is:


2009-10-12 11:54:08,057 [main] ERROR  
org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to  
setup the load function.
2009-10-12 11:54:08,059 [main] ERROR org.apache.pig.tools.grunt.Grunt  
- ERROR 2999: Unexpected internal error. Unable to setup the load  
function.

stack trace from the log file:

Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error. Unable to setup the load  
function.

java.lang.RuntimeException: Unable to setup the load function.
	at  
org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java: 
96)
	at org.apache.pig.PigServer.getExamples(PigServer.java:723)
	at  
org 
.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java: 
545)
	at  
org 
.apache 
.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java: 
246)
	at  
org 
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java: 
168)
	at  
org 
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java: 
144)
	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
	at org.apache.pig.Main.main(Main.java:320)


It works fine when I run the job normally. Is there anything special I  
need to do in order to get it to work with ILLUSTRATE?

thx,
-sr

Sam Rash
samr@ning.com




Re: ILLUSTRATE with custom loadFunc

Posted by Alan Gates <ga...@yahoo-inc.com>.
This looks like a bug.  PORead is trying to access its internal bag at  
line 70.  Either it wasn't passed the bag or the bag it was passed is  
null.  Does this happen if you don't use a custom load function?   
Either way I think you should file a JIRA on this one.

Alan.

On Oct 12, 2009, at 3:44 PM, Sam Rash wrote:

> Actually I think I see the problem--I am setting the input path to a  
> directory and assuming bindTo() is called with an actual filename.   
> This happens correctly if it's running mapreduce mode, but not local  
> mode (it gets the input path as I specify it). Not sure if this is  
> by design, but I can spider the dir and get a path to use if that's  
> the contract of LoadFunc in localmode.
>
> So...I fixed that and have bindTo() creating a SequencFile.Reader on  
> an actual file instance, but get this error:
>
> Pig Stack Trace
> ---------------
> ERROR 2999: Unexpected internal error. null
>
> java.lang.NullPointerException
> 	at  
> org 
> .apache 
> .pig 
> .backend 
> .hadoop 
> .executionengine 
> .physicalLayer.relationalOperators.PORead.getNext(PORead.java:70)
> 	at  
> org 
> .apache 
> .pig 
> .backend 
> .hadoop 
> .executionengine 
> .physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java: 
> 231)
> 	at  
> org 
> .apache 
> .pig 
> .backend 
> .hadoop 
> .executionengine 
> .physicalLayer 
> .relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java: 
> 240)
> 	at  
> org 
> .apache 
> .pig 
> .backend 
> .local 
> .executionengine 
> .physicalLayer 
> .relationalOperators.POCogroup.accumulateData(POCogroup.java:174)
> 	at  
> org 
> .apache 
> .pig 
> .backend 
> .local 
> .executionengine 
> .physicalLayer.relationalOperators.POCogroup.getNext(POCogroup.java: 
> 93)
> 	at  
> org.apache.pig.pen.DerivedDataVisitor.visit(DerivedDataVisitor.java: 
> 182)
> 	at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java: 
> 333)
> 	at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java: 
> 44)
> 	at  
> org 
> .apache 
> .pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java: 
> 68)
> 	at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> 	at  
> org 
> .apache 
> .pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:98)
> 	at  
> org 
> .apache 
> .pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:90)
> 	at  
> org 
> .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java: 
> 106)
> 	at org.apache.pig.PigServer.getExamples(PigServer.java:723)
> 	at  
> org 
> .apache 
> .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:545)
> 	at  
> org 
> .apache 
> .pig 
> .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java: 
> 246)
> 	at  
> org 
> .apache 
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> 	at  
> org 
> .apache 
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> 	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> 	at org.apache.pig.Main.main(Main.java:320)
>
>
> thx again,
> -sr
>
> Sam Rash
> samr@ning.com
>
>
>
> On Oct 12, 2009, at 3:18 PM, Sam Rash wrote:
>
>> Hey Alan,
>>
>> ah, how do i make a load function work locally, do I have to do  
>> anything special?  does local mean local execution?  or local  
>> filesystem as well? (ie could it just be failing to find inputs to  
>> sample)
>> the LoadFunc I have borrows heavily from https://issues.apache.org/jira/browse/PIG-911 
>>  but has some custom bits for schemas.
>>
>> thx,
>> -sr
>>
>> Sam Rash
>> samr@ning.com
>>
>>
>>
>> On Oct 12, 2009, at 2:29 PM, Alan Gates wrote:
>>
>>> Looking at the code, the two cases I can see when it should get this
>>> error is if it fails to open the file, or the call to bindTo fails
>>> (see POLoad.setUp()).  When you say it works when you run it  
>>> normally
>>> does normally mean in local mode or hadoop mode?  Illustrate is  
>>> trying
>>> to run your function in local mode.
>>>
>>> Alan.
>>>
>>> On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
>>>
>>> > Hi,
>>> >
>>> > I have a couple questions about using ILLUSTRATE with a custom  
>>> load
>>> > function
>>> >
>>> > 1. This seems to require I manually make sure my code is in the
>>> > local classpath (doing  REGISTER in the pig script only makes it
>>> > available remotely).  Is this by design?
>>> >
>>> > 2. I have implemented  custom LoadFunc (implements SamplableLoader
>>> > in fact) and am getting errors when trying to use ILLUSTRATE with
>>> > it.  The error is:
>>> >
>>> >
>>> > 2009-10-12 11:54:08,057 [main] ERROR
>>> > org.apache.pig.pen.ExampleGenerator - Error reading data. Unable  
>>> to
>>> > setup the load function.
>>> > 2009-10-12 11:54:08,059 [main] ERROR
>>> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
>>> > error. Unable to setup the load function.
>>> >
>>> > stack trace from the log file:
>>> >
>>> > Pig Stack Trace
>>> > ---------------
>>> > ERROR 2999: Unexpected internal error. Unable to setup the load
>>> > function.
>>> >
>>> > java.lang.RuntimeException: Unable to setup the load function.
>>> >       at
>>> > org
>>> > 
>>>  .apache 
>>> .pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:96)
>>> >       at org.apache.pig.PigServer.getExamples(PigServer.java:723)
>>> >       at
>>> > org
>>> > .apache
>>> > .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java: 
>>> 545)
>>> >       at
>>> > org
>>> > .apache
>>> > .pig
>>> > 
>>>  .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
>>> > 246)
>>> >       at
>>> > org
>>> > .apache
>>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java: 
>>> 168)
>>> >       at
>>> > org
>>> > .apache
>>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java: 
>>> 144)
>>> >       at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
>>> >       at org.apache.pig.Main.main(Main.java:320)
>>> >
>>> >
>>> > It works fine when I run the job normally. Is there anything  
>>> special
>>> > I need to do in order to get it to work with ILLUSTRATE?
>>> >
>>> > thx,
>>> > -sr
>>> >
>>> > Sam Rash
>>> > samr@ning.com
>>> >
>>> >
>>> >
>>>
>>>
>>
>


Re: ILLUSTRATE with custom loadFunc

Posted by Sam Rash <sa...@ning.com>.
Actually I think I see the problem--I am setting the input path to a  
directory and assuming bindTo() is called with an actual filename.   
This happens correctly if it's running mapreduce mode, but not local  
mode (it gets the input path as I specify it). Not sure if this is by  
design, but I can spider the dir and get a path to use if that's the  
contract of LoadFunc in localmode.

So...I fixed that and have bindTo() creating a SequencFile.Reader on  
an actual file instance, but get this error:

Pig Stack Trace
---------------
ERROR 2999: Unexpected internal error. null

java.lang.NullPointerException
	at  
org 
.apache 
.pig 
.backend 
.hadoop 
.executionengine 
.physicalLayer.relationalOperators.PORead.getNext(PORead.java:70)
	at  
org 
.apache 
.pig 
.backend 
.hadoop 
.executionengine 
.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
	at  
org 
.apache 
.pig 
.backend 
.hadoop 
.executionengine 
.physicalLayer 
.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
	at  
org 
.apache 
.pig 
.backend 
.local 
.executionengine 
.physicalLayer 
.relationalOperators.POCogroup.accumulateData(POCogroup.java:174)
	at  
org 
.apache 
.pig 
.backend 
.local 
.executionengine 
.physicalLayer.relationalOperators.POCogroup.getNext(POCogroup.java:93)
	at  
org.apache.pig.pen.DerivedDataVisitor.visit(DerivedDataVisitor.java:182)
	at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:333)
	at org.apache.pig.impl.logicalLayer.LOCogroup.visit(LOCogroup.java:44)
	at  
org 
.apache 
.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
	at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
	at  
org 
.apache 
.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:98)
	at  
org 
.apache 
.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:90)
	at  
org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java: 
106)
	at org.apache.pig.PigServer.getExamples(PigServer.java:723)
	at  
org 
.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java: 
545)
	at  
org 
.apache 
.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java: 
246)
	at  
org 
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java: 
168)
	at  
org 
.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java: 
144)
	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
	at org.apache.pig.Main.main(Main.java:320)


thx again,
-sr

Sam Rash
samr@ning.com



On Oct 12, 2009, at 3:18 PM, Sam Rash wrote:

> Hey Alan,
>
> ah, how do i make a load function work locally, do I have to do  
> anything special?  does local mean local execution?  or local  
> filesystem as well? (ie could it just be failing to find inputs to  
> sample)
> the LoadFunc I have borrows heavily from https://issues.apache.org/jira/browse/PIG-911 
>  but has some custom bits for schemas.
>
> thx,
> -sr
>
> Sam Rash
> samr@ning.com
>
>
>
> On Oct 12, 2009, at 2:29 PM, Alan Gates wrote:
>
>> Looking at the code, the two cases I can see when it should get this
>> error is if it fails to open the file, or the call to bindTo fails
>> (see POLoad.setUp()).  When you say it works when you run it normally
>> does normally mean in local mode or hadoop mode?  Illustrate is  
>> trying
>> to run your function in local mode.
>>
>> Alan.
>>
>> On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
>>
>> > Hi,
>> >
>> > I have a couple questions about using ILLUSTRATE with a custom load
>> > function
>> >
>> > 1. This seems to require I manually make sure my code is in the
>> > local classpath (doing  REGISTER in the pig script only makes it
>> > available remotely).  Is this by design?
>> >
>> > 2. I have implemented  custom LoadFunc (implements SamplableLoader
>> > in fact) and am getting errors when trying to use ILLUSTRATE with
>> > it.  The error is:
>> >
>> >
>> > 2009-10-12 11:54:08,057 [main] ERROR
>> > org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to
>> > setup the load function.
>> > 2009-10-12 11:54:08,059 [main] ERROR
>> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
>> > error. Unable to setup the load function.
>> >
>> > stack trace from the log file:
>> >
>> > Pig Stack Trace
>> > ---------------
>> > ERROR 2999: Unexpected internal error. Unable to setup the load
>> > function.
>> >
>> > java.lang.RuntimeException: Unable to setup the load function.
>> >       at
>> > org
>> > 
>>  .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java: 
>> 96)
>> >       at org.apache.pig.PigServer.getExamples(PigServer.java:723)
>> >       at
>> > org
>> > .apache
>> > .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java: 
>> 545)
>> >       at
>> > org
>> > .apache
>> > .pig
>> > .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
>> > 246)
>> >       at
>> > org
>> > .apache
>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
>> >       at
>> > org
>> > .apache
>> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
>> >       at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
>> >       at org.apache.pig.Main.main(Main.java:320)
>> >
>> >
>> > It works fine when I run the job normally. Is there anything  
>> special
>> > I need to do in order to get it to work with ILLUSTRATE?
>> >
>> > thx,
>> > -sr
>> >
>> > Sam Rash
>> > samr@ning.com
>> >
>> >
>> >
>>
>>
>


Re: ILLUSTRATE with custom loadFunc

Posted by Sam Rash <sa...@ning.com>.
Hey Alan,

ah, how do i make a load function work locally, do I have to do  
anything special?  does local mean local execution?  or local  
filesystem as well? (ie could it just be failing to find inputs to  
sample)
the LoadFunc I have borrows heavily from https://issues.apache.org/jira/browse/PIG-911 
  but has some custom bits for schemas.

thx,
-sr

Sam Rash
samr@ning.com



On Oct 12, 2009, at 2:29 PM, Alan Gates wrote:

> Looking at the code, the two cases I can see when it should get this
> error is if it fails to open the file, or the call to bindTo fails
> (see POLoad.setUp()).  When you say it works when you run it normally
> does normally mean in local mode or hadoop mode?  Illustrate is trying
> to run your function in local mode.
>
> Alan.
>
> On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:
>
> > Hi,
> >
> > I have a couple questions about using ILLUSTRATE with a custom load
> > function
> >
> > 1. This seems to require I manually make sure my code is in the
> > local classpath (doing  REGISTER in the pig script only makes it
> > available remotely).  Is this by design?
> >
> > 2. I have implemented  custom LoadFunc (implements SamplableLoader
> > in fact) and am getting errors when trying to use ILLUSTRATE with
> > it.  The error is:
> >
> >
> > 2009-10-12 11:54:08,057 [main] ERROR
> > org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to
> > setup the load function.
> > 2009-10-12 11:54:08,059 [main] ERROR
> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal
> > error. Unable to setup the load function.
> >
> > stack trace from the log file:
> >
> > Pig Stack Trace
> > ---------------
> > ERROR 2999: Unexpected internal error. Unable to setup the load
> > function.
> >
> > java.lang.RuntimeException: Unable to setup the load function.
> >       at
> > org
> > .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java: 
> 96)
> >       at org.apache.pig.PigServer.getExamples(PigServer.java:723)
> >       at
> > org
> > .apache
> > .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:545)
> >       at
> > org
> > .apache
> > .pig
> > .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:
> > 246)
> >       at
> > org
> > .apache
> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> >       at
> > org
> > .apache
> > .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> >       at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> >       at org.apache.pig.Main.main(Main.java:320)
> >
> >
> > It works fine when I run the job normally. Is there anything special
> > I need to do in order to get it to work with ILLUSTRATE?
> >
> > thx,
> > -sr
> >
> > Sam Rash
> > samr@ning.com
> >
> >
> >
>
>


Re: ILLUSTRATE with custom loadFunc

Posted by Alan Gates <ga...@yahoo-inc.com>.
Looking at the code, the two cases I can see when it should get this  
error is if it fails to open the file, or the call to bindTo fails  
(see POLoad.setUp()).  When you say it works when you run it normally  
does normally mean in local mode or hadoop mode?  Illustrate is trying  
to run your function in local mode.

Alan.

On Oct 12, 2009, at 11:55 AM, Sam Rash wrote:

> Hi,
>
> I have a couple questions about using ILLUSTRATE with a custom load  
> function
>
> 1. This seems to require I manually make sure my code is in the  
> local classpath (doing  REGISTER in the pig script only makes it  
> available remotely).  Is this by design?
>
> 2. I have implemented  custom LoadFunc (implements SamplableLoader  
> in fact) and am getting errors when trying to use ILLUSTRATE with  
> it.  The error is:
>
>
> 2009-10-12 11:54:08,057 [main] ERROR  
> org.apache.pig.pen.ExampleGenerator - Error reading data. Unable to  
> setup the load function.
> 2009-10-12 11:54:08,059 [main] ERROR  
> org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal  
> error. Unable to setup the load function.
>
> stack trace from the log file:
>
> Pig Stack Trace
> ---------------
> ERROR 2999: Unexpected internal error. Unable to setup the load  
> function.
>
> java.lang.RuntimeException: Unable to setup the load function.
> 	at  
> org 
> .apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:96)
> 	at org.apache.pig.PigServer.getExamples(PigServer.java:723)
> 	at  
> org 
> .apache 
> .pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:545)
> 	at  
> org 
> .apache 
> .pig 
> .tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java: 
> 246)
> 	at  
> org 
> .apache 
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> 	at  
> org 
> .apache 
> .pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
> 	at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> 	at org.apache.pig.Main.main(Main.java:320)
>
>
> It works fine when I run the job normally. Is there anything special  
> I need to do in order to get it to work with ILLUSTRATE?
>
> thx,
> -sr
>
> Sam Rash
> samr@ning.com
>
>
>