You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by "Walker, Alan" <Al...@sabre.com> on 2012/04/06 22:38:50 UTC

Sending parameters to a customer load function

Hi,

I'm having some challenges with a  load function.  It only seems to work with a void constructor.  The Java code has a void constructor and a String constructor, much like the SimpleTextLoader example.  Any thoughts on what might be going wrong?

    public ShoppingReader() {
        parms = "";
    }

    public ShoppingReader(String tmp) {
        parms = tmp;
    }

grunt> A = LOAD '/user/alanw/*.xml' USING com.sabre.pigshop.ShoppingReader('all') AS (x);
2012-04-06 16:04:08,593 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. could not instantiate com.sabre.pigshop.ShoppingReader' with arguments '[all]'

Thanks,
Alan.


Re: Sending parameters to a customer load function

Posted by Norbert Burger <no...@gmail.com>.
It seems to me that Alan is only interested in writing a loader which has a
non-default constructor (takes arguments), he doesn't need to create a UDF
which has this property.

Besides SimpleTextLoader, there are a number of examples of this in the Pig
codebase, including HBaseStorage.  My own attempt of a basic no-op loader
that has a non-default constructor also worked fine.

Alan -- can you share the full loader implementation where you're seeing
this issue?  Also, what version of Pig are you using?  Error #2999 makes we
wonder whether you're hitting an uncaught exception elsewhere in your
loader implementation.

Norbert

On Mon, Apr 9, 2012 at 11:35 AM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Hi Alan,
> when you use a loader:
> A  = load 'stuff' using my.pig.Loader('foo', 'bar');
>
> the loader gets constructed with 'foo', 'bar'', then it gets set up
> (with the various setSignature, prepareToRead, etc, calls), and its
> getNext() gets called repeatedly until there is nothing left to read.
>
> when you use a udf:
> B = foreach A generate my.pig.UDF($0);
>
> pig iterates through relation A and invokes the UDF's exec method on a
> tuple composed of the fields specified in your script -- in this case,
> the first field in each row of A.  If you want to have a non-default
> constructor used to create the UDF instance that will be exec'd on all
> these tuples, you can do this through a "define" call as I described
> earlier.
>
> Loaders (and Storers) are very different from UDFs in how they are
> used and invoked, and they implement totally different interfaces.
>
> -Dmitriy
>
> On Mon, Apr 9, 2012 at 5:34 AM, Walker, Alan <Al...@sabre.com>
> wrote:
> > Dmitriy,
> >
> > I have also tried that pattern for a Loader and it doesn't find the
> String constructor, it only works with the void constructor.
> >
> > grunt> define myreader com.sabre.pigshop.ShoppingReader('all');
> > grunt> A = LOAD '/user/alanw/*.xml' USING myreader() AS (x);
> > 2012-04-09 07:33:58,502 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2999: Unexpected internal error. could not instantiate
> 'com.sabre.pigshop.ShoppingReader' with arguments '[all]'
> >
> >
> > This works:
> >
> > grunt> define myreader com.sabre.pigshop.ShoppingReader();
> > grunt> A = LOAD '/user/alanw/*.xml' USING myreader AS (x);
> >
> >
> > I haven't dug into the Pig source yet, perhaps the Loader functions are
> treated differently than another UDF?  Seems unlikely.
> >
> > Thanks,
> > Alan
> >
> >
> > -----Original Message-----
> > From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > Sent: Friday, April 06, 2012 6:21 PM
> > To: user@pig.apache.org; Walker, Alan
> > Subject: Re: Sending parameters to a customer load function
> >
> > Hi Alan,
> > You can use "define" to supply an argument to a UDF constructor.
> >
> > You can see an example here:
> >
> http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#udf_define
> >
> > I did just find to my surprise that this isn't in our documentation..
> > we should add that.
> >
> > D
> >
> > On Fri, Apr 6, 2012 at 1:38 PM, Walker, Alan <Al...@sabre.com>
> wrote:
> >> Hi,
> >>
> >> I'm having some challenges with a  load function.  It only seems to
> work with a void constructor.  The Java code has a void constructor and a
> String constructor, much like the SimpleTextLoader example.  Any thoughts
> on what might be going wrong?
> >>
> >>    public ShoppingReader() {
> >>        parms = "";
> >>    }
> >>
> >>    public ShoppingReader(String tmp) {
> >>        parms = tmp;
> >>    }
> >>
> >> grunt> A = LOAD '/user/alanw/*.xml' USING
> com.sabre.pigshop.ShoppingReader('all') AS (x);
> >> 2012-04-06 16:04:08,593 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2999: Unexpected internal error. could not instantiate
> com.sabre.pigshop.ShoppingReader' with arguments '[all]'
> >>
> >> Thanks,
> >> Alan.
> >>
>

Re: Sending parameters to a customer load function

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi Alan,
when you use a loader:
A  = load 'stuff' using my.pig.Loader('foo', 'bar');

the loader gets constructed with 'foo', 'bar'', then it gets set up
(with the various setSignature, prepareToRead, etc, calls), and its
getNext() gets called repeatedly until there is nothing left to read.

when you use a udf:
B = foreach A generate my.pig.UDF($0);

pig iterates through relation A and invokes the UDF's exec method on a
tuple composed of the fields specified in your script -- in this case,
the first field in each row of A.  If you want to have a non-default
constructor used to create the UDF instance that will be exec'd on all
these tuples, you can do this through a "define" call as I described
earlier.

Loaders (and Storers) are very different from UDFs in how they are
used and invoked, and they implement totally different interfaces.

-Dmitriy

On Mon, Apr 9, 2012 at 5:34 AM, Walker, Alan <Al...@sabre.com> wrote:
> Dmitriy,
>
> I have also tried that pattern for a Loader and it doesn't find the String constructor, it only works with the void constructor.
>
> grunt> define myreader com.sabre.pigshop.ShoppingReader('all');
> grunt> A = LOAD '/user/alanw/*.xml' USING myreader() AS (x);
> 2012-04-09 07:33:58,502 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. could not instantiate 'com.sabre.pigshop.ShoppingReader' with arguments '[all]'
>
>
> This works:
>
> grunt> define myreader com.sabre.pigshop.ShoppingReader();
> grunt> A = LOAD '/user/alanw/*.xml' USING myreader AS (x);
>
>
> I haven't dug into the Pig source yet, perhaps the Loader functions are treated differently than another UDF?  Seems unlikely.
>
> Thanks,
> Alan
>
>
> -----Original Message-----
> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> Sent: Friday, April 06, 2012 6:21 PM
> To: user@pig.apache.org; Walker, Alan
> Subject: Re: Sending parameters to a customer load function
>
> Hi Alan,
> You can use "define" to supply an argument to a UDF constructor.
>
> You can see an example here:
> http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#udf_define
>
> I did just find to my surprise that this isn't in our documentation..
> we should add that.
>
> D
>
> On Fri, Apr 6, 2012 at 1:38 PM, Walker, Alan <Al...@sabre.com> wrote:
>> Hi,
>>
>> I'm having some challenges with a  load function.  It only seems to work with a void constructor.  The Java code has a void constructor and a String constructor, much like the SimpleTextLoader example.  Any thoughts on what might be going wrong?
>>
>>    public ShoppingReader() {
>>        parms = "";
>>    }
>>
>>    public ShoppingReader(String tmp) {
>>        parms = tmp;
>>    }
>>
>> grunt> A = LOAD '/user/alanw/*.xml' USING com.sabre.pigshop.ShoppingReader('all') AS (x);
>> 2012-04-06 16:04:08,593 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. could not instantiate com.sabre.pigshop.ShoppingReader' with arguments '[all]'
>>
>> Thanks,
>> Alan.
>>

RE: Sending parameters to a customer load function

Posted by "Walker, Alan" <Al...@sabre.com>.
Dmitriy,

I have also tried that pattern for a Loader and it doesn't find the String constructor, it only works with the void constructor.

grunt> define myreader com.sabre.pigshop.ShoppingReader('all');
grunt> A = LOAD '/user/alanw/*.xml' USING myreader() AS (x);
2012-04-09 07:33:58,502 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. could not instantiate 'com.sabre.pigshop.ShoppingReader' with arguments '[all]'


This works:

grunt> define myreader com.sabre.pigshop.ShoppingReader();
grunt> A = LOAD '/user/alanw/*.xml' USING myreader AS (x);


I haven't dug into the Pig source yet, perhaps the Loader functions are treated differently than another UDF?  Seems unlikely.

Thanks,
Alan


-----Original Message-----
From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com] 
Sent: Friday, April 06, 2012 6:21 PM
To: user@pig.apache.org; Walker, Alan
Subject: Re: Sending parameters to a customer load function

Hi Alan,
You can use "define" to supply an argument to a UDF constructor.

You can see an example here:
http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#udf_define

I did just find to my surprise that this isn't in our documentation..
we should add that.

D

On Fri, Apr 6, 2012 at 1:38 PM, Walker, Alan <Al...@sabre.com> wrote:
> Hi,
>
> I'm having some challenges with a  load function.  It only seems to work with a void constructor.  The Java code has a void constructor and a String constructor, much like the SimpleTextLoader example.  Any thoughts on what might be going wrong?
>
>    public ShoppingReader() {
>        parms = "";
>    }
>
>    public ShoppingReader(String tmp) {
>        parms = tmp;
>    }
>
> grunt> A = LOAD '/user/alanw/*.xml' USING com.sabre.pigshop.ShoppingReader('all') AS (x);
> 2012-04-06 16:04:08,593 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. could not instantiate com.sabre.pigshop.ShoppingReader' with arguments '[all]'
>
> Thanks,
> Alan.
>

Re: Sending parameters to a customer load function

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi Alan,
You can use "define" to supply an argument to a UDF constructor.

You can see an example here:
http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#udf_define

I did just find to my surprise that this isn't in our documentation..
we should add that.

D

On Fri, Apr 6, 2012 at 1:38 PM, Walker, Alan <Al...@sabre.com> wrote:
> Hi,
>
> I'm having some challenges with a  load function.  It only seems to work with a void constructor.  The Java code has a void constructor and a String constructor, much like the SimpleTextLoader example.  Any thoughts on what might be going wrong?
>
>    public ShoppingReader() {
>        parms = "";
>    }
>
>    public ShoppingReader(String tmp) {
>        parms = tmp;
>    }
>
> grunt> A = LOAD '/user/alanw/*.xml' USING com.sabre.pigshop.ShoppingReader('all') AS (x);
> 2012-04-06 16:04:08,593 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. could not instantiate com.sabre.pigshop.ShoppingReader' with arguments '[all]'
>
> Thanks,
> Alan.
>