You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Dave Viner <da...@vinertech.com> on 2010/06/12 06:52:16 UTC

"Invalid alias" from UDF output

I am having a problem getting Pig 0.7.0 to use a variable I add from a UDF.
 Here's the basic pig script:

LOGS = LOAD '$INPUT' USING PigStorage('\t') ;

IMP_SID = FOREACH IMPRESSIONS_ONLY GENERATE *,
    (($4 == 'NULL') ? null : (chararray)$4) AS my_id:chararray;

ORD_SID = FOREACH IMP_SID GENERATE *,
    MY_LOOKUP(my_id, $2, $17) AS (
    out1:int, out2:chararray, out3:chararray);

The input file is a tab-delimited file that has a variable number of columns
(but always more than 16).  When I run this, I get the following error:

grunt> ORD_SID = FOREACH IMP_SID GENERATE *,
>>    MY_LOOKUP(my_id, $2, $17) AS (
>>    out1:int, out2:chararray, out3:chararray);

2010-06-11 21:45:04,652 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1000: Error during parsing. Invalid alias: my_id in null
Details at logfile: /tmp/pig_1276317170205.log

Why is the 'my_id' value an invalid alias?

Re: "Invalid alias" from UDF output

Posted by Dave Viner <da...@vinertech.com>.
FWIW, I think I found my error here.  The AS statement applies to the entire
output, not just additional columns.  Since I am doing:
    GENERATE *, .... AS (1colname)

This is the real failure.  the * must be somehow specified in the schema.
 Or just use *no* schema.

Dave Viner



On Fri, Jun 11, 2010 at 9:52 PM, Dave Viner <da...@vinertech.com> wrote:

> I am having a problem getting Pig 0.7.0 to use a variable I add from a UDF.
>  Here's the basic pig script:
>
> LOGS = LOAD '$INPUT' USING PigStorage('\t') ;
>
> IMP_SID = FOREACH IMPRESSIONS_ONLY GENERATE *,
>     (($4 == 'NULL') ? null : (chararray)$4) AS my_id:chararray;
>
> ORD_SID = FOREACH IMP_SID GENERATE *,
>     MY_LOOKUP(my_id, $2, $17) AS (
>     out1:int, out2:chararray, out3:chararray);
>
> The input file is a tab-delimited file that has a variable number of
> columns (but always more than 16).  When I run this, I get the following
> error:
>
> grunt> ORD_SID = FOREACH IMP_SID GENERATE *,
> >>    MY_LOOKUP(my_id, $2, $17) AS (
> >>    out1:int, out2:chararray, out3:chararray);
>
> 2010-06-11 21:45:04,652 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1000: Error during parsing. Invalid alias: my_id in null
> Details at logfile: /tmp/pig_1276317170205.log
>
> Why is the 'my_id' value an invalid alias?
>