You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Dave Viner <da...@vinertech.com> on 2010/06/12 06:52:16 UTC
"Invalid alias" from UDF output
I am having a problem getting Pig 0.7.0 to use a variable I add from a UDF.
Here's the basic pig script:
LOGS = LOAD '$INPUT' USING PigStorage('\t') ;
IMP_SID = FOREACH IMPRESSIONS_ONLY GENERATE *,
(($4 == 'NULL') ? null : (chararray)$4) AS my_id:chararray;
ORD_SID = FOREACH IMP_SID GENERATE *,
MY_LOOKUP(my_id, $2, $17) AS (
out1:int, out2:chararray, out3:chararray);
The input file is a tab-delimited file that has a variable number of columns
(but always more than 16). When I run this, I get the following error:
grunt> ORD_SID = FOREACH IMP_SID GENERATE *,
>> MY_LOOKUP(my_id, $2, $17) AS (
>> out1:int, out2:chararray, out3:chararray);
2010-06-11 21:45:04,652 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1000: Error during parsing. Invalid alias: my_id in null
Details at logfile: /tmp/pig_1276317170205.log
Why is the 'my_id' value an invalid alias?
Re: "Invalid alias" from UDF output
Posted by Dave Viner <da...@vinertech.com>.
FWIW, I think I found my error here. The AS statement applies to the entire
output, not just additional columns. Since I am doing:
GENERATE *, .... AS (1colname)
This is the real failure. the * must be somehow specified in the schema.
Or just use *no* schema.
Dave Viner
On Fri, Jun 11, 2010 at 9:52 PM, Dave Viner <da...@vinertech.com> wrote:
> I am having a problem getting Pig 0.7.0 to use a variable I add from a UDF.
> Here's the basic pig script:
>
> LOGS = LOAD '$INPUT' USING PigStorage('\t') ;
>
> IMP_SID = FOREACH IMPRESSIONS_ONLY GENERATE *,
> (($4 == 'NULL') ? null : (chararray)$4) AS my_id:chararray;
>
> ORD_SID = FOREACH IMP_SID GENERATE *,
> MY_LOOKUP(my_id, $2, $17) AS (
> out1:int, out2:chararray, out3:chararray);
>
> The input file is a tab-delimited file that has a variable number of
> columns (but always more than 16). When I run this, I get the following
> error:
>
> grunt> ORD_SID = FOREACH IMP_SID GENERATE *,
> >> MY_LOOKUP(my_id, $2, $17) AS (
> >> out1:int, out2:chararray, out3:chararray);
>
> 2010-06-11 21:45:04,652 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1000: Error during parsing. Invalid alias: my_id in null
> Details at logfile: /tmp/pig_1276317170205.log
>
> Why is the 'my_id' value an invalid alias?
>