You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Eric Yang (JIRA)" <ji...@apache.org> on 2010/12/30 20:29:46 UTC

[jira] Commented: (PIG-1693) There needs to be a way in foreach to indicate "and all the rest of the fields"

    [ https://issues.apache.org/jira/browse/PIG-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976089#action_12976089 ] 

Eric Yang commented on PIG-1693:
--------------------------------

*+ and *- could have potential readability problems.  It is easy to confuse user with mathematical operation at first glance.  I think using ".." would be better choice.

It should be possible to write as:

{noformat}
Z = foreach Y generate myUDF(firstcol, secondcol, thirdcol) as result, forthcol .. tenthcol;
Z = foreach Y generate firstcol, forthcol .. tenthcol;
{noformat}

Another approach, It could be written as UDF style.

{noformat}
Z = foreach Y generate myUDF(firstcol, secondcol, thirdcol) as result, mirror(forthcol, tenthcol);
Z = foreach Y generate firstcol, mirror(forthcol, thenthcol);
{noformat}


> There needs to be a way in foreach to indicate "and all the rest of the fields"
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1693
>                 URL: https://issues.apache.org/jira/browse/PIG-1693
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Daniel Dai
>             Fix For: 0.9.0
>
>
> A common use case we see in Pig is people have many columns in their data and they only want to operate on a few of them.  Consider for example if before storing data with ten columns, the user wants to perform a cast on one column:
> {code}
> ...
> Z = foreach Y generate (int)firstcol, secondcol, thridcol, forthcol, fifthcol, sixthcol, seventhcol, eigthcol, ninethcol, tenthcol;
> store Z into 'output';
> {code}
> Obviously this only gets worse as the user has more columns.  Ideally the above could be transformed to something like:
> {code}
> ...
> Z = foreach Y generate (int)firstcol, "and all the rest";
> store Z into 'output'
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.