You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Arnab Guin (JIRA)" <ji...@apache.org> on 2013/01/16 08:08:13 UTC

[jira] [Commented] (PIG-3083) Introduce new syntax that let's you project just the columns that come from a given :: prefix

    [ https://issues.apache.org/jira/browse/PIG-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554802#comment-13554802 ] 

Arnab Guin commented on PIG-3083:
---------------------------------

Hi Jonathan,

I am submitting a patch to your request. The "a::*" notation now prints out only columns of relation 'a'. However I must mention this is still work in progress in the following areas:

(1) Support both a::* and b::* in the same assignment

(2) Currently works with column pruning turned off (need to use -t ColumnMapKeyPrune)
If anyone is an expert on Column Pruning, please kindly give your thoughts. The a:: referent alias refers to an LOLoad operator where columns on the load are "pushed down" into the loadFunc. So it needs to be ensured that the deleted columns (in this case y and z) are not referred by a::*. But what should the attached relational operator be in this case? It seems it can neither be LOLoad nor the x field. Any help appreciated.

(3) Need to add tests too after (1) and (2) addressed.

Thanks,
Arnab

                
> Introduce new syntax that let's you project just the columns that come from a given :: prefix
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-3083
>                 URL: https://issues.apache.org/jira/browse/PIG-3083
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.12
>            Reporter: Jonathan Coveney
>              Labels: PIG-3078
>             Fix For: 0.12
>
>         Attachments: pig_jira_aguin_3083.patch
>
>
> This is basically a more refined approach than PIG-3078, but it is also more work. That JIRA is more of a stopgap until we do something like this.
> The idea would be to support something like the following:
> a = load 'a' as (x,y,z);
> b = load 'b'  as (x,y,z);
> c = join a by x, b by x;
> d = foreach c generate a::*;
> Obviously this is useful for any case where you have relations with columns with various prefixes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira