You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Prasanth J (JIRA)" <ji...@apache.org> on 2012/06/22 09:30:42 UTC

[jira] [Commented] (PIG-2765) Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

    [ https://issues.apache.org/jira/browse/PIG-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399184#comment-13399184 ] 

Prasanth J commented on PIG-2765:
---------------------------------

This patch contains the following features
1) RollupDimensions UDF
2) Support for ROLLUP clause in CUBE operator
3) Testcases for both
4) Removes "dimensions::" namespace from the output schema of cube operator

The syntax for the CUBE operator is now
{code}
alias = CUBE rel BY { CUBE | ROLLUP } col_ref [, { CUBE | ROLLUP } col_ref ...]
{code}

Example:
{code}
out = CUBE inp BY CUBE(a,b), ROLLUP(c,d);
{code}

the above code will generate following combinations of aggregations for each input tuples
(a,b,c,d)
(a,NULL,c,d)
(NULL,b,c,d)
(NULL,NULL,c,d)
(a,b,c,NULL)
(a,NULL,c,NULL)
(NULL,b,c,NULL)
(NULL,NULL,c,NULL)
(a,b,NULL,NULL)
(a,NULL,NULL,NULL)
(NULL,b,NULL,NULL)
(NULL,NULL,NULL,NULL)


Schema for "out" will be 
{code}
out: {group: (a: bytearray,b: bytearray,c: bytearray,d: bytearray),cube: {(a: bytearray,b: bytearray,c: bytearray,d: bytearray)}}
{code}

NOTE: NULL value handling is not available in this patch. Patch for legitimate NULL value handling is available at https://issues.apache.org/jira/browse/PIG-2726
                
> Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator
> ---------------------------------------------------------------------------
>
>                 Key: PIG-2765
>                 URL: https://issues.apache.org/jira/browse/PIG-2765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>         Attachments: PIG-2765.1.patch
>
>
> Implement RollupDimensions UDF which performs aggregation from most detailed level of dimensions to the most general level (grand total) in hierarchical order. Provide support for ROLLUP clause in CUBE operator. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira