You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Ajay Garg (JIRA)" <ji...@apache.org> on 2008/07/08 12:22:31 UTC

[jira] Updated: (PIG-296) UDF for cumulative statistics

     [ https://issues.apache.org/jira/browse/PIG-296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ajay Garg updated PIG-296:
--------------------------

    Attachment: cumulative.patch

Patch attached....

> UDF for cumulative statistics
> -----------------------------
>
>                 Key: PIG-296
>                 URL: https://issues.apache.org/jira/browse/PIG-296
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Ajay Garg
>            Priority: Minor
>         Attachments: cumulative.patch
>
>
> udf for computive cumulative sum, row, rank, dense rank. visit http://twiki.corp.yahoo.com/view/YResearch/PigStatisticsCumulative for detailed description. 
> To use 
> A = load 'data' using PigStorage as ( query, freq );
> B = group A all;
> C = foreach B {
>     Ordered = order A by freq using numeric.OrderDescending;
>     generate
>         statistics.CUMULATIVE_COLUMN(Ordered, 1) as   -- Pig starts with 0th column, this refers to the column freq by offset
>                 ( query, freq, freq_cumulative_sum, freq_row, freq_rank, freq_dense_rank );
> };
> D = foreach C generate FLATTEN(A);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.