You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by jamal sasha <ja...@gmail.com> on 2013/04/27 11:32:15 UTC

pig question

Hi,
  I have data of format

id1,id2, value
1 , abc, 2993
1, dhu, 9284
1,dus,2389
2, acs,29392

and so on

For each id1, I want to find the maximum value and then divide value by
max_value
so in example above:

1,abc, 2993/9284
1,dhu ,9284/9284
1,dus, 2389/9284
2,acs, 29392/max_value_for_this id

How do i do this in pig?
Thanks

Re: pig question

Posted by Russell Jurney <ru...@gmail.com>.
values = LOAD 'my_path' AS (id1:int, id2:chararray, value:int);
overall = FOREACH (GROUP values BY id1) GENERATE group AS id1,
value/MAX(value) as div_max;

Russell Jurney http://datasyndrome.com

On Apr 27, 2013, at 2:32 AM, jamal sasha <ja...@gmail.com> wrote:

> Hi,
>  I have data of format
>
> id1,id2, value
> 1 , abc, 2993
> 1, dhu, 9284
> 1,dus,2389
> 2, acs,29392
>
> and so on
>
> For each id1, I want to find the maximum value and then divide value by
> max_value
> so in example above:
>
> 1,abc, 2993/9284
> 1,dhu ,9284/9284
> 1,dus, 2389/9284
> 2,acs, 29392/max_value_for_this id
>
> How do i do this in pig?
> Thanks