You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by jamal sasha <ja...@gmail.com> on 2012/10/11 22:36:35 UTC

question

>
>    I have a data file in format
>
>
>
> User, movie, price
>
> 123,abc,22.2
>
> 123,daw,39
>
> 123,abc,99  ß Note that the user and movie is same but price is different
>
>
>
> I want to generate a pig script where I am counting how many times a user
has rented a particular movie
>
>
>
>
>
> in = LOAD 'data' USING PigStorage('\\u001') AS ( user:long, movie: long,
price: float)
>
>
>
> filtered_times = FILTER in BY price>0;
>
> perCust = GROUP filtered_times BY (user,movie);
>
>
>
> count = foreach perCust generate group, COUNT(filtered_times.movie);
>
> STORE count INTO 'results' using PigStorage(',');
>
>
>
> The out put is like:
>
> (3710100987700,5460986508),14
>
>
>
> I don’t want these braces L
>
> I want like normal delimited by ","

Re: question

Posted by Arun Ahuja <aa...@gmail.com>.
Instead of

count = foreach perCust generate group, COUNT(filtered_times.movie);

use

count = foreach perCust generate FLATTEN(group), COUNT(filtered_times.movie);

FLATTEN is a special operator that replaces a tuple with the elements
inside the tuple.

On Thu, Oct 11, 2012 at 4:36 PM, jamal sasha <ja...@gmail.com> wrote:
>>
>>    I have a data file in format
>>
>>
>>
>> User, movie, price
>>
>> 123,abc,22.2
>>
>> 123,daw,39
>>
>> 123,abc,99  ß Note that the user and movie is same but price is different
>>
>>
>>
>> I want to generate a pig script where I am counting how many times a user
> has rented a particular movie
>>
>>
>>
>>
>>
>> in = LOAD 'data' USING PigStorage('\\u001') AS ( user:long, movie: long,
> price: float)
>>
>>
>>
>> filtered_times = FILTER in BY price>0;
>>
>> perCust = GROUP filtered_times BY (user,movie);
>>
>>
>>
>> count = foreach perCust generate group, COUNT(filtered_times.movie);
>>
>> STORE count INTO 'results' using PigStorage(',');
>>
>>
>>
>> The out put is like:
>>
>> (3710100987700,5460986508),14
>>
>>
>>
>> I don’t want these braces L
>>
>> I want like normal delimited by ","