You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jonathan Coveney <jc...@gmail.com> on 2010/12/14 19:32:51 UTC

How to divide by the minimum number in a set in Pig?

I'm not sure if Pig can handle this...perhaps in this specific case there is
something more clever that can be done, although I think it points to a
bigger question.

Basically, let's say I have (whatever:chararray, icare:int)
I want to get whatever, icare/min(all_of_icare), for each touple. Basically
something akin to...

loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
min_generated = FOREACH loaded GENERATE icare;
min_group = GROUP min_generated ALL;
min = FOREACH min_group GENERATE MIN(min_generated);

generated = FOREACH loaded GENERATE whatever, icare/***min***;

obviously this code would not work, but I am wondering if something in the
spirit of it can be done in Pig.

Thank you for your time
Jon

Re: How to divide by the minimum number in a set in Pig?

Posted by Jonathan Coveney <jc...@gmail.com>.
Thanks guys

2010/12/14 Daniel Dai <ji...@yahoo-inc.com>

> This is what you need (on 0.8):
>
>
> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int);
> min_generated = FOREACH loaded GENERATE icare;
> min_group = GROUP min_generated ALL;
> min = FOREACH min_group GENERATE MIN(min_generated) as m;
> generated = FOREACH loaded GENERATE whatever, icare/min.m;
>
> Daniel
>
>
> Jonathan Coveney wrote:
>
>> Also, where would the cast go?
>>
>> 2010/12/14 Jonathan Coveney <jc...@gmail.com>
>>
>>
>>
>>> I can use new code, yes. If I simply use the dev version of pig, will it
>>> support this then?
>>>
>>> 2010/12/14 Alan Gates <ga...@yahoo-inc.com>
>>>
>>> Actually, in 0.8 the code you give will work, if you cast min_generated
>>> to
>>>
>>>
>>>> an int.  0.8 Is in the release process now.  Are you in a position to
>>>> use
>>>> new code?
>>>>
>>>> Alan.
>>>>
>>>>
>>>> On Dec 14, 2010, at 10:32 AM, Jonathan Coveney wrote:
>>>>
>>>>  I'm not sure if Pig can handle this...perhaps in this specific case
>>>> there
>>>>
>>>>
>>>>> is
>>>>> something more clever that can be done, although I think it points to a
>>>>> bigger question.
>>>>>
>>>>> Basically, let's say I have (whatever:chararray, icare:int)
>>>>> I want to get whatever, icare/min(all_of_icare), for each touple.
>>>>> Basically
>>>>> something akin to...
>>>>>
>>>>> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
>>>>> min_generated = FOREACH loaded GENERATE icare;
>>>>> min_group = GROUP min_generated ALL;
>>>>> min = FOREACH min_group GENERATE MIN(min_generated);
>>>>>
>>>>> generated = FOREACH loaded GENERATE whatever, icare/***min***;
>>>>>
>>>>> obviously this code would not work, but I am wondering if something in
>>>>> the
>>>>> spirit of it can be done in Pig.
>>>>>
>>>>> Thank you for your time
>>>>> Jon
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>

Re: How to divide by the minimum number in a set in Pig?

Posted by Daniel Dai <ji...@yahoo-inc.com>.
This is what you need (on 0.8):

loaded = LOAD 'whatever' AS (whatever:chararray, icare:int);
min_generated = FOREACH loaded GENERATE icare;
min_group = GROUP min_generated ALL;
min = FOREACH min_group GENERATE MIN(min_generated) as m;
generated = FOREACH loaded GENERATE whatever, icare/min.m;

Daniel

Jonathan Coveney wrote:
> Also, where would the cast go?
>
> 2010/12/14 Jonathan Coveney <jc...@gmail.com>
>
>   
>> I can use new code, yes. If I simply use the dev version of pig, will it
>> support this then?
>>
>> 2010/12/14 Alan Gates <ga...@yahoo-inc.com>
>>
>> Actually, in 0.8 the code you give will work, if you cast min_generated to
>>     
>>> an int.  0.8 Is in the release process now.  Are you in a position to use
>>> new code?
>>>
>>> Alan.
>>>
>>>
>>> On Dec 14, 2010, at 10:32 AM, Jonathan Coveney wrote:
>>>
>>>  I'm not sure if Pig can handle this...perhaps in this specific case there
>>>       
>>>> is
>>>> something more clever that can be done, although I think it points to a
>>>> bigger question.
>>>>
>>>> Basically, let's say I have (whatever:chararray, icare:int)
>>>> I want to get whatever, icare/min(all_of_icare), for each touple.
>>>> Basically
>>>> something akin to...
>>>>
>>>> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
>>>> min_generated = FOREACH loaded GENERATE icare;
>>>> min_group = GROUP min_generated ALL;
>>>> min = FOREACH min_group GENERATE MIN(min_generated);
>>>>
>>>> generated = FOREACH loaded GENERATE whatever, icare/***min***;
>>>>
>>>> obviously this code would not work, but I am wondering if something in
>>>> the
>>>> spirit of it can be done in Pig.
>>>>
>>>> Thank you for your time
>>>> Jon
>>>>
>>>>         
>>>       


Re: How to divide by the minimum number in a set in Pig?

Posted by Jonathan Coveney <jc...@gmail.com>.
Also, where would the cast go?

2010/12/14 Jonathan Coveney <jc...@gmail.com>

> I can use new code, yes. If I simply use the dev version of pig, will it
> support this then?
>
> 2010/12/14 Alan Gates <ga...@yahoo-inc.com>
>
> Actually, in 0.8 the code you give will work, if you cast min_generated to
>> an int.  0.8 Is in the release process now.  Are you in a position to use
>> new code?
>>
>> Alan.
>>
>>
>> On Dec 14, 2010, at 10:32 AM, Jonathan Coveney wrote:
>>
>>  I'm not sure if Pig can handle this...perhaps in this specific case there
>>> is
>>> something more clever that can be done, although I think it points to a
>>> bigger question.
>>>
>>> Basically, let's say I have (whatever:chararray, icare:int)
>>> I want to get whatever, icare/min(all_of_icare), for each touple.
>>> Basically
>>> something akin to...
>>>
>>> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
>>> min_generated = FOREACH loaded GENERATE icare;
>>> min_group = GROUP min_generated ALL;
>>> min = FOREACH min_group GENERATE MIN(min_generated);
>>>
>>> generated = FOREACH loaded GENERATE whatever, icare/***min***;
>>>
>>> obviously this code would not work, but I am wondering if something in
>>> the
>>> spirit of it can be done in Pig.
>>>
>>> Thank you for your time
>>> Jon
>>>
>>
>>
>

Re: How to divide by the minimum number in a set in Pig?

Posted by Alan Gates <ga...@yahoo-inc.com>.
You can download the release candidate for 0.8 from http://people.apache.org/~olga/pig-0.8.0-candidate-0 
.  If you want to build your own I'd build off the 0.8 branch, which  
we know to be stable.  We're pouring all kinds of cool new features  
into trunk at the moment so it may not be as stable as you'd want  
right at the moment.

Alan.

On Dec 14, 2010, at 10:52 AM, Jonathan Coveney wrote:

> I can use new code, yes. If I simply use the dev version of pig,  
> will it
> support this then?
>
> 2010/12/14 Alan Gates <ga...@yahoo-inc.com>
>
>> Actually, in 0.8 the code you give will work, if you cast  
>> min_generated to
>> an int.  0.8 Is in the release process now.  Are you in a position  
>> to use
>> new code?
>>
>> Alan.
>>
>>
>> On Dec 14, 2010, at 10:32 AM, Jonathan Coveney wrote:
>>
>> I'm not sure if Pig can handle this...perhaps in this specific case  
>> there
>>> is
>>> something more clever that can be done, although I think it points  
>>> to a
>>> bigger question.
>>>
>>> Basically, let's say I have (whatever:chararray, icare:int)
>>> I want to get whatever, icare/min(all_of_icare), for each touple.
>>> Basically
>>> something akin to...
>>>
>>> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
>>> min_generated = FOREACH loaded GENERATE icare;
>>> min_group = GROUP min_generated ALL;
>>> min = FOREACH min_group GENERATE MIN(min_generated);
>>>
>>> generated = FOREACH loaded GENERATE whatever, icare/***min***;
>>>
>>> obviously this code would not work, but I am wondering if  
>>> something in the
>>> spirit of it can be done in Pig.
>>>
>>> Thank you for your time
>>> Jon
>>>
>>
>>


Re: How to divide by the minimum number in a set in Pig?

Posted by Jonathan Coveney <jc...@gmail.com>.
I can use new code, yes. If I simply use the dev version of pig, will it
support this then?

2010/12/14 Alan Gates <ga...@yahoo-inc.com>

> Actually, in 0.8 the code you give will work, if you cast min_generated to
> an int.  0.8 Is in the release process now.  Are you in a position to use
> new code?
>
> Alan.
>
>
> On Dec 14, 2010, at 10:32 AM, Jonathan Coveney wrote:
>
>  I'm not sure if Pig can handle this...perhaps in this specific case there
>> is
>> something more clever that can be done, although I think it points to a
>> bigger question.
>>
>> Basically, let's say I have (whatever:chararray, icare:int)
>> I want to get whatever, icare/min(all_of_icare), for each touple.
>> Basically
>> something akin to...
>>
>> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
>> min_generated = FOREACH loaded GENERATE icare;
>> min_group = GROUP min_generated ALL;
>> min = FOREACH min_group GENERATE MIN(min_generated);
>>
>> generated = FOREACH loaded GENERATE whatever, icare/***min***;
>>
>> obviously this code would not work, but I am wondering if something in the
>> spirit of it can be done in Pig.
>>
>> Thank you for your time
>> Jon
>>
>
>

Re: How to divide by the minimum number in a set in Pig?

Posted by Alan Gates <ga...@yahoo-inc.com>.
Actually, in 0.8 the code you give will work, if you cast  
min_generated to an int.  0.8 Is in the release process now.  Are you  
in a position to use new code?

Alan.

On Dec 14, 2010, at 10:32 AM, Jonathan Coveney wrote:

> I'm not sure if Pig can handle this...perhaps in this specific case  
> there is
> something more clever that can be done, although I think it points  
> to a
> bigger question.
>
> Basically, let's say I have (whatever:chararray, icare:int)
> I want to get whatever, icare/min(all_of_icare), for each touple.  
> Basically
> something akin to...
>
> loaded = LOAD 'whatever' AS (whatever:chararray, icare:int)
> min_generated = FOREACH loaded GENERATE icare;
> min_group = GROUP min_generated ALL;
> min = FOREACH min_group GENERATE MIN(min_generated);
>
> generated = FOREACH loaded GENERATE whatever, icare/***min***;
>
> obviously this code would not work, but I am wondering if something  
> in the
> spirit of it can be done in Pig.
>
> Thank you for your time
> Jon