You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by sonia gehlot <so...@gmail.com> on 2011/02/20 00:39:04 UTC

FOREACH GENERATE after if else condition

Hi Guys,

I getting wired error while running my pig script.

   *case_state = FOREACH join_pe_pre GENERATE*

*  f1, f2, f3, f4,   (*

*                  (f5 '.*.facebook..*')*

*                  ? f10*

*                  : null*

*          ) as facebook_referrals,*

*
*

*          (*

*                  (*

*                          (f6 == 1)*

*                          AND*

*                          (f7 == 2000)*

*                          AND*

*                          (f8 == 1)*

*                  )*

*                  ? f10*

*                  : null*

*          ) as cd_referrals,*

*
*

*          (*

*                  (*

*                          (f7 == 1770)*

*                          OR*

*                          (f7 == 1771)*

*                  )*

*                  ? f10*

*                  : null*

*          ) as nm_referrals;*

*
*

*DUMP case_state;*


Here when I am doing DUMP case_state I am getting desired results, proper
case values for facebook_referrals, cd_referrals and nm_referrals


*gen_values = FOREACH case_state GENERATE *;*

*
*
*DUMP gen_values; *
*
*
But after this if I do simple FOREACH GENERATE everything again at this
moment I am getting same values for  facebook_referrals, cd_referrals
and nm_referrals. All these three values are same as whatever the value of
first if else.
I could able to figure out what could be the possible reason of this.

Please let me know if I am doing anything wrong.

Thanks in advance.

Sonia

Re: FOREACH GENERATE after if else condition

Posted by sonia gehlot <so...@gmail.com>.
We found the patch for this issue:

Revision: http://svn.apache.org/viewvc?view=revision&revision=1057974
JIRA: https://issues.apache.org/jira/browse/PIG-1785

Sonia

On Tue, Feb 22, 2011 at 2:22 PM, Daniel Dai <ji...@yahoo-inc.com> wrote:

> I just tried your script. I can see the wrong output in 0.8 release, but it
> is fixed on current 0.8 branch (
> http://svn.apache.org/repos/asf/pig/branches/branch-0.8). Check out the
> 0.8 branch and try again.
>
> Daniel
>
>
> Bill Graham wrote:
>
>> Our version (I work with Sonia) is this:
>>
>> Apache Pig version 0.8.0-SNAPSHOT (rexported)
>>
>> Which is not very helpful. It's basically the 0.8.0 release with an
>> early patch of PIG-1680 applied. I've tested with an unpatched 0.8.0
>> and I can reproduce. Testing against the current trunk, the issue
>> seems to be fixed.
>>
>> Does anyone know what JIRA might have fixed this, or even what part of
>> the code we should look at to find the fix? If we know the patch we
>> can apply it.
>>
>> thanks,
>> Bill
>>
>> On Tue, Feb 22, 2011 at 10:57 AM, Dmitriy Ryaboy <dv...@gmail.com>
>> wrote:
>>
>>
>>> I am not sure where that version came from. Is this some CDH thing?
>>> Here's the output of my pig -version:
>>>
>>> Apache Pig version 0.8.0-SNAPSHOT (r1073123)
>>>
>>> -D
>>>
>>> On Tue, Feb 22, 2011 at 9:54 AM, sonia gehlot <sonia.gehlot@gmail.com
>>> >wrote:
>>>
>>>
>>>
>>>> Thanks Dimitriy,
>>>>
>>>> I am using pig.version=0.8.0_0.20.1_1-1 Version of Pig. Are you also
>>>> using
>>>> the same one?
>>>>
>>>> Sonia
>>>>
>>>>
>>>> On Mon, Feb 21, 2011 at 2:39 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
>>>> >wrote:
>>>>
>>>>
>>>>
>>>>> Sonia, check the svn revision.
>>>>>
>>>>> w-mbp13-dryaboy:pig-0.8 dmitriy$ svn info
>>>>> Path: .
>>>>> URL: https://svn.apache.org/repos/asf/pig/branches/branch-0.8
>>>>> Repository Root: https://svn.apache.org/repos/asf
>>>>> Repository UUID: 13f79535-47bb- <0310-9956> <0310-9956>0310-9956
>>>>> -ffa450edef68
>>>>> Revision: 1073123
>>>>> Node Kind: directory
>>>>> Schedule: normal
>>>>> Last Changed Author: rding
>>>>> Last Changed Rev: 1072229
>>>>> Last Changed Date: 2011-02-18 17:20:38 -0800 (Fri, 18 Feb 2011)
>>>>>
>>>>> Also, here are the cksum results of pig jars:
>>>>>
>>>>> tw-mbp13-dryaboy:pig-0.8 dmitriy$ cksum build/pig-0.8.0-SNAPSHOT*jar
>>>>> 4198538967 2235407 build/pig-0.8.0-SNAPSHOT-core.jar
>>>>> 1926172930 3807045 build/pig-0.8.0-SNAPSHOT-withouthadoop.jar
>>>>> 1795234843 8902554 build/pig-0.8.0-SNAPSHOT.jar
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 21, 2011 at 2:14 PM, sonia gehlot <sonia.gehlot@gmail.com
>>>>> >wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hey Dmitriy,
>>>>>>
>>>>>> I tried running this again but getting same error.
>>>>>> I will double check my Pig version. I knw it sounds dumb question, but
>>>>>> can you tell how can I see which version of Pig I am using?
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dvryaboy@gmail.com
>>>>>> >wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Sonia,
>>>>>>> I just tried your script using the version of Pig 8 in the 0.8 branch
>>>>>>> (didn't try it with the release), and got correct results. Please try
>>>>>>> that.
>>>>>>>
>>>>>>> dump case_abc:
>>>>>>> (John,USA,1234,)
>>>>>>> (Ron,California,,1432)
>>>>>>> (Sam,NY,,5432)
>>>>>>> (Bill,UK,5647,)
>>>>>>>
>>>>>>> dump gen_case:
>>>>>>> (John,USA,1234,)
>>>>>>> (Ron,California,,1432)
>>>>>>> (Sam,NY,,5432)
>>>>>>> (Bill,UK,5647,)
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <
>>>>>>> jcoveney@gmail.com
>>>>>>>
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>              I do not know about sonia, but I know that when I ran
>>>>>>>> into a similar
>>>>>>>>
>>>>>>>>
>>>>>>> bug it
>>>>>>>
>>>>>>>
>>>>>>>> was on trunk.
>>>>>>>>
>>>>>>>> 2011/2/21 Ramesh, Amit <am...@amazon.com>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> This very much looks like a result of the bug that dropped schemas
>>>>>>>>>
>>>>>>>>>
>>>>>>>> in the
>>>>>>>
>>>>>>>
>>>>>>>> release version. I was bitten by it a couple of times, but have
>>>>>>>>>
>>>>>>>>>
>>>>>>>> everything
>>>>>>>>
>>>>>>>>
>>>>>>>>> working now by pulling down a recent snapshot of the code from the
>>>>>>>>>
>>>>>>>>>
>>>>>>>> svn
>>>>>>>
>>>>>>>
>>>>>>>> 0.8
>>>>>>>>
>>>>>>>>
>>>>>>>>> release branch. Quite a few major bug fixes have gone in since the
>>>>>>>>>
>>>>>>>>>
>>>>>>>> original
>>>>>>>>
>>>>>>>>
>>>>>>>>> release.
>>>>>>>>>
>>>>>>>>> ~ Amit
>>>>>>>>>
>>>>>>>>> On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <
>>>>>>>>> jcoveney@gmail.com
>>>>>>>>>                wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Sonia,
>>>>>>>>>>
>>>>>>>>>> I absolutely have seen this bug, just couldn't find an easy way to
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> replicate
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> it as it was buried in a bunch of code. The use case was similar:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> multiple
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> embedded ifs, some nulls, and then instead of getting the expected
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> output,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> it just repeats one. I know dmitriy said he'd look into it I just
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> want
>>>>>>>
>>>>>>>
>>>>>>>> to
>>>>>>>>
>>>>>>>>
>>>>>>>>> say that this isn't an isolated thing.
>>>>>>>>>>
>>>>>>>>>> 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> This sounds like a bug. I'll check it out tomorrow.
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> sonia.gehlot@gmail.com
>>>>>>>
>>>>>>>
>>>>>>>>  wrote:
>>>>>>>>>>>>                      Here is an example, Hope this will help. I
>>>>>>>>>>>> am running this on
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Pig 0.8
>>>>>>>
>>>>>>>
>>>>>>>>  version.
>>>>>>>>>>>>
>>>>>>>>>>>> Sample data in text file
>>>>>>>>>>>> sample1.txt
>>>>>>>>>>>>
>>>>>>>>>>>> John USA www.google.com 1234 900
>>>>>>>>>>>> Ron California www.facebook.com 1432 400
>>>>>>>>>>>> Sam NY www.orkut.com 5432 400
>>>>>>>>>>>> Bill UK www.google.com 5647 645
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> chararray,
>>>>>>>
>>>>>>>
>>>>>>>>  country: chararray, website: chararray, sess_id: int, page_id:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> int);
>>>>>>>
>>>>>>>
>>>>>>>>  case_abc = FOREACH abc GENERATE
>>>>>>>>>>>> name,
>>>>>>>>>>>> country,
>>>>>>>>>>>> ((website matches '.*.google..*') ? sess_id : null ) as
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> google_user,
>>>>>>>
>>>>>>>
>>>>>>>>  ((page_id == 400) ? sess_id : null) as other_user;
>>>>>>>>>>>>
>>>>>>>>>>>> DUMP case_abc;
>>>>>>>>>>>> ------------------
>>>>>>>>>>>> result of DUMP case_abc;
>>>>>>>>>>>>
>>>>>>>>>>>> (John,USA,1234,)
>>>>>>>>>>>> (Ron,California,,1432)
>>>>>>>>>>>> (Sam,NY,,5432)
>>>>>>>>>>>> (Bill,UK,5647,)
>>>>>>>>>>>>
>>>>>>>>>>>> -----------------
>>>>>>>>>>>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>>>>>>>>>>>> other_user;
>>>>>>>>>>>>
>>>>>>>>>>>> DUMP gen_case;
>>>>>>>>>>>> ---------------
>>>>>>>>>>>> result of DUMP gen_case;
>>>>>>>>>>>>
>>>>>>>>>>>> (John,USA,1234,1234)
>>>>>>>>>>>> (Ron,California,,)
>>>>>>>>>>>> (Sam,NY,,)
>>>>>>>>>>>> (Bill,UK,5647,5647)
>>>>>>>>>>>>
>>>>>>>>>>>> You can see in 1st DUMP if conditions are working fine. Then in
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 2nd
>>>>>>>
>>>>>>>
>>>>>>>> dump
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  after 2nd foreach it messed up with if conditions.
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know if it does make any sense.
>>>>>>>>>>>>
>>>>>>>>>>>> Sonia
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> dvryaboy@gmail.com
>>>>>>>
>>>>>>>
>>>>>>>>  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Sonia,
>>>>>>>>>>>>> Looks like something went wrong in your pasting of the Pig
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> code.
>>>>>>>
>>>>>>>
>>>>>>>> Could
>>>>>>>>
>>>>>>>>
>>>>>>>>>  you
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> try again, and also add some sample inputs/outputs?
>>>>>>>>>>>>>
>>>>>>>>>>>>> As in, contents of join_pe_pre, contents of case_state, and
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> contents
>>>>>>>
>>>>>>>
>>>>>>>> of
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  gen_values that illustrate the problem and allow us to
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> reproduce it.
>>>>>>>
>>>>>>>
>>>>>>>>  Also please tell us what version of Pig you are using.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -D
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> sonia.gehlot@gmail.com
>>>>>>>>
>>>>>>>>
>>>>>>>>>  wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Guys,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I getting wired error while running my pig script.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *  f1, f2, f3, f4,   (*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  (f5 '.*.facebook..*')*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  ? f10*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  : null*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *          ) as facebook_referrals,*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *          (*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  (*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          (f6 == 1)*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          AND*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          (f7 == 2000)*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          AND*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          (f8 == 1)*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  )*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  ? f10*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  : null*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *          ) as cd_referrals,*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *          (*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  (*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          (f7 == 1770)*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          OR*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                          (f7 == 1771)*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  )*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  ? f10*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *                  : null*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *          ) as nm_referrals;*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *DUMP case_state;*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here when I am doing DUMP case_state I am getting desired
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> results,
>>>>>>>
>>>>>>>
>>>>>>>>  proper
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>  case values for facebook_referrals, cd_referrals and
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> nm_referrals
>>>>>>>
>>>>>>>
>>>>>>>>   *gen_values = FOREACH case_state GENERATE *;*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *DUMP gen_values; *
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> But after this if I do simple FOREACH GENERATE everything
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> again at
>>>>>>>
>>>>>>>
>>>>>>>> this
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  moment I am getting same values for  facebook_referrals,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> cd_referrals
>>>>>>>>
>>>>>>>>
>>>>>>>>>  and nm_referrals. All these three values are same as whatever
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>
>>>>>>>
>>>>>>>> value
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  of
>>>>>>>>>>>>>> first if else.
>>>>>>>>>>>>>> I could able to figure out what could be the possible reason
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> of
>>>>>>>
>>>>>>>
>>>>>>>> this.
>>>>>>>>
>>>>>>>>
>>>>>>>>>  Please let me know if I am doing anything wrong.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sonia
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>
>>>>>
>

Re: FOREACH GENERATE after if else condition

Posted by Daniel Dai <ji...@yahoo-inc.com>.
I just tried your script. I can see the wrong output in 0.8 release, but 
it is fixed on current 0.8 branch 
(http://svn.apache.org/repos/asf/pig/branches/branch-0.8). Check out the 
0.8 branch and try again.

Daniel

Bill Graham wrote:
> Our version (I work with Sonia) is this:
>
> Apache Pig version 0.8.0-SNAPSHOT (rexported)
>
> Which is not very helpful. It's basically the 0.8.0 release with an
> early patch of PIG-1680 applied. I've tested with an unpatched 0.8.0
> and I can reproduce. Testing against the current trunk, the issue
> seems to be fixed.
>
> Does anyone know what JIRA might have fixed this, or even what part of
> the code we should look at to find the fix? If we know the patch we
> can apply it.
>
> thanks,
> Bill
>
> On Tue, Feb 22, 2011 at 10:57 AM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>   
>> I am not sure where that version came from. Is this some CDH thing?
>> Here's the output of my pig -version:
>>
>> Apache Pig version 0.8.0-SNAPSHOT (r1073123)
>>
>> -D
>>
>> On Tue, Feb 22, 2011 at 9:54 AM, sonia gehlot <so...@gmail.com>wrote:
>>
>>     
>>> Thanks Dimitriy,
>>>
>>> I am using pig.version=0.8.0_0.20.1_1-1 Version of Pig. Are you also using
>>> the same one?
>>>
>>> Sonia
>>>
>>>
>>> On Mon, Feb 21, 2011 at 2:39 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>>
>>>       
>>>> Sonia, check the svn revision.
>>>>
>>>> w-mbp13-dryaboy:pig-0.8 dmitriy$ svn info
>>>> Path: .
>>>> URL: https://svn.apache.org/repos/asf/pig/branches/branch-0.8
>>>> Repository Root: https://svn.apache.org/repos/asf
>>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>>> Revision: 1073123
>>>> Node Kind: directory
>>>> Schedule: normal
>>>> Last Changed Author: rding
>>>> Last Changed Rev: 1072229
>>>> Last Changed Date: 2011-02-18 17:20:38 -0800 (Fri, 18 Feb 2011)
>>>>
>>>> Also, here are the cksum results of pig jars:
>>>>
>>>> tw-mbp13-dryaboy:pig-0.8 dmitriy$ cksum build/pig-0.8.0-SNAPSHOT*jar
>>>> 4198538967 2235407 build/pig-0.8.0-SNAPSHOT-core.jar
>>>> 1926172930 3807045 build/pig-0.8.0-SNAPSHOT-withouthadoop.jar
>>>> 1795234843 8902554 build/pig-0.8.0-SNAPSHOT.jar
>>>>
>>>>
>>>>
>>>> On Mon, Feb 21, 2011 at 2:14 PM, sonia gehlot <so...@gmail.com>wrote:
>>>>
>>>>         
>>>>> Hey Dmitriy,
>>>>>
>>>>> I tried running this again but getting same error.
>>>>> I will double check my Pig version. I knw it sounds dumb question, but
>>>>> can you tell how can I see which version of Pig I am using?
>>>>>
>>>>>
>>>>> On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>>>>
>>>>>           
>>>>>> Sonia,
>>>>>> I just tried your script using the version of Pig 8 in the 0.8 branch
>>>>>> (didn't try it with the release), and got correct results. Please try
>>>>>> that.
>>>>>>
>>>>>> dump case_abc:
>>>>>> (John,USA,1234,)
>>>>>> (Ron,California,,1432)
>>>>>> (Sam,NY,,5432)
>>>>>> (Bill,UK,5647,)
>>>>>>
>>>>>> dump gen_case:
>>>>>> (John,USA,1234,)
>>>>>> (Ron,California,,1432)
>>>>>> (Sam,NY,,5432)
>>>>>> (Bill,UK,5647,)
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jcoveney@gmail.com
>>>>>>             
>>>>>>> wrote:
>>>>>>>               
>>>>>>> I do not know about sonia, but I know that when I ran into a similar
>>>>>>>               
>>>>>> bug it
>>>>>>             
>>>>>>> was on trunk.
>>>>>>>
>>>>>>> 2011/2/21 Ramesh, Amit <am...@amazon.com>
>>>>>>>
>>>>>>>               
>>>>>>>> This very much looks like a result of the bug that dropped schemas
>>>>>>>>                 
>>>>>> in the
>>>>>>             
>>>>>>>> release version. I was bitten by it a couple of times, but have
>>>>>>>>                 
>>>>>>> everything
>>>>>>>               
>>>>>>>> working now by pulling down a recent snapshot of the code from the
>>>>>>>>                 
>>>>>> svn
>>>>>>             
>>>>>>> 0.8
>>>>>>>               
>>>>>>>> release branch. Quite a few major bug fixes have gone in since the
>>>>>>>>                 
>>>>>>> original
>>>>>>>               
>>>>>>>> release.
>>>>>>>>
>>>>>>>> ~ Amit
>>>>>>>>
>>>>>>>> On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jcoveney@gmail.com
>>>>>>>>                 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> Sonia,
>>>>>>>>>
>>>>>>>>> I absolutely have seen this bug, just couldn't find an easy way to
>>>>>>>>>                   
>>>>>>>> replicate
>>>>>>>>                 
>>>>>>>>> it as it was buried in a bunch of code. The use case was similar:
>>>>>>>>>                   
>>>>>>>> multiple
>>>>>>>>                 
>>>>>>>>> embedded ifs, some nulls, and then instead of getting the expected
>>>>>>>>>                   
>>>>>>>> output,
>>>>>>>>                 
>>>>>>>>> it just repeats one. I know dmitriy said he'd look into it I just
>>>>>>>>>                   
>>>>>> want
>>>>>>             
>>>>>>> to
>>>>>>>               
>>>>>>>>> say that this isn't an isolated thing.
>>>>>>>>>
>>>>>>>>> 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> This sounds like a bug. I'll check it out tomorrow.
>>>>>>>>>>
>>>>>>>>>> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
>>>>>>>>>>                     
>>>>>> sonia.gehlot@gmail.com
>>>>>>             
>>>>>>>>>>> wrote:
>>>>>>>>>>>                       
>>>>>>>>>>> Here is an example, Hope this will help. I am running this on
>>>>>>>>>>>                       
>>>>>> Pig 0.8
>>>>>>             
>>>>>>>>>>> version.
>>>>>>>>>>>
>>>>>>>>>>> Sample data in text file
>>>>>>>>>>> sample1.txt
>>>>>>>>>>>
>>>>>>>>>>> John USA www.google.com 1234 900
>>>>>>>>>>> Ron California www.facebook.com 1432 400
>>>>>>>>>>> Sam NY www.orkut.com 5432 400
>>>>>>>>>>> Bill UK www.google.com 5647 645
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
>>>>>>>>>>>                       
>>>>>> chararray,
>>>>>>             
>>>>>>>>>>> country: chararray, website: chararray, sess_id: int, page_id:
>>>>>>>>>>>                       
>>>>>> int);
>>>>>>             
>>>>>>>>>>> case_abc = FOREACH abc GENERATE
>>>>>>>>>>> name,
>>>>>>>>>>> country,
>>>>>>>>>>> ((website matches '.*.google..*') ? sess_id : null ) as
>>>>>>>>>>>                       
>>>>>> google_user,
>>>>>>             
>>>>>>>>>>> ((page_id == 400) ? sess_id : null) as other_user;
>>>>>>>>>>>
>>>>>>>>>>> DUMP case_abc;
>>>>>>>>>>> ------------------
>>>>>>>>>>> result of DUMP case_abc;
>>>>>>>>>>>
>>>>>>>>>>> (John,USA,1234,)
>>>>>>>>>>> (Ron,California,,1432)
>>>>>>>>>>> (Sam,NY,,5432)
>>>>>>>>>>> (Bill,UK,5647,)
>>>>>>>>>>>
>>>>>>>>>>> -----------------
>>>>>>>>>>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>>>>>>>>>>> other_user;
>>>>>>>>>>>
>>>>>>>>>>> DUMP gen_case;
>>>>>>>>>>> ---------------
>>>>>>>>>>> result of DUMP gen_case;
>>>>>>>>>>>
>>>>>>>>>>> (John,USA,1234,1234)
>>>>>>>>>>> (Ron,California,,)
>>>>>>>>>>> (Sam,NY,,)
>>>>>>>>>>> (Bill,UK,5647,5647)
>>>>>>>>>>>
>>>>>>>>>>> You can see in 1st DUMP if conditions are working fine. Then in
>>>>>>>>>>>                       
>>>>>> 2nd
>>>>>>             
>>>>>>>> dump
>>>>>>>>                 
>>>>>>>>>>> after 2nd foreach it messed up with if conditions.
>>>>>>>>>>>
>>>>>>>>>>> Let me know if it does make any sense.
>>>>>>>>>>>
>>>>>>>>>>> Sonia
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
>>>>>>>>>>>                       
>>>>>> dvryaboy@gmail.com
>>>>>>             
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>>>> Hi Sonia,
>>>>>>>>>>>> Looks like something went wrong in your pasting of the Pig
>>>>>>>>>>>>                         
>>>>>> code.
>>>>>>             
>>>>>>> Could
>>>>>>>               
>>>>>>>>>> you
>>>>>>>>>>                     
>>>>>>>>>>>> try again, and also add some sample inputs/outputs?
>>>>>>>>>>>>
>>>>>>>>>>>> As in, contents of join_pe_pre, contents of case_state, and
>>>>>>>>>>>>                         
>>>>>> contents
>>>>>>             
>>>>>>>> of
>>>>>>>>                 
>>>>>>>>>>>> gen_values that illustrate the problem and allow us to
>>>>>>>>>>>>                         
>>>>>> reproduce it.
>>>>>>             
>>>>>>>>>>>> Also please tell us what version of Pig you are using.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -D
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
>>>>>>>>>>>>                         
>>>>>>> sonia.gehlot@gmail.com
>>>>>>>               
>>>>>>>>>>> wrote:
>>>>>>>>>>>                       
>>>>>>>>>>>>> Hi Guys,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I getting wired error while running my pig script.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *  f1, f2, f3, f4,   (*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  (f5 '.*.facebook..*')*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  ? f10*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  : null*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *          ) as facebook_referrals,*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *
>>>>>>>>>>>>>
>>>>>>>>>>>>> *          (*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  (*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          (f6 == 1)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          AND*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          (f7 == 2000)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          AND*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          (f8 == 1)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  )*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  ? f10*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  : null*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *          ) as cd_referrals,*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *
>>>>>>>>>>>>>
>>>>>>>>>>>>> *          (*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  (*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          (f7 == 1770)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          OR*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                          (f7 == 1771)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  )*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  ? f10*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *                  : null*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *          ) as nm_referrals;*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *
>>>>>>>>>>>>>
>>>>>>>>>>>>> *DUMP case_state;*
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here when I am doing DUMP case_state I am getting desired
>>>>>>>>>>>>>                           
>>>>>> results,
>>>>>>             
>>>>>>>>>> proper
>>>>>>>>>>                     
>>>>>>>>>>>>> case values for facebook_referrals, cd_referrals and
>>>>>>>>>>>>>                           
>>>>>> nm_referrals
>>>>>>             
>>>>>>>>>>>>> *gen_values = FOREACH case_state GENERATE *;*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *DUMP gen_values; *
>>>>>>>>>>>>> *
>>>>>>>>>>>>> *
>>>>>>>>>>>>> But after this if I do simple FOREACH GENERATE everything
>>>>>>>>>>>>>                           
>>>>>> again at
>>>>>>             
>>>>>>>> this
>>>>>>>>                 
>>>>>>>>>>>>> moment I am getting same values for  facebook_referrals,
>>>>>>>>>>>>>                           
>>>>>>> cd_referrals
>>>>>>>               
>>>>>>>>>>>>> and nm_referrals. All these three values are same as whatever
>>>>>>>>>>>>>                           
>>>>>> the
>>>>>>             
>>>>>>>> value
>>>>>>>>                 
>>>>>>>>>>>>> of
>>>>>>>>>>>>> first if else.
>>>>>>>>>>>>> I could able to figure out what could be the possible reason
>>>>>>>>>>>>>                           
>>>>>> of
>>>>>>             
>>>>>>> this.
>>>>>>>               
>>>>>>>>>>>>> Please let me know if I am doing anything wrong.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sonia
>>>>>>>>>>>>>
>>>>>>>>>>>>>                           
>>>>>>>>>>>>                         
>>>>>           


Re: FOREACH GENERATE after if else condition

Posted by Bill Graham <bi...@gmail.com>.
Our version (I work with Sonia) is this:

Apache Pig version 0.8.0-SNAPSHOT (rexported)

Which is not very helpful. It's basically the 0.8.0 release with an
early patch of PIG-1680 applied. I've tested with an unpatched 0.8.0
and I can reproduce. Testing against the current trunk, the issue
seems to be fixed.

Does anyone know what JIRA might have fixed this, or even what part of
the code we should look at to find the fix? If we know the patch we
can apply it.

thanks,
Bill

On Tue, Feb 22, 2011 at 10:57 AM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> I am not sure where that version came from. Is this some CDH thing?
> Here's the output of my pig -version:
>
> Apache Pig version 0.8.0-SNAPSHOT (r1073123)
>
> -D
>
> On Tue, Feb 22, 2011 at 9:54 AM, sonia gehlot <so...@gmail.com>wrote:
>
>> Thanks Dimitriy,
>>
>> I am using pig.version=0.8.0_0.20.1_1-1 Version of Pig. Are you also using
>> the same one?
>>
>> Sonia
>>
>>
>> On Mon, Feb 21, 2011 at 2:39 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>
>>> Sonia, check the svn revision.
>>>
>>> w-mbp13-dryaboy:pig-0.8 dmitriy$ svn info
>>> Path: .
>>> URL: https://svn.apache.org/repos/asf/pig/branches/branch-0.8
>>> Repository Root: https://svn.apache.org/repos/asf
>>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>>> Revision: 1073123
>>> Node Kind: directory
>>> Schedule: normal
>>> Last Changed Author: rding
>>> Last Changed Rev: 1072229
>>> Last Changed Date: 2011-02-18 17:20:38 -0800 (Fri, 18 Feb 2011)
>>>
>>> Also, here are the cksum results of pig jars:
>>>
>>> tw-mbp13-dryaboy:pig-0.8 dmitriy$ cksum build/pig-0.8.0-SNAPSHOT*jar
>>> 4198538967 2235407 build/pig-0.8.0-SNAPSHOT-core.jar
>>> 1926172930 3807045 build/pig-0.8.0-SNAPSHOT-withouthadoop.jar
>>> 1795234843 8902554 build/pig-0.8.0-SNAPSHOT.jar
>>>
>>>
>>>
>>> On Mon, Feb 21, 2011 at 2:14 PM, sonia gehlot <so...@gmail.com>wrote:
>>>
>>>> Hey Dmitriy,
>>>>
>>>> I tried running this again but getting same error.
>>>> I will double check my Pig version. I knw it sounds dumb question, but
>>>> can you tell how can I see which version of Pig I am using?
>>>>
>>>>
>>>> On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>>>
>>>>> Sonia,
>>>>> I just tried your script using the version of Pig 8 in the 0.8 branch
>>>>> (didn't try it with the release), and got correct results. Please try
>>>>> that.
>>>>>
>>>>> dump case_abc:
>>>>> (John,USA,1234,)
>>>>> (Ron,California,,1432)
>>>>> (Sam,NY,,5432)
>>>>> (Bill,UK,5647,)
>>>>>
>>>>> dump gen_case:
>>>>> (John,USA,1234,)
>>>>> (Ron,California,,1432)
>>>>> (Sam,NY,,5432)
>>>>> (Bill,UK,5647,)
>>>>>
>>>>>
>>>>> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jcoveney@gmail.com
>>>>> >wrote:
>>>>>
>>>>> > I do not know about sonia, but I know that when I ran into a similar
>>>>> bug it
>>>>> > was on trunk.
>>>>> >
>>>>> > 2011/2/21 Ramesh, Amit <am...@amazon.com>
>>>>> >
>>>>> > >
>>>>> > > This very much looks like a result of the bug that dropped schemas
>>>>> in the
>>>>> > > release version. I was bitten by it a couple of times, but have
>>>>> > everything
>>>>> > > working now by pulling down a recent snapshot of the code from the
>>>>> svn
>>>>> > 0.8
>>>>> > > release branch. Quite a few major bug fixes have gone in since the
>>>>> > original
>>>>> > > release.
>>>>> > >
>>>>> > > ~ Amit
>>>>> > >
>>>>> > > On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jcoveney@gmail.com
>>>>> >
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Sonia,
>>>>> > > >
>>>>> > > > I absolutely have seen this bug, just couldn't find an easy way to
>>>>> > > replicate
>>>>> > > > it as it was buried in a bunch of code. The use case was similar:
>>>>> > > multiple
>>>>> > > > embedded ifs, some nulls, and then instead of getting the expected
>>>>> > > output,
>>>>> > > > it just repeats one. I know dmitriy said he'd look into it I just
>>>>> want
>>>>> > to
>>>>> > > > say that this isn't an isolated thing.
>>>>> > > >
>>>>> > > > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
>>>>> > > >
>>>>> > > >> This sounds like a bug. I'll check it out tomorrow.
>>>>> > > >>
>>>>> > > >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
>>>>> sonia.gehlot@gmail.com
>>>>> > > >>> wrote:
>>>>> > > >>
>>>>> > > >>> Here is an example, Hope this will help. I am running this on
>>>>> Pig 0.8
>>>>> > > >>> version.
>>>>> > > >>>
>>>>> > > >>> Sample data in text file
>>>>> > > >>> sample1.txt
>>>>> > > >>>
>>>>> > > >>> John USA www.google.com 1234 900
>>>>> > > >>> Ron California www.facebook.com 1432 400
>>>>> > > >>> Sam NY www.orkut.com 5432 400
>>>>> > > >>> Bill UK www.google.com 5647 645
>>>>> > > >>>
>>>>> > > >>>
>>>>> > > >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
>>>>> chararray,
>>>>> > > >>> country: chararray, website: chararray, sess_id: int, page_id:
>>>>> int);
>>>>> > > >>>
>>>>> > > >>> case_abc = FOREACH abc GENERATE
>>>>> > > >>> name,
>>>>> > > >>> country,
>>>>> > > >>> ((website matches '.*.google..*') ? sess_id : null ) as
>>>>> google_user,
>>>>> > > >>> ((page_id == 400) ? sess_id : null) as other_user;
>>>>> > > >>>
>>>>> > > >>> DUMP case_abc;
>>>>> > > >>> ------------------
>>>>> > > >>> result of DUMP case_abc;
>>>>> > > >>>
>>>>> > > >>> (John,USA,1234,)
>>>>> > > >>> (Ron,California,,1432)
>>>>> > > >>> (Sam,NY,,5432)
>>>>> > > >>> (Bill,UK,5647,)
>>>>> > > >>>
>>>>> > > >>> -----------------
>>>>> > > >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>>>>> > > >>> other_user;
>>>>> > > >>>
>>>>> > > >>> DUMP gen_case;
>>>>> > > >>> ---------------
>>>>> > > >>> result of DUMP gen_case;
>>>>> > > >>>
>>>>> > > >>> (John,USA,1234,1234)
>>>>> > > >>> (Ron,California,,)
>>>>> > > >>> (Sam,NY,,)
>>>>> > > >>> (Bill,UK,5647,5647)
>>>>> > > >>>
>>>>> > > >>> You can see in 1st DUMP if conditions are working fine. Then in
>>>>> 2nd
>>>>> > > dump
>>>>> > > >>> after 2nd foreach it messed up with if conditions.
>>>>> > > >>>
>>>>> > > >>> Let me know if it does make any sense.
>>>>> > > >>>
>>>>> > > >>> Sonia
>>>>> > > >>>
>>>>> > > >>>
>>>>> > > >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
>>>>> dvryaboy@gmail.com
>>>>> > > >>> wrote:
>>>>> > > >>>
>>>>> > > >>>> Hi Sonia,
>>>>> > > >>>> Looks like something went wrong in your pasting of the Pig
>>>>> code.
>>>>> > Could
>>>>> > > >> you
>>>>> > > >>>> try again, and also add some sample inputs/outputs?
>>>>> > > >>>>
>>>>> > > >>>> As in, contents of join_pe_pre, contents of case_state, and
>>>>> contents
>>>>> > > of
>>>>> > > >>>> gen_values that illustrate the problem and allow us to
>>>>> reproduce it.
>>>>> > > >>>>
>>>>> > > >>>> Also please tell us what version of Pig you are using.
>>>>> > > >>>>
>>>>> > > >>>> Thanks,
>>>>> > > >>>> -D
>>>>> > > >>>>
>>>>> > > >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
>>>>> > sonia.gehlot@gmail.com
>>>>> > > >>> wrote:
>>>>> > > >>>>
>>>>> > > >>>>> Hi Guys,
>>>>> > > >>>>>
>>>>> > > >>>>> I getting wired error while running my pig script.
>>>>> > > >>>>>
>>>>> > > >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>>>>> > > >>>>>
>>>>> > > >>>>> *  f1, f2, f3, f4,   (*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  (f5 '.*.facebook..*')*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  ? f10*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  : null*
>>>>> > > >>>>>
>>>>> > > >>>>> *          ) as facebook_referrals,*
>>>>> > > >>>>>
>>>>> > > >>>>> *
>>>>> > > >>>>> *
>>>>> > > >>>>>
>>>>> > > >>>>> *          (*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  (*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          (f6 == 1)*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          AND*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          (f7 == 2000)*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          AND*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          (f8 == 1)*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  )*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  ? f10*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  : null*
>>>>> > > >>>>>
>>>>> > > >>>>> *          ) as cd_referrals,*
>>>>> > > >>>>>
>>>>> > > >>>>> *
>>>>> > > >>>>> *
>>>>> > > >>>>>
>>>>> > > >>>>> *          (*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  (*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          (f7 == 1770)*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          OR*
>>>>> > > >>>>>
>>>>> > > >>>>> *                          (f7 == 1771)*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  )*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  ? f10*
>>>>> > > >>>>>
>>>>> > > >>>>> *                  : null*
>>>>> > > >>>>>
>>>>> > > >>>>> *          ) as nm_referrals;*
>>>>> > > >>>>>
>>>>> > > >>>>> *
>>>>> > > >>>>> *
>>>>> > > >>>>>
>>>>> > > >>>>> *DUMP case_state;*
>>>>> > > >>>>>
>>>>> > > >>>>>
>>>>> > > >>>>> Here when I am doing DUMP case_state I am getting desired
>>>>> results,
>>>>> > > >> proper
>>>>> > > >>>>> case values for facebook_referrals, cd_referrals and
>>>>> nm_referrals
>>>>> > > >>>>>
>>>>> > > >>>>>
>>>>> > > >>>>> *gen_values = FOREACH case_state GENERATE *;*
>>>>> > > >>>>>
>>>>> > > >>>>> *
>>>>> > > >>>>> *
>>>>> > > >>>>> *DUMP gen_values; *
>>>>> > > >>>>> *
>>>>> > > >>>>> *
>>>>> > > >>>>> But after this if I do simple FOREACH GENERATE everything
>>>>> again at
>>>>> > > this
>>>>> > > >>>>> moment I am getting same values for  facebook_referrals,
>>>>> > cd_referrals
>>>>> > > >>>>> and nm_referrals. All these three values are same as whatever
>>>>> the
>>>>> > > value
>>>>> > > >>>>> of
>>>>> > > >>>>> first if else.
>>>>> > > >>>>> I could able to figure out what could be the possible reason
>>>>> of
>>>>> > this.
>>>>> > > >>>>>
>>>>> > > >>>>> Please let me know if I am doing anything wrong.
>>>>> > > >>>>>
>>>>> > > >>>>> Thanks in advance.
>>>>> > > >>>>>
>>>>> > > >>>>> Sonia
>>>>> > > >>>>>
>>>>> > > >>>>
>>>>> > > >>>>
>>>>> > > >>>
>>>>> > > >>
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Re: FOREACH GENERATE after if else condition

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
I am not sure where that version came from. Is this some CDH thing?
Here's the output of my pig -version:

Apache Pig version 0.8.0-SNAPSHOT (r1073123)

-D

On Tue, Feb 22, 2011 at 9:54 AM, sonia gehlot <so...@gmail.com>wrote:

> Thanks Dimitriy,
>
> I am using pig.version=0.8.0_0.20.1_1-1 Version of Pig. Are you also using
> the same one?
>
> Sonia
>
>
> On Mon, Feb 21, 2011 at 2:39 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>
>> Sonia, check the svn revision.
>>
>> w-mbp13-dryaboy:pig-0.8 dmitriy$ svn info
>> Path: .
>> URL: https://svn.apache.org/repos/asf/pig/branches/branch-0.8
>> Repository Root: https://svn.apache.org/repos/asf
>> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
>> Revision: 1073123
>> Node Kind: directory
>> Schedule: normal
>> Last Changed Author: rding
>> Last Changed Rev: 1072229
>> Last Changed Date: 2011-02-18 17:20:38 -0800 (Fri, 18 Feb 2011)
>>
>> Also, here are the cksum results of pig jars:
>>
>> tw-mbp13-dryaboy:pig-0.8 dmitriy$ cksum build/pig-0.8.0-SNAPSHOT*jar
>> 4198538967 2235407 build/pig-0.8.0-SNAPSHOT-core.jar
>> 1926172930 3807045 build/pig-0.8.0-SNAPSHOT-withouthadoop.jar
>> 1795234843 8902554 build/pig-0.8.0-SNAPSHOT.jar
>>
>>
>>
>> On Mon, Feb 21, 2011 at 2:14 PM, sonia gehlot <so...@gmail.com>wrote:
>>
>>> Hey Dmitriy,
>>>
>>> I tried running this again but getting same error.
>>> I will double check my Pig version. I knw it sounds dumb question, but
>>> can you tell how can I see which version of Pig I am using?
>>>
>>>
>>> On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>>
>>>> Sonia,
>>>> I just tried your script using the version of Pig 8 in the 0.8 branch
>>>> (didn't try it with the release), and got correct results. Please try
>>>> that.
>>>>
>>>> dump case_abc:
>>>> (John,USA,1234,)
>>>> (Ron,California,,1432)
>>>> (Sam,NY,,5432)
>>>> (Bill,UK,5647,)
>>>>
>>>> dump gen_case:
>>>> (John,USA,1234,)
>>>> (Ron,California,,1432)
>>>> (Sam,NY,,5432)
>>>> (Bill,UK,5647,)
>>>>
>>>>
>>>> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jcoveney@gmail.com
>>>> >wrote:
>>>>
>>>> > I do not know about sonia, but I know that when I ran into a similar
>>>> bug it
>>>> > was on trunk.
>>>> >
>>>> > 2011/2/21 Ramesh, Amit <am...@amazon.com>
>>>> >
>>>> > >
>>>> > > This very much looks like a result of the bug that dropped schemas
>>>> in the
>>>> > > release version. I was bitten by it a couple of times, but have
>>>> > everything
>>>> > > working now by pulling down a recent snapshot of the code from the
>>>> svn
>>>> > 0.8
>>>> > > release branch. Quite a few major bug fixes have gone in since the
>>>> > original
>>>> > > release.
>>>> > >
>>>> > > ~ Amit
>>>> > >
>>>> > > On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jcoveney@gmail.com
>>>> >
>>>> > > wrote:
>>>> > >
>>>> > > > Sonia,
>>>> > > >
>>>> > > > I absolutely have seen this bug, just couldn't find an easy way to
>>>> > > replicate
>>>> > > > it as it was buried in a bunch of code. The use case was similar:
>>>> > > multiple
>>>> > > > embedded ifs, some nulls, and then instead of getting the expected
>>>> > > output,
>>>> > > > it just repeats one. I know dmitriy said he'd look into it I just
>>>> want
>>>> > to
>>>> > > > say that this isn't an isolated thing.
>>>> > > >
>>>> > > > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
>>>> > > >
>>>> > > >> This sounds like a bug. I'll check it out tomorrow.
>>>> > > >>
>>>> > > >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
>>>> sonia.gehlot@gmail.com
>>>> > > >>> wrote:
>>>> > > >>
>>>> > > >>> Here is an example, Hope this will help. I am running this on
>>>> Pig 0.8
>>>> > > >>> version.
>>>> > > >>>
>>>> > > >>> Sample data in text file
>>>> > > >>> sample1.txt
>>>> > > >>>
>>>> > > >>> John USA www.google.com 1234 900
>>>> > > >>> Ron California www.facebook.com 1432 400
>>>> > > >>> Sam NY www.orkut.com 5432 400
>>>> > > >>> Bill UK www.google.com 5647 645
>>>> > > >>>
>>>> > > >>>
>>>> > > >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
>>>> chararray,
>>>> > > >>> country: chararray, website: chararray, sess_id: int, page_id:
>>>> int);
>>>> > > >>>
>>>> > > >>> case_abc = FOREACH abc GENERATE
>>>> > > >>> name,
>>>> > > >>> country,
>>>> > > >>> ((website matches '.*.google..*') ? sess_id : null ) as
>>>> google_user,
>>>> > > >>> ((page_id == 400) ? sess_id : null) as other_user;
>>>> > > >>>
>>>> > > >>> DUMP case_abc;
>>>> > > >>> ------------------
>>>> > > >>> result of DUMP case_abc;
>>>> > > >>>
>>>> > > >>> (John,USA,1234,)
>>>> > > >>> (Ron,California,,1432)
>>>> > > >>> (Sam,NY,,5432)
>>>> > > >>> (Bill,UK,5647,)
>>>> > > >>>
>>>> > > >>> -----------------
>>>> > > >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>>>> > > >>> other_user;
>>>> > > >>>
>>>> > > >>> DUMP gen_case;
>>>> > > >>> ---------------
>>>> > > >>> result of DUMP gen_case;
>>>> > > >>>
>>>> > > >>> (John,USA,1234,1234)
>>>> > > >>> (Ron,California,,)
>>>> > > >>> (Sam,NY,,)
>>>> > > >>> (Bill,UK,5647,5647)
>>>> > > >>>
>>>> > > >>> You can see in 1st DUMP if conditions are working fine. Then in
>>>> 2nd
>>>> > > dump
>>>> > > >>> after 2nd foreach it messed up with if conditions.
>>>> > > >>>
>>>> > > >>> Let me know if it does make any sense.
>>>> > > >>>
>>>> > > >>> Sonia
>>>> > > >>>
>>>> > > >>>
>>>> > > >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
>>>> dvryaboy@gmail.com
>>>> > > >>> wrote:
>>>> > > >>>
>>>> > > >>>> Hi Sonia,
>>>> > > >>>> Looks like something went wrong in your pasting of the Pig
>>>> code.
>>>> > Could
>>>> > > >> you
>>>> > > >>>> try again, and also add some sample inputs/outputs?
>>>> > > >>>>
>>>> > > >>>> As in, contents of join_pe_pre, contents of case_state, and
>>>> contents
>>>> > > of
>>>> > > >>>> gen_values that illustrate the problem and allow us to
>>>> reproduce it.
>>>> > > >>>>
>>>> > > >>>> Also please tell us what version of Pig you are using.
>>>> > > >>>>
>>>> > > >>>> Thanks,
>>>> > > >>>> -D
>>>> > > >>>>
>>>> > > >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
>>>> > sonia.gehlot@gmail.com
>>>> > > >>> wrote:
>>>> > > >>>>
>>>> > > >>>>> Hi Guys,
>>>> > > >>>>>
>>>> > > >>>>> I getting wired error while running my pig script.
>>>> > > >>>>>
>>>> > > >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>>>> > > >>>>>
>>>> > > >>>>> *  f1, f2, f3, f4,   (*
>>>> > > >>>>>
>>>> > > >>>>> *                  (f5 '.*.facebook..*')*
>>>> > > >>>>>
>>>> > > >>>>> *                  ? f10*
>>>> > > >>>>>
>>>> > > >>>>> *                  : null*
>>>> > > >>>>>
>>>> > > >>>>> *          ) as facebook_referrals,*
>>>> > > >>>>>
>>>> > > >>>>> *
>>>> > > >>>>> *
>>>> > > >>>>>
>>>> > > >>>>> *          (*
>>>> > > >>>>>
>>>> > > >>>>> *                  (*
>>>> > > >>>>>
>>>> > > >>>>> *                          (f6 == 1)*
>>>> > > >>>>>
>>>> > > >>>>> *                          AND*
>>>> > > >>>>>
>>>> > > >>>>> *                          (f7 == 2000)*
>>>> > > >>>>>
>>>> > > >>>>> *                          AND*
>>>> > > >>>>>
>>>> > > >>>>> *                          (f8 == 1)*
>>>> > > >>>>>
>>>> > > >>>>> *                  )*
>>>> > > >>>>>
>>>> > > >>>>> *                  ? f10*
>>>> > > >>>>>
>>>> > > >>>>> *                  : null*
>>>> > > >>>>>
>>>> > > >>>>> *          ) as cd_referrals,*
>>>> > > >>>>>
>>>> > > >>>>> *
>>>> > > >>>>> *
>>>> > > >>>>>
>>>> > > >>>>> *          (*
>>>> > > >>>>>
>>>> > > >>>>> *                  (*
>>>> > > >>>>>
>>>> > > >>>>> *                          (f7 == 1770)*
>>>> > > >>>>>
>>>> > > >>>>> *                          OR*
>>>> > > >>>>>
>>>> > > >>>>> *                          (f7 == 1771)*
>>>> > > >>>>>
>>>> > > >>>>> *                  )*
>>>> > > >>>>>
>>>> > > >>>>> *                  ? f10*
>>>> > > >>>>>
>>>> > > >>>>> *                  : null*
>>>> > > >>>>>
>>>> > > >>>>> *          ) as nm_referrals;*
>>>> > > >>>>>
>>>> > > >>>>> *
>>>> > > >>>>> *
>>>> > > >>>>>
>>>> > > >>>>> *DUMP case_state;*
>>>> > > >>>>>
>>>> > > >>>>>
>>>> > > >>>>> Here when I am doing DUMP case_state I am getting desired
>>>> results,
>>>> > > >> proper
>>>> > > >>>>> case values for facebook_referrals, cd_referrals and
>>>> nm_referrals
>>>> > > >>>>>
>>>> > > >>>>>
>>>> > > >>>>> *gen_values = FOREACH case_state GENERATE *;*
>>>> > > >>>>>
>>>> > > >>>>> *
>>>> > > >>>>> *
>>>> > > >>>>> *DUMP gen_values; *
>>>> > > >>>>> *
>>>> > > >>>>> *
>>>> > > >>>>> But after this if I do simple FOREACH GENERATE everything
>>>> again at
>>>> > > this
>>>> > > >>>>> moment I am getting same values for  facebook_referrals,
>>>> > cd_referrals
>>>> > > >>>>> and nm_referrals. All these three values are same as whatever
>>>> the
>>>> > > value
>>>> > > >>>>> of
>>>> > > >>>>> first if else.
>>>> > > >>>>> I could able to figure out what could be the possible reason
>>>> of
>>>> > this.
>>>> > > >>>>>
>>>> > > >>>>> Please let me know if I am doing anything wrong.
>>>> > > >>>>>
>>>> > > >>>>> Thanks in advance.
>>>> > > >>>>>
>>>> > > >>>>> Sonia
>>>> > > >>>>>
>>>> > > >>>>
>>>> > > >>>>
>>>> > > >>>
>>>> > > >>
>>>> > >
>>>> >
>>>>
>>>
>>>
>>
>

Re: FOREACH GENERATE after if else condition

Posted by sonia gehlot <so...@gmail.com>.
Thanks Dimitriy,

I am using pig.version=0.8.0_0.20.1_1-1 Version of Pig. Are you also using
the same one?

Sonia

On Mon, Feb 21, 2011 at 2:39 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Sonia, check the svn revision.
>
> w-mbp13-dryaboy:pig-0.8 dmitriy$ svn info
> Path: .
> URL: https://svn.apache.org/repos/asf/pig/branches/branch-0.8
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1073123
> Node Kind: directory
> Schedule: normal
> Last Changed Author: rding
> Last Changed Rev: 1072229
> Last Changed Date: 2011-02-18 17:20:38 -0800 (Fri, 18 Feb 2011)
>
> Also, here are the cksum results of pig jars:
>
> tw-mbp13-dryaboy:pig-0.8 dmitriy$ cksum build/pig-0.8.0-SNAPSHOT*jar
> 4198538967 2235407 build/pig-0.8.0-SNAPSHOT-core.jar
> 1926172930 3807045 build/pig-0.8.0-SNAPSHOT-withouthadoop.jar
> 1795234843 8902554 build/pig-0.8.0-SNAPSHOT.jar
>
>
>
> On Mon, Feb 21, 2011 at 2:14 PM, sonia gehlot <so...@gmail.com>wrote:
>
>> Hey Dmitriy,
>>
>> I tried running this again but getting same error.
>> I will double check my Pig version. I knw it sounds dumb question, but can
>> you tell how can I see which version of Pig I am using?
>>
>>
>> On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>
>>> Sonia,
>>> I just tried your script using the version of Pig 8 in the 0.8 branch
>>> (didn't try it with the release), and got correct results. Please try
>>> that.
>>>
>>> dump case_abc:
>>> (John,USA,1234,)
>>> (Ron,California,,1432)
>>> (Sam,NY,,5432)
>>> (Bill,UK,5647,)
>>>
>>> dump gen_case:
>>> (John,USA,1234,)
>>> (Ron,California,,1432)
>>> (Sam,NY,,5432)
>>> (Bill,UK,5647,)
>>>
>>>
>>> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jcoveney@gmail.com
>>> >wrote:
>>>
>>> > I do not know about sonia, but I know that when I ran into a similar
>>> bug it
>>> > was on trunk.
>>> >
>>> > 2011/2/21 Ramesh, Amit <am...@amazon.com>
>>> >
>>> > >
>>> > > This very much looks like a result of the bug that dropped schemas in
>>> the
>>> > > release version. I was bitten by it a couple of times, but have
>>> > everything
>>> > > working now by pulling down a recent snapshot of the code from the
>>> svn
>>> > 0.8
>>> > > release branch. Quite a few major bug fixes have gone in since the
>>> > original
>>> > > release.
>>> > >
>>> > > ~ Amit
>>> > >
>>> > > On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jc...@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Sonia,
>>> > > >
>>> > > > I absolutely have seen this bug, just couldn't find an easy way to
>>> > > replicate
>>> > > > it as it was buried in a bunch of code. The use case was similar:
>>> > > multiple
>>> > > > embedded ifs, some nulls, and then instead of getting the expected
>>> > > output,
>>> > > > it just repeats one. I know dmitriy said he'd look into it I just
>>> want
>>> > to
>>> > > > say that this isn't an isolated thing.
>>> > > >
>>> > > > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
>>> > > >
>>> > > >> This sounds like a bug. I'll check it out tomorrow.
>>> > > >>
>>> > > >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
>>> sonia.gehlot@gmail.com
>>> > > >>> wrote:
>>> > > >>
>>> > > >>> Here is an example, Hope this will help. I am running this on Pig
>>> 0.8
>>> > > >>> version.
>>> > > >>>
>>> > > >>> Sample data in text file
>>> > > >>> sample1.txt
>>> > > >>>
>>> > > >>> John USA www.google.com 1234 900
>>> > > >>> Ron California www.facebook.com 1432 400
>>> > > >>> Sam NY www.orkut.com 5432 400
>>> > > >>> Bill UK www.google.com 5647 645
>>> > > >>>
>>> > > >>>
>>> > > >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
>>> chararray,
>>> > > >>> country: chararray, website: chararray, sess_id: int, page_id:
>>> int);
>>> > > >>>
>>> > > >>> case_abc = FOREACH abc GENERATE
>>> > > >>> name,
>>> > > >>> country,
>>> > > >>> ((website matches '.*.google..*') ? sess_id : null ) as
>>> google_user,
>>> > > >>> ((page_id == 400) ? sess_id : null) as other_user;
>>> > > >>>
>>> > > >>> DUMP case_abc;
>>> > > >>> ------------------
>>> > > >>> result of DUMP case_abc;
>>> > > >>>
>>> > > >>> (John,USA,1234,)
>>> > > >>> (Ron,California,,1432)
>>> > > >>> (Sam,NY,,5432)
>>> > > >>> (Bill,UK,5647,)
>>> > > >>>
>>> > > >>> -----------------
>>> > > >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>>> > > >>> other_user;
>>> > > >>>
>>> > > >>> DUMP gen_case;
>>> > > >>> ---------------
>>> > > >>> result of DUMP gen_case;
>>> > > >>>
>>> > > >>> (John,USA,1234,1234)
>>> > > >>> (Ron,California,,)
>>> > > >>> (Sam,NY,,)
>>> > > >>> (Bill,UK,5647,5647)
>>> > > >>>
>>> > > >>> You can see in 1st DUMP if conditions are working fine. Then in
>>> 2nd
>>> > > dump
>>> > > >>> after 2nd foreach it messed up with if conditions.
>>> > > >>>
>>> > > >>> Let me know if it does make any sense.
>>> > > >>>
>>> > > >>> Sonia
>>> > > >>>
>>> > > >>>
>>> > > >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
>>> dvryaboy@gmail.com
>>> > > >>> wrote:
>>> > > >>>
>>> > > >>>> Hi Sonia,
>>> > > >>>> Looks like something went wrong in your pasting of the Pig code.
>>> > Could
>>> > > >> you
>>> > > >>>> try again, and also add some sample inputs/outputs?
>>> > > >>>>
>>> > > >>>> As in, contents of join_pe_pre, contents of case_state, and
>>> contents
>>> > > of
>>> > > >>>> gen_values that illustrate the problem and allow us to reproduce
>>> it.
>>> > > >>>>
>>> > > >>>> Also please tell us what version of Pig you are using.
>>> > > >>>>
>>> > > >>>> Thanks,
>>> > > >>>> -D
>>> > > >>>>
>>> > > >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
>>> > sonia.gehlot@gmail.com
>>> > > >>> wrote:
>>> > > >>>>
>>> > > >>>>> Hi Guys,
>>> > > >>>>>
>>> > > >>>>> I getting wired error while running my pig script.
>>> > > >>>>>
>>> > > >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>>> > > >>>>>
>>> > > >>>>> *  f1, f2, f3, f4,   (*
>>> > > >>>>>
>>> > > >>>>> *                  (f5 '.*.facebook..*')*
>>> > > >>>>>
>>> > > >>>>> *                  ? f10*
>>> > > >>>>>
>>> > > >>>>> *                  : null*
>>> > > >>>>>
>>> > > >>>>> *          ) as facebook_referrals,*
>>> > > >>>>>
>>> > > >>>>> *
>>> > > >>>>> *
>>> > > >>>>>
>>> > > >>>>> *          (*
>>> > > >>>>>
>>> > > >>>>> *                  (*
>>> > > >>>>>
>>> > > >>>>> *                          (f6 == 1)*
>>> > > >>>>>
>>> > > >>>>> *                          AND*
>>> > > >>>>>
>>> > > >>>>> *                          (f7 == 2000)*
>>> > > >>>>>
>>> > > >>>>> *                          AND*
>>> > > >>>>>
>>> > > >>>>> *                          (f8 == 1)*
>>> > > >>>>>
>>> > > >>>>> *                  )*
>>> > > >>>>>
>>> > > >>>>> *                  ? f10*
>>> > > >>>>>
>>> > > >>>>> *                  : null*
>>> > > >>>>>
>>> > > >>>>> *          ) as cd_referrals,*
>>> > > >>>>>
>>> > > >>>>> *
>>> > > >>>>> *
>>> > > >>>>>
>>> > > >>>>> *          (*
>>> > > >>>>>
>>> > > >>>>> *                  (*
>>> > > >>>>>
>>> > > >>>>> *                          (f7 == 1770)*
>>> > > >>>>>
>>> > > >>>>> *                          OR*
>>> > > >>>>>
>>> > > >>>>> *                          (f7 == 1771)*
>>> > > >>>>>
>>> > > >>>>> *                  )*
>>> > > >>>>>
>>> > > >>>>> *                  ? f10*
>>> > > >>>>>
>>> > > >>>>> *                  : null*
>>> > > >>>>>
>>> > > >>>>> *          ) as nm_referrals;*
>>> > > >>>>>
>>> > > >>>>> *
>>> > > >>>>> *
>>> > > >>>>>
>>> > > >>>>> *DUMP case_state;*
>>> > > >>>>>
>>> > > >>>>>
>>> > > >>>>> Here when I am doing DUMP case_state I am getting desired
>>> results,
>>> > > >> proper
>>> > > >>>>> case values for facebook_referrals, cd_referrals and
>>> nm_referrals
>>> > > >>>>>
>>> > > >>>>>
>>> > > >>>>> *gen_values = FOREACH case_state GENERATE *;*
>>> > > >>>>>
>>> > > >>>>> *
>>> > > >>>>> *
>>> > > >>>>> *DUMP gen_values; *
>>> > > >>>>> *
>>> > > >>>>> *
>>> > > >>>>> But after this if I do simple FOREACH GENERATE everything again
>>> at
>>> > > this
>>> > > >>>>> moment I am getting same values for  facebook_referrals,
>>> > cd_referrals
>>> > > >>>>> and nm_referrals. All these three values are same as whatever
>>> the
>>> > > value
>>> > > >>>>> of
>>> > > >>>>> first if else.
>>> > > >>>>> I could able to figure out what could be the possible reason of
>>> > this.
>>> > > >>>>>
>>> > > >>>>> Please let me know if I am doing anything wrong.
>>> > > >>>>>
>>> > > >>>>> Thanks in advance.
>>> > > >>>>>
>>> > > >>>>> Sonia
>>> > > >>>>>
>>> > > >>>>
>>> > > >>>>
>>> > > >>>
>>> > > >>
>>> > >
>>> >
>>>
>>
>>
>

Re: FOREACH GENERATE after if else condition

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Sonia, check the svn revision.

w-mbp13-dryaboy:pig-0.8 dmitriy$ svn info
Path: .
URL: https://svn.apache.org/repos/asf/pig/branches/branch-0.8
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1073123
Node Kind: directory
Schedule: normal
Last Changed Author: rding
Last Changed Rev: 1072229
Last Changed Date: 2011-02-18 17:20:38 -0800 (Fri, 18 Feb 2011)

Also, here are the cksum results of pig jars:

tw-mbp13-dryaboy:pig-0.8 dmitriy$ cksum build/pig-0.8.0-SNAPSHOT*jar
4198538967 2235407 build/pig-0.8.0-SNAPSHOT-core.jar
1926172930 3807045 build/pig-0.8.0-SNAPSHOT-withouthadoop.jar
1795234843 8902554 build/pig-0.8.0-SNAPSHOT.jar



On Mon, Feb 21, 2011 at 2:14 PM, sonia gehlot <so...@gmail.com>wrote:

> Hey Dmitriy,
>
> I tried running this again but getting same error.
> I will double check my Pig version. I knw it sounds dumb question, but can
> you tell how can I see which version of Pig I am using?
>
>
> On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>
>> Sonia,
>> I just tried your script using the version of Pig 8 in the 0.8 branch
>> (didn't try it with the release), and got correct results. Please try
>> that.
>>
>> dump case_abc:
>> (John,USA,1234,)
>> (Ron,California,,1432)
>> (Sam,NY,,5432)
>> (Bill,UK,5647,)
>>
>> dump gen_case:
>> (John,USA,1234,)
>> (Ron,California,,1432)
>> (Sam,NY,,5432)
>> (Bill,UK,5647,)
>>
>>
>> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jcoveney@gmail.com
>> >wrote:
>>
>> > I do not know about sonia, but I know that when I ran into a similar bug
>> it
>> > was on trunk.
>> >
>> > 2011/2/21 Ramesh, Amit <am...@amazon.com>
>> >
>> > >
>> > > This very much looks like a result of the bug that dropped schemas in
>> the
>> > > release version. I was bitten by it a couple of times, but have
>> > everything
>> > > working now by pulling down a recent snapshot of the code from the svn
>> > 0.8
>> > > release branch. Quite a few major bug fixes have gone in since the
>> > original
>> > > release.
>> > >
>> > > ~ Amit
>> > >
>> > > On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jc...@gmail.com>
>> > > wrote:
>> > >
>> > > > Sonia,
>> > > >
>> > > > I absolutely have seen this bug, just couldn't find an easy way to
>> > > replicate
>> > > > it as it was buried in a bunch of code. The use case was similar:
>> > > multiple
>> > > > embedded ifs, some nulls, and then instead of getting the expected
>> > > output,
>> > > > it just repeats one. I know dmitriy said he'd look into it I just
>> want
>> > to
>> > > > say that this isn't an isolated thing.
>> > > >
>> > > > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
>> > > >
>> > > >> This sounds like a bug. I'll check it out tomorrow.
>> > > >>
>> > > >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
>> sonia.gehlot@gmail.com
>> > > >>> wrote:
>> > > >>
>> > > >>> Here is an example, Hope this will help. I am running this on Pig
>> 0.8
>> > > >>> version.
>> > > >>>
>> > > >>> Sample data in text file
>> > > >>> sample1.txt
>> > > >>>
>> > > >>> John USA www.google.com 1234 900
>> > > >>> Ron California www.facebook.com 1432 400
>> > > >>> Sam NY www.orkut.com 5432 400
>> > > >>> Bill UK www.google.com 5647 645
>> > > >>>
>> > > >>>
>> > > >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
>> chararray,
>> > > >>> country: chararray, website: chararray, sess_id: int, page_id:
>> int);
>> > > >>>
>> > > >>> case_abc = FOREACH abc GENERATE
>> > > >>> name,
>> > > >>> country,
>> > > >>> ((website matches '.*.google..*') ? sess_id : null ) as
>> google_user,
>> > > >>> ((page_id == 400) ? sess_id : null) as other_user;
>> > > >>>
>> > > >>> DUMP case_abc;
>> > > >>> ------------------
>> > > >>> result of DUMP case_abc;
>> > > >>>
>> > > >>> (John,USA,1234,)
>> > > >>> (Ron,California,,1432)
>> > > >>> (Sam,NY,,5432)
>> > > >>> (Bill,UK,5647,)
>> > > >>>
>> > > >>> -----------------
>> > > >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>> > > >>> other_user;
>> > > >>>
>> > > >>> DUMP gen_case;
>> > > >>> ---------------
>> > > >>> result of DUMP gen_case;
>> > > >>>
>> > > >>> (John,USA,1234,1234)
>> > > >>> (Ron,California,,)
>> > > >>> (Sam,NY,,)
>> > > >>> (Bill,UK,5647,5647)
>> > > >>>
>> > > >>> You can see in 1st DUMP if conditions are working fine. Then in
>> 2nd
>> > > dump
>> > > >>> after 2nd foreach it messed up with if conditions.
>> > > >>>
>> > > >>> Let me know if it does make any sense.
>> > > >>>
>> > > >>> Sonia
>> > > >>>
>> > > >>>
>> > > >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
>> dvryaboy@gmail.com
>> > > >>> wrote:
>> > > >>>
>> > > >>>> Hi Sonia,
>> > > >>>> Looks like something went wrong in your pasting of the Pig code.
>> > Could
>> > > >> you
>> > > >>>> try again, and also add some sample inputs/outputs?
>> > > >>>>
>> > > >>>> As in, contents of join_pe_pre, contents of case_state, and
>> contents
>> > > of
>> > > >>>> gen_values that illustrate the problem and allow us to reproduce
>> it.
>> > > >>>>
>> > > >>>> Also please tell us what version of Pig you are using.
>> > > >>>>
>> > > >>>> Thanks,
>> > > >>>> -D
>> > > >>>>
>> > > >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
>> > sonia.gehlot@gmail.com
>> > > >>> wrote:
>> > > >>>>
>> > > >>>>> Hi Guys,
>> > > >>>>>
>> > > >>>>> I getting wired error while running my pig script.
>> > > >>>>>
>> > > >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>> > > >>>>>
>> > > >>>>> *  f1, f2, f3, f4,   (*
>> > > >>>>>
>> > > >>>>> *                  (f5 '.*.facebook..*')*
>> > > >>>>>
>> > > >>>>> *                  ? f10*
>> > > >>>>>
>> > > >>>>> *                  : null*
>> > > >>>>>
>> > > >>>>> *          ) as facebook_referrals,*
>> > > >>>>>
>> > > >>>>> *
>> > > >>>>> *
>> > > >>>>>
>> > > >>>>> *          (*
>> > > >>>>>
>> > > >>>>> *                  (*
>> > > >>>>>
>> > > >>>>> *                          (f6 == 1)*
>> > > >>>>>
>> > > >>>>> *                          AND*
>> > > >>>>>
>> > > >>>>> *                          (f7 == 2000)*
>> > > >>>>>
>> > > >>>>> *                          AND*
>> > > >>>>>
>> > > >>>>> *                          (f8 == 1)*
>> > > >>>>>
>> > > >>>>> *                  )*
>> > > >>>>>
>> > > >>>>> *                  ? f10*
>> > > >>>>>
>> > > >>>>> *                  : null*
>> > > >>>>>
>> > > >>>>> *          ) as cd_referrals,*
>> > > >>>>>
>> > > >>>>> *
>> > > >>>>> *
>> > > >>>>>
>> > > >>>>> *          (*
>> > > >>>>>
>> > > >>>>> *                  (*
>> > > >>>>>
>> > > >>>>> *                          (f7 == 1770)*
>> > > >>>>>
>> > > >>>>> *                          OR*
>> > > >>>>>
>> > > >>>>> *                          (f7 == 1771)*
>> > > >>>>>
>> > > >>>>> *                  )*
>> > > >>>>>
>> > > >>>>> *                  ? f10*
>> > > >>>>>
>> > > >>>>> *                  : null*
>> > > >>>>>
>> > > >>>>> *          ) as nm_referrals;*
>> > > >>>>>
>> > > >>>>> *
>> > > >>>>> *
>> > > >>>>>
>> > > >>>>> *DUMP case_state;*
>> > > >>>>>
>> > > >>>>>
>> > > >>>>> Here when I am doing DUMP case_state I am getting desired
>> results,
>> > > >> proper
>> > > >>>>> case values for facebook_referrals, cd_referrals and
>> nm_referrals
>> > > >>>>>
>> > > >>>>>
>> > > >>>>> *gen_values = FOREACH case_state GENERATE *;*
>> > > >>>>>
>> > > >>>>> *
>> > > >>>>> *
>> > > >>>>> *DUMP gen_values; *
>> > > >>>>> *
>> > > >>>>> *
>> > > >>>>> But after this if I do simple FOREACH GENERATE everything again
>> at
>> > > this
>> > > >>>>> moment I am getting same values for  facebook_referrals,
>> > cd_referrals
>> > > >>>>> and nm_referrals. All these three values are same as whatever
>> the
>> > > value
>> > > >>>>> of
>> > > >>>>> first if else.
>> > > >>>>> I could able to figure out what could be the possible reason of
>> > this.
>> > > >>>>>
>> > > >>>>> Please let me know if I am doing anything wrong.
>> > > >>>>>
>> > > >>>>> Thanks in advance.
>> > > >>>>>
>> > > >>>>> Sonia
>> > > >>>>>
>> > > >>>>
>> > > >>>>
>> > > >>>
>> > > >>
>> > >
>> >
>>
>
>

Re: FOREACH GENERATE after if else condition

Posted by sonia gehlot <so...@gmail.com>.
Hey Dmitriy,

I tried running this again but getting same error.
I will double check my Pig version. I knw it sounds dumb question, but can
you tell how can I see which version of Pig I am using?

On Mon, Feb 21, 2011 at 11:57 AM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Sonia,
> I just tried your script using the version of Pig 8 in the 0.8 branch
> (didn't try it with the release), and got correct results. Please try that.
>
> dump case_abc:
> (John,USA,1234,)
> (Ron,California,,1432)
> (Sam,NY,,5432)
> (Bill,UK,5647,)
>
> dump gen_case:
> (John,USA,1234,)
> (Ron,California,,1432)
> (Sam,NY,,5432)
> (Bill,UK,5647,)
>
>
> On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jcoveney@gmail.com
> >wrote:
>
> > I do not know about sonia, but I know that when I ran into a similar bug
> it
> > was on trunk.
> >
> > 2011/2/21 Ramesh, Amit <am...@amazon.com>
> >
> > >
> > > This very much looks like a result of the bug that dropped schemas in
> the
> > > release version. I was bitten by it a couple of times, but have
> > everything
> > > working now by pulling down a recent snapshot of the code from the svn
> > 0.8
> > > release branch. Quite a few major bug fixes have gone in since the
> > original
> > > release.
> > >
> > > ~ Amit
> > >
> > > On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jc...@gmail.com>
> > > wrote:
> > >
> > > > Sonia,
> > > >
> > > > I absolutely have seen this bug, just couldn't find an easy way to
> > > replicate
> > > > it as it was buried in a bunch of code. The use case was similar:
> > > multiple
> > > > embedded ifs, some nulls, and then instead of getting the expected
> > > output,
> > > > it just repeats one. I know dmitriy said he'd look into it I just
> want
> > to
> > > > say that this isn't an isolated thing.
> > > >
> > > > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
> > > >
> > > >> This sounds like a bug. I'll check it out tomorrow.
> > > >>
> > > >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <
> sonia.gehlot@gmail.com
> > > >>> wrote:
> > > >>
> > > >>> Here is an example, Hope this will help. I am running this on Pig
> 0.8
> > > >>> version.
> > > >>>
> > > >>> Sample data in text file
> > > >>> sample1.txt
> > > >>>
> > > >>> John USA www.google.com 1234 900
> > > >>> Ron California www.facebook.com 1432 400
> > > >>> Sam NY www.orkut.com 5432 400
> > > >>> Bill UK www.google.com 5647 645
> > > >>>
> > > >>>
> > > >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name:
> chararray,
> > > >>> country: chararray, website: chararray, sess_id: int, page_id:
> int);
> > > >>>
> > > >>> case_abc = FOREACH abc GENERATE
> > > >>> name,
> > > >>> country,
> > > >>> ((website matches '.*.google..*') ? sess_id : null ) as
> google_user,
> > > >>> ((page_id == 400) ? sess_id : null) as other_user;
> > > >>>
> > > >>> DUMP case_abc;
> > > >>> ------------------
> > > >>> result of DUMP case_abc;
> > > >>>
> > > >>> (John,USA,1234,)
> > > >>> (Ron,California,,1432)
> > > >>> (Sam,NY,,5432)
> > > >>> (Bill,UK,5647,)
> > > >>>
> > > >>> -----------------
> > > >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
> > > >>> other_user;
> > > >>>
> > > >>> DUMP gen_case;
> > > >>> ---------------
> > > >>> result of DUMP gen_case;
> > > >>>
> > > >>> (John,USA,1234,1234)
> > > >>> (Ron,California,,)
> > > >>> (Sam,NY,,)
> > > >>> (Bill,UK,5647,5647)
> > > >>>
> > > >>> You can see in 1st DUMP if conditions are working fine. Then in 2nd
> > > dump
> > > >>> after 2nd foreach it messed up with if conditions.
> > > >>>
> > > >>> Let me know if it does make any sense.
> > > >>>
> > > >>> Sonia
> > > >>>
> > > >>>
> > > >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <
> dvryaboy@gmail.com
> > > >>> wrote:
> > > >>>
> > > >>>> Hi Sonia,
> > > >>>> Looks like something went wrong in your pasting of the Pig code.
> > Could
> > > >> you
> > > >>>> try again, and also add some sample inputs/outputs?
> > > >>>>
> > > >>>> As in, contents of join_pe_pre, contents of case_state, and
> contents
> > > of
> > > >>>> gen_values that illustrate the problem and allow us to reproduce
> it.
> > > >>>>
> > > >>>> Also please tell us what version of Pig you are using.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> -D
> > > >>>>
> > > >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
> > sonia.gehlot@gmail.com
> > > >>> wrote:
> > > >>>>
> > > >>>>> Hi Guys,
> > > >>>>>
> > > >>>>> I getting wired error while running my pig script.
> > > >>>>>
> > > >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
> > > >>>>>
> > > >>>>> *  f1, f2, f3, f4,   (*
> > > >>>>>
> > > >>>>> *                  (f5 '.*.facebook..*')*
> > > >>>>>
> > > >>>>> *                  ? f10*
> > > >>>>>
> > > >>>>> *                  : null*
> > > >>>>>
> > > >>>>> *          ) as facebook_referrals,*
> > > >>>>>
> > > >>>>> *
> > > >>>>> *
> > > >>>>>
> > > >>>>> *          (*
> > > >>>>>
> > > >>>>> *                  (*
> > > >>>>>
> > > >>>>> *                          (f6 == 1)*
> > > >>>>>
> > > >>>>> *                          AND*
> > > >>>>>
> > > >>>>> *                          (f7 == 2000)*
> > > >>>>>
> > > >>>>> *                          AND*
> > > >>>>>
> > > >>>>> *                          (f8 == 1)*
> > > >>>>>
> > > >>>>> *                  )*
> > > >>>>>
> > > >>>>> *                  ? f10*
> > > >>>>>
> > > >>>>> *                  : null*
> > > >>>>>
> > > >>>>> *          ) as cd_referrals,*
> > > >>>>>
> > > >>>>> *
> > > >>>>> *
> > > >>>>>
> > > >>>>> *          (*
> > > >>>>>
> > > >>>>> *                  (*
> > > >>>>>
> > > >>>>> *                          (f7 == 1770)*
> > > >>>>>
> > > >>>>> *                          OR*
> > > >>>>>
> > > >>>>> *                          (f7 == 1771)*
> > > >>>>>
> > > >>>>> *                  )*
> > > >>>>>
> > > >>>>> *                  ? f10*
> > > >>>>>
> > > >>>>> *                  : null*
> > > >>>>>
> > > >>>>> *          ) as nm_referrals;*
> > > >>>>>
> > > >>>>> *
> > > >>>>> *
> > > >>>>>
> > > >>>>> *DUMP case_state;*
> > > >>>>>
> > > >>>>>
> > > >>>>> Here when I am doing DUMP case_state I am getting desired
> results,
> > > >> proper
> > > >>>>> case values for facebook_referrals, cd_referrals and nm_referrals
> > > >>>>>
> > > >>>>>
> > > >>>>> *gen_values = FOREACH case_state GENERATE *;*
> > > >>>>>
> > > >>>>> *
> > > >>>>> *
> > > >>>>> *DUMP gen_values; *
> > > >>>>> *
> > > >>>>> *
> > > >>>>> But after this if I do simple FOREACH GENERATE everything again
> at
> > > this
> > > >>>>> moment I am getting same values for  facebook_referrals,
> > cd_referrals
> > > >>>>> and nm_referrals. All these three values are same as whatever the
> > > value
> > > >>>>> of
> > > >>>>> first if else.
> > > >>>>> I could able to figure out what could be the possible reason of
> > this.
> > > >>>>>
> > > >>>>> Please let me know if I am doing anything wrong.
> > > >>>>>
> > > >>>>> Thanks in advance.
> > > >>>>>
> > > >>>>> Sonia
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
>

Re: FOREACH GENERATE after if else condition

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Sonia,
I just tried your script using the version of Pig 8 in the 0.8 branch
(didn't try it with the release), and got correct results. Please try that.

dump case_abc:
(John,USA,1234,)
(Ron,California,,1432)
(Sam,NY,,5432)
(Bill,UK,5647,)

dump gen_case:
(John,USA,1234,)
(Ron,California,,1432)
(Sam,NY,,5432)
(Bill,UK,5647,)


On Mon, Feb 21, 2011 at 10:23 AM, Jonathan Coveney <jc...@gmail.com>wrote:

> I do not know about sonia, but I know that when I ran into a similar bug it
> was on trunk.
>
> 2011/2/21 Ramesh, Amit <am...@amazon.com>
>
> >
> > This very much looks like a result of the bug that dropped schemas in the
> > release version. I was bitten by it a couple of times, but have
> everything
> > working now by pulling down a recent snapshot of the code from the svn
> 0.8
> > release branch. Quite a few major bug fixes have gone in since the
> original
> > release.
> >
> > ~ Amit
> >
> > On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jc...@gmail.com>
> > wrote:
> >
> > > Sonia,
> > >
> > > I absolutely have seen this bug, just couldn't find an easy way to
> > replicate
> > > it as it was buried in a bunch of code. The use case was similar:
> > multiple
> > > embedded ifs, some nulls, and then instead of getting the expected
> > output,
> > > it just repeats one. I know dmitriy said he'd look into it I just want
> to
> > > say that this isn't an isolated thing.
> > >
> > > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
> > >
> > >> This sounds like a bug. I'll check it out tomorrow.
> > >>
> > >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <sonia.gehlot@gmail.com
> > >>> wrote:
> > >>
> > >>> Here is an example, Hope this will help. I am running this on Pig 0.8
> > >>> version.
> > >>>
> > >>> Sample data in text file
> > >>> sample1.txt
> > >>>
> > >>> John USA www.google.com 1234 900
> > >>> Ron California www.facebook.com 1432 400
> > >>> Sam NY www.orkut.com 5432 400
> > >>> Bill UK www.google.com 5647 645
> > >>>
> > >>>
> > >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
> > >>> country: chararray, website: chararray, sess_id: int, page_id: int);
> > >>>
> > >>> case_abc = FOREACH abc GENERATE
> > >>> name,
> > >>> country,
> > >>> ((website matches '.*.google..*') ? sess_id : null ) as google_user,
> > >>> ((page_id == 400) ? sess_id : null) as other_user;
> > >>>
> > >>> DUMP case_abc;
> > >>> ------------------
> > >>> result of DUMP case_abc;
> > >>>
> > >>> (John,USA,1234,)
> > >>> (Ron,California,,1432)
> > >>> (Sam,NY,,5432)
> > >>> (Bill,UK,5647,)
> > >>>
> > >>> -----------------
> > >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
> > >>> other_user;
> > >>>
> > >>> DUMP gen_case;
> > >>> ---------------
> > >>> result of DUMP gen_case;
> > >>>
> > >>> (John,USA,1234,1234)
> > >>> (Ron,California,,)
> > >>> (Sam,NY,,)
> > >>> (Bill,UK,5647,5647)
> > >>>
> > >>> You can see in 1st DUMP if conditions are working fine. Then in 2nd
> > dump
> > >>> after 2nd foreach it messed up with if conditions.
> > >>>
> > >>> Let me know if it does make any sense.
> > >>>
> > >>> Sonia
> > >>>
> > >>>
> > >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
> > >>> wrote:
> > >>>
> > >>>> Hi Sonia,
> > >>>> Looks like something went wrong in your pasting of the Pig code.
> Could
> > >> you
> > >>>> try again, and also add some sample inputs/outputs?
> > >>>>
> > >>>> As in, contents of join_pe_pre, contents of case_state, and contents
> > of
> > >>>> gen_values that illustrate the problem and allow us to reproduce it.
> > >>>>
> > >>>> Also please tell us what version of Pig you are using.
> > >>>>
> > >>>> Thanks,
> > >>>> -D
> > >>>>
> > >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <
> sonia.gehlot@gmail.com
> > >>> wrote:
> > >>>>
> > >>>>> Hi Guys,
> > >>>>>
> > >>>>> I getting wired error while running my pig script.
> > >>>>>
> > >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
> > >>>>>
> > >>>>> *  f1, f2, f3, f4,   (*
> > >>>>>
> > >>>>> *                  (f5 '.*.facebook..*')*
> > >>>>>
> > >>>>> *                  ? f10*
> > >>>>>
> > >>>>> *                  : null*
> > >>>>>
> > >>>>> *          ) as facebook_referrals,*
> > >>>>>
> > >>>>> *
> > >>>>> *
> > >>>>>
> > >>>>> *          (*
> > >>>>>
> > >>>>> *                  (*
> > >>>>>
> > >>>>> *                          (f6 == 1)*
> > >>>>>
> > >>>>> *                          AND*
> > >>>>>
> > >>>>> *                          (f7 == 2000)*
> > >>>>>
> > >>>>> *                          AND*
> > >>>>>
> > >>>>> *                          (f8 == 1)*
> > >>>>>
> > >>>>> *                  )*
> > >>>>>
> > >>>>> *                  ? f10*
> > >>>>>
> > >>>>> *                  : null*
> > >>>>>
> > >>>>> *          ) as cd_referrals,*
> > >>>>>
> > >>>>> *
> > >>>>> *
> > >>>>>
> > >>>>> *          (*
> > >>>>>
> > >>>>> *                  (*
> > >>>>>
> > >>>>> *                          (f7 == 1770)*
> > >>>>>
> > >>>>> *                          OR*
> > >>>>>
> > >>>>> *                          (f7 == 1771)*
> > >>>>>
> > >>>>> *                  )*
> > >>>>>
> > >>>>> *                  ? f10*
> > >>>>>
> > >>>>> *                  : null*
> > >>>>>
> > >>>>> *          ) as nm_referrals;*
> > >>>>>
> > >>>>> *
> > >>>>> *
> > >>>>>
> > >>>>> *DUMP case_state;*
> > >>>>>
> > >>>>>
> > >>>>> Here when I am doing DUMP case_state I am getting desired results,
> > >> proper
> > >>>>> case values for facebook_referrals, cd_referrals and nm_referrals
> > >>>>>
> > >>>>>
> > >>>>> *gen_values = FOREACH case_state GENERATE *;*
> > >>>>>
> > >>>>> *
> > >>>>> *
> > >>>>> *DUMP gen_values; *
> > >>>>> *
> > >>>>> *
> > >>>>> But after this if I do simple FOREACH GENERATE everything again at
> > this
> > >>>>> moment I am getting same values for  facebook_referrals,
> cd_referrals
> > >>>>> and nm_referrals. All these three values are same as whatever the
> > value
> > >>>>> of
> > >>>>> first if else.
> > >>>>> I could able to figure out what could be the possible reason of
> this.
> > >>>>>
> > >>>>> Please let me know if I am doing anything wrong.
> > >>>>>
> > >>>>> Thanks in advance.
> > >>>>>
> > >>>>> Sonia
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> >
>

Re: FOREACH GENERATE after if else condition

Posted by Jonathan Coveney <jc...@gmail.com>.
I do not know about sonia, but I know that when I ran into a similar bug it
was on trunk.

2011/2/21 Ramesh, Amit <am...@amazon.com>

>
> This very much looks like a result of the bug that dropped schemas in the
> release version. I was bitten by it a couple of times, but have everything
> working now by pulling down a recent snapshot of the code from the svn 0.8
> release branch. Quite a few major bug fixes have gone in since the original
> release.
>
> ~ Amit
>
> On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jc...@gmail.com>
> wrote:
>
> > Sonia,
> >
> > I absolutely have seen this bug, just couldn't find an easy way to
> replicate
> > it as it was buried in a bunch of code. The use case was similar:
> multiple
> > embedded ifs, some nulls, and then instead of getting the expected
> output,
> > it just repeats one. I know dmitriy said he'd look into it I just want to
> > say that this isn't an isolated thing.
> >
> > 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
> >
> >> This sounds like a bug. I'll check it out tomorrow.
> >>
> >> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <sonia.gehlot@gmail.com
> >>> wrote:
> >>
> >>> Here is an example, Hope this will help. I am running this on Pig 0.8
> >>> version.
> >>>
> >>> Sample data in text file
> >>> sample1.txt
> >>>
> >>> John USA www.google.com 1234 900
> >>> Ron California www.facebook.com 1432 400
> >>> Sam NY www.orkut.com 5432 400
> >>> Bill UK www.google.com 5647 645
> >>>
> >>>
> >>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
> >>> country: chararray, website: chararray, sess_id: int, page_id: int);
> >>>
> >>> case_abc = FOREACH abc GENERATE
> >>> name,
> >>> country,
> >>> ((website matches '.*.google..*') ? sess_id : null ) as google_user,
> >>> ((page_id == 400) ? sess_id : null) as other_user;
> >>>
> >>> DUMP case_abc;
> >>> ------------------
> >>> result of DUMP case_abc;
> >>>
> >>> (John,USA,1234,)
> >>> (Ron,California,,1432)
> >>> (Sam,NY,,5432)
> >>> (Bill,UK,5647,)
> >>>
> >>> -----------------
> >>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
> >>> other_user;
> >>>
> >>> DUMP gen_case;
> >>> ---------------
> >>> result of DUMP gen_case;
> >>>
> >>> (John,USA,1234,1234)
> >>> (Ron,California,,)
> >>> (Sam,NY,,)
> >>> (Bill,UK,5647,5647)
> >>>
> >>> You can see in 1st DUMP if conditions are working fine. Then in 2nd
> dump
> >>> after 2nd foreach it messed up with if conditions.
> >>>
> >>> Let me know if it does make any sense.
> >>>
> >>> Sonia
> >>>
> >>>
> >>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
> >>> wrote:
> >>>
> >>>> Hi Sonia,
> >>>> Looks like something went wrong in your pasting of the Pig code. Could
> >> you
> >>>> try again, and also add some sample inputs/outputs?
> >>>>
> >>>> As in, contents of join_pe_pre, contents of case_state, and contents
> of
> >>>> gen_values that illustrate the problem and allow us to reproduce it.
> >>>>
> >>>> Also please tell us what version of Pig you are using.
> >>>>
> >>>> Thanks,
> >>>> -D
> >>>>
> >>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <sonia.gehlot@gmail.com
> >>> wrote:
> >>>>
> >>>>> Hi Guys,
> >>>>>
> >>>>> I getting wired error while running my pig script.
> >>>>>
> >>>>>  *case_state = FOREACH join_pe_pre GENERATE*
> >>>>>
> >>>>> *  f1, f2, f3, f4,   (*
> >>>>>
> >>>>> *                  (f5 '.*.facebook..*')*
> >>>>>
> >>>>> *                  ? f10*
> >>>>>
> >>>>> *                  : null*
> >>>>>
> >>>>> *          ) as facebook_referrals,*
> >>>>>
> >>>>> *
> >>>>> *
> >>>>>
> >>>>> *          (*
> >>>>>
> >>>>> *                  (*
> >>>>>
> >>>>> *                          (f6 == 1)*
> >>>>>
> >>>>> *                          AND*
> >>>>>
> >>>>> *                          (f7 == 2000)*
> >>>>>
> >>>>> *                          AND*
> >>>>>
> >>>>> *                          (f8 == 1)*
> >>>>>
> >>>>> *                  )*
> >>>>>
> >>>>> *                  ? f10*
> >>>>>
> >>>>> *                  : null*
> >>>>>
> >>>>> *          ) as cd_referrals,*
> >>>>>
> >>>>> *
> >>>>> *
> >>>>>
> >>>>> *          (*
> >>>>>
> >>>>> *                  (*
> >>>>>
> >>>>> *                          (f7 == 1770)*
> >>>>>
> >>>>> *                          OR*
> >>>>>
> >>>>> *                          (f7 == 1771)*
> >>>>>
> >>>>> *                  )*
> >>>>>
> >>>>> *                  ? f10*
> >>>>>
> >>>>> *                  : null*
> >>>>>
> >>>>> *          ) as nm_referrals;*
> >>>>>
> >>>>> *
> >>>>> *
> >>>>>
> >>>>> *DUMP case_state;*
> >>>>>
> >>>>>
> >>>>> Here when I am doing DUMP case_state I am getting desired results,
> >> proper
> >>>>> case values for facebook_referrals, cd_referrals and nm_referrals
> >>>>>
> >>>>>
> >>>>> *gen_values = FOREACH case_state GENERATE *;*
> >>>>>
> >>>>> *
> >>>>> *
> >>>>> *DUMP gen_values; *
> >>>>> *
> >>>>> *
> >>>>> But after this if I do simple FOREACH GENERATE everything again at
> this
> >>>>> moment I am getting same values for  facebook_referrals, cd_referrals
> >>>>> and nm_referrals. All these three values are same as whatever the
> value
> >>>>> of
> >>>>> first if else.
> >>>>> I could able to figure out what could be the possible reason of this.
> >>>>>
> >>>>> Please let me know if I am doing anything wrong.
> >>>>>
> >>>>> Thanks in advance.
> >>>>>
> >>>>> Sonia
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>

Re: FOREACH GENERATE after if else condition

Posted by "Ramesh, Amit" <am...@amazon.com>.
This very much looks like a result of the bug that dropped schemas in the release version. I was bitten by it a couple of times, but have everything working now by pulling down a recent snapshot of the code from the svn 0.8 release branch. Quite a few major bug fixes have gone in since the original release.

~ Amit

On Feb 21, 2011, at 6:47 AM, "Jonathan Coveney" <jc...@gmail.com> wrote:

> Sonia,
> 
> I absolutely have seen this bug, just couldn't find an easy way to replicate
> it as it was buried in a bunch of code. The use case was similar: multiple
> embedded ifs, some nulls, and then instead of getting the expected output,
> it just repeats one. I know dmitriy said he'd look into it I just want to
> say that this isn't an isolated thing.
> 
> 2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>
> 
>> This sounds like a bug. I'll check it out tomorrow.
>> 
>> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <sonia.gehlot@gmail.com
>>> wrote:
>> 
>>> Here is an example, Hope this will help. I am running this on Pig 0.8
>>> version.
>>> 
>>> Sample data in text file
>>> sample1.txt
>>> 
>>> John USA www.google.com 1234 900
>>> Ron California www.facebook.com 1432 400
>>> Sam NY www.orkut.com 5432 400
>>> Bill UK www.google.com 5647 645
>>> 
>>> 
>>> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
>>> country: chararray, website: chararray, sess_id: int, page_id: int);
>>> 
>>> case_abc = FOREACH abc GENERATE
>>> name,
>>> country,
>>> ((website matches '.*.google..*') ? sess_id : null ) as google_user,
>>> ((page_id == 400) ? sess_id : null) as other_user;
>>> 
>>> DUMP case_abc;
>>> ------------------
>>> result of DUMP case_abc;
>>> 
>>> (John,USA,1234,)
>>> (Ron,California,,1432)
>>> (Sam,NY,,5432)
>>> (Bill,UK,5647,)
>>> 
>>> -----------------
>>> gen_case = FOREACH case_abc GENERATE name, country, google_user,
>>> other_user;
>>> 
>>> DUMP gen_case;
>>> ---------------
>>> result of DUMP gen_case;
>>> 
>>> (John,USA,1234,1234)
>>> (Ron,California,,)
>>> (Sam,NY,,)
>>> (Bill,UK,5647,5647)
>>> 
>>> You can see in 1st DUMP if conditions are working fine. Then in 2nd dump
>>> after 2nd foreach it messed up with if conditions.
>>> 
>>> Let me know if it does make any sense.
>>> 
>>> Sonia
>>> 
>>> 
>>> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
>>> wrote:
>>> 
>>>> Hi Sonia,
>>>> Looks like something went wrong in your pasting of the Pig code. Could
>> you
>>>> try again, and also add some sample inputs/outputs?
>>>> 
>>>> As in, contents of join_pe_pre, contents of case_state, and contents of
>>>> gen_values that illustrate the problem and allow us to reproduce it.
>>>> 
>>>> Also please tell us what version of Pig you are using.
>>>> 
>>>> Thanks,
>>>> -D
>>>> 
>>>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <sonia.gehlot@gmail.com
>>> wrote:
>>>> 
>>>>> Hi Guys,
>>>>> 
>>>>> I getting wired error while running my pig script.
>>>>> 
>>>>>  *case_state = FOREACH join_pe_pre GENERATE*
>>>>> 
>>>>> *  f1, f2, f3, f4,   (*
>>>>> 
>>>>> *                  (f5 '.*.facebook..*')*
>>>>> 
>>>>> *                  ? f10*
>>>>> 
>>>>> *                  : null*
>>>>> 
>>>>> *          ) as facebook_referrals,*
>>>>> 
>>>>> *
>>>>> *
>>>>> 
>>>>> *          (*
>>>>> 
>>>>> *                  (*
>>>>> 
>>>>> *                          (f6 == 1)*
>>>>> 
>>>>> *                          AND*
>>>>> 
>>>>> *                          (f7 == 2000)*
>>>>> 
>>>>> *                          AND*
>>>>> 
>>>>> *                          (f8 == 1)*
>>>>> 
>>>>> *                  )*
>>>>> 
>>>>> *                  ? f10*
>>>>> 
>>>>> *                  : null*
>>>>> 
>>>>> *          ) as cd_referrals,*
>>>>> 
>>>>> *
>>>>> *
>>>>> 
>>>>> *          (*
>>>>> 
>>>>> *                  (*
>>>>> 
>>>>> *                          (f7 == 1770)*
>>>>> 
>>>>> *                          OR*
>>>>> 
>>>>> *                          (f7 == 1771)*
>>>>> 
>>>>> *                  )*
>>>>> 
>>>>> *                  ? f10*
>>>>> 
>>>>> *                  : null*
>>>>> 
>>>>> *          ) as nm_referrals;*
>>>>> 
>>>>> *
>>>>> *
>>>>> 
>>>>> *DUMP case_state;*
>>>>> 
>>>>> 
>>>>> Here when I am doing DUMP case_state I am getting desired results,
>> proper
>>>>> case values for facebook_referrals, cd_referrals and nm_referrals
>>>>> 
>>>>> 
>>>>> *gen_values = FOREACH case_state GENERATE *;*
>>>>> 
>>>>> *
>>>>> *
>>>>> *DUMP gen_values; *
>>>>> *
>>>>> *
>>>>> But after this if I do simple FOREACH GENERATE everything again at this
>>>>> moment I am getting same values for  facebook_referrals, cd_referrals
>>>>> and nm_referrals. All these three values are same as whatever the value
>>>>> of
>>>>> first if else.
>>>>> I could able to figure out what could be the possible reason of this.
>>>>> 
>>>>> Please let me know if I am doing anything wrong.
>>>>> 
>>>>> Thanks in advance.
>>>>> 
>>>>> Sonia
>>>>> 
>>>> 
>>>> 
>>> 
>> 

Re: FOREACH GENERATE after if else condition

Posted by Jonathan Coveney <jc...@gmail.com>.
Sonia,

I absolutely have seen this bug, just couldn't find an easy way to replicate
it as it was buried in a bunch of code. The use case was similar: multiple
embedded ifs, some nulls, and then instead of getting the expected output,
it just repeats one. I know dmitriy said he'd look into it I just want to
say that this isn't an isolated thing.

2011/2/20 Dmitriy Ryaboy <dv...@gmail.com>

> This sounds like a bug. I'll check it out tomorrow.
>
> On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <sonia.gehlot@gmail.com
> >wrote:
>
> > Here is an example, Hope this will help. I am running this on Pig 0.8
> > version.
> >
> > Sample data in text file
> > sample1.txt
> >
> > John USA www.google.com 1234 900
> > Ron California www.facebook.com 1432 400
> > Sam NY www.orkut.com 5432 400
> > Bill UK www.google.com 5647 645
> >
> >
> > abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
> > country: chararray, website: chararray, sess_id: int, page_id: int);
> >
> > case_abc = FOREACH abc GENERATE
> >  name,
> > country,
> > ((website matches '.*.google..*') ? sess_id : null ) as google_user,
> >  ((page_id == 400) ? sess_id : null) as other_user;
> >
> > DUMP case_abc;
> > ------------------
> > result of DUMP case_abc;
> >
> > (John,USA,1234,)
> > (Ron,California,,1432)
> > (Sam,NY,,5432)
> > (Bill,UK,5647,)
> >
> > -----------------
> > gen_case = FOREACH case_abc GENERATE name, country, google_user,
> > other_user;
> >
> > DUMP gen_case;
> > ---------------
> > result of DUMP gen_case;
> >
> > (John,USA,1234,1234)
> > (Ron,California,,)
> > (Sam,NY,,)
> > (Bill,UK,5647,5647)
> >
> > You can see in 1st DUMP if conditions are working fine. Then in 2nd dump
> > after 2nd foreach it messed up with if conditions.
> >
> > Let me know if it does make any sense.
> >
> > Sonia
> >
> >
> > On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
> >wrote:
> >
> >> Hi Sonia,
> >> Looks like something went wrong in your pasting of the Pig code. Could
> you
> >> try again, and also add some sample inputs/outputs?
> >>
> >> As in, contents of join_pe_pre, contents of case_state, and contents of
> >> gen_values that illustrate the problem and allow us to reproduce it.
> >>
> >> Also please tell us what version of Pig you are using.
> >>
> >> Thanks,
> >> -D
> >>
> >> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <sonia.gehlot@gmail.com
> >wrote:
> >>
> >>> Hi Guys,
> >>>
> >>> I getting wired error while running my pig script.
> >>>
> >>>   *case_state = FOREACH join_pe_pre GENERATE*
> >>>
> >>> *  f1, f2, f3, f4,   (*
> >>>
> >>> *                  (f5 '.*.facebook..*')*
> >>>
> >>> *                  ? f10*
> >>>
> >>> *                  : null*
> >>>
> >>> *          ) as facebook_referrals,*
> >>>
> >>> *
> >>> *
> >>>
> >>> *          (*
> >>>
> >>> *                  (*
> >>>
> >>> *                          (f6 == 1)*
> >>>
> >>> *                          AND*
> >>>
> >>> *                          (f7 == 2000)*
> >>>
> >>> *                          AND*
> >>>
> >>> *                          (f8 == 1)*
> >>>
> >>> *                  )*
> >>>
> >>> *                  ? f10*
> >>>
> >>> *                  : null*
> >>>
> >>> *          ) as cd_referrals,*
> >>>
> >>> *
> >>> *
> >>>
> >>> *          (*
> >>>
> >>> *                  (*
> >>>
> >>> *                          (f7 == 1770)*
> >>>
> >>> *                          OR*
> >>>
> >>> *                          (f7 == 1771)*
> >>>
> >>> *                  )*
> >>>
> >>> *                  ? f10*
> >>>
> >>> *                  : null*
> >>>
> >>> *          ) as nm_referrals;*
> >>>
> >>> *
> >>> *
> >>>
> >>> *DUMP case_state;*
> >>>
> >>>
> >>> Here when I am doing DUMP case_state I am getting desired results,
> proper
> >>> case values for facebook_referrals, cd_referrals and nm_referrals
> >>>
> >>>
> >>> *gen_values = FOREACH case_state GENERATE *;*
> >>>
> >>> *
> >>> *
> >>> *DUMP gen_values; *
> >>> *
> >>> *
> >>> But after this if I do simple FOREACH GENERATE everything again at this
> >>> moment I am getting same values for  facebook_referrals, cd_referrals
> >>> and nm_referrals. All these three values are same as whatever the value
> >>> of
> >>> first if else.
> >>> I could able to figure out what could be the possible reason of this.
> >>>
> >>> Please let me know if I am doing anything wrong.
> >>>
> >>> Thanks in advance.
> >>>
> >>> Sonia
> >>>
> >>
> >>
> >
>

Re: FOREACH GENERATE after if else condition

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
This sounds like a bug. I'll check it out tomorrow.

On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <so...@gmail.com>wrote:

> Here is an example, Hope this will help. I am running this on Pig 0.8
> version.
>
> Sample data in text file
> sample1.txt
>
> John USA www.google.com 1234 900
> Ron California www.facebook.com 1432 400
> Sam NY www.orkut.com 5432 400
> Bill UK www.google.com 5647 645
>
>
> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
> country: chararray, website: chararray, sess_id: int, page_id: int);
>
> case_abc = FOREACH abc GENERATE
>  name,
> country,
> ((website matches '.*.google..*') ? sess_id : null ) as google_user,
>  ((page_id == 400) ? sess_id : null) as other_user;
>
> DUMP case_abc;
> ------------------
> result of DUMP case_abc;
>
> (John,USA,1234,)
> (Ron,California,,1432)
> (Sam,NY,,5432)
> (Bill,UK,5647,)
>
> -----------------
> gen_case = FOREACH case_abc GENERATE name, country, google_user,
> other_user;
>
> DUMP gen_case;
> ---------------
> result of DUMP gen_case;
>
> (John,USA,1234,1234)
> (Ron,California,,)
> (Sam,NY,,)
> (Bill,UK,5647,5647)
>
> You can see in 1st DUMP if conditions are working fine. Then in 2nd dump
> after 2nd foreach it messed up with if conditions.
>
> Let me know if it does make any sense.
>
> Sonia
>
>
> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>
>> Hi Sonia,
>> Looks like something went wrong in your pasting of the Pig code. Could you
>> try again, and also add some sample inputs/outputs?
>>
>> As in, contents of join_pe_pre, contents of case_state, and contents of
>> gen_values that illustrate the problem and allow us to reproduce it.
>>
>> Also please tell us what version of Pig you are using.
>>
>> Thanks,
>> -D
>>
>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <so...@gmail.com>wrote:
>>
>>> Hi Guys,
>>>
>>> I getting wired error while running my pig script.
>>>
>>>   *case_state = FOREACH join_pe_pre GENERATE*
>>>
>>> *  f1, f2, f3, f4,   (*
>>>
>>> *                  (f5 '.*.facebook..*')*
>>>
>>> *                  ? f10*
>>>
>>> *                  : null*
>>>
>>> *          ) as facebook_referrals,*
>>>
>>> *
>>> *
>>>
>>> *          (*
>>>
>>> *                  (*
>>>
>>> *                          (f6 == 1)*
>>>
>>> *                          AND*
>>>
>>> *                          (f7 == 2000)*
>>>
>>> *                          AND*
>>>
>>> *                          (f8 == 1)*
>>>
>>> *                  )*
>>>
>>> *                  ? f10*
>>>
>>> *                  : null*
>>>
>>> *          ) as cd_referrals,*
>>>
>>> *
>>> *
>>>
>>> *          (*
>>>
>>> *                  (*
>>>
>>> *                          (f7 == 1770)*
>>>
>>> *                          OR*
>>>
>>> *                          (f7 == 1771)*
>>>
>>> *                  )*
>>>
>>> *                  ? f10*
>>>
>>> *                  : null*
>>>
>>> *          ) as nm_referrals;*
>>>
>>> *
>>> *
>>>
>>> *DUMP case_state;*
>>>
>>>
>>> Here when I am doing DUMP case_state I am getting desired results, proper
>>> case values for facebook_referrals, cd_referrals and nm_referrals
>>>
>>>
>>> *gen_values = FOREACH case_state GENERATE *;*
>>>
>>> *
>>> *
>>> *DUMP gen_values; *
>>> *
>>> *
>>> But after this if I do simple FOREACH GENERATE everything again at this
>>> moment I am getting same values for  facebook_referrals, cd_referrals
>>> and nm_referrals. All these three values are same as whatever the value
>>> of
>>> first if else.
>>> I could able to figure out what could be the possible reason of this.
>>>
>>> Please let me know if I am doing anything wrong.
>>>
>>> Thanks in advance.
>>>
>>> Sonia
>>>
>>
>>
>

Re: FOREACH GENERATE after if else condition

Posted by sonia gehlot <so...@gmail.com>.
Is anyone is facing this kind of issue.. or I am doing anything wrong.

Thanks,
Sonia

On Sat, Feb 19, 2011 at 7:04 PM, sonia gehlot <so...@gmail.com>wrote:

> Here is an example, Hope this will help. I am running this on Pig 0.8
> version.
>
> Sample data in text file
> sample1.txt
>
> John USA www.google.com 1234 900
> Ron California www.facebook.com 1432 400
> Sam NY www.orkut.com 5432 400
> Bill UK www.google.com 5647 645
>
>
> abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
> country: chararray, website: chararray, sess_id: int, page_id: int);
>
> case_abc = FOREACH abc GENERATE
>  name,
> country,
> ((website matches '.*.google..*') ? sess_id : null ) as google_user,
>  ((page_id == 400) ? sess_id : null) as other_user;
>
> DUMP case_abc;
> ------------------
> result of DUMP case_abc;
>
> (John,USA,1234,)
> (Ron,California,,1432)
> (Sam,NY,,5432)
> (Bill,UK,5647,)
>
> -----------------
> gen_case = FOREACH case_abc GENERATE name, country, google_user,
> other_user;
>
> DUMP gen_case;
> ---------------
> result of DUMP gen_case;
>
> (John,USA,1234,1234)
> (Ron,California,,)
> (Sam,NY,,)
> (Bill,UK,5647,5647)
>
> You can see in 1st DUMP if conditions are working fine. Then in 2nd dump
> after 2nd foreach it messed up with if conditions.
>
> Let me know if it does make any sense.
>
> Sonia
>
>
> On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>
>> Hi Sonia,
>> Looks like something went wrong in your pasting of the Pig code. Could you
>> try again, and also add some sample inputs/outputs?
>>
>> As in, contents of join_pe_pre, contents of case_state, and contents of
>> gen_values that illustrate the problem and allow us to reproduce it.
>>
>> Also please tell us what version of Pig you are using.
>>
>> Thanks,
>> -D
>>
>> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <so...@gmail.com>wrote:
>>
>>> Hi Guys,
>>>
>>> I getting wired error while running my pig script.
>>>
>>>   *case_state = FOREACH join_pe_pre GENERATE*
>>>
>>> *  f1, f2, f3, f4,   (*
>>>
>>> *                  (f5 '.*.facebook..*')*
>>>
>>> *                  ? f10*
>>>
>>> *                  : null*
>>>
>>> *          ) as facebook_referrals,*
>>>
>>> *
>>> *
>>>
>>> *          (*
>>>
>>> *                  (*
>>>
>>> *                          (f6 == 1)*
>>>
>>> *                          AND*
>>>
>>> *                          (f7 == 2000)*
>>>
>>> *                          AND*
>>>
>>> *                          (f8 == 1)*
>>>
>>> *                  )*
>>>
>>> *                  ? f10*
>>>
>>> *                  : null*
>>>
>>> *          ) as cd_referrals,*
>>>
>>> *
>>> *
>>>
>>> *          (*
>>>
>>> *                  (*
>>>
>>> *                          (f7 == 1770)*
>>>
>>> *                          OR*
>>>
>>> *                          (f7 == 1771)*
>>>
>>> *                  )*
>>>
>>> *                  ? f10*
>>>
>>> *                  : null*
>>>
>>> *          ) as nm_referrals;*
>>>
>>> *
>>> *
>>>
>>> *DUMP case_state;*
>>>
>>>
>>> Here when I am doing DUMP case_state I am getting desired results, proper
>>> case values for facebook_referrals, cd_referrals and nm_referrals
>>>
>>>
>>> *gen_values = FOREACH case_state GENERATE *;*
>>>
>>> *
>>> *
>>> *DUMP gen_values; *
>>> *
>>> *
>>> But after this if I do simple FOREACH GENERATE everything again at this
>>> moment I am getting same values for  facebook_referrals, cd_referrals
>>> and nm_referrals. All these three values are same as whatever the value
>>> of
>>> first if else.
>>> I could able to figure out what could be the possible reason of this.
>>>
>>> Please let me know if I am doing anything wrong.
>>>
>>> Thanks in advance.
>>>
>>> Sonia
>>>
>>
>>
>

Re: FOREACH GENERATE after if else condition

Posted by sonia gehlot <so...@gmail.com>.
Here is an example, Hope this will help. I am running this on Pig 0.8
version.

Sample data in text file
sample1.txt

John USA www.google.com 1234 900
Ron California www.facebook.com 1432 400
Sam NY www.orkut.com 5432 400
Bill UK www.google.com 5647 645


abc = LOAD '/user/sgehlot/test_data/sample1.txt' as (name: chararray,
country: chararray, website: chararray, sess_id: int, page_id: int);

case_abc = FOREACH abc GENERATE
 name,
country,
((website matches '.*.google..*') ? sess_id : null ) as google_user,
 ((page_id == 400) ? sess_id : null) as other_user;

DUMP case_abc;
------------------
result of DUMP case_abc;

(John,USA,1234,)
(Ron,California,,1432)
(Sam,NY,,5432)
(Bill,UK,5647,)

-----------------
gen_case = FOREACH case_abc GENERATE name, country, google_user, other_user;

DUMP gen_case;
---------------
result of DUMP gen_case;

(John,USA,1234,1234)
(Ron,California,,)
(Sam,NY,,)
(Bill,UK,5647,5647)

You can see in 1st DUMP if conditions are working fine. Then in 2nd dump
after 2nd foreach it messed up with if conditions.

Let me know if it does make any sense.

Sonia


On Sat, Feb 19, 2011 at 4:55 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Hi Sonia,
> Looks like something went wrong in your pasting of the Pig code. Could you
> try again, and also add some sample inputs/outputs?
>
> As in, contents of join_pe_pre, contents of case_state, and contents of
> gen_values that illustrate the problem and allow us to reproduce it.
>
> Also please tell us what version of Pig you are using.
>
> Thanks,
> -D
>
> On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <so...@gmail.com>wrote:
>
>> Hi Guys,
>>
>> I getting wired error while running my pig script.
>>
>>   *case_state = FOREACH join_pe_pre GENERATE*
>>
>> *  f1, f2, f3, f4,   (*
>>
>> *                  (f5 '.*.facebook..*')*
>>
>> *                  ? f10*
>>
>> *                  : null*
>>
>> *          ) as facebook_referrals,*
>>
>> *
>> *
>>
>> *          (*
>>
>> *                  (*
>>
>> *                          (f6 == 1)*
>>
>> *                          AND*
>>
>> *                          (f7 == 2000)*
>>
>> *                          AND*
>>
>> *                          (f8 == 1)*
>>
>> *                  )*
>>
>> *                  ? f10*
>>
>> *                  : null*
>>
>> *          ) as cd_referrals,*
>>
>> *
>> *
>>
>> *          (*
>>
>> *                  (*
>>
>> *                          (f7 == 1770)*
>>
>> *                          OR*
>>
>> *                          (f7 == 1771)*
>>
>> *                  )*
>>
>> *                  ? f10*
>>
>> *                  : null*
>>
>> *          ) as nm_referrals;*
>>
>> *
>> *
>>
>> *DUMP case_state;*
>>
>>
>> Here when I am doing DUMP case_state I am getting desired results, proper
>> case values for facebook_referrals, cd_referrals and nm_referrals
>>
>>
>> *gen_values = FOREACH case_state GENERATE *;*
>>
>> *
>> *
>> *DUMP gen_values; *
>> *
>> *
>> But after this if I do simple FOREACH GENERATE everything again at this
>> moment I am getting same values for  facebook_referrals, cd_referrals
>> and nm_referrals. All these three values are same as whatever the value of
>> first if else.
>> I could able to figure out what could be the possible reason of this.
>>
>> Please let me know if I am doing anything wrong.
>>
>> Thanks in advance.
>>
>> Sonia
>>
>
>

Re: FOREACH GENERATE after if else condition

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi Sonia,
Looks like something went wrong in your pasting of the Pig code. Could you
try again, and also add some sample inputs/outputs?

As in, contents of join_pe_pre, contents of case_state, and contents of
gen_values that illustrate the problem and allow us to reproduce it.

Also please tell us what version of Pig you are using.

Thanks,
-D

On Sat, Feb 19, 2011 at 3:39 PM, sonia gehlot <so...@gmail.com>wrote:

> Hi Guys,
>
> I getting wired error while running my pig script.
>
>   *case_state = FOREACH join_pe_pre GENERATE*
>
> *  f1, f2, f3, f4,   (*
>
> *                  (f5 '.*.facebook..*')*
>
> *                  ? f10*
>
> *                  : null*
>
> *          ) as facebook_referrals,*
>
> *
> *
>
> *          (*
>
> *                  (*
>
> *                          (f6 == 1)*
>
> *                          AND*
>
> *                          (f7 == 2000)*
>
> *                          AND*
>
> *                          (f8 == 1)*
>
> *                  )*
>
> *                  ? f10*
>
> *                  : null*
>
> *          ) as cd_referrals,*
>
> *
> *
>
> *          (*
>
> *                  (*
>
> *                          (f7 == 1770)*
>
> *                          OR*
>
> *                          (f7 == 1771)*
>
> *                  )*
>
> *                  ? f10*
>
> *                  : null*
>
> *          ) as nm_referrals;*
>
> *
> *
>
> *DUMP case_state;*
>
>
> Here when I am doing DUMP case_state I am getting desired results, proper
> case values for facebook_referrals, cd_referrals and nm_referrals
>
>
> *gen_values = FOREACH case_state GENERATE *;*
>
> *
> *
> *DUMP gen_values; *
> *
> *
> But after this if I do simple FOREACH GENERATE everything again at this
> moment I am getting same values for  facebook_referrals, cd_referrals
> and nm_referrals. All these three values are same as whatever the value of
> first if else.
> I could able to figure out what could be the possible reason of this.
>
> Please let me know if I am doing anything wrong.
>
> Thanks in advance.
>
> Sonia
>