You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "liyunzhang_intel (JIRA)" <ji...@apache.org> on 2014/12/18 07:23:13 UTC

[jira] [Updated] (PIG-4282) Enable unit test "TestForEachNestedPlan" for spark

     [ https://issues.apache.org/jira/browse/PIG-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liyunzhang_intel updated PIG-4282:
----------------------------------
    Attachment: PIG-4282.patch

group operator has different results in different engines like "spark" and "mapreduce".
for example:
groupdistinct.pig
{code}
A = load 'input1.txt' as (age:int,gpa:int); 
B = group A by age;  
C = foreach B { 
 D = A.gpa; 
 E = distinct D;
 generate group, MIN(E);
};
dump C;
{code}

input1.txt is:
10	89
20	78
10	68
10	89
20	92

the mapreduce output is:
(10,68),(20,78)

the spark output is 
(20,78),(10,68)

all test cases of TestForEachNestedPlan pass except TestInnerDistinct in spark mode. The reason why fails I described above.  Original code only judges the result is "(10,68),(20,78)".  PIG-4282.patch will judge  both "(10,68),(20,78)" and "(20,78),(10,68)" for the result.





> Enable unit test "TestForEachNestedPlan" for spark
> --------------------------------------------------
>
>                 Key: PIG-4282
>                 URL: https://issues.apache.org/jira/browse/PIG-4282
>             Project: Pig
>          Issue Type: Bug
>          Components: spark
>            Reporter: liyunzhang_intel
>         Attachments: PIG-4282.patch, TEST-org.apache.pig.test.TestForEachNestedPlan.txt
>
>
> error log is attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)