You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "liyunzhang_intel (JIRA)" <ji...@apache.org> on 2014/12/18 07:23:13 UTC
[jira] [Updated] (PIG-4282) Enable unit test
"TestForEachNestedPlan" for spark
[ https://issues.apache.org/jira/browse/PIG-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liyunzhang_intel updated PIG-4282:
----------------------------------
Attachment: PIG-4282.patch
group operator has different results in different engines like "spark" and "mapreduce".
for example:
groupdistinct.pig
{code}
A = load 'input1.txt' as (age:int,gpa:int);
B = group A by age;
C = foreach B {
D = A.gpa;
E = distinct D;
generate group, MIN(E);
};
dump C;
{code}
input1.txt is:
10 89
20 78
10 68
10 89
20 92
the mapreduce output is:
(10,68),(20,78)
the spark output is
(20,78),(10,68)
all test cases of TestForEachNestedPlan pass except TestInnerDistinct in spark mode. The reason why fails I described above. Original code only judges the result is "(10,68),(20,78)".  PIG-4282.patch will judge both "(10,68),(20,78)" and "(20,78),(10,68)" for the result.
> Enable unit test "TestForEachNestedPlan" for spark
> --------------------------------------------------
>
> Key: PIG-4282
> URL: https://issues.apache.org/jira/browse/PIG-4282
> Project: Pig
> Issue Type: Bug
> Components: spark
> Reporter: liyunzhang_intel
> Attachments: PIG-4282.patch, TEST-org.apache.pig.test.TestForEachNestedPlan.txt
>
>
> error log is attached
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)