You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "liyunzhang_intel (JIRA)" <ji...@apache.org> on 2015/06/18 05:25:00 UTC
[jira] [Commented] (PIG-4607) Enable "TestRank1","TestRank3" unit
tests in spark mode
[ https://issues.apache.org/jira/browse/PIG-4607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14591176#comment-14591176 ]
liyunzhang_intel commented on PIG-4607:
---------------------------------------
Let's make an example to explain why unit tests about TestRank1, TestRank3 fail:
rank01RowNumber.pig:
{code}
A = LOAD './rank01RowNumber.txt' AS (f1:chararray,f2:int,f3:chararray);
C = rank A;
store C into './rank01RowNumber.out';
{code}
cat rank01RowNumber.txt:
{code}
A 1 N
B 2 N
C 3 M
D 4 P
E 4 Q
E 4 Q
F 8 Q
F 7 Q
F 8 T
F 8 Q
G 10 V
{code}
the physical plan is :
{code}
#-----------------------------------------------
# Physical Plan:
#-----------------------------------------------
C: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-13
|
|---C: PORank[tuple] - scope-12
|
|---C: POCounter[tuple] - scope-11
|
|---A: New For Each(false,false,false)[bag] - scope-10
| |
| Cast[chararray] - scope-2
| |
| |---Project[bytearray][0] - scope-1
| |
| Cast[int] - scope-5
| |
| |---Project[bytearray][1] - scope-4
| |
| Cast[chararray] - scope-8
| |
| |---Project[bytearray][2] - scope-7
|
|---A: Load(hdfs://zly1.sh.intel.com:8020/user/root/rank01RowNumber.txt:org.apache.pig.builtin.PigStorage) - scope-0
{code}
The spark plan is:
{code}
scope-14
#--------------------------------------------------
# Spark Plan
#--------------------------------------------------
Spark node scope-14
C: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-13
|
|---A: New For Each(false,false,false)[bag] - scope-10
| |
| Cast[chararray] - scope-2
| |
| |---Project[bytearray][0] - scope-1
| |
| Cast[int] - scope-5
| |
| |---Project[bytearray][1] - scope-4
| |
| Cast[chararray] - scope-8
| |
| |---Project[bytearray][2] - scope-7
|
|---A: Load(hdfs://zly1.sh.intel.com:8020/user/root/rank01RowNumber.txt:org.apache.pig.builtin.PigStorage) - scope-0--------
{code}
The root cause is "POCounter" and "PORank" are missing in the spark plan.
> Enable "TestRank1","TestRank3" unit tests in spark mode
> -------------------------------------------------------
>
> Key: PIG-4607
> URL: https://issues.apache.org/jira/browse/PIG-4607
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: kexianda
> Fix For: spark-branch
>
>
> In https://builds.apache.org/job/Pig-spark/216/#showFailuresLink, unit tests about TestRank1, TestRank3:
> org.apache.pig.test.TestRank1.testRank02RowNumber
> org.apache.pig.test.TestRank1.testRank01RowNumber
> org.apache.pig.test.TestRank3.testRankWithSplitInMap
> org.apache.pig.test.TestRank3.testRankWithSplitInReduce
> org.apache.pig.test.TestRank3.testRankCascade
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)