You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2015/02/20 00:22:11 UTC
[jira] [Updated] (PIG-4426) RowNumber(simple) Rank not producing
correct results
[ https://issues.apache.org/jira/browse/PIG-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-4426:
------------------------------
Attachment: pig-4426-v01.txt
I haven't fully understood the Rank (mapreduce) implementation, but it seems like for simple(rownumber) rank, it runs POCounter on the mapper side whereas for others it uses reducer.
As a result,
{code:title=JobControlCompiler.java}
381 private void saveCounters(Job job, String operationID) {
...
412 for (int i=0;i<job.getJob().getNumReduceTasks();i++) {
{code}
I believe we need to getNumMapTasks() for simple rownumber ranks.
Simple test (TestRank1) should have also failed but the results weren't compared for an exact match. Changed the test a bit so that it will fail.
Running unit and e2e tests with this patch.
> RowNumber(simple) Rank not producing correct results
> ----------------------------------------------------
>
> Key: PIG-4426
> URL: https://issues.apache.org/jira/browse/PIG-4426
> Project: Pig
> Issue Type: Bug
> Reporter: Koji Noguchi
> Assignee: Koji Noguchi
> Attachments: pig-4426-v01.txt
>
>
> After PIG-4392, started seeing TestRank3.testRankWithSplitInMap (and some others) failing with
> {noformat}
> Comparing actual and expected results. expected:<[(1,1,2), (1,1,2), (1,3,1), (2,1,2), (3,1,2), (3,2,3), (3,2,4), (4,2,3), (5,2,4), (5,3,1)]> but was:<[(1,1,2), (1,1,2), (3,2,3), (3,2,4), (5,3,1)]>
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)