You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2015/02/20 00:22:11 UTC

[jira] [Updated] (PIG-4426) RowNumber(simple) Rank not producing correct results

     [ https://issues.apache.org/jira/browse/PIG-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-4426:
------------------------------
    Attachment: pig-4426-v01.txt

I haven't fully understood the Rank (mapreduce) implementation, but it seems like for simple(rownumber) rank, it runs POCounter on the mapper side whereas for others it uses reducer.

As a result, 
{code:title=JobControlCompiler.java}
 381     private void saveCounters(Job job, String operationID) {
...
 412             for (int i=0;i<job.getJob().getNumReduceTasks();i++) {
{code}

I believe we need to getNumMapTasks() for simple rownumber ranks.

Simple test (TestRank1) should have also failed but the results weren't compared for an exact match.  Changed the test a bit so that it will fail.
Running unit and e2e tests with this patch.


> RowNumber(simple) Rank not producing correct results
> ----------------------------------------------------
>
>                 Key: PIG-4426
>                 URL: https://issues.apache.org/jira/browse/PIG-4426
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>         Attachments: pig-4426-v01.txt
>
>
> After PIG-4392, started seeing TestRank3.testRankWithSplitInMap (and some others) failing with 
> {noformat}
> Comparing actual and expected results.  expected:<[(1,1,2), (1,1,2), (1,3,1), (2,1,2), (3,1,2), (3,2,3), (3,2,4), (4,2,3), (5,2,4), (5,3,1)]> but was:<[(1,1,2), (1,1,2), (3,2,3), (3,2,4), (5,3,1)]>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)