You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2010/01/16 00:02:54 UTC

[jira] Commented: (PIG-1193) Secondary sort issue on nested desc sort

    [ https://issues.apache.org/jira/browse/PIG-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800984#action_12800984 ] 

Daniel Dai commented on PIG-1193:
---------------------------------

Diagnosis for this issue:
{code}
Reduce plan:
Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-56
|
|---New For Each(false)[bag] - 1-55
    |   |
    |   POUserFunc(sequence.CUMULATIVE)[bag] - 1-54
    |   |
    |   |---RelationToExpressionProject[bag][*] - 1-49
    |   |   |
    |   |   |---RelationToExpressionProject[bag][*] - 1-58
    |   |       |
    |   |       |---Project[tuple][1] - 1-46
    |   |
    |   |---RelationToExpressionProject[bag][*] - 1-53
    |       |
    |       |---POSort[bag]() - 1-52
    |           |   |
    |           |   Project[int][0] - 1-51
    |           |
    |           |---Project[tuple][1] - 1-50
    |
    |---Package[tuple]{chararray} - 1-43--------
{code}

We take the first input's reverse POSort and make it a secondary sort key. However, we did not remove the second input's POSort. So the second input for the UDF is reverse reverse sorted.

> Secondary sort issue on nested desc sort
> ----------------------------------------
>
>                 Key: PIG-1193
>                 URL: https://issues.apache.org/jira/browse/PIG-1193
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>
> Secondary sort doing nested desc sort order incorrectly if the following conditions meet:
> 1. We have sort and UDF in nested plan
> 2. This UDF will use the same input tuples more than once
> 3. The input tuples are sorted in desc order
> Here is a test case:
> {code}
> register sequence.jar;
> A = load 'input' as (a0:int);
> B = group A ALL;
> C = foreach B {
>     D = order A by a0 desc;
>     generate sequence.CUMULATIVE(D,D);
> };
> dump C;
> {code}
> input file:
> {code}
> 3
> 4
> {code}
> The input for the UDF is:
> {code}
> ({(4),(3)},{(3),(4)})
> {code}
> The first bag is sorted desc, but the second is not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.