You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2010/01/16 00:02:54 UTC
[jira] Commented: (PIG-1193) Secondary sort issue on nested desc
sort
[ https://issues.apache.org/jira/browse/PIG-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800984#action_12800984 ]
Daniel Dai commented on PIG-1193:
---------------------------------
Diagnosis for this issue:
{code}
Reduce plan:
Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-56
|
|---New For Each(false)[bag] - 1-55
| |
| POUserFunc(sequence.CUMULATIVE)[bag] - 1-54
| |
| |---RelationToExpressionProject[bag][*] - 1-49
| | |
| | |---RelationToExpressionProject[bag][*] - 1-58
| | |
| | |---Project[tuple][1] - 1-46
| |
| |---RelationToExpressionProject[bag][*] - 1-53
| |
| |---POSort[bag]() - 1-52
| | |
| | Project[int][0] - 1-51
| |
| |---Project[tuple][1] - 1-50
|
|---Package[tuple]{chararray} - 1-43--------
{code}
We take the first input's reverse POSort and make it a secondary sort key. However, we did not remove the second input's POSort. So the second input for the UDF is reverse reverse sorted.
> Secondary sort issue on nested desc sort
> ----------------------------------------
>
> Key: PIG-1193
> URL: https://issues.apache.org/jira/browse/PIG-1193
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.6.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
>
> Secondary sort doing nested desc sort order incorrectly if the following conditions meet:
> 1. We have sort and UDF in nested plan
> 2. This UDF will use the same input tuples more than once
> 3. The input tuples are sorted in desc order
> Here is a test case:
> {code}
> register sequence.jar;
> A = load 'input' as (a0:int);
> B = group A ALL;
> C = foreach B {
> D = order A by a0 desc;
> generate sequence.CUMULATIVE(D,D);
> };
> dump C;
> {code}
> input file:
> {code}
> 3
> 4
> {code}
> The input for the UDF is:
> {code}
> ({(4),(3)},{(3),(4)})
> {code}
> The first bag is sorted desc, but the second is not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.