You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "fang fang chen (JIRA)" <ji...@apache.org> on 2012/08/14 10:07:38 UTC

[jira] [Commented] (PIG-2498) e2e tests failing in some cases due to incorrect unix sort args

    [ https://issues.apache.org/jira/browse/PIG-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433984#comment-13433984 ] 

fang fang chen commented on PIG-2498:
-------------------------------------

+1
Also encounter some tests failed with error "Sort check failed". With this patch, original failed tests passed now.
                
> e2e tests failing in some cases due to incorrect unix sort args
> ---------------------------------------------------------------
>
>                 Key: PIG-2498
>                 URL: https://issues.apache.org/jira/browse/PIG-2498
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2
>            Reporter: Patrick Hunt
>            Assignee: Patrick Hunt
>         Attachments: PIG-2498.patch
>
>
> Some e2e tests are failing for me against 23 due to what I think are incorrect arguments to unix sort. For example in Order_6:
> {noformat}
> 			'num' => 6,
> 			'pig' => q\a = load ':INPATH:/singlefile/studenttab10k';
> c = order a by $0;
> store c into ':OUTPATH:';\,
> 			'sortArgs' => ['-t', '	', '+0', '-1'],
> {noformat}
> The pig job is sorting by the first column, however unix sort is being told to sort by the first and second columns.
> From the gnu sort manual (specifically pos2 is _inclusive_): http://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html
> {noformat}
> '-k pos1[,pos2]'
> '--key=pos1[,pos2]'
> Specify a sort field that consists of the part of the line between pos1 and pos2 (or the end of the line, if pos2 is omitted), inclusive.
> ...
> On older systems, sort supports an obsolete origin-zero syntax '+pos1 [-pos2]' for specifying sort keys. The obsolete sequence 'sort +a.x -b.y' is equivalent to 'sort -k a+1.x+1,b' if y is '0' or absent, otherwise it is equivalent to 'sort -k a+1.x+1,b+1.y'.
> {noformat}
> I verified this by running the sort manually with +0 -1 and +0 -0, in the first case it fails, in the second case it passes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira