You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2011/04/05 00:23:05 UTC
[jira] [Commented] (PIG-1963) in nested foreach, accumutive udf
taking input from order-by does not get results in order
[ https://issues.apache.org/jira/browse/PIG-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015661#comment-13015661 ]
Thejas M Nair commented on PIG-1963:
------------------------------------
MYCONCATBAG udf in the query in description concatenates the entries in the bag, in the order it is recieved.
When the query run with the property - pig.accumulative.batchsize=2 ,
and input -
{code}
100 apple
200 orange
300 strawberry
300 pear
100 apple
300 pear
400 apple
{code}
gives output -
{code}
(100,(100)(100),(apple)(apple))
(200,(200),(orange))
(300,(300)(300)(300),(pear)(strawberry)(pear)) -- this should be (300,(300)(300)(300),(pear)(pear)(strawberry))
(400,(400),(apple))
{code}
> in nested foreach, accumutive udf taking input from order-by does not get results in order
> ------------------------------------------------------------------------------------------
>
> Key: PIG-1963
> URL: https://issues.apache.org/jira/browse/PIG-1963
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0, 0.9.0
> Reporter: Thejas M Nair
>
> This happens only when secondary sort is not being used for the order-by.
> For example -
> {code}
> a1 = load 'fruits.txt' as (f1:int,f2);
> a2 = load 'fruits.txt' as (f1:int,f2);
> b = cogroup a1 by f1, a2 by f1;
> d = foreach b {
> sort1 = order a1 by f2;
> sort2 = order a2 by f2; -- secondary sort not getting used here, MYCONCATBAG gets results in wrong order
> generate group, MYCONCATBAG(sort1.f1), MYCONCATBAG(sort2.f2);
> }
> -- explain d;
> dump d;
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira