You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/04/08 09:58:05 UTC

[jira] [Commented] (PIG-1911) Infinite loop with accumulator function in nested foreach

    [ https://issues.apache.org/jira/browse/PIG-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017334#comment-13017334 ] 

Daniel Dai commented on PIG-1911:
---------------------------------

+1, this is definitely a fix. Accumulator will only be used if there is an accumulator UDF in nested plan. So fix inside UDF should be fine.

Just help me to understand better, I think fix PORelationToExprProject is also possible. Since accumulator only need one extra bag to in order for UDF to invoke getValue(). So after exhaust all batch, send one extra bag, then send EOP, will solve the problem as well. Is that right?

> Infinite loop with accumulator function in nested foreach
> ---------------------------------------------------------
>
>                 Key: PIG-1911
>                 URL: https://issues.apache.org/jira/browse/PIG-1911
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1911.08.1.patch, PIG-1911.trunk.1.patch
>
>
> Sample script:
> register v_udf.jar;
> a = load '2records' as (f1:chararray,f2:chararray);
> b = group a by f1;
> d = foreach b { sort = order a by f1; 
>   generate org.udfs.MyCOUNT(sort) as something ; }
> dump d;
> This causes infinite loop if MyCOUNT implements Accumulator interface.
> The workaround is to take the function out of nested foreach into a separate foreach statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira