You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (JIRA)" <ji...@apache.org> on 2011/03/17 06:11:29 UTC

[jira] Commented: (PIG-1911) Infinite loop with accumulator function in nested foreach

    [ https://issues.apache.org/jira/browse/PIG-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007815#comment-13007815 ] 

Vivek Padmanabhan commented on PIG-1911:
----------------------------------------

In this case pig is calling getValue() and cleanup() methods infinitely. The below is the udf source just in case;
{code}
public class MyCOUNT extends EvalFunc<Long> implements  Accumulator<Long>{
    @Override
    public Long exec(Tuple input) throws IOException {
            DataBag bag = (DataBag)input.get(0);
            Iterator it = bag.iterator();
            long cnt = 0;
            while (it.hasNext()){
                    Tuple t = (Tuple)it.next();
                    if (t != null && t.size() > 0 && t.get(0) != null )
                            cnt++;
            }
            return cnt;
    }

    @Override
    public Schema outputSchema(Schema input) {
        return new Schema(new Schema.FieldSchema(null, DataType.LONG)); 
    }
    private long intermediateCount = 0L;
    @Override
    public void accumulate(Tuple b) throws IOException {
            DataBag bag = (DataBag)b.get(0);
            Iterator it = bag.iterator();
            while (it.hasNext()){
                Tuple t = (Tuple)it.next();
                if (t != null && t.size() > 0 && t.get(0) != null) {
                    intermediateCount += 1;
                }
            }
    }
    @Override
    public void cleanup() {
        intermediateCount = 0L;
    }
    @Override
    public Long getValue() {
        return intermediateCount;
    }
}
{code}

> Infinite loop with accumulator function in nested foreach
> ---------------------------------------------------------
>
>                 Key: PIG-1911
>                 URL: https://issues.apache.org/jira/browse/PIG-1911
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>
> Sample script:
> register v_udf.jar;
> a = load '2records' as (f1:chararray,f2:chararray);
> b = group a by f1;
> d = foreach b { sort = order a by f1; 
>   generate org.udfs.MyCOUNT(sort) as something ; }
> dump d;
> This causes infinite loop if MyCOUNT implements Accumulator interface.
> The workaround is to take the function out of nested foreach into a separate foreach statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira