You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Pradeep Kamath (JIRA)" <ji...@apache.org> on 2009/01/26 19:13:59 UTC

[jira] Updated: (PIG-634) When POUnion is one of the roots of a map plan, POUnion.getNext() gives a null pointer exception

     [ https://issues.apache.org/jira/browse/PIG-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-634:
-------------------------------

    Status: Patch Available  (was: Open)

Attached patch which fixed POUnion.getNext() by adding the following condition:
{code}
public Result getNext(Tuple t) throws ExecException {

        if (nextReturnEOP) {
            nextReturnEOP = false ;
            return eopResult ;
        }

        // Case 1 : Normal connected plan
        if (!isInputAttached()) {
            
            if (inputs == null || inputs.size()==0) {
                // Neither does this Union have predecessors nor
                // was any input attached! This can happen when we have
                // a plan like below
                // POUnion
                // |
                // |--POLocalRearrange
                // |    |
                // |    |-POUnion (root 2)--> This union's getNext() can lead the code here
                // |
                // |--POLocalRearrange (root 1)
                
                // The inner POUnion above is a root in the plan which has 2 roots.
                // So these 2 roots would have input coming from different input
                // sources (dfs files). So certain maps would be working on input only
                // meant for "root 1" above and some maps would work on input
                // meant only for "root 2". In the former case, "root 2" would
                // neither get input attached to it nor does it have predecessors
                // which is the case which can lead us here.
                return eopResult;
            }
            ... rest of getNext
{code}

The check to see if inputs is null or inputs.size() is 0 is the new condition added in getNext(). This ensures that when POUnion is one of the roots of the map plan and when it receives no input (i.e. no input is attached), it will send EOP to successor.

> When POUnion is one of the roots of a map plan, POUnion.getNext() gives a null pointer exception
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-634
>                 URL: https://issues.apache.org/jira/browse/PIG-634
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-634.patch
>
>
> POUnion.getnext() gives a null pointer exception in the following scenario (pasted from a code comment explaining the fix for this issue). If a script results in a plan like the one below, currently POUnion.getNext() gives a null pointer exception
> {noformat}
>                 
>                 // POUnion
>                 // |
>                 // |--POLocalRearrange
>                 // |    |
>                 // |    |-POUnion (root 2)--> This union's getNext() can lead the code here
>                 // |
>                 // |--POLocalRearrange (root 1)
>                 
>                 // The inner POUnion above is a root in the plan which has 2 roots.
>                 // So these 2 roots would have input coming from different input
>                 // sources (dfs files). So certain maps would be working on input only
>                 // meant for "root 1" above and some maps would work on input
>                 // meant only for "root 2". In the former case, "root 2" would
>                 // neither get input attached to it nor does it have predecessors
> {noformat}
> A script which can cause a plan like above is:
> {code}
> a = load 'xyz'; 
> b = load 'abc'; 
> c = union a,b; 
> d = load 'def'; 
> e = cogroup c by $0 inner , d by $0 inner;
> dump e;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.