You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Pradeep Kamath (JIRA)" <ji...@apache.org> on 2009/12/04 21:08:20 UTC

[jira] Commented: (PIG-747) Logical to Physical Plan Translation fails when temporary alias are created within foreach

    [ https://issues.apache.org/jira/browse/PIG-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786124#action_12786124 ] 

Pradeep Kamath commented on PIG-747:
------------------------------------

I did some investigation and here are some observations:
Consider the following foreach segment which is similar to the script above:
{code}
foreach a generate {
 X = 10;
 Y = X + X;
 generate Y;
}
{code}

Currently it looks like in the logical plan we connect the same instance of LOConst (X) twice to the LOAdd (Y). In LogToPhyTranslationVisitor,  each successor of an operator is supposed to get a different instance of the operator as its predecessor  because DependencyOrderWalkerWOSeenChk is used to visit the inner foreach plan and a new Physical Operator is created each time a Logical operator is seen (even if it is the same instance of the Logical Operator). However the LogToPhyTranslationVisitor maintains a LogToPhyMap which is hashmap for mapping between a logicaloperator and translated PhysicalOperator. Since this is a HashMap and not a MultiMap, the LOConst gets mapped to the last POConst created and POAdd gets connected to it twice. 

Options to solve this:
1) Change the design in LogToPhyTranslationVisitor to handle this by using a MultiMap - this might be pretty involved - not sure on the extent of changes required
2) Change the parser to create copies originally in the nested foreach of the LogicalPlan and then LogToPhyTranslation doesn't need to worry about this case - this seems more cleaner - again unsure on how easy this is.



> Logical to Physical Plan Translation fails when temporary alias are created within foreach
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-747
>                 URL: https://issues.apache.org/jira/browse/PIG-747
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.7.0
>
>         Attachments: physicalplan.txt, physicalplanprob.pig, PIG-747-1.patch
>
>
> Consider a the pig script which calculates a new column F inside the foreach as:
> {code}
> A = load 'physicalplan.txt' as (col1,col2,col3);
> B = foreach A {
>    D = col1/col2;
>    E = col3/col2;
>    F = E - (D*D);
>    generate
>    F as newcol;
> };
> dump B;
> {code}
> This gives the following error:
> =======================================================================================================================================
> Caused by: org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException: ERROR 2015: Invalid physical operators in the physical plan
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:377)
>         at org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:63)
>         at org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:29)
>         at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:68)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:908)
>         at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:122)
>         at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:41)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:246)
>         ... 10 more
> Caused by: org.apache.pig.impl.plan.PlanException: ERROR 0: Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide multiple outputs.  This operator does not support multiple outputs.
>         at org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:158)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:89)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:373)
>         ... 19 more
> =======================================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.