You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org> on 2012/03/08 22:21:57 UTC

[jira] [Commented] (PIG-1798) nested foreach - alias for expression does not work

    [ https://issues.apache.org/jira/browse/PIG-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225534#comment-13225534 ] 

Daniel Dai commented on PIG-1798:
---------------------------------

Doesn't feel there is an easy solution (don't even think this is valid). In nested plan, we assume input bags will be streamed into nested operator (except LOGenerate). If there is a UDF, it should take tuple. However, Distinct takes the bag, which break the rule. Eg, the following script should be doable:

f = foreach g { ld = ABS(l.$0); f = filter ld by $0 > 1; generate COUNT(f);}

ABS see l as tuple stream instead of bag. This syntax is not currently supported, but totally doable. However, if you want to use a UDF which see l as a bag, you can only do it in generate statement, which takes every input as bag:

f = foreach g { f = filter l by $0 > 1; generate COUNT(Distinct(f));}

Unlink this issue from 0.10.
                
> nested foreach - alias for expression does not work
> ---------------------------------------------------
>
>                 Key: PIG-1798
>                 URL: https://issues.apache.org/jira/browse/PIG-1798
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Daniel Dai
>
> In following example, the nested foreach statement has an alias ld used for output of  distinct udf . Pig gives an error during query plan generation - 
> {code}
> grunt> l = load 'x' as (a, b);
> grunt> g = group l by a;
> grunt> f = foreach g { ld = org.apache.pig.builtin.Distinct(l); f = filter ld by $0 > 1; generate COUNT(f);}
> grunt> explain f;
> 2011-01-11 12:18:33,908 [main] WARN  org.apache.pig.PigServer - Encountered Warning IMPLICIT_CAST_TO_INT 1 time(s).
> 2011-01-11 12:18:33,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-01-11 12:18:33,941 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1067: Unable to explain alias f
> Details at logfile: /Users/tejas/pig_comb2/trunk/pig_1294777094048.log
> less /Users/tejas/pig_comb2/trunk/pig_1294777094048.log
> Pig Stack Trace
> ---------------
> ERROR 1067: Unable to explain alias f
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias f
>         at org.apache.pig.PigServer.explain(PigServer.java:1053)
>         at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:358)
>         at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:290)
>         at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:253)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:665)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:325)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
>         at org.apache.pig.Main.run(Main.java:475)
>         at org.apache.pig.Main.main(Main.java:109)
> Caused by: java.lang.NullPointerException
>         at org.apache.pig.newplan.logical.ForeachInnerPlanVisitor.translateInnerPlanConnection(ForeachInnerPlanVisitor.java:87)
>         at org.apache.pig.newplan.logical.ForeachInnerPlanVisitor.visit(ForeachInnerPlanVisitor.java:245)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:114)
>         at org.apache.pig.impl.logicalLayer.LogicalOperator.visit(LogicalOperator.java:1)
>         at org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:71)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.newplan.logical.LogicalPlanMigrationVistor.visit(LogicalPlanMigrationVistor.java:245)
>         at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:132)
>         at org.apache.pig.impl.logicalLayer.LogicalOperator.visit(LogicalOperator.java:1)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:70)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:264)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:1460)
>         at org.apache.pig.PigServer.explain(PigServer.java:1022)
>         ... 10 more
> ================================================================================
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira