You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Santhosh Srinivasan (JIRA)" <ji...@apache.org> on 2008/08/20 14:24:47 UTC

[jira] Commented: (PIG-379) describe interfiers with name resolution

    [ https://issues.apache.org/jira/browse/PIG-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623972#action_12623972 ] 

Santhosh Srinivasan commented on PIG-379:
-----------------------------------------

The describe statement kicks of the logical plan -> type checker -> optimizer process. During the logical plan optimization the schema of each operator is reset. When the schema of each operator is recomputed, the computation uses the attributes of the operator along with the information about its inputs. User defined schemas specified with the as clause are not annotated as such in each operator. As a result, when the schema is reset in the logical optimizer, this information is lost resulting in incorrect schemas.

There are multiple items that we need to consider:

1. Annotate each relational operator and expression operator with an attributed to denote presence of user specified schemas

2. Checks to ensure compatibility of user specified schemas with the generated/inferred schemas, i.e.,
   a. if the user specifies incorrect types, then perform appropriate checks and type promotions
   b. if the schema is a mismatch then flag it as an error

3. For complex constants, the schema computation is a bit complex and involves type promotions, null introductions, etc.

> describe interfiers with name resolution
> ----------------------------------------
>
>                 Key: PIG-379
>                 URL: https://issues.apache.org/jira/browse/PIG-379
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Priority: Critical
>             Fix For: types_branch
>
>
> If I ran the following script:
> A = load 'studenttab10k' as (name: chararray, age: int, gpa: float);
> B = foreach A generate name, age;
> describe B;
> C = filter B by age > 30;
> describe C;
> D = group C by name;
> describe D;
> I get the error below. Also notice that the schema of C no longer have names:
> {name: chararray,age: integer}
> {chararray,integer}
> java.io.IOException: Invalid alias: name in {chararray,integer}
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:254)
>         at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:422)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid alias: name in {chararray,integer}
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:5179)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5048)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:3357)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3254)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3208)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3117)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3043)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3009)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:2911)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.GroupItem(QueryParser.java:1548)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.CogroupClause(QueryParser.java:1468)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:751)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:569)
>         at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:378)
>         at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:251)
> If I remove describe, I don't see any errors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.