You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (JIRA)" <ji...@apache.org> on 2011/02/24 07:43:38 UTC
[jira] Created: (PIG-1868) New logical plan fails when I have
complex data types from udf
New logical plan fails when I have complex data types from udf
--------------------------------------------------------------
Key: PIG-1868
URL: https://issues.apache.org/jira/browse/PIG-1868
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.8.0
Reporter: Vivek Padmanabhan
The new logical plan fails when I have complex data types returning from my eval function.
The below is my script :
{code}
register myudf.jar;
B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
B2 = group B1 by id;
B = foreach B2 {
Tuples = order B1 by ts;
generate Tuples;
};
C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
C2 = foreach C1 generate FLATTEN(seq);
C3 = foreach C2 generate current.id as id;
dump C3;
{code}
On C3 it fails with below message :
{code}
Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
{code}
The below is the describe on C1 ;
{code}
C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
{code}
The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011387#comment-13011387 ]
Thejas M Nair commented on PIG-1868:
------------------------------------
+1
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1868-1.patch
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1868:
--------------------------------
Fix Version/s: 0.8.0
Assignee: Daniel Dai
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006760#comment-13006760 ]
Daniel Dai commented on PIG-1868:
---------------------------------
Get actually implmenentation of TransformToMyDataType from the user. The problem is TransformToMyDataType forget to set twolevelaccess flag for the bag. Pig 0.9 does not require twolevelaccess flag. However, we saw Pig 0.9 fail with a different stack:
ERROR 1000: Invalid field reference. Referenced field [guid] does not exist in schema: null.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias C3
at org.apache.pig.PigServer.explain(PigServer.java:993)
at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:368)
at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:300)
at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:263)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:665)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:325)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:176)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:537)
at org.apache.pig.Main.main(Main.java:108)
Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 1000: Invalid field reference. Referenced field [guid] does not exist in schema: null.
at org.apache.pig.newplan.logical.visitor.ColumnAliasConversionVisitor$1.visit(ColumnAliasConversionVisitor.java:114)
at org.apache.pig.newplan.logical.expression.DereferenceExpression.accept(DereferenceExpression.java:83)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:114)
at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:104)
at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1538)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1533)
at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1295)
at org.apache.pig.PigServer.buildStorePlan(PigServer.java:1195)
at org.apache.pig.PigServer.explain(PigServer.java:956)
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-1868:
----------------------------
Affects Version/s: (was: 0.8.0)
Fix Version/s: (was: 0.8.0)
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1868-1.patch
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-1868:
----------------------------
Affects Version/s: 0.9.0
Fix Version/s: 0.9.0
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1868-1.patch
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-1868:
----------------------------
Attachment: PIG-1868-1.patch
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1868-1.patch
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007663#comment-13007663 ]
Daniel Dai commented on PIG-1868:
---------------------------------
The reason for the failure on trunk is because in this query, TransformToMyDataType returns more specific error message than user declared schema "seq: { t: ( previous, current, next ) }", and now we only take user declared schema. This is a regression, I will attach a patch to fix it.
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-1868) New logical plan fails when I have
complex data types from udf
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-1868.
-----------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Review notes: https://reviews.apache.org/r/526/
Patch committed to trunk.
> New logical plan fails when I have complex data types from udf
> --------------------------------------------------------------
>
> Key: PIG-1868
> URL: https://issues.apache.org/jira/browse/PIG-1868
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1868-1.patch
>
>
> The new logical plan fails when I have complex data types returning from my eval function.
> The below is my script :
> {code}
> register myudf.jar;
> B1 = load 'myinput' as (id:chararray,ts:int,url:chararray);
> B2 = group B1 by id;
> B = foreach B2 {
> Tuples = order B1 by ts;
> generate Tuples;
> };
> C1 = foreach B generate TransformToMyDataType(Tuples,-1,0,1) as seq: { t: ( previous, current, next ) };
> C2 = foreach C1 generate FLATTEN(seq);
> C3 = foreach C2 generate current.id as id;
> dump C3;
> {code}
> On C3 it fails with below message :
> {code}
> Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 45 Input: 0 Column: 1)
> {code}
> The below is the describe on C1 ;
> {code}
> C1: {seq: {t: (previous: (id: chararray,ts: int,url: chararray),current: (id: chararray,ts: int,url: chararray),next: (id: chararray,ts: int,url: chararray))}}
> {code}
> The script works if I turn off new logical plan or use Pig 0.7.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira