You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Richard Ding (JIRA)" <ji...@apache.org> on 2011/03/18 23:55:29 UTC
[jira] Created: (PIG-1920) An error in new parser
An error in new parser
----------------------
Key: PIG-1920
URL: https://issues.apache.org/jira/browse/PIG-1920
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Xuefu Zhang
Fix For: 0.9.0
Run following Pig script on trunk:
{code}
A = load 'input' as (v, u);
B = group A by $0;
C = group B by $0;
describe C;
R = foreach C generate B.A.v;
describe R;
{code}
One gets the this error:
{code}
C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Invalid field reference. Referenced field [v] does not exist in schema: A#19:bag{null#20:tuple(v#17:bytearray,u#18:bytearray)}.
{code}
Change the 5th line to
{code}
R = foreach C generate B.A.$0;
{code}
One gets this output:
{code}
C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
R: {{(A: {(v: bytearray,u: bytearray)})}}
{code}
This is different (and wrong) from the corresponding Pig 0.8 output:
{code}
C: {group: bytearray,B: {group: bytearray,A: {v: bytearray,u: bytearray}}}
R: {{v: bytearray}}
{code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-1920) An error in new parser
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-1920.
-----------------------------
Resolution: Fixed
Actually this is the right behavior. 0.8 doing it wrong.
The schema we get for C is C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
The schema for B is: B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}
The schema for B.A is: B:{(A: {(v: bytearray,u: bytearray)})}
Note B.A don't flatten a bag, it results a two level bag. v is in the inner bag and cannot be referenced.
B.A.$0 is valid, but it equals to B.A.
In 0.8, describe B.A.v is valid, however, dump it we get B.A.$0, which is not the right result.
So actually this bug is in 0.8, and we fix it in 0.9.
> An error in new parser
> ----------------------
>
> Key: PIG-1920
> URL: https://issues.apache.org/jira/browse/PIG-1920
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Richard Ding
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
>
> Run following Pig script on trunk:
> {code}
> A = load 'input' as (v, u);
> B = group A by $0;
> C = group B by $0;
> describe C;
> R = foreach C generate B.A.v;
> describe R;
> {code}
> One gets the this error:
> {code}
> C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
> [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Invalid field reference. Referenced field [v] does not exist in schema: A#19:bag{null#20:tuple(v#17:bytearray,u#18:bytearray)}.
> {code}
> Change the 5th line to
> {code}
> R = foreach C generate B.A.$0;
> {code}
> One gets this output:
> {code}
> C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
> R: {{(A: {(v: bytearray,u: bytearray)})}}
> {code}
> This is different (and wrong) from the corresponding Pig 0.8 output:
> {code}
> C: {group: bytearray,B: {group: bytearray,A: {v: bytearray,u: bytearray}}}
> R: {{v: bytearray}}
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (PIG-1920) An error in new parser
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang reassigned PIG-1920:
--------------------------------
Assignee: Daniel Dai (was: Xuefu Zhang)
It seems related to the 'null#u:int' issue.
> An error in new parser
> ----------------------
>
> Key: PIG-1920
> URL: https://issues.apache.org/jira/browse/PIG-1920
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Richard Ding
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
>
> Run following Pig script on trunk:
> {code}
> A = load 'input' as (v, u);
> B = group A by $0;
> C = group B by $0;
> describe C;
> R = foreach C generate B.A.v;
> describe R;
> {code}
> One gets the this error:
> {code}
> C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
> [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Invalid field reference. Referenced field [v] does not exist in schema: A#19:bag{null#20:tuple(v#17:bytearray,u#18:bytearray)}.
> {code}
> Change the 5th line to
> {code}
> R = foreach C generate B.A.$0;
> {code}
> One gets this output:
> {code}
> C: {group: bytearray,B: {(group: bytearray,A: {(v: bytearray,u: bytearray)})}}
> R: {{(A: {(v: bytearray,u: bytearray)})}}
> {code}
> This is different (and wrong) from the corresponding Pig 0.8 output:
> {code}
> C: {group: bytearray,B: {group: bytearray,A: {v: bytearray,u: bytearray}}}
> R: {{v: bytearray}}
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira