You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2011/01/28 22:42:46 UTC
[jira] Created: (PIG-1834) relation-as-scalar - uses the last
statement associated with the scalar alias
relation-as-scalar - uses the last statement associated with the scalar alias
-----------------------------------------------------------------------------
Key: PIG-1834
URL: https://issues.apache.org/jira/browse/PIG-1834
Project: Pig
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Fix For: 0.9.0, 0.8.0
Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
For example -
{code}
l = load 'x' as (a,b);
l = filter l by a > 1;
l = foreach ...
store l into 'y'
{code}
At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
For example -
{code}
l = load 'x' as (a,b);
A = load 'x' as (a,b);
B = foreach A generate a, l.a as la;
l = foreach l generate a+1 as a;
store B into 'b';
{code}
The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-16
Map Plan
l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
|
|---l: New For Each(false)[bag] - scope-7
| |
| Add[int] - scope-5
| |
| |---Cast[int] - scope-3
| | |
| | |---Project[bytearray][0] - scope-2
| |
| |---Constant(1) - scope-4
|
|---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
Global sort: false
----------------
MapReduce node scope-17
Map Plan
B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
|
|---B: New For Each(false,false)[bag] - scope-14
| |
| Project[bytearray][0] - scope-9
| |
| POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
| |
| |---Constant(0) - scope-11
| |
| |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
|
|---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
Global sort: false
----------------
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1834) relation-as-scalar - uses the last
statement associated with the scalar alias
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1834:
--------------------------------
Fix Version/s: (was: 0.8.0)
> relation-as-scalar - uses the last statement associated with the scalar alias
> -----------------------------------------------------------------------------
>
> Key: PIG-1834
> URL: https://issues.apache.org/jira/browse/PIG-1834
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Richard Ding
> Fix For: 0.9.0
>
>
> Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
> For example -
> {code}
> l = load 'x' as (a,b);
> l = filter l by a > 1;
> l = foreach ...
> store l into 'y'
> {code}
> At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
> But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
> For example -
> {code}
> l = load 'x' as (a,b);
> A = load 'x' as (a,b);
> B = foreach A generate a, l.a as la;
> l = foreach l generate a+1 as a;
> store B into 'b';
> {code}
> The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
> {code}
> #--------------------------------------------------
> # Map Reduce Plan
> #--------------------------------------------------
> MapReduce node scope-16
> Map Plan
> l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
> |
> |---l: New For Each(false)[bag] - scope-7
> | |
> | Add[int] - scope-5
> | |
> | |---Cast[int] - scope-3
> | | |
> | | |---Project[bytearray][0] - scope-2
> | |
> | |---Constant(1) - scope-4
> |
> |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
> Global sort: false
> ----------------
> MapReduce node scope-17
> Map Plan
> B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
> |
> |---B: New For Each(false,false)[bag] - scope-14
> | |
> | Project[bytearray][0] - scope-9
> | |
> | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
> | |
> | |---Constant(0) - scope-11
> | |
> | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
> |
> |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
> Global sort: false
> ----------------
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (PIG-1834) relation-as-scalar - uses the last
statement associated with the scalar alias
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich reassigned PIG-1834:
-----------------------------------
Assignee: Richard Ding
> relation-as-scalar - uses the last statement associated with the scalar alias
> -----------------------------------------------------------------------------
>
> Key: PIG-1834
> URL: https://issues.apache.org/jira/browse/PIG-1834
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Richard Ding
> Fix For: 0.8.0, 0.9.0
>
>
> Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
> For example -
> {code}
> l = load 'x' as (a,b);
> l = filter l by a > 1;
> l = foreach ...
> store l into 'y'
> {code}
> At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
> But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
> For example -
> {code}
> l = load 'x' as (a,b);
> A = load 'x' as (a,b);
> B = foreach A generate a, l.a as la;
> l = foreach l generate a+1 as a;
> store B into 'b';
> {code}
> The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
> {code}
> #--------------------------------------------------
> # Map Reduce Plan
> #--------------------------------------------------
> MapReduce node scope-16
> Map Plan
> l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
> |
> |---l: New For Each(false)[bag] - scope-7
> | |
> | Add[int] - scope-5
> | |
> | |---Cast[int] - scope-3
> | | |
> | | |---Project[bytearray][0] - scope-2
> | |
> | |---Constant(1) - scope-4
> |
> |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
> Global sort: false
> ----------------
> MapReduce node scope-17
> Map Plan
> B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
> |
> |---B: New For Each(false,false)[bag] - scope-14
> | |
> | Project[bytearray][0] - scope-9
> | |
> | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
> | |
> | |---Constant(0) - scope-11
> | |
> | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
> |
> |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
> Global sort: false
> ----------------
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (PIG-1834) relation-as-scalar - uses the last
statement associated with the scalar alias
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-1834.
---------------------------------
Resolution: Fixed
> relation-as-scalar - uses the last statement associated with the scalar alias
> -----------------------------------------------------------------------------
>
> Key: PIG-1834
> URL: https://issues.apache.org/jira/browse/PIG-1834
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Richard Ding
> Fix For: 0.9.0
>
>
> Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
> For example -
> {code}
> l = load 'x' as (a,b);
> l = filter l by a > 1;
> l = foreach ...
> store l into 'y'
> {code}
> At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
> But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
> For example -
> {code}
> l = load 'x' as (a,b);
> A = load 'x' as (a,b);
> B = foreach A generate a, l.a as la;
> l = foreach l generate a+1 as a;
> store B into 'b';
> {code}
> The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
> {code}
> #--------------------------------------------------
> # Map Reduce Plan
> #--------------------------------------------------
> MapReduce node scope-16
> Map Plan
> l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
> |
> |---l: New For Each(false)[bag] - scope-7
> | |
> | Add[int] - scope-5
> | |
> | |---Cast[int] - scope-3
> | | |
> | | |---Project[bytearray][0] - scope-2
> | |
> | |---Constant(1) - scope-4
> |
> |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
> Global sort: false
> ----------------
> MapReduce node scope-17
> Map Plan
> B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
> |
> |---B: New For Each(false,false)[bag] - scope-14
> | |
> | Project[bytearray][0] - scope-9
> | |
> | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
> | |
> | |---Constant(0) - scope-11
> | |
> | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
> |
> |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
> Global sort: false
> ----------------
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1834) relation-as-scalar - uses the last
statement associated with the scalar alias
Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002836#comment-13002836 ]
Richard Ding commented on PIG-1834:
-----------------------------------
This is fixed with the new parser changes.
> relation-as-scalar - uses the last statement associated with the scalar alias
> -----------------------------------------------------------------------------
>
> Key: PIG-1834
> URL: https://issues.apache.org/jira/browse/PIG-1834
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Richard Ding
> Fix For: 0.9.0
>
>
> Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
> For example -
> {code}
> l = load 'x' as (a,b);
> l = filter l by a > 1;
> l = foreach ...
> store l into 'y'
> {code}
> At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
> But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
> For example -
> {code}
> l = load 'x' as (a,b);
> A = load 'x' as (a,b);
> B = foreach A generate a, l.a as la;
> l = foreach l generate a+1 as a;
> store B into 'b';
> {code}
> The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
> {code}
> #--------------------------------------------------
> # Map Reduce Plan
> #--------------------------------------------------
> MapReduce node scope-16
> Map Plan
> l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
> |
> |---l: New For Each(false)[bag] - scope-7
> | |
> | Add[int] - scope-5
> | |
> | |---Cast[int] - scope-3
> | | |
> | | |---Project[bytearray][0] - scope-2
> | |
> | |---Constant(1) - scope-4
> |
> |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
> Global sort: false
> ----------------
> MapReduce node scope-17
> Map Plan
> B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
> |
> |---B: New For Each(false,false)[bag] - scope-14
> | |
> | Project[bytearray][0] - scope-9
> | |
> | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
> | |
> | |---Constant(0) - scope-11
> | |
> | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
> |
> |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
> Global sort: false
> ----------------
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1834) relation-as-scalar - uses the last
statement associated with the scalar alias
Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1834:
-------------------------------
Description:
Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
For example -
{code}
l = load 'x' as (a,b);
l = filter l by a > 1;
l = foreach ...
store l into 'y'
{code}
At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
For example -
{code}
l = load 'x' as (a,b);
A = load 'x' as (a,b);
B = foreach A generate a, l.a as la;
l = foreach l generate a+1 as a;
store B into 'b';
{code}
The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
{code}
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-16
Map Plan
l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
|
|---l: New For Each(false)[bag] - scope-7
| |
| Add[int] - scope-5
| |
| |---Cast[int] - scope-3
| | |
| | |---Project[bytearray][0] - scope-2
| |
| |---Constant(1) - scope-4
|
|---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
Global sort: false
----------------
MapReduce node scope-17
Map Plan
B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
|
|---B: New For Each(false,false)[bag] - scope-14
| |
| Project[bytearray][0] - scope-9
| |
| POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
| |
| |---Constant(0) - scope-11
| |
| |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
|
|---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
Global sort: false
----------------
{code}
was:
Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
For example -
{code}
l = load 'x' as (a,b);
l = filter l by a > 1;
l = foreach ...
store l into 'y'
{code}
At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
For example -
{code}
l = load 'x' as (a,b);
A = load 'x' as (a,b);
B = foreach A generate a, l.a as la;
l = foreach l generate a+1 as a;
store B into 'b';
{code}
The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
#--------------------------------------------------
# Map Reduce Plan
#--------------------------------------------------
MapReduce node scope-16
Map Plan
l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
|
|---l: New For Each(false)[bag] - scope-7
| |
| Add[int] - scope-5
| |
| |---Cast[int] - scope-3
| | |
| | |---Project[bytearray][0] - scope-2
| |
| |---Constant(1) - scope-4
|
|---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
Global sort: false
----------------
MapReduce node scope-17
Map Plan
B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
|
|---B: New For Each(false,false)[bag] - scope-14
| |
| Project[bytearray][0] - scope-9
| |
| POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
| |
| |---Constant(0) - scope-11
| |
| |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
|
|---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
Global sort: false
----------------
> relation-as-scalar - uses the last statement associated with the scalar alias
> -----------------------------------------------------------------------------
>
> Key: PIG-1834
> URL: https://issues.apache.org/jira/browse/PIG-1834
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Fix For: 0.8.0, 0.9.0
>
>
> Pig allows relation alias to be re-used , ie refer to different relations(/statements) . I have not seen this in documentation, but I have seen people writing such queries.
> For example -
> {code}
> l = load 'x' as (a,b);
> l = filter l by a > 1;
> l = foreach ...
> store l into 'y'
> {code}
> At any part of the query, the alias "l' always represents the relation it last associated with the portion of pig-query above it.
> But in case of relation-as-scalar feature the association is happening with the last relation associated with the alias in entire script.
> For example -
> {code}
> l = load 'x' as (a,b);
> A = load 'x' as (a,b);
> B = foreach A generate a, l.a as la;
> l = foreach l generate a+1 as a;
> store B into 'b';
> {code}
> The alias l in relation with alias B should refer to the load, but it refers to the foreach statement -
> {code}
> #--------------------------------------------------
> # Map Reduce Plan
> #--------------------------------------------------
> MapReduce node scope-16
> Map Plan
> l: Store(file:/tmp/temp-953430379/tmp2006282146:org.apache.pig.impl.io.InterStorage) - scope-8
> |
> |---l: New For Each(false)[bag] - scope-7
> | |
> | Add[int] - scope-5
> | |
> | |---Cast[int] - scope-3
> | | |
> | | |---Project[bytearray][0] - scope-2
> | |
> | |---Constant(1) - scope-4
> |
> |---l: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-1--------
> Global sort: false
> ----------------
> MapReduce node scope-17
> Map Plan
> B: Store(file:///Users/tejas/pig_type/trunk/b:org.apache.pig.builtin.PigStorage) - scope-15
> |
> |---B: New For Each(false,false)[bag] - scope-14
> | |
> | Project[bytearray][0] - scope-9
> | |
> | POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[int] - scope-13
> | |
> | |---Constant(0) - scope-11
> | |
> | |---Constant(file:/tmp/temp-953430379/tmp2006282146) - scope-12
> |
> |---A: Load(file:///Users/tejas/pig_type/trunk/x:org.apache.pig.builtin.PigStorage) - scope-0--------
> Global sort: false
> ----------------
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.