You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2010/12/13 22:38:01 UTC

[jira] Created: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

as_clause in foreach statement should differentiate between simple type and type within tuple
---------------------------------------------------------------------------------------------

                 Key: PIG-1765
                 URL: https://issues.apache.org/jira/browse/PIG-1765
             Project: Pig
          Issue Type: Sub-task
            Reporter: Thejas M Nair
             Fix For: 0.9.0


In new parser changes, the following statements are treated as same -

f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int

f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 

With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 

The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971308#action_12971308 ] 

Alan Gates commented on PIG-1765:
---------------------------------

This is going to be very non-intuitive to people.  Based on 20+ years of mathematics most programmers expect x = (x) to be true.  We are saying in this case it isn't in this situation.  I know this is not a mathematical expression, but many people will not see the difference.  I agree consistent syntax between load as and foreach as is greatly desirable, so requiring the name or keyword tuple in the foreach as is not good.  So maybe there is no way around this.  But we should at least make sure the documentation is very specific, and maybe even issue warnings in the (x) case, to make sure people know.

> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>             Fix For: 0.9.0
>
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-1765:
--------------------------------

    Assignee: Xuefu Zhang

Xuefu, can we just follow Alan's suggestion and test to make sure we are not breaking compatibility on this one.

> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>            Assignee: Xuefu Zhang
>             Fix For: 0.9.0
>
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994942#comment-12994942 ] 

Xuefu Zhang commented on PIG-1765:
----------------------------------

The issue about foreach generate a as aa:int vs (aa:int) was once there but quickly addressed. Right now, we have exactly the same behavior as the old parser.

In addition, B = foreach A generate (A.x); doesn't parse in the new parser, but parses in the old one. I think it should be invalid because A is used both as a relation as well as a scalar. If A has only one row, B = foreach A generate x will give the same result.

We can deprecate A = load 'foo' as x:int; in the new parser.

> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>            Assignee: Xuefu Zhang
>             Fix For: 0.9.0
>
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994717#comment-12994717 ] 

Alan Gates commented on PIG-1765:
---------------------------------

We don't know what we want to do in the long term here.  Currently we do not have consistent semantics with parenthesis in  Pig Latin.  In the long term I think I would vote for () always  meaning a tuple.  But we need more discussion and thought before we  settle on that.  We are ready to push this into 0.9.

For 0.9 it will be adequate to make sure we don't change behavior from  0.8.  That is:

{code}
B = foreach A generate (A.x);
{code}

should still gives an error.  We could also easily deprecate the

{code}
A = load 'foo' as x:int;
{code}

behavior.


> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>             Fix For: 0.9.0
>
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Resolved: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang resolved PIG-1765.
------------------------------

    Resolution: Fixed

Per discussion above, there is no more issue regarding this.

> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>            Assignee: Xuefu Zhang
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-1765:
--------------------------------

    Fix Version/s:     (was: 0.9.0)

> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>            Assignee: Xuefu Zhang
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (PIG-1765) as_clause in foreach statement should differentiate between simple type and type within tuple

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971363#action_12971363 ] 

Daniel Dai commented on PIG-1765:
---------------------------------

If we are going to change, we also need to address the following situation as well:

f = foreach l generate flatten(a) as (a0:int, a1:int);  -- here user do not mean to generate a tuple, but to name different component of flattened bag, we need to way to express this

> as_clause in foreach statement should differentiate between simple type and type within tuple
> ---------------------------------------------------------------------------------------------
>
>                 Key: PIG-1765
>                 URL: https://issues.apache.org/jira/browse/PIG-1765
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Thejas M Nair
>             Fix For: 0.9.0
>
>
> In new parser changes, the following statements are treated as same -
> f = foreach l generate a as aa :int;      -- here the column is now called aa and has type int
> f = foreach l generate a as (aa :int);   -- this should mean that the column has type "tuple with column aa of type int" 
> With old parser the 2nd statement results in syntax error, which is fine, because it requires name part . 
> The parenthesis represent tuple in pig.  We should deprecate support for load statement that takes schema without the parenthesis part , such as following example -
> l = load 'x' as a:int -- It should be as (a :int) , it is treated as such but this is inconsistent syntax.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.