You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2009/01/29 00:22:59 UTC

[jira] Created: (PIG-644) Duplicate column names in foreach do not throw parser error

Duplicate column names in foreach do not throw parser error
-----------------------------------------------------------

                 Key: PIG-644
                 URL: https://issues.apache.org/jira/browse/PIG-644
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Viraj Bhat
             Fix For: types_branch


Consider the following Pig script where we generate column names b and b in the FOREACH
{code}
DATA = LOAD 'blah.txt' as (a:long, b:long);
RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
DESCRIBE RESULT;
dump RESULT;
{code}

Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-644:
---------------------------

    Attachment: PIG-644-1.patch

Add a SchemaAliasValidator to do this check.

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-644:
---------------------------

        Fix Version/s: 0.6.0
    Affects Version/s:     (was: 0.2.0)
                       0.4.0
               Status: Patch Available  (was: Open)

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767528#action_12767528 ] 

Pradeep Kamath commented on PIG-644:
------------------------------------

+1


> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai reassigned PIG-644:
------------------------------

    Assignee: Daniel Dai

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766839#action_12766839 ] 

Hadoop QA commented on PIG-644:
-------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12422324/PIG-644-1.patch
  against trunk revision 826110.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 307 release audit warnings (more than the trunk's current 305 warnings).

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/console

This message is automatically generated.

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viraj Bhat updated PIG-644:
---------------------------

    Attachment: blah.txt

Sample input

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>             Fix For: types_branch
>
>         Attachments: blah.txt
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-644:
---------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Patch committed

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-644) Duplicate column names in foreach do not throw parser error

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767558#action_12767558 ] 

Daniel Dai commented on PIG-644:
--------------------------------

Also note this change may broke some existing scripts if they do contain duplicated schema alias.

> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
>                 Key: PIG-644
>                 URL: https://issues.apache.org/jira/browse/PIG-644
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names.  I do not know if the new error handling framework will handle these kinds of cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.