You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2009/01/29 00:22:59 UTC
[jira] Created: (PIG-644) Duplicate column names in foreach do not
throw parser error
Duplicate column names in foreach do not throw parser error
-----------------------------------------------------------
Key: PIG-644
URL: https://issues.apache.org/jira/browse/PIG-644
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: types_branch
Reporter: Viraj Bhat
Fix For: types_branch
Consider the following Pig script where we generate column names b and b in the FOREACH
{code}
DATA = LOAD 'blah.txt' as (a:long, b:long);
RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
DESCRIBE RESULT;
dump RESULT;
{code}
Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-644) Duplicate column names in foreach do not
throw parser error
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-644:
---------------------------
Attachment: PIG-644-1.patch
Add a SchemaAliasValidator to do this check.
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-644) Duplicate column names in foreach do not
throw parser error
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-644:
---------------------------
Fix Version/s: 0.6.0
Affects Version/s: (was: 0.2.0)
0.4.0
Status: Patch Available (was: Open)
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-644) Duplicate column names in foreach do
not throw parser error
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767528#action_12767528 ]
Pradeep Kamath commented on PIG-644:
------------------------------------
+1
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-644) Duplicate column names in foreach do not
throw parser error
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai reassigned PIG-644:
------------------------------
Assignee: Daniel Dai
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-644) Duplicate column names in foreach do
not throw parser error
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766839#action_12766839 ]
Hadoop QA commented on PIG-644:
-------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12422324/PIG-644-1.patch
against trunk revision 826110.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
-1 release audit. The applied patch generated 307 release audit warnings (more than the trunk's current 305 warnings).
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/91/console
This message is automatically generated.
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-644) Duplicate column names in foreach do not
throw parser error
Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Bhat updated PIG-644:
---------------------------
Attachment: blah.txt
Sample input
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Reporter: Viraj Bhat
> Fix For: types_branch
>
> Attachments: blah.txt
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-644) Duplicate column names in foreach do not
throw parser error
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-644:
---------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
Patch committed
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-644) Duplicate column names in foreach do
not throw parser error
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767558#action_12767558 ]
Daniel Dai commented on PIG-644:
--------------------------------
Also note this change may broke some existing scripts if they do contain duplicated schema alias.
> Duplicate column names in foreach do not throw parser error
> -----------------------------------------------------------
>
> Key: PIG-644
> URL: https://issues.apache.org/jira/browse/PIG-644
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.4.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: blah.txt, PIG-644-1.patch
>
>
> Consider the following Pig script where we generate column names b and b in the FOREACH
> {code}
> DATA = LOAD 'blah.txt' as (a:long, b:long);
> RESULT = FOREACH DATA GENERATE a, b, (b>20?b:0) as b;
> DESCRIBE RESULT;
> dump RESULT;
> {code}
> Pig runs the script successfully and does not complain of the duplicate column names. I do not know if the new error handling framework will handle these kinds of cases.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.