You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2009/06/23 09:48:07 UTC
[jira] Created: (PIG-861) POJoinPackage lose tuple in large dataset
POJoinPackage lose tuple in large dataset
-----------------------------------------
Key: PIG-861
URL: https://issues.apache.org/jira/browse/PIG-861
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.2.0
Reporter: Daniel Dai
Assignee: Daniel Dai
Fix For: 0.4.0
Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Status: In Progress (was: Patch Available)
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch, PIG-861-2.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Affects Version/s: (was: 0.2.0)
0.3.0
Status: Patch Available (was: Open)
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Attachment: PIG-861-1.patch
The problem is caused by a bug in BinStorage.java which erroneously interprets character \255 in the binary stream as EOF. Tested on the original queries and the patch fix the problem. No unit test is included since this patch does not introduce any new feature.
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.2.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-861) POJoinPackage lose tuple in large
dataset
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727030#action_12727030 ]
Hadoop QA commented on PIG-861:
-------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12412218/PIG-861-1.patch
against trunk revision 790735.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
-1 patch. The patch command could not apply the patch.
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/110/console
This message is automatically generated.
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Attachment: PIG-861-2.patch
Resync the patch to the latest trunk.
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch, PIG-861-2.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Patch committed
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch, PIG-861-2.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Status: Patch Available (was: In Progress)
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch, PIG-861-2.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Status: Patch Available (was: In Progress)
Submit again for Hudson to pick up.
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-861) POJoinPackage lose tuple in large
dataset
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727223#action_12727223 ]
Hudson commented on PIG-861:
----------------------------
Integrated in Pig-trunk #494 (See [http://hudson.zones.apache.org/hudson/job/Pig-trunk/494/])
: POJoinPackage lose tuple in large dataset
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch, PIG-861-2.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-861) POJoinPackage lose tuple in large
dataset
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727172#action_12727172 ]
Hadoop QA commented on PIG-861:
-------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12412527/PIG-861-2.patch
against trunk revision 790735.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/113/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/113/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-minerva.apache.org/113/console
This message is automatically generated.
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch, PIG-861-2.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-861) POJoinPackage lose tuple in large
dataset
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725880#action_12725880 ]
Olga Natkovich commented on PIG-861:
------------------------------------
+1, changes look good. Great catch!
Need to make sure all tests pass before committing
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-861) POJoinPackage lose tuple in large dataset
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-861:
---------------------------
Status: In Progress (was: Patch Available)
> POJoinPackage lose tuple in large dataset
> -----------------------------------------
>
> Key: PIG-861
> URL: https://issues.apache.org/jira/browse/PIG-861
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.4.0
>
> Attachments: PIG-861-1.patch
>
>
> Some script using POJoinPackage loses records when processing large amount of input data. We do not see this problem in smaller input. We can reproduce this problem, however, the dataset for the test case is too big to be included here. We suspect that POJoinPackage causes the problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.