You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Pradeep Kamath (JIRA)" <ji...@apache.org> on 2010/02/24 00:50:28 UTC
[jira] Created: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
PigStorage per the new load-store redesign should support splitting of bzip files
---------------------------------------------------------------------------------
Key: PIG-1257
URL: https://issues.apache.org/jira/browse/PIG-1257
Project: Pig
Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath
Fix For: 0.7.0
PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Attachment: PIG-1257-2.patch
Attached new patch to address unit test failures
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257-2.patch, PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838261#action_12838261 ]
Hadoop QA commented on PIG-1257:
--------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12436937/PIG-1257.patch
against trunk revision 916065.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 10 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 1 new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/215/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/215/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/215/console
This message is automatically generated.
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Status: Open (was: Patch Available)
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257-2.patch, PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Attachment: blockHeaderEndsAt136500.txt.bz2
blockEndingInCR.txt.bz2
PIG-1257-3.patch
Since the last patch, I uncovered some issue with code while testing some boundary conditions. I have fixed those in the new patch PIG-1257-3.patch and included those boundary conditions in testcases in TestBZip
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846080#action_12846080 ]
Pradeep Kamath commented on PIG-1257:
-------------------------------------
I ran all unit tests on my local machines and also the "test-patch" ant target:
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 12 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
[exec]
[exec]
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845627#action_12845627 ]
Hadoop QA commented on PIG-1257:
--------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12438883/recordLossblockHeaderEndsAt136500.txt.bz2
against trunk revision 923043.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
-1 patch. The patch command could not apply the patch.
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/239/console
This message is automatically generated.
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846038#action_12846038 ]
Pradeep Kamath commented on PIG-1257:
-------------------------------------
In the following case in inputData the record will end with \r won't it? (notice the \r in the middle after 2)
{code}
"1\t2\r3\t4", // '\r' case - this will be split into two tuples
{code}
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Attachment: recordLossblockHeaderEndsAt136500.txt.bz2
The .bz2 files attached to this issue should be put in test/org/apache/pig/test/data for this patch to pass unit tests.
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838695#action_12838695 ]
Hadoop QA commented on PIG-1257:
--------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12437067/PIG-1257-2.patch
against trunk revision 916429.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/224/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/224/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/224/console
This message is automatically generated.
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257-2.patch, PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846027#action_12846027 ]
Benjamin Reed commented on PIG-1257:
------------------------------------
excellent work pradeep. just one minor thing: you always append a \n before inputData in your test case, so you never test the case when you end with just \r
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Status: Open (was: Patch Available)
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257-2.patch, PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846083#action_12846083 ]
Benjamin Reed commented on PIG-1257:
------------------------------------
+1 you are right. thanx pradeep. i think it is ready to commit.
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Status: Patch Available (was: Open)
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257-2.patch, PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Status: Patch Available (was: Open)
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Status: Patch Available (was: Open)
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
Patch committed
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pradeep Kamath updated PIG-1257:
--------------------------------
Attachment: PIG-1257.patch
Attached patch builds an InputFormat (Bzip2TextInputFormat) on top of the existing CBZip2InputStream.
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: PIG-1257.patch
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (PIG-1257) PigStorage per the new load-store
redesign should support splitting of bzip files
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai closed PIG-1257.
---------------------------
> PigStorage per the new load-store redesign should support splitting of bzip files
> ---------------------------------------------------------------------------------
>
> Key: PIG-1257
> URL: https://issues.apache.org/jira/browse/PIG-1257
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: 0.7.0
>
> Attachments: blockEndingInCR.txt.bz2, blockHeaderEndsAt136500.txt.bz2, PIG-1257-2.patch, PIG-1257-3.patch, PIG-1257.patch, recordLossblockHeaderEndsAt136500.txt.bz2
>
>
> PigStorage implemented per new load-store-redesign (PIG-966) is based on TextInputFormat for reading data. TextInputFormat has support for reading bzip data but without support for splitting bzip files. In pig 0.6, splitting was enabled for bzip files - we should attempt to enable that feature.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.