You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Yan Zhou (JIRA)" <ji...@apache.org> on 2009/12/09 20:58:18 UTC
[jira] Created: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
[zebra] Use of Hadoop 2.0 APIs
--------------------------------
Key: PIG-1140
URL: https://issues.apache.org/jira/browse/PIG-1140
Project: Pig
Issue Type: Improvement
Affects Versions: 0.6.0
Reporter: Yan Zhou
Fix For: 0.7.0
Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: (was: zebra.0112)
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: zebra.0112
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802548#action_12802548 ]
Hadoop QA commented on PIG-1140:
--------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12430033/zebra.0112
against trunk revision 900926.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 78 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/183/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/183/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/183/console
This message is automatically generated.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Status: Patch Available (was: Open)
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: zebra.0209
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Status: Patch Available (was: Open)
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0111
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832287#action_12832287 ]
Gaurav Jain commented on PIG-1140:
----------------------------------
Few suggestions to the implementation
TableLoader:
-- In initialize method(), we sld do
Configuration conf = new Configuration(false) which creates an empty object.
Configuration conf = new Configuration() populates the object from default-*xml which may contain conflicting properties.
( Good to have )
-- In seekNear method(), we might want to check the nullness of tableRecordReader. ( Good to have )
-- In createIndexReader(), since we set the projection, we sld not send null projection to
createTableRecordReader(job, null).
It sld be createTableRecordReader(job, TableInoutFormat.getProjection(job)) (need to have)
-- In setLocation() and getSchema(), if we are handling paths == null then we might want to check paths.isEmpty() as well. (good to have)
TableStorer:
-- Instead of implementing new classes (TableOutputFormat and TableOutputCommitter), we sld use BasicTableOutputFormat and BasicTableOutputFormat.TableOutputCommitter in zebra mapreduce package ( must have )
(There would be a separate jira/patch to do the same )
-- Code from storeSchema sld go TableOutputFormat.TableOutputCommitter.cleanupJob().
-- Does pig calls OutputCommitter.abortJob() for failed jobs ?
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yan Zhou resolved PIG-1140.
---------------------------
Resolution: Fixed
Committed to the load-store-redesign branch.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833127#action_12833127 ]
Yan Zhou commented on PIG-1140:
-------------------------------
+1. Looks ok to me now.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: zebra.0213
Same as previous one, but with change requested in the comment area.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832868#action_12832868 ]
Yan Zhou commented on PIG-1140:
-------------------------------
-1. That's exaclt what I meant: having a separate work-horse method. As I said the getSingleSortedSplit clones most of its logic from getSplits(). And this duplicated logic is non-trivial. I don't think code changes would have much risk.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798903#action_12798903 ]
Hadoop QA commented on PIG-1140:
--------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12429913/zebra.0111
against trunk revision 896951.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 78 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
-1 release audit. The applied patch generated 482 release audit warnings (more than the trunk's current 481 warnings).
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/console
This message is automatically generated.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0111
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802441#action_12802441 ]
Gaurav Jain commented on PIG-1140:
----------------------------------
+1
Pig related Zebra changes have not been migrated to new Hadoop 20 Api in this patch. Those will contniue to work with Old Hadoop 18 Api.
Pig is re-designing its interfaces and will be incorporated in Zebra in the next patch.
Also, in BasicTableOuputFormat M/R commit interface is a no-op for now in this patch as its used exclusivley for Pig interfaces
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833177#action_12833177 ]
Xuefu Zhang commented on PIG-1140:
----------------------------------
Result from Hudson (executed manually on Load-Store-Redesign branch with patch)
[exec]
[exec] There appear to be 507 release audit warnings before the patch and 507 release audit warnings after applying the patch.
[exec]
[exec]
[exec]
[exec]
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 123 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
[exec]
[exec]
[exec]
[exec]
[exec] ======================================================================
[exec] ======================================================================
[exec] Finished build.
[exec] ======================================================================
[exec] ======================================================================
[exec]
[exec]
BUILD SUCCESSFUL
Total time: 24 minutes 15 seconds
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Chao Wang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chao Wang updated PIG-1140:
---------------------------
Status: Open (was: Patch Available)
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832776#action_12832776 ]
Gaurav Jain commented on PIG-1140:
----------------------------------
+1
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: zebra.0211
Updated based on review feedback.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Jay Tang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Tang reassigned PIG-1140:
-----------------------------
Assignee: Xuefu Zhang
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: zebra.0212
Final version of the patch.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832848#action_12832848 ]
Xuefu Zhang commented on PIG-1140:
----------------------------------
Regarding about suggestion on getSingleSortedSplit(), while it has its point, but I don't think it's a must have, especially when we only handle two cases, 1 or -1. And 1 only applies to a sorted table. Thus, separating them clearly makes better sense. If there is any logic duplication, a better way would be to abstract the logic to a common method. At this point, Nevetheless, I don't think we have to get this done immediately. Having said that, I'm going to submit a new patch with the unnecessary import mentioned above removed. Thanks.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: (was: zebra.0111)
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832787#action_12832787 ]
Yan Zhou commented on PIG-1140:
-------------------------------
TableInputFormat.getSingleSortedSplit(...) clones most of its logic from getSplits; should have a single work-horse function handling both the generic getSplits functionality and this special single sorted split functionality;
A minor issue: "import java.io.Serializable;" is unnecessary in ColumnGroup.java
Everything else look ok to me.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832346#action_12832346 ]
Yan Zhou commented on PIG-1140:
-------------------------------
TableLoader:
seekNear(): should build static info once and only build dynamic data for each and every call;
getNext(): should not need to make a copy of Tuple as a returned value;
TableInputFormat:
setProjection(Configuration conf, String projection) seems to be a utility method and should be made private
createTableRecordReader needs to make sure only one split is generated
there are several unused "serialVersionUID" const variable introduced;
TableRecordWriter:
Should stay inside the BasicTableOutput.java
Constructor: better to build the inserter's name outside the loop; the "patition" appearts to be a typo; why not use the original "part-" prefix? Is the sequence number 0-padded at the front when necessary?
TableRecordReader:
nextKeyValue should not absorb the IOException: it should throw it without printing the stack trace.
TableRecordReader:
tableRecordWriter: should not be a member variable;
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Status: Open (was: Patch Available)
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831090#action_12831090 ]
Xuefu Zhang commented on PIG-1140:
----------------------------------
New submission. It includes changes required for PIG LOAD/STORE FUNC redesign. As such, checkin should be committed to PIG-LOAD_STORE-REDESIGN branch instead.
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated PIG-1140:
-----------------------------
Attachment: zebra.0111
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Fix For: 0.7.0
>
> Attachments: zebra.0111
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Closed: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai closed PIG-1140.
---------------------------
> [zebra] Use of Hadoop 2.0 APIs
> --------------------------------
>
> Key: PIG-1140
> URL: https://issues.apache.org/jira/browse/PIG-1140
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.6.0
> Reporter: Yan Zhou
> Assignee: Xuefu Zhang
> Fix For: 0.7.0
>
> Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.