You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Yan Zhou (JIRA)" <ji...@apache.org> on 2009/12/09 20:58:18 UTC

[jira] Created: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

[zebra] Use of Hadoop 2.0 APIs  
--------------------------------

                 Key: PIG-1140
                 URL: https://issues.apache.org/jira/browse/PIG-1140
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.6.0
            Reporter: Yan Zhou
             Fix For: 0.7.0


Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment:     (was: zebra.0112)

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment: zebra.0112

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802548#action_12802548 ] 

Hadoop QA commented on PIG-1140:
--------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12430033/zebra.0112
  against trunk revision 900926.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 78 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/183/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/183/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/183/console

This message is automatically generated.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Status: Patch Available  (was: Open)

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment: zebra.0209

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Status: Patch Available  (was: Open)

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0111
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832287#action_12832287 ] 

Gaurav Jain commented on PIG-1140:
----------------------------------


Few suggestions to the implementation


TableLoader: 
 -- In initialize method(), we sld do 
      
   Configuration conf = new Configuration(false) which creates an empty object. 
 
   Configuration conf = new Configuration() populates the object from default-*xml which may contain conflicting properties. 
 
    ( Good to have ) 
 
 -- In seekNear method(), we might want to check the nullness of tableRecordReader. ( Good to have ) 
 
 -- In createIndexReader(), since we set the projection, we sld not send null projection to 
     createTableRecordReader(job, null). 
     It sld be createTableRecordReader(job, TableInoutFormat.getProjection(job)) (need to have) 
 
 -- In setLocation() and getSchema(), if we are handling paths == null then we might want to check paths.isEmpty() as well. (good to have) 
 
 
 
 
 TableStorer: 
 
 -- Instead of implementing new classes (TableOutputFormat and TableOutputCommitter), we sld use BasicTableOutputFormat and BasicTableOutputFormat.TableOutputCommitter in zebra mapreduce package ( must have ) 
 
                                   (There would be a separate jira/patch to do the same ) 
 
 -- Code from storeSchema sld go TableOutputFormat.TableOutputCommitter.cleanupJob(). 
 
 -- Does pig calls OutputCommitter.abortJob() for failed jobs ? 
 


> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yan Zhou resolved PIG-1140.
---------------------------

    Resolution: Fixed

Committed to the load-store-redesign branch.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833127#action_12833127 ] 

Yan Zhou commented on PIG-1140:
-------------------------------

+1. Looks ok to me now.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment: zebra.0213

Same as previous one, but with change requested in the comment area.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832868#action_12832868 ] 

Yan Zhou commented on PIG-1140:
-------------------------------

-1. That's exaclt what I meant: having a separate work-horse method. As I said the getSingleSortedSplit clones most of its logic from getSplits(). And this duplicated logic is non-trivial. I don't think code changes would have much risk. 

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798903#action_12798903 ] 

Hadoop QA commented on PIG-1140:
--------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12429913/zebra.0111
  against trunk revision 896951.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 78 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 482 release audit warnings (more than the trunk's current 481 warnings).

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/169/console

This message is automatically generated.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0111
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802441#action_12802441 ] 

Gaurav Jain commented on PIG-1140:
----------------------------------


+1 

Pig related Zebra changes have not been migrated to new Hadoop 20 Api in this patch. Those will contniue to work with Old Hadoop 18 Api.

Pig is re-designing its interfaces and will be incorporated in Zebra in the next patch.

Also, in BasicTableOuputFormat M/R commit interface is a no-op for now in this patch as its used exclusivley for Pig interfaces

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833177#action_12833177 ] 

Xuefu Zhang commented on PIG-1140:
----------------------------------

Result from Hudson (executed manually on Load-Store-Redesign branch with patch)

     [exec] 
     [exec] There appear to be 507 release audit warnings before the patch and 507 release audit warnings after applying the patch.
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 123 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
     [exec] 
     [exec] 
     [exec] 
     [exec] 
     [exec] ======================================================================
     [exec] ======================================================================
     [exec]     Finished build.
     [exec] ======================================================================
     [exec] ======================================================================
     [exec] 
     [exec] 

BUILD SUCCESSFUL
Total time: 24 minutes 15 seconds


> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Chao Wang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Wang updated PIG-1140:
---------------------------

    Status: Open  (was: Patch Available)

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Gaurav Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832776#action_12832776 ] 

Gaurav Jain commented on PIG-1140:
----------------------------------

 
+1

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment: zebra.0211

Updated based on review feedback.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Jay Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jay Tang reassigned PIG-1140:
-----------------------------

    Assignee: Xuefu Zhang

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment: zebra.0212

Final version of the patch.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832848#action_12832848 ] 

Xuefu Zhang commented on PIG-1140:
----------------------------------

Regarding about suggestion on getSingleSortedSplit(), while it has its point, but I don't think it's a must have, especially when we only handle two cases, 1 or -1. And 1 only applies to a sorted table. Thus, separating them clearly makes better sense. If there is any logic duplication, a better way would be to abstract the logic to a common method. At this point, Nevetheless, I don't think we have to get this done immediately. Having said that, I'm going to submit a new patch with the unnecessary import mentioned above removed. Thanks.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment:     (was: zebra.0111)

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832787#action_12832787 ] 

Yan Zhou commented on PIG-1140:
-------------------------------

TableInputFormat.getSingleSortedSplit(...) clones most of its logic from getSplits; should have a single work-horse function handling both the generic getSplits functionality and this special single sorted split functionality;

A minor issue:  "import java.io.Serializable;" is unnecessary in ColumnGroup.java

Everything else look ok to me.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Yan Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832346#action_12832346 ] 

Yan Zhou commented on PIG-1140:
-------------------------------

TableLoader:

   seekNear(): should build static info once and only build dynamic data for each and every call;
   getNext():     should not need to make a copy of Tuple as a returned value;

TableInputFormat:

  setProjection(Configuration conf, String projection)  seems to be a utility method and should be made private
  createTableRecordReader  needs to make sure only one split is generated
  there are several unused "serialVersionUID" const variable introduced;
 
TableRecordWriter:

  Should stay inside the BasicTableOutput.java
  Constructor: better to build the inserter's name outside the loop; the "patition" appearts to be a typo; why not use the original "part-" prefix? Is the sequence number 0-padded at the front when necessary?

TableRecordReader:

  nextKeyValue should not absorb the IOException: it should throw it without printing the stack trace.

TableRecordReader:

  tableRecordWriter:  should not be a member variable;

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Status: Open  (was: Patch Available)

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0112
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831090#action_12831090 ] 

Xuefu Zhang commented on PIG-1140:
----------------------------------

New submission. It includes changes required for PIG LOAD/STORE FUNC redesign. As such, checkin should be committed to PIG-LOAD_STORE-REDESIGN branch instead.

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Xuefu Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xuefu Zhang updated PIG-1140:
-----------------------------

    Attachment: zebra.0111

> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>             Fix For: 0.7.0
>
>         Attachments: zebra.0111
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai closed PIG-1140.
---------------------------


> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209, zebra.0211, zebra.0212, zebra.0213
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.