You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by "Ed Kohlwey (JIRA)" <ji...@apache.org> on 2012/10/11 16:11:03 UTC

[jira] [Created] (ACCUMULO-804) Hadoop 2.0 Support

Ed Kohlwey created ACCUMULO-804:
-----------------------------------

             Summary: Hadoop 2.0 Support
                 Key: ACCUMULO-804
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
             Project: Accumulo
          Issue Type: Improvement
          Components: client
    Affects Versions: 1.5.0, 1.4.3
            Reporter: Ed Kohlwey
            Assignee: Billie Rinaldi


We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.

When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.

The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.

There may also be some filesystem API issues but I don't think they will be too severe.

The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-804) Hadoop 2.0 Support

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ACCUMULO-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475701#comment-13475701 ] 

Drew Farris commented on ACCUMULO-804:
--------------------------------------

Trying to figure out the best way to approach maven packaging to support both Hadoop 1.x and Hadoop 2.x - One idea is to employ a maven module that uses a profiles to specify which hadoop dependencies to use, and those dependencies are inherited transitively by the other modules.
                
> Hadoop 2.0 Support
> ------------------
>
>                 Key: ACCUMULO-804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.5.0, 1.4.3
>            Reporter: Ed Kohlwey
>            Assignee: Billie Rinaldi
>              Labels: hackathon
>
> We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.
> When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.
> The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.
> There may also be some filesystem API issues but I don't think they will be too severe.
> The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ACCUMULO-804) Hadoop 2.0 Support

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ACCUMULO-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Drew Farris updated ACCUMULO-804:
---------------------------------

    Labels: hackathon  (was: )
    
> Hadoop 2.0 Support
> ------------------
>
>                 Key: ACCUMULO-804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.5.0, 1.4.3
>            Reporter: Ed Kohlwey
>            Assignee: Billie Rinaldi
>              Labels: hackathon
>
> We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.
> When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.
> The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.
> There may also be some filesystem API issues but I don't think they will be too severe.
> The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-804) Hadoop 2.0 Support

Posted by "John Vines (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ACCUMULO-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475704#comment-13475704 ] 

John Vines commented on ACCUMULO-804:
-------------------------------------

Hadoop isn't a compiled dependency. Couldn't we keep the default with 1 for compilation time, but have a testing procedure of some sort which exercises both?
                
> Hadoop 2.0 Support
> ------------------
>
>                 Key: ACCUMULO-804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.5.0, 1.4.3
>            Reporter: Ed Kohlwey
>            Assignee: Billie Rinaldi
>              Labels: hackathon
>
> We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.
> When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.
> The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.
> There may also be some filesystem API issues but I don't think they will be too severe.
> The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-804) Hadoop 2.0 Support

Posted by "Billie Rinaldi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ACCUMULO-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475659#comment-13475659 ] 

Billie Rinaldi commented on ACCUMULO-804:
-----------------------------------------

The test issues have already been addressed by creating a ContextFactory for ACCUMULO-564.  I would be in favor of fixing the MapReduce tests anyway and getting rid of the ContextFactory.

The remaining issues appear to be:

1) Pulling JobTracker information for org.apache.accumulo.server.monitor.servlets.DefaultServlet.  We could potentially pull ResourceManager information instead, and do something with reflection to determine when to do that (like with the ContextFactory) or we could remove that information from the Accumulo monitor.

2) Instantiating SocketInputStream in org.apache.accumulo.core.util.TTimeoutTransport.  This class should be instantiated with NetUtils.getInputStream, which also exists in Hadoop 1.  While we're at it, we might as well replace SocketOutputStream with NetUtils.getOutputStream, although it isn't broken in Hadoop 2.
                
> Hadoop 2.0 Support
> ------------------
>
>                 Key: ACCUMULO-804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.5.0, 1.4.3
>            Reporter: Ed Kohlwey
>            Assignee: Billie Rinaldi
>
> We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.
> When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.
> The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.
> There may also be some filesystem API issues but I don't think they will be too severe.
> The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-804) Hadoop 2.0 Support

Posted by "John Vines (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ACCUMULO-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474196#comment-13474196 ] 

John Vines commented on ACCUMULO-804:
-------------------------------------

I really think this is something we should at least try to get into 1.5
                
> Hadoop 2.0 Support
> ------------------
>
>                 Key: ACCUMULO-804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.5.0, 1.4.3
>            Reporter: Ed Kohlwey
>            Assignee: Billie Rinaldi
>
> We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.
> When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.
> The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.
> There may also be some filesystem API issues but I don't think they will be too severe.
> The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-804) Hadoop 2.0 Support

Posted by "Christopher Tubbs (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ACCUMULO-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475559#comment-13475559 ] 

Christopher Tubbs commented on ACCUMULO-804:
--------------------------------------------

If this change would result in Accumulo not working on Apache Hadoop 1.x, I think that could be disruptive enough to require a jump to version 2.0.0 ourselves. It would be very strange if you simply couldn't run Accumulo 1.5.x on the same HDFS instance that worked fine for 1.4.x. Ideally, though, we could try to support both for 1.5.0+.
                
> Hadoop 2.0 Support
> ------------------
>
>                 Key: ACCUMULO-804
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-804
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.5.0, 1.4.3
>            Reporter: Ed Kohlwey
>            Assignee: Billie Rinaldi
>
> We should start thinking about Hadoop 2 support now that it is Cloudera's recommended distribution and many new Hadoop users will probably be adopting it.
> When I investigated this first a few months ago it seemed like the biggest barrier to this was that all the Map/Reduce related tests are implemented using pseudo-private constructors from Hadoop 1.0 that are no-longer present in Hadoop 2.0.
> The main strategy to fix this should probably be to adopt the Map/Reduce cluster test object for testing the various Accumulo input formats instead of instrumenting them directly. I have used this convenience object successfully on tests utilizing MockInstance, so I think it should work fine.
> There may also be some filesystem API issues but I don't think they will be too severe.
> The other main issue is that we will need to actually deploy on Hadoop 1 and 2 and run the integration tests once we start supporting both, so that will be a headache for release testing that we should think through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira