You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2012/04/25 23:36:18 UTC

[jira] [Commented] (MAPREDUCE-4148) MapReduce should not have a compile-time dependency on HDFS

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262139#comment-13262139 ] 

Daryn Sharp commented on MAPREDUCE-4148:
----------------------------------------

Very nice!  I've been wanting to do something similar to allow tokens to decode their identifiers for quite awhile now.  Thoughts/suggestions:

{{Token#decodeIdentifier()}} would be _really_ useful and prevent callers from having to know about the factory.  This would very nearly abstract away the details.  It could either delegate to the factory as-is, or just query the factory for the class, or maybe even {{Token}} can host the kind/class registration method.  There are a number of other places in the code, ex. {{AbstractDelegationTokenSecretManager#renewToken(Token)}} that could benefit from such a method.

I'm not too fond of {{AbstractDelegationTokenIdentifier#stringifyToken(Token)}}.  It creates a circular relationship.  A {{TokenIdentifier}} really shouldn't have to know about a {{Token}} wrapper.  How removing the inversion of knowledge, and update {{Token#toString()}} to use {{token.decodeIdentifier()}}?  If the value is null because the class isn't available, it can print the raw bytes like it does now.

{{TokenIdentifierFactory.createIdentifier}} is using {{ReflectionUtils.newInstance(Configuration)}} whose main purpose is to invoke {{setConf(conf)}} and/or MR's {{configure(conf)}} which isn't applicable in this case.  How about directly using {{class.newInstance()}}?

I think there's pitfalls with using static class inits for the factory registration.  The identifier class has to be loaded (sorry if I'm stating the obvious: not just imported, but something referenced from it) before a token can be decoded.  In general this probably means a token has to be created before other tokens can be decoded.  Perhaps the static blocks should become a static class method that's invoked by the secret manager, since we know the secret manager is instantiated before token manipulation.  Although, that limits token ident decoding only to tokens owned by the daemon, which would leave a client out of luck.  Maybe you could get fancy with a class loader.
                
> MapReduce should not have a compile-time dependency on HDFS
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4148
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4148
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: MAPREDUCE-4148.patch, MAPREDUCE-4148.patch
>
>
> MapReduce depends on HDFS's DelegationTokenIdentifier (for printing token debug information). We should remove this dependency and MapReduce's compile-time dependency on HDFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira