You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2013/09/04 06:04:51 UTC

[jira] [Commented] (MAPREDUCE-5329) APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13757448#comment-13757448 ] 

Siddharth Seth commented on MAPREDUCE-5329:
-------------------------------------------

bq. I think the code for that should be similar to what I wrote above in my 24/Jun/13 comment.
This JIRA has been open a long time!
Yep, it does look like the change will be something along those lines. I don't particularly like the fact that the ShuffleHandler is being used to serialize data for additional Aux services. Unless the jobToken is required, how about sending an empty byte buffer to ensure the additional service is initialized.
Considering how the ShuffleHandler is tied into the TaskAttemptImpl as well as ContainerLauncher for the port information - I think this should just be left as it is - i.e. ShuffleHandler will always be used. To configure an additional ShuffleProvider - add a separate config like your previous patch. That's really an 'advanced' config - with restrictions on how it can be used in the MR case - i.e. any service data returned by the plugin is not used, service data sent to the plugin is either the jobToken or an empty ByteBuffer.
                
> APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5329
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5329
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.4-alpha
>            Reporter: Avner BenHanoch
>
> APPLICATION_INIT is never sent to AuxServices other than the built-in ShuffleHandler.  This means that 3rd party ShuffleProvider(s) will not be able to function, because APPLICATION_INIT enables the AuxiliaryService to map jobId->userId. This is needed for properly finding the MOFs of a job per reducers' requests.
> NOTE: The built-in ShuffleHandler does get APPLICATION_INIT events due to hard-coded expression in hadoop code. The current TaskAttemptImpl.java code explicitly call: serviceData.put (ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID, ...) and ignores any additional AuxiliaryService. As a result, only the built-in ShuffleHandler will get APPLICATION_INIT events.  Any 3rd party AuxillaryService will never get APPLICATION_INIT events.
> I think a solution can be in one of two ways:
> 1. Change TaskAttemptImpl.java to loop on all Auxiliary Services and register each of them, by calling serviceData.put (…) in loop.
> 2. Change AuxServices.java similar to the fix in: MAPREDUCE-2668  "APPLICATION_STOP is never sent to AuxServices".  This means that in case the 'handle' method gets APPLICATION_INIT event it will demultiplex it to all Aux Services regardless of the value in event.getServiceID().
> I prefer the 2nd solution.  I am welcoming any ideas.  I can provide the needed patch for any option that people like.
> See [Pluggable Shuffle in Hadoop documentation|http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira