You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2013/09/11 17:07:52 UTC

[jira] [Commented] (MAPREDUCE-5332) Support token-preserving restart of history server

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764384#comment-13764384 ] 

Daryn Sharp commented on MAPREDUCE-5332:
----------------------------------------

+HistoryServerFileSystemStateStore+
# Suggest: It may be clearer to rename the {{TOKEN_FOO_PREFIX}} constants to be {{TOKEN_FOO_DIR_PREFIX}} or {{TOKEN_*_FILE_PREFIX}}.
# Suggest: I'd consider not having the hardcoded {{ROOT_STATE_DIR_NAME}} added to the user's path configured by {{MR_HS_FS_STATE_STORE_URI}}.  Is there an advantage to not using exactly what the user specified?  Up to you.
# Question: In {{startStateStorage}}, why 2 mkdirs instead of 1?  A mkdir via {{createDir}} is only going to check the permissions of the leaf dir which seems dubious.  If any of the parent dirs are owned by another user with open permissions, the directory created by the JHS can be deleted and recreated with open permissions.  Point is I'm not sure the extra checks add value, but I suppose they don't hurt.  Up to you. 
# Bug: Unlike {{HistoryServerMemStateStore}}, there appear to be no checks for things being added twice - although arguably those checks all belong in ADTSM.  Token check is there, but not a secret check.  I think the state stores should behave consistently.
# Bug: In {{getBucketPath}}, I think you want to mod (%) the seq number instead of dividing?  Otherwise it creates a janitorial job for someone to clean up empty directories.

+HistoryServerStateStore+
# Suggest: For clarify, perhaps rename to {{HistoryServerStateStore*Service*}}.  I kept getting it confused with {{HistoryServerState}}.
# Suggest: I'd consider removing the dtsm's recover method.  Perhaps {{loadState}} can take the dtsm as an argument and directly populate it instead of populating an intermediary {{HistoryServerState}} object before populating the dtsm.

+JobHistoryServer+
# Bug: Is the {{stateStore}} going to be started twice?  Once in {{startService}} if {{recoveryEnabled}}, again by {{super#startService}} when it iterates the composite services?
# Bug: Should the state store service be started after being recovered?  Not before?
# Suggest: Perhaps the {{stateStore}} should be conditionally created & registered in {{serviceInit}}, then having the state loading all in {{stateStore#start}} invoked by the composite service.  Then there's no need for additional logic in {{JHS#serviceStart}}.  Just an idea.

+JobHistoryStateStoreFactory+
# Bug? Seems a bit odd if recovery is enabled but there's no class defined, a {{HistoryServerNullStateStore}} is created.  It appears {{JHS#serviceStart}} will fail when it calls {{loadState}} and an {{UnsupportedOperationException}} is thrown.  The null store seems to have no real value other than deferring an error from {{JHS#serviceInit}} to {{JHS#serviceStart}}?
# Suggest: It feels like {{JobHistoryServer}} should only create & register a state store if required - which ties in with the prior comment in JHS.  {{serviceInit}} only asks the factory for a state store if recovery is enabled.  The factory throws if no class is defined.
                
> Support token-preserving restart of history server
> --------------------------------------------------
>
>                 Key: MAPREDUCE-5332
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5332
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: jobhistoryserver
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: MAPREDUCE-5332-2.patch, MAPREDUCE-5332-3.patch, MAPREDUCE-5332.patch
>
>
> To better support rolling upgrades through a cluster, the history server needs the ability to restart without losing track of delegation tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira