You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jonathan Hung (JIRA)" <ji...@apache.org> on 2017/02/16 01:53:41 UTC

[jira] [Comment Edited] (YARN-5946) Create YarnConfigurationStore interface and InMemoryConfigurationStore class

    [ https://issues.apache.org/jira/browse/YARN-5946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868965#comment-15868965 ] 

Jonathan Hung edited comment on YARN-5946 at 2/16/17 1:53 AM:
--------------------------------------------------------------

Thanks [~leftnoteasy] for the comments.
bq. It is actually means "last confirmed" transaction id, correct? I found in the step 5 it get increased even if update failed.
It is the txnid for which all logs with a lesser txnid do not need to be replayed on recovery. Either this means the log has been persisted to the store in case of successful refresh, or the mutation has been deemed invalid in case of failure to refresh (which is why it is incremented even if update fails). So in this case perhaps confirmMutation(long id) should be confirmMutation(long id, boolean isValid). 
bq. So I suggest to persist a transaction-id in addition to "last good" configuration to table-1.
Sure, I think this is implementation dependent, in general though we can have a configuration entry with key="transaction.id" or something similar.
bq. Who will generate "id" for each logItem?
I think the YarnConfigurationStore should maintain the current id and generate new ones, which are returned upon logMutation calls. So when MCM receives a mutation, it will log it, which will then return an incremented id "id", then MCM will try to refresh, and will call confirmMutation("id", true/false).

Here the YarnConfigurationStore can store a map of "id" to LogMutation in memory, so it can quickly store the LogMutation into table1 if confirmMutation(id, true) is called.
bq. YarnConfigurationStore#retrieve, does it mean get from table-1 or get from table-1/2/3 (which described by your "for the failover case ..." in your previous comment)? I would prefer the latter one.
On failover MCM would call retrieve (which returns a "conf"), and getPendingMutations, apply each pendingMutation one by one to "conf", and confirmMutation(pendingMutation.id, true/false) if refresh is successful/unsuccessful. So YarnConfigurationStore#retrieve on its own returns from table1 which may not have all logs applied, but MCM will reconstruct the updated configuration from getPendingMutations. So not sure if retrieveLatestConf is necessary (the third API in previous comment).

Since MCM stores an in memory configuration, YarnConfigurationStore#retrieve and getPendingMutations should be only called once, on failover.
So my proposal is: {noformat}1) initialize(Configuration conf, Map<String, String> schedConf);
2) retrieve which returns conf stored in table1
3) long logMutation(LogMutation) returns id to save the new mutation in table2
4) confirmMutation(long id, boolean isValid) to increment txnid stored in table1, and persist the logged mutation if isValid==true
5) List<LogMutation> getPendingMutations(void) for getting unconfirmed mutations{noformat}
I think we can add getConfirmedConfHistory in a later patch.

If no concerns with this approach, will upload patch. Thanks!


was (Author: jhung):
Thanks [~leftnoteasy] for the comments.
bq. It is actually means "last confirmed" transaction id, correct? I found in the step 5 it get increased even if update failed.
It is the txnid for which all logs with a lesser txnid do not need to be replayed on recovery. Either this means the log has been persisted to the store in case of successful refresh, or the mutation has been deemed invalid in case of failure to refresh (which is why it is incremented even if update fails). So in this case perhaps confirmMutation(long id) should be confirmMutation(long id, boolean isValid). 
bq. So I suggest to persist a transaction-id in addition to "last good" configuration to table-1.
Sure, I think this is implementation dependent, in general though we can have a configuration entry with key="transaction.id" or something similar.
bq. Who will generate "id" for each logItem?
I think the YarnConfigurationStore should maintain the current id and generate new ones, which are returned upon logMutation calls. So when MCM receives a mutation, it will log it, which will then return an incremented id "id", then MCM will try to refresh, and will call confirmMutation("id", true/false).

Here the YarnConfigurationStore can store a map of "id" to LogMutation in memory, so it can quickly store the LogMutation into table1 if confirmMutation(id, true) is called.
bq. YarnConfigurationStore#retrieve, does it mean get from table-1 or get from table-1/2/3 (which described by your "for the failover case ..." in your previous comment)? I would prefer the latter one.
On failover MCM would call retrieve (which returns a "conf"), and getPendingMutations, apply each pendingMutation one by one to "conf", and confirmMutation(pendingMutation.id, true/false) if refresh is successful/unsuccessful. So YarnConfigurationStore#retrieve on its own returns from table1 which may not have all logs applied, but MCM will reconstruct the updated configuration from getPendingMutations. So not sure if retrieveLatestConf is necessary (the third API in previous comment).

Since MCM stores an in memory configuration, YarnConfigurationStore#retrieve and getPendingMutations should be only called once, on failover.
So my proposal is: {noformat}1) initialize(Configuration conf, Map<String, String> schedConf);
2) retrieve which returns conf stored in table1
3) logMutation to save the new mutation in table2
4) confirmMutation(long id, boolean isValid) to increment txnid stored in table1, and persist the logged mutation if isValid==true
5) List<LogMutation> getPendingMutations(void) for getting unconfirmed mutations{noformat}
I think we can add getConfirmedConfHistory in a later patch.

If no concerns with this approach, will upload patch. Thanks!

> Create YarnConfigurationStore interface and InMemoryConfigurationStore class
> ----------------------------------------------------------------------------
>
>                 Key: YARN-5946
>                 URL: https://issues.apache.org/jira/browse/YARN-5946
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jonathan Hung
>            Assignee: Jonathan Hung
>         Attachments: YARN-5946.001.patch, YARN-5946-YARN-5734.002.patch
>
>
> This class provides the interface to persist YARN configurations in a backing store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org