You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Igor Morozov (JIRA)" <ji...@apache.org> on 2016/07/29 20:44:20 UTC

[jira] [Commented] (AURORA-1721) Support user initiated rollback

    [ https://issues.apache.org/jira/browse/AURORA-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399975#comment-15399975 ] 

Igor Morozov commented on AURORA-1721:
--------------------------------------

Let's move a discussion here as you suggested. The only reason we need this rollback from terminal state is to support global multi datacenter workflow updates. When any datacenter upgrade fails we need to rollback everything to the initial state. 
We can do that in aurora using a slightly different approach: 
1. Limit the scope of a rollback to only job updates in a ROLLING_FORWARD and ROLL_FORWARD_AWAITING_PULSE states.
2. Extend the semantic of pulsed updates to support another mode of operation: automatic pausing between state transitions in job update:

/** Job update thresholds and limits. */
struct JobUpdateSettings {
....
  10: optional i32 blockIfNoPulsesBetweenStateTransitions
}

So if an update has started in this extended pulsed state it will be automatically paused before reaching terminal state ROLLED_FORWARD. The transitions to the remaining terminal states(ABORTED, FAILED, ROLLED_BACK, ERROR) will stay as they are now.

I'm open to other suggestions to solve our use case in aurora.

> Support user initiated rollback 
> --------------------------------
>
>                 Key: AURORA-1721
>                 URL: https://issues.apache.org/jira/browse/AURORA-1721
>             Project: Aurora
>          Issue Type: Task
>          Components: Scheduler
>            Reporter: Igor Morozov
>            Assignee: Igor Morozov
>              Labels: Uber
>             Fix For: 0.16.0
>
>
> The proposal to support user initiated rollback:
> 1. Create new thrift API:
>  /**Rollback job update. */
>   Response rollbackJobUpdate(
>       /** The update to rollback. */
>       1: JobUpdateKey key,
>       /** A user-specified message to include with the induced job update state change. */
>       3: string message)
> 2.  Implement new API in a scheduler so the implementation would just undo the latest JobUpdate effectively trying to apply initialState to the job. If that is for some reason is impossible them rollback with fail with appropriate error message.
> 3. Support new aurora client command 'rollback'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)