You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Maxim Khutornenko (JIRA)" <ji...@apache.org> on 2014/05/13 01:33:19 UTC

[jira] [Commented] (AURORA-413) aurora update fails if update results in a pending job

    [ https://issues.apache.org/jira/browse/AURORA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995813#comment-13995813 ] 

Maxim Khutornenko commented on AURORA-413:
------------------------------------------

A failed update rollback is intended to minimize the entropy of multiple simultaneously active job configurations. The rollback is active by default but can be turned off by setting the UpdateConfig.rollback_on_failure to False in the aurora job file to effectively achieve what you are suggesting.

> aurora update fails if update results in a pending job
> ------------------------------------------------------
>
>                 Key: AURORA-413
>                 URL: https://issues.apache.org/jira/browse/AURORA-413
>             Project: Aurora
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.5.0
>            Reporter: Anindya Sinha
>
> Assume I have a job running on the cluster with 2 instances. Let us say both tasks are in RUNNING state.
> At this point, I update my job to bump up the # of instances from 2 to 4, and do an aurora update. As expected, it leaves the 2 running instances intact, and attempts to start instances 3 and 4. Assume instance 3 starts up fine and is in RUNNING state. If for some reason, the cluster is in such a state that instance 4 cannot be scheduled immediately (ie. it is in PENDING state). In that case, the update fails if the task does not move to RUNNING state in 30-45 seconds after attempting to launch. Since update fails, it kills instance 3 as well.
> aurora update should not fail if a task is in PENDING state since that task may get scheduled when some other job finishes or fails. Also, if I were to do a aurora create instead with # of instances = 4, it keeps 3 of them in RUNNING while the 4th instance is in PENDING state. So, the behavior is different depending on whether we do a "aurora create" v "aurora update" which ideally should not be the case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)