You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Koushik Das (JIRA)" <ji...@apache.org> on 2014/02/17 06:43:20 UTC

[jira] [Created] (CLOUDSTACK-6124) During MS maintenance unfinished work items are not cleaned up resulting in them getting repeated for every subsequent maintenance

Koushik Das created CLOUDSTACK-6124:
---------------------------------------

             Summary: During MS maintenance unfinished work items are not cleaned up resulting in them getting repeated for every subsequent maintenance
                 Key: CLOUDSTACK-6124
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6124
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.3.0
            Reporter: Koushik Das
            Assignee: Koushik Das
             Fix For: 4.4.0


During MS shutdown, all pending work items (op_it_work.step != 'Done') for it are picked up by other MS in cluster. The new MS then try to see for all pending work items, if the VMs are running or not and if not try to start them (using the same mechanism used to HA VMs). In case the investigators find out that VMs are still alive no action is needed. This completes the process for checking all pending work items.

Looks like there is a bug in the code where the op_it_work.step is not marked as 'Done' in the above case thereby leaving the work items as pending always. As a result every time MS owning these work items is shutdown, the work items are picked up by another MS and the steps mentioned above gets repeated.

Scenario where a pending work item may get created. If there is a failure to deploy VM then type and step gets set to 'Starting' and 'Release' respectively. Ideally if the operations ends gracefully then the step gets updated to 'Done'. But if there is an abrupt termination then it is possible that for some work items the step still remains in 'Release'. As a result of this the step never gets updated to 'Done' for these items and are always tried when a new MS takes over.





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)