You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Pranav Saxena <pr...@citrix.com> on 2012/07/05 17:05:21 UTC

Review Request: CS-13376:Vm is stuck in Stopping state when MS is rebooted after the stop command was issued, but answer wasn't recieved from the backend yet

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5767/
-----------------------------------------------------------

Review request for cloudstack, Abhinandan Prateek and Koushik Das.


Description
-------

1) Have just 1 xen host.
2) Have user VMs running on the backend.
3) Issue Stop command and allow it to reach the backend but stop MS before answer comes back
4) start the management server.

VM is stuck in Stopping state after restarting the management server


This addresses bug CS-13376.


Diffs
-----

  server/src/com/cloud/vm/VirtualMachineManagerImpl.java 561e5b2 

Diff: https://reviews.apache.org/r/5767/diff/


Testing
-------

Debugged the code by putting system.exit statements to track the response from the backend and thereby tested it on my local cloudstack set up.


Thanks,

Pranav Saxena


RE: Review Request: CS-13376:Vm is stuck in Stopping state when MS is rebooted after the stop command was issued, but answer wasn't recieved from the backend yet

Posted by Anthony Xu <Xu...@citrix.com>.
Thanks for providing the patch.

But this patch may introduce race condition in clustered management servers.

Consider below situation.

1. two management servers
2. M1 stop a VM , VM is stopped in backend, the StopCommand haven't return, so the VM is still in Stopping Status.
3. restart M2, M2 set vm to stopped status before StopCommand return in M1
4. in M1, stopCommand returns, M1 tries to set VM to stopped VM, it will fail because he thinks someone else changes the VM status, UI will return stop VM failure.


I think below way might be safer.

In Cloudstack, there is a Async job queue, which records all Async jobs like StopCommand.
Usually only the MS who owns this job can mark this VM as Stopped.
There should be a timeout for each job, if the job times out, then other MS can mark this VM as Stopped.

Thanks,
Anthony




> -----Original Message-----
> From: Abhinandan Prateek [mailto:noreply@reviews.apache.org] On Behalf
> Of Abhinandan Prateek
> Sent: Friday, July 06, 2012 2:32 AM
> To: Koushik Das; Abhinandan Prateek
> Cc: cloudstack; Pranav Saxena
> Subject: Re: Review Request: CS-13376:Vm is stuck in Stopping state
> when MS is rebooted after the stop command was issued, but answer
> wasn't recieved from the backend yet
> 
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5767/#review8913
> -----------------------------------------------------------
> 
> Ship it!
> 
> 
> Ship It!
> 
> - Abhinandan Prateek
> 
> 
> On July 5, 2012, 3:05 p.m., Pranav Saxena wrote:
> >
> > -----------------------------------------------------------
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/5767/
> > -----------------------------------------------------------
> >
> > (Updated July 5, 2012, 3:05 p.m.)
> >
> >
> > Review request for cloudstack, Abhinandan Prateek and Koushik Das.
> >
> >
> > Description
> > -------
> >
> > 1) Have just 1 xen host.
> > 2) Have user VMs running on the backend.
> > 3) Issue Stop command and allow it to reach the backend but stop MS
> before answer comes back
> > 4) start the management server.
> >
> > VM is stuck in Stopping state after restarting the management server
> >
> >
> > This addresses bug CS-13376.
> >
> >
> > Diffs
> > -----
> >
> >   server/src/com/cloud/vm/VirtualMachineManagerImpl.java 561e5b2
> >
> > Diff: https://reviews.apache.org/r/5767/diff/
> >
> >
> > Testing
> > -------
> >
> > Debugged the code by putting system.exit statements to track the
> response from the backend and thereby tested it on my local cloudstack
> set up.
> >
> >
> > Thanks,
> >
> > Pranav Saxena
> >
> >


Re: Review Request: CS-13376:Vm is stuck in Stopping state when MS is rebooted after the stop command was issued, but answer wasn't recieved from the backend yet

Posted by Abhinandan Prateek <ap...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5767/#review8913
-----------------------------------------------------------

Ship it!


Ship It!

- Abhinandan Prateek


On July 5, 2012, 3:05 p.m., Pranav Saxena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5767/
> -----------------------------------------------------------
> 
> (Updated July 5, 2012, 3:05 p.m.)
> 
> 
> Review request for cloudstack, Abhinandan Prateek and Koushik Das.
> 
> 
> Description
> -------
> 
> 1) Have just 1 xen host.
> 2) Have user VMs running on the backend.
> 3) Issue Stop command and allow it to reach the backend but stop MS before answer comes back
> 4) start the management server.
> 
> VM is stuck in Stopping state after restarting the management server
> 
> 
> This addresses bug CS-13376.
> 
> 
> Diffs
> -----
> 
>   server/src/com/cloud/vm/VirtualMachineManagerImpl.java 561e5b2 
> 
> Diff: https://reviews.apache.org/r/5767/diff/
> 
> 
> Testing
> -------
> 
> Debugged the code by putting system.exit statements to track the response from the backend and thereby tested it on my local cloudstack set up.
> 
> 
> Thanks,
> 
> Pranav Saxena
> 
>


Re: Review Request: CS-13376:Vm is stuck in Stopping state when MS is rebooted after the stop command was issued, but answer wasn't recieved from the backend yet

Posted by Abhinandan Prateek <ap...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5767/#review8912
-----------------------------------------------------------

Ship it!


ok

- Abhinandan Prateek


On July 5, 2012, 3:05 p.m., Pranav Saxena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5767/
> -----------------------------------------------------------
> 
> (Updated July 5, 2012, 3:05 p.m.)
> 
> 
> Review request for cloudstack, Abhinandan Prateek and Koushik Das.
> 
> 
> Description
> -------
> 
> 1) Have just 1 xen host.
> 2) Have user VMs running on the backend.
> 3) Issue Stop command and allow it to reach the backend but stop MS before answer comes back
> 4) start the management server.
> 
> VM is stuck in Stopping state after restarting the management server
> 
> 
> This addresses bug CS-13376.
> 
> 
> Diffs
> -----
> 
>   server/src/com/cloud/vm/VirtualMachineManagerImpl.java 561e5b2 
> 
> Diff: https://reviews.apache.org/r/5767/diff/
> 
> 
> Testing
> -------
> 
> Debugged the code by putting system.exit statements to track the response from the backend and thereby tested it on my local cloudstack set up.
> 
> 
> Thanks,
> 
> Pranav Saxena
> 
>