You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Nik Martin <ni...@nfinausa.com> on 2013/01/17 06:08:41 UTC

Get system VM out of hung migrating state

Ok, the saga of the hung console proxy VM continues, and I really need 
to get this cleaned up.  After a hard reboot and having to force the 
hypervisor online via the database, my consol proxy VM is still hung in 
migrating state, so CS essentially ignores it:


2013-01-16 23:03:38,763 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-492:null) Seq 3-1375993858: Executing request
2013-01-16 23:03:39,155 WARN  [xen.resource.CitrixResourceBase] 
(DirectAgent-492:null) The VM is now missing marking it as Stopped v-11-VM
2013-01-16 23:03:39,155 DEBUG [agent.manager.DirectAgentAttache] 
(DirectAgent-492:null) Seq 3-1375993858: Response Received:
2013-01-16 23:03:39,155 DEBUG [agent.transport.Request] 
(DirectAgent-492:null) Seq 3-1375993858: Processing:  { Ans: , MgmtId: 
130577622632, via: 3, Ver: v1, Flags: 10, 
[{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{"v-11-VM":{"t":"4bafdf37-d0e7-4652-90dd-afa6f9099598","u":"Stopped"}},"_isExecuted":false,"result":true,"wait":0}}] 
}
2013-01-16 23:03:39,160 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
(DirectAgent-492:null) VM v-11-VM: cs state = Migrating and realState = 
Stopped
2013-01-16 23:03:39,160 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
(DirectAgent-492:null) VM v-11-VM: cs state = Migrating and realState = 
Stopped
2013-01-16 23:03:39,160 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
(DirectAgent-492:null) Skipping vm in migrating state: 
VM[ConsoleProxy|v-11-VM]
2013-01-16 23:03:44,793 DEBUG [agent.manager.AgentManagerImpl] 
(AgentManager-Handler-2:null) Ping from 13


How do I get this VM back online, and do any developers need anything to 
try and dtermine what is wrong?  The status of system VMs and hosts is 
VERY fragile in CS, and there seems to be little error recovery in 
place.  CS shoudl not just ignore a VM in a migrating state.  I have a 
migratewait setting of 3600 seconds, but this VM has been in this state 
 > 24 hours, after multiple management restarts.
-- 

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability

RE: Get system VM out of hung migrating state

Posted by Tamas Monos <ta...@veber.co.uk>.
Hi,

I am not sure if I will be able to help you at all, but will try.
Are you using 3.0.2? Can you just not kill the Console Proxy VM from CS interface with 'Destroy' and 3.0.2 will automatically re-create it.
If you are using 3.0.2+ do not attempt this as later versions have a tendency not to rebuilt systemVM (4.0.x)
If it does not allow you because of the VM state you could go and edit the database and mark the systemVM as Destroyed and put a date next to it. Then restart the management server.
Off course MAKE a BACKUP before you'd do that.

I had to do a similar thing when my test environment self-destructed :/

I'm running 3.0.2 with vmware for nearly a year now non-stop and my systemVMs are fine and CS too.
I was always struggling with Xen/KVM cluster because CS wants to manage the cluster and does not really do a great job in that respect, so I went vmware way where CS has the 'hands-off' approach and let's vcenter to do the clustering job.
I know this does not help you right now, just a thing to consider.

Regards

Tamas Monos                                               DDI         +44(0)2034687012
Chief Technical                                             Office    +44(0)2034687000
Veber: The Hosting Specialists               Fax         +44(0)871 522 7057
http://www.veber.co.uk

Follow us on Twitter: www.twitter.com/veberhost
Follow us on Facebook: www.facebook.com/veberhost

-----Original Message-----
From: Nik Martin [mailto:nik.martin@nfinausa.com] 
Sent: 17 January 2013 05:09
To: cloudstack-users@incubator.apache.org
Subject: Get system VM out of hung migrating state

Ok, the saga of the hung console proxy VM continues, and I really need to get this cleaned up.  After a hard reboot and having to force the hypervisor online via the database, my consol proxy VM is still hung in migrating state, so CS essentially ignores it:


2013-01-16 23:03:38,763 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-492:null) Seq 3-1375993858: Executing request
2013-01-16 23:03:39,155 WARN  [xen.resource.CitrixResourceBase]
(DirectAgent-492:null) The VM is now missing marking it as Stopped v-11-VM
2013-01-16 23:03:39,155 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-492:null) Seq 3-1375993858: Response Received:
2013-01-16 23:03:39,155 DEBUG [agent.transport.Request]
(DirectAgent-492:null) Seq 3-1375993858: Processing:  { Ans: , MgmtId: 
130577622632, via: 3, Ver: v1, Flags: 10, [{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{"v-11-VM":{"t":"4bafdf37-d0e7-4652-90dd-afa6f9099598","u":"Stopped"}},"_isExecuted":false,"result":true,"wait":0}}]
}
2013-01-16 23:03:39,160 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-492:null) VM v-11-VM: cs state = Migrating and realState = Stopped
2013-01-16 23:03:39,160 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-492:null) VM v-11-VM: cs state = Migrating and realState = Stopped
2013-01-16 23:03:39,160 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-492:null) Skipping vm in migrating state: 
VM[ConsoleProxy|v-11-VM]
2013-01-16 23:03:44,793 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-2:null) Ping from 13


How do I get this VM back online, and do any developers need anything to try and dtermine what is wrong?  The status of system VMs and hosts is VERY fragile in CS, and there seems to be little error recovery in place.  CS shoudl not just ignore a VM in a migrating state.  I have a migratewait setting of 3600 seconds, but this VM has been in this state  > 24 hours, after multiple management restarts.
-- 

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability