You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Nik Martin <ni...@nfinausa.com> on 2013/01/16 21:44:34 UTC

Migrating VMs from Host in Alert state

I have to physically reboot a Xenserver host that has gotten disconnected 
from CS somehow.  I have restarted the api toolstack, but CS wont 
reconnect.  I can ssh to the server fine, and xenapi is running fine.  Is 
there any way to live migrate VMs to another host via the xe CLI without 
totally breaking Cloudstack?  I'm trying everyhing I can not to have to 
reboot a host full of VMs because CS won't re-connect to the Hypervisor.

Regards,

Nik

Nik Martin
+1.251.243.0043 x1003
Relentless Reliability


Re: Migrating VMs from Host in Alert state

Posted by Nik Martin <ni...@nfinausa.com>.
On 01/16/2013 11:52 PM, Nik Martin wrote:
> On 01/16/2013 11:45 PM, Nik Martin wrote:
>> On 01/16/2013 11:27 PM, Koushik Das wrote:

If a VM ever gets stuck in a "Migrating" state, CS never attempts to 
clean it up, even afetr the migrate interval passes.

>>> Can you check from the MS logs if CS is trying to reconnect and
>>> failing with exception or reconnect is not even attempted?
>> It is trying to reconnect and failing and the error is that the XAPI is
>> not available, but it is working fine. I have since forced a reboot,
>> which killed all my VMs since I could not migrate them off.
>>
>
> Here are the errors prior to the reboot:
>
> 2013-01-15 18:15:16,012 WARN  [cloud.vm.VirtualMachineManagerImpl]
> (DirectAgent-232:null) Cleanup failed due to Exception:
> com.cloud.utils.exceptio
> n.CloudRuntimeException
> Message: Unable to reset master of slave 172.16.5.3 to 172.16.5.4 due to
> org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Error wr
> iting to server
> Stack: com.cloud.utils.exception.CloudRuntimeException: Unable to reset
> master of slave 172.16.5.3 to 172.16.5.4 due to org.apache.xmlrpc.XmlRpcExc
> eption: Failed to create input stream: Error writing to server
>          at
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:439)
>
>          at
> com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:657)
>
>          at
> com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:4872)
>
>          at
> com.cloud.hypervisor.xen.resource.XenServer56Resource.execute(XenServer56Resource.java:167)
>
>          at
> com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:67)
>
>          at
> com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:187)
>
>          at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>          at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>          at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>          at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
>
>          at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
>
>          at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
>          at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
>          at java.lang.Thread.run(Thread.java:679)
>
> 2013-01-15 18:15:16,012 DEBUG [agent.transport.Request]
> (RouterMonitor-1:null) Seq 1-1375404851: Received:  { Ans: , MgmtId:
> 130577622632, via: 1, Ver: v1, Flags: 10, { NetworkUsageAnswer } }
> 2013-01-15 18:15:16,012 WARN  [cloud.vm.VirtualMachineManagerImpl]
> (RouterMonitor-1:null) Cleanup failed due to Exception:
> com.cloud.utils.exception.CloudRuntimeException
> Message: Unable to reset master of slave 172.16.5.3 to 172.16.5.4 due to
> org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Error
> writing to server
> Stack: com.cloud.utils.exception.CloudRuntimeException: Unable to reset
> master of slave 172.16.5.3 to 172.16.5.4 due to
> org.apache.xmlrpc.XmlRpcExc:
>
>
> This is after the xapi restart, and the Host showing error in
> maintenance state in CS
>>>
>>> -Koushik
>>>
>>>> -----Original Message-----
>>>> From: Nik Martin [mailto:nik.martin@nfinausa.com]
>>>> Sent: Thursday, January 17, 2013 2:15 AM
>>>> To: cloudstack-users@incubator.apache.org
>>>> Subject: Migrating VMs from Host in Alert state
>>>>
>>>> I have to physically reboot a Xenserver host that has gotten
>>>> disconnected
>>>> from CS somehow.  I have restarted the api toolstack, but CS wont
>>>> reconnect.  I can ssh to the server fine, and xenapi is running
>>>> fine.  Is there
>>>> any way to live migrate VMs to another host via the xe CLI without
>>>> totally
>>>> breaking Cloudstack?  I'm trying everyhing I can not to have to
>>>> reboot a host
>>>> full of VMs because CS won't re-connect to the Hypervisor.
>>>>
>>>> Regards,
>>>>
>>>> Nik
>>>>
>>>> Nik Martin
>>>> +1.251.243.0043 x1003
>>>> Relentless Reliability
>>>
>>
>>
>
>


-- 

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability

Re: Migrating VMs from Host in Alert state

Posted by Nik Martin <ni...@nfinausa.com>.
On 01/16/2013 11:45 PM, Nik Martin wrote:
> On 01/16/2013 11:27 PM, Koushik Das wrote:
>> Can you check from the MS logs if CS is trying to reconnect and
>> failing with exception or reconnect is not even attempted?
> It is trying to reconnect and failing and the error is that the XAPI is
> not available, but it is working fine. I have since forced a reboot,
> which killed all my VMs since I could not migrate them off.
>

Here are the errors prior to the reboot:

2013-01-15 18:15:16,012 WARN  [cloud.vm.VirtualMachineManagerImpl] 
(DirectAgent-232:null) Cleanup failed due to Exception: 
com.cloud.utils.exceptio
n.CloudRuntimeException
Message: Unable to reset master of slave 172.16.5.3 to 172.16.5.4 due to 
org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Error wr
iting to server
Stack: com.cloud.utils.exception.CloudRuntimeException: Unable to reset 
master of slave 172.16.5.3 to 172.16.5.4 due to org.apache.xmlrpc.XmlRpcExc
eption: Failed to create input stream: Error writing to server
         at 
com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:439)
         at 
com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:657)
         at 
com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:4872)
         at 
com.cloud.hypervisor.xen.resource.XenServer56Resource.execute(XenServer56Resource.java:167)
         at 
com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:67)
         at 
com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:187)
         at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
         at 
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
         at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
         at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
         at java.lang.Thread.run(Thread.java:679)

2013-01-15 18:15:16,012 DEBUG [agent.transport.Request] 
(RouterMonitor-1:null) Seq 1-1375404851: Received:  { Ans: , MgmtId: 
130577622632, via: 1, Ver: v1, Flags: 10, { NetworkUsageAnswer } }
2013-01-15 18:15:16,012 WARN  [cloud.vm.VirtualMachineManagerImpl] 
(RouterMonitor-1:null) Cleanup failed due to Exception: 
com.cloud.utils.exception.CloudRuntimeException
Message: Unable to reset master of slave 172.16.5.3 to 172.16.5.4 due to 
org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Error 
writing to server
Stack: com.cloud.utils.exception.CloudRuntimeException: Unable to reset 
master of slave 172.16.5.3 to 172.16.5.4 due to org.apache.xmlrpc.XmlRpcExc:


This is after the xapi restart, and the Host showing error in 
maintenance state in CS
>>
>> -Koushik
>>
>>> -----Original Message-----
>>> From: Nik Martin [mailto:nik.martin@nfinausa.com]
>>> Sent: Thursday, January 17, 2013 2:15 AM
>>> To: cloudstack-users@incubator.apache.org
>>> Subject: Migrating VMs from Host in Alert state
>>>
>>> I have to physically reboot a Xenserver host that has gotten
>>> disconnected
>>> from CS somehow.  I have restarted the api toolstack, but CS wont
>>> reconnect.  I can ssh to the server fine, and xenapi is running
>>> fine.  Is there
>>> any way to live migrate VMs to another host via the xe CLI without
>>> totally
>>> breaking Cloudstack?  I'm trying everyhing I can not to have to
>>> reboot a host
>>> full of VMs because CS won't re-connect to the Hypervisor.
>>>
>>> Regards,
>>>
>>> Nik
>>>
>>> Nik Martin
>>> +1.251.243.0043 x1003
>>> Relentless Reliability
>>
>
>


-- 

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability

Re: Migrating VMs from Host in Alert state

Posted by Nik Martin <ni...@nfinausa.com>.
On 01/16/2013 11:27 PM, Koushik Das wrote:
> Can you check from the MS logs if CS is trying to reconnect and failing with exception or reconnect is not even attempted?
It is trying to reconnect and failing and the error is that the XAPI is 
not available, but it is working fine. I have since forced a reboot, 
which killed all my VMs since I could not migrate them off.

>
> -Koushik
>
>> -----Original Message-----
>> From: Nik Martin [mailto:nik.martin@nfinausa.com]
>> Sent: Thursday, January 17, 2013 2:15 AM
>> To: cloudstack-users@incubator.apache.org
>> Subject: Migrating VMs from Host in Alert state
>>
>> I have to physically reboot a Xenserver host that has gotten disconnected
>> from CS somehow.  I have restarted the api toolstack, but CS wont
>> reconnect.  I can ssh to the server fine, and xenapi is running fine.  Is there
>> any way to live migrate VMs to another host via the xe CLI without totally
>> breaking Cloudstack?  I'm trying everyhing I can not to have to reboot a host
>> full of VMs because CS won't re-connect to the Hypervisor.
>>
>> Regards,
>>
>> Nik
>>
>> Nik Martin
>> +1.251.243.0043 x1003
>> Relentless Reliability
>


-- 

Regards,

Nik

Nik Martin
nfina Technologies, Inc.
+1.251.243.0043 x1003
http://nfinausa.com
Relentless Reliability

RE: Migrating VMs from Host in Alert state

Posted by Koushik Das <ko...@citrix.com>.
Can you check from the MS logs if CS is trying to reconnect and failing with exception or reconnect is not even attempted?

-Koushik

> -----Original Message-----
> From: Nik Martin [mailto:nik.martin@nfinausa.com]
> Sent: Thursday, January 17, 2013 2:15 AM
> To: cloudstack-users@incubator.apache.org
> Subject: Migrating VMs from Host in Alert state
> 
> I have to physically reboot a Xenserver host that has gotten disconnected
> from CS somehow.  I have restarted the api toolstack, but CS wont
> reconnect.  I can ssh to the server fine, and xenapi is running fine.  Is there
> any way to live migrate VMs to another host via the xe CLI without totally
> breaking Cloudstack?  I'm trying everyhing I can not to have to reboot a host
> full of VMs because CS won't re-connect to the Hypervisor.
> 
> Regards,
> 
> Nik
> 
> Nik Martin
> +1.251.243.0043 x1003
> Relentless Reliability