You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Koushik Das (JIRA)" <ji...@apache.org> on 2013/12/19 08:11:07 UTC

[jira] [Resolved] (CLOUDSTACK-5470) Xenserver - Host continues to remain in "Up" state when powere off due to exception - "Unable to reset master of slave"

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koushik Das resolved CLOUDSTACK-5470.
-------------------------------------

    Resolution: Cannot Reproduce

Snapshots don't have any impact on the status of host.
This also looks like a XS setup issue similar to CLOUDSTACK-2428 where hosts in a cluster are not time synced.

Please reopen if you still see this issue after fixing the time sync. problem.

> Xenserver - Host continues to remain in "Up" state when powere off due to exception - "Unable to reset master of slave"
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5470
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5470
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: Koushik Das
>             Fix For: 4.3.0
>
>
> Set up - 
> Advanced zone set up with 2 Xenserver 6.2 hosts
> Had about 5 vms on each of the hosts.
> I had hourly snapshots that were scheduled for ROOT volumes of all the Vms.
> Shutdown master host. ( I powered off the host machine using IPMI).
> But the host continue to show up as being in "Up" state:
> I see the following exception in the logs:
> Both hosts show the status as being "UP" in the cloud platform and following exception is seen:
> 2013-12-04 18:13:15,510 WARN [c.c.a.m.DirectAgentAttache] (DirectAgent-59:ctx-0bb6ad1e) Seq 1-2071592963: Exception Caught while executing command
> com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of slave 10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcException: Failed to read server's response: connect timed out
> at com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:443)
> at com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:661)
> at com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:5985)
> at com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:8248)
> at com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:587)
> at com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:59)
> at com.cloud.hypervisor.xen.resource.XenServer610Resource.executeRequest(XenServer610Resource.java:106)
> at com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:216)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> 2013-12-04 18:13:15,511 DEBUG [c.c.a.m.DirectAgentAttache] (DirectAgent-59:ctx-0bb6ad1e) Seq 1-2071592963: Response Received:
> 2013-12-04 18:13:15,511 DEBUG [c.c.a.t.Request] (DirectAgent-59:ctx-0bb6ad1e) Seq 1-2071592963: Processing: { Ans: , MgmtId: 112516401760401, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of slave 10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcException: Failed to read server's response: connect timed out","wait":0}}] }
> Also all the snapshots that were sent to the host that is currently down fails with following exception:
> 2013-12-04 18:10:24,856 DEBUG [o.a.c.s.s.SnapshotServiceImpl] (Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) create snapshot TestVM-1_ROOT-3_201
> 31204231014 failed: com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of slave 10.223.59.66 to 10.223.59.67 due to org.
> apache.xmlrpc.XmlRpcException: Failed to read server's response: connect timed out
> 2013-12-04 18:10:24,867 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] (Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) Failed to take snapshot: com.
> cloud.utils.exception.CloudRuntimeException: Unable to reset master of slave 10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcExce
> ption: Failed to read server's response: connect timed out
> 2013-12-04 18:10:24,872 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) Failed to create snapshot
> com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Unable to reset master of slave 10.223.59.66 to 10.223.59.67 due to org.apache.xmlrpc.XmlRpcException: Failed to read server's response: connect timed out
> at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:281)
> at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
> at sun.reflect.GeneratedMethodAccessor334.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> at $Proxy160.takeSnapshot(Unknown Source)
> at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1342)
> at com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1402)
> at sun.reflect.GeneratedMethodAccessor333.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> at $Proxy232.takeSnapshot(Unknown Source)
> at org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:181)
> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:161)
> at com.cloud.api.ApiAsyncJobDispatcher.runJobInContext(ApiAsyncJobDispatcher.java:109)
> at com.cloud.api.ApiAsyncJobDispatcher$1.run(ApiAsyncJobDispatcher.java:66)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:63)
> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:520)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> 2013-12-04 18:10:24,883 DEBUG [o.a.c.s.v.VolumeServiceImpl] (Job-Executor-27:ctx-6cb3f72a ctx-dedf771a) Take snapshot: 3 failed



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)