You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Likitha Shetty (JIRA)" <ji...@apache.org> on 2013/12/27 10:48:51 UTC

[jira] [Comment Edited] (CLOUDSTACK-5504) Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported failure because of timing out after 20 minutes.But the snapshot process continues to succeed in Vmcenter after NFS was brought up.

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850187#comment-13850187 ] 

Likitha Shetty edited comment on CLOUDSTACK-5504 at 12/27/13 9:48 AM:
----------------------------------------------------------------------

Sangeetha, if this issue is VMware related then it could be related to bug - [https://issues.apache.org/jira/browse/CLOUDSTACK-5519] where the task that failed on CloudStack after 20 minutes continued to execute in vCenter server.


was (Author: likithas):
Sangeetha, can you confirm if this issue is VMware specific because in the logs i see only XenServer related failures.

2013-12-13 17:13:44,440 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to create snapshot 
com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException: Commands 422576170 to Host 5 timed out after 43200 
        at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:275) 
        at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:135) 
        at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:294) 

Also, if this issue is VMware related then it could be related to bug - [https://issues.apache.org/jira/browse/CLOUDSTACK-5519] where the task that failed on CloudStack after 20 minutes continued to execute in vCenter server.

> Vmware-Primary store unavailable for 10 mts - All snapshot tasks reported failure because of timing out after 20 minutes.But the snapshot process continues to succeed in Vmcenter after NFS was brought up.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5504
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5504
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Assignee: Likitha Shetty
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: primarydown.rar
>
>
> Setup:
> Advanced zone set up with 2 5.1 ESXI host.
> 1. Deploy few Vms in each of the hosts  , so we start with 11 Vms.
> 2. Create snapshot for ROOT volumes.
> 3. When snapshot is still in progress , Make the primary storage unavailable for 10 mts.
> 4. Bring up the primary store after more than 10 mts.
> When the primary store was brought up , I see the snapshots that were in progress actually continue to download to secondary and succeed . 
> one of the snapshots that succeeded and fully available in secondary store:
> root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]# ls -ltr
> total 446808
> -rw-r--r--. 1 root root      6454 Dec 13 21:09 1c545037-1d1c-4927-918a-2f3975e1076b.ovf
> -rw-r--r--. 1 root root 457069056 Dec 13 21:09 1c545037-1d1c-4927-918a-2f3975e1076b-disk0.vmdk
> [root@Rack3Host8 1c545037-1d1c-4927-918a-2f3975e1076b]#
> But all the 11 snapshot tasks from Cloud Stack side report failure after about 20 minutes and then snapshots are put in "CreatedOnPrimary" state.
> Next scheduled hourly snapshot is attempted and succeeds.
> |        22 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        21 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        20 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        19 | CreatedOnPrimary | 2013-12-13 21:52:15 | NULL                |
> |        18 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        17 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        16 | CreatedOnPrimary | 2013-12-13 21:52:16 | NULL                |
> |        14 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        25 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        24 | CreatedOnPrimary | 2013-12-13 21:52:17 | NULL                |
> |        23 | CreatedOnPrimary | 2013-12-13 21:52:18 | NULL                |
> |        22 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        21 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        20 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        19 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        18 | BackedUp         | 2013-12-13 22:42:15 | NULL                |
> |        17 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        16 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        14 | BackedUp         | 2013-12-13 22:42:16 | NULL                |
> |        25 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> |        24 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> |        23 | BackedUp         | 2013-12-13 22:42:17 | NULL                |
> +-----------+------------------+---------------------+---------------------+
> 86 rows in set (0.00 sec)
> 2013-12-13 16:52:17,720 DEBUG [c.c.a.t.Request] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Sending  { Cmd , MgmtId: 95307354844397, via: 5(s-10-VM), Ver: v1, Flags: 100011, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}] }
> 2013-12-13 17:13:44,250 DEBUG [c.c.a.m.AgentAttache] (AgentConnectTaskPool-3:ctx-1a3a5155) Seq 5-422576
> 170: Sending disconnect to class com.cloud.agent.manager.SynchronousListener
> 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more time because this is the current command
> 2013-12-13 17:13:44,252 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Waiting some more time because this is the current command
> 2013-12-13 17:13:44,252 INFO  [c.c.u.e.CSExceptionErrorCode] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Could not find exception: com.cloud.exception.OperationTimedoutException in error code list for exceptions
> 2013-12-13 17:13:44,256 WARN  [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Timed out on Seq 5-422576170:  { Cmd , MgmtId: 95307354844397, via: 5(s-10-VM), Ver: v1, Flags: 100011, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"ffa0b125-d1d9-4524-bd9e-03178914845b","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"0476f8f4-61b4-40d2-8e04-001a119fc294","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/4/18","volume":{"uuid":"15189035-3592-41ac-b2bc-a39d247e7d2f","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"a0c555cc-695c-3343-bfa0-3413a91dbfed","id":1,"poolType":"NetworkFilesystem","host":"10.223.57.195","path":"/export/home/vmware/primary","port":2049,"url":"NetworkFilesystem://10.223.57.195//export/home/vmware/primary/?ROLE=Primary&STOREUUID=a0c555cc-695c-3343-bfa0-3413a91dbfed"}},"name":"ROOT-18","size":2147483648,"path":"ROOT-18","volumeId":18,"vmName":"i-4-18-VM","accountId":4,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[a0c555cc695c3343bfa03413a91dbfed] i-4-18-VM/ROOT-18.vmdk\"]}","format":"OVA","id":18,"deviceId":0,"hypervisorType":"VMware"},"parentSnapshotPath":"snapshots/4/18/730b8477-d29f-45e8-b2a2-fb26e2ac220b/730b8477-d29f-45e8-b2a2-fb26e2ac220b","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://10.223.57.194/export/home/vmware/secondary","_role":"Image"}},"vmName":"i-4-18-VM","name":"TestVM-tiny-host-0ps-0-2_ROOT-18_20131213215216","hypervisorType":"VMware","id":69,"quiescevm":false}},"executeInSequence":false,"wait":21600}}] }
> 2013-12-13 17:13:44,294 DEBUG [c.c.a.m.AgentAttache] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Seq 5-422576170: Cancelling.
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.RemoteHostEndPoint] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException: Commands 422576170 to Host 5 timed out after 43200
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy snasphot failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException: Commands 422576170 to Host 5 timed out after 43200
> 2013-12-13 17:13:44,305 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) copy failed
> 2013-12-13 17:13:44,440 DEBUG [c.c.s.s.SnapshotManagerImpl] (Job-Executor-5:ctx-1d3bd6cc ctx-837a59a5) Failed to create snapshot
> com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:2, com.cloud.exception.OperationTimedoutException: Commands 422576170 to Host 5 timed out after 43200
>         at org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:275)
>         at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:135)
>         at org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:294)
>         at com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:951)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>         at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>         at $Proxy161.takeSnapshot(Unknown Source)
>         at org.apache.cloudstack.storage.volume.VolumeServiceImpl.takeSnapshot(VolumeServiceImpl.java:1341)
>         at com.cloud.storage.VolumeApiServiceImpl.takeSnapshot(VolumeApiServiceImpl.java:1486)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>         at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>         at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)