You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Sangeetha Hariharan (JIRA)" <ji...@apache.org> on 2013/12/13 22:28:07 UTC

[jira] [Updated] (CLOUDSTACK-5499) Vmware -When nfs was down for about 12 hours and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sangeetha Hariharan updated CLOUDSTACK-5499:
--------------------------------------------

    Attachment: nfs12down.rar

> Vmware -When nfs was down for about 12 hours  and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5499
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5499
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>         Environment: Build from 4.3
>            Reporter: Sangeetha Hariharan
>            Priority: Critical
>             Fix For: 4.3.0
>
>         Attachments: nfs12down.rar
>
>
> Vmware -When nfs was down for about 12 hours  and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.
> Set up :
> Advanced Zone with 2 5.1 ESXI hosts.
> Steps to reproduce the problem:
> 1. Deploy 5 Vms in each of the hosts , so we start with 11 Vms.
> 2. Start concurrent snapshots for ROOT volumes of all the Vms.
> 3. Shutdown the Secondary storage server when the snapshots are in the progress.
> 4. Bring the Secondary storage server up after 12 hours.
> Follwoing are the issues that are seen in this run:
> 1. I see that the snapshots that are in Progress , report failures only after 12 hours even though the backup.snapshot.wait is set to 12 hours.
> 2. New snapshot request that were executed when the NFS server was down , do  not report failure immediately. In my case , i see that such  request eventually succeeded when the NFS server was brought up. Is this the expected behavior ? Should we not expect to fail right away , instead of holding on to such active  sessions ?
> 3. Some of the snapshot failures resulted in snaphots that are in "CreatedOnPrimary" state. For such volumes , snapshots are not being attempted at all , even though  the NFS server was brought up.
> Volumes in this state are - 16,18,17,22.
> There are instances where  I have seen the snapshots being scheduled and succeeding even when the previous state was "CreatedOnPrimary". Why are were able to schedule snapshots in such cases ? And sometimes not in other cases?
> mysql> select volume_id,status,created from snapshots where volume_id=18;
> +-----------+------------------+---------------------+
> | volume_id | status           | created             |
> +-----------+------------------+---------------------+
> |        18 | Destroyed        | 2013-12-12 23:24:14 |
> |        18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
> |        18 | BackedUp         | 2013-12-13 01:53:38 |
> |        18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
> +-----------+------------------+---------------------+
> mysql> select volume_id,status,created from snapshots;
> +-----------+------------------+---------------------+
> | volume_id | status           | created             |
> +-----------+------------------+---------------------+
> |        22 | Destroyed        | 2013-12-12 23:24:13 |
> |        21 | Destroyed        | 2013-12-12 23:24:13 |
> |        20 | Destroyed        | 2013-12-12 23:24:14 |
> |        19 | Destroyed        | 2013-12-12 23:24:14 |
> |        18 | Destroyed        | 2013-12-12 23:24:14 |
> |        17 | Destroyed        | 2013-12-12 23:24:14 |
> |        16 | Destroyed        | 2013-12-12 23:24:14 |
> |        14 | Destroyed        | 2013-12-12 23:24:15 |
> |        25 | Destroyed        | 2013-12-12 23:24:15 |
> |        24 | Destroyed        | 2013-12-12 23:24:15 |
> |        23 | Destroyed        | 2013-12-12 23:24:15 |
> |        22 | CreatedOnPrimary | 2013-12-12 23:53:38 |
> |        21 | Destroyed        | 2013-12-12 23:53:38 |
> |        20 | Destroyed        | 2013-12-12 23:53:38 |
> |        19 | Destroyed        | 2013-12-12 23:53:39 |
> |        18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
> |        17 | CreatedOnPrimary | 2013-12-12 23:53:40 |
> |        16 | CreatedOnPrimary | 2013-12-12 23:53:40 |
> |        14 | Destroyed        | 2013-12-12 23:53:40 |
> |        25 | Destroyed        | 2013-12-12 23:53:41 |
> |        24 | Destroyed        | 2013-12-12 23:53:41 |
> |        23 | Destroyed        | 2013-12-12 23:53:42 |
> |        21 | Destroyed        | 2013-12-13 00:53:37 |
> |        19 | Destroyed        | 2013-12-13 00:53:38 |
> |        22 | BackedUp         | 2013-12-13 01:53:37 |
> |        21 | Destroyed        | 2013-12-13 01:53:38 |
> |        20 | Destroyed        | 2013-12-13 01:53:38 |
> |        19 | Destroyed        | 2013-12-13 01:53:38 |
> |        18 | BackedUp         | 2013-12-13 01:53:38 |
> |        17 | BackedUp         | 2013-12-13 01:53:38 |
> |        16 | BackedUp         | 2013-12-13 01:53:39 |
> |        14 | Destroyed        | 2013-12-13 01:53:39 |
> |        25 | Destroyed        | 2013-12-13 01:53:39 |
> |        24 | Destroyed        | 2013-12-13 01:53:39 |
> |        23 | Destroyed        | 2013-12-13 01:53:40 |
> |        22 | CreatedOnPrimary | 2013-12-13 03:53:37 |
> |        21 | Destroyed        | 2013-12-13 03:53:38 |
> |        20 | Destroyed        | 2013-12-13 03:53:38 |
> |        19 | Destroyed        | 2013-12-13 03:53:38 |
> |        18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
> |        17 | CreatedOnPrimary | 2013-12-13 03:53:38 |
> |        16 | CreatedOnPrimary | 2013-12-13 03:53:39 |
> |        14 | Destroyed        | 2013-12-13 03:53:39 |
> |        24 | Destroyed        | 2013-12-13 08:53:37 |
> |        25 | Destroyed        | 2013-12-13 09:53:37 |
> |        23 | Destroyed        | 2013-12-13 10:53:37 |
> |        21 | Destroyed        | 2013-12-13 16:53:37 |
> |        20 | Destroyed        | 2013-12-13 16:53:38 |
> |        19 | Destroyed        | 2013-12-13 16:53:38 |
> |        14 | Destroyed        | 2013-12-13 16:53:38 |
> |        21 | BackedUp         | 2013-12-13 18:53:37 |
> |        20 | BackedUp         | 2013-12-13 18:53:38 |
> |        19 | BackedUp         | 2013-12-13 18:53:38 |
> |        14 | BackedUp         | 2013-12-13 18:53:38 |
> |        25 | BackedUp         | 2013-12-13 18:53:38 |
> |        24 | BackedUp         | 2013-12-13 18:53:38 |
> |        23 | BackedUp         | 2013-12-13 18:53:39 |
> |        21 | BackedUp         | 2013-12-13 19:53:37 |
> |        20 | BackedUp         | 2013-12-13 19:53:38 |
> |        19 | BackedUp         | 2013-12-13 19:53:38 |
> |        14 | BackedUp         | 2013-12-13 19:53:38 |
> |        25 | BackedUp         | 2013-12-13 19:53:38 |
> |        24 | BackedUp         | 2013-12-13 19:53:39 |
> |        23 | BackedUp         | 2013-12-13 19:53:39 |
> +-----------+------------------+---------------------+



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)