You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "Sangeetha Hariharan (JIRA)" <ji...@apache.org> on 2013/12/13 22:22:07 UTC

[jira] [Created] (CLOUDSTACK-5499) Vmware -When nfs was down for about 12 hours and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.

Sangeetha Hariharan created CLOUDSTACK-5499:
-----------------------------------------------

             Summary: Vmware -When nfs was down for about 12 hours  and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.
                 Key: CLOUDSTACK-5499
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5499
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.3.0
         Environment: Build from 4.3
            Reporter: Sangeetha Hariharan
            Priority: Critical
             Fix For: 4.3.0


Vmware -When nfs was down for about 12 hours  and then brought back up again , snasphots are not being attempted for some of the volumes which have snaphots that are in "CreatedOnPrimary" state.

Set up :
Advanced Zone with 2 5.1 ESXI hosts.

Steps to reproduce the problem:

1. Deploy 5 Vms in each of the hosts , so we start with 11 Vms.
2. Start concurrent snapshots for ROOT volumes of all the Vms.
3. Shutdown the Secondary storage server when the snapshots are in the progress.
4. Bring the Secondary storage server up after 12 hours.

Follwoing are the issues that are seen in this run:

1. I see that the snapshots that are in Progress , report failures only after 12 hours even though the backup.snapshot.wait is set to 12 hours.

2. New snapshot request that were executed when the NFS server was down , do  not report failure immediately. In my case , i see that such  request eventually succeeded when the NFS server was brought up. Is this the expected behavior ? Should we not expect to fail right away , instead of holding on to such active  sessions ?

3. Some of the snapshot failures resulted in snaphots that are in "CreatedOnPrimary" state. For such volumes , snapshots are not being attempted at all , even though  the NFS server was brought up.

Volumes in this state are - 16,18,17,22.

There are instances where  I have seen the snapshots being scheduled and succeeding even when the previous state was "CreatedOnPrimary". Why are were able to schedule snapshots in such cases ? And sometimes not in other cases?

mysql> select volume_id,status,created from snapshots where volume_id=18;
+-----------+------------------+---------------------+
| volume_id | status           | created             |
+-----------+------------------+---------------------+
|        18 | Destroyed        | 2013-12-12 23:24:14 |
|        18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
|        18 | BackedUp         | 2013-12-13 01:53:38 |
|        18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
+-----------+------------------+---------------------+



mysql> select volume_id,status,created from snapshots;
+-----------+------------------+---------------------+
| volume_id | status           | created             |
+-----------+------------------+---------------------+
|        22 | Destroyed        | 2013-12-12 23:24:13 |
|        21 | Destroyed        | 2013-12-12 23:24:13 |
|        20 | Destroyed        | 2013-12-12 23:24:14 |
|        19 | Destroyed        | 2013-12-12 23:24:14 |
|        18 | Destroyed        | 2013-12-12 23:24:14 |
|        17 | Destroyed        | 2013-12-12 23:24:14 |
|        16 | Destroyed        | 2013-12-12 23:24:14 |
|        14 | Destroyed        | 2013-12-12 23:24:15 |
|        25 | Destroyed        | 2013-12-12 23:24:15 |
|        24 | Destroyed        | 2013-12-12 23:24:15 |
|        23 | Destroyed        | 2013-12-12 23:24:15 |
|        22 | CreatedOnPrimary | 2013-12-12 23:53:38 |
|        21 | Destroyed        | 2013-12-12 23:53:38 |
|        20 | Destroyed        | 2013-12-12 23:53:38 |
|        19 | Destroyed        | 2013-12-12 23:53:39 |
|        18 | CreatedOnPrimary | 2013-12-12 23:53:39 |
|        17 | CreatedOnPrimary | 2013-12-12 23:53:40 |
|        16 | CreatedOnPrimary | 2013-12-12 23:53:40 |
|        14 | Destroyed        | 2013-12-12 23:53:40 |
|        25 | Destroyed        | 2013-12-12 23:53:41 |
|        24 | Destroyed        | 2013-12-12 23:53:41 |
|        23 | Destroyed        | 2013-12-12 23:53:42 |
|        21 | Destroyed        | 2013-12-13 00:53:37 |
|        19 | Destroyed        | 2013-12-13 00:53:38 |
|        22 | BackedUp         | 2013-12-13 01:53:37 |
|        21 | Destroyed        | 2013-12-13 01:53:38 |
|        20 | Destroyed        | 2013-12-13 01:53:38 |
|        19 | Destroyed        | 2013-12-13 01:53:38 |
|        18 | BackedUp         | 2013-12-13 01:53:38 |
|        17 | BackedUp         | 2013-12-13 01:53:38 |
|        16 | BackedUp         | 2013-12-13 01:53:39 |
|        14 | Destroyed        | 2013-12-13 01:53:39 |
|        25 | Destroyed        | 2013-12-13 01:53:39 |
|        24 | Destroyed        | 2013-12-13 01:53:39 |
|        23 | Destroyed        | 2013-12-13 01:53:40 |
|        22 | CreatedOnPrimary | 2013-12-13 03:53:37 |
|        21 | Destroyed        | 2013-12-13 03:53:38 |
|        20 | Destroyed        | 2013-12-13 03:53:38 |
|        19 | Destroyed        | 2013-12-13 03:53:38 |
|        18 | CreatedOnPrimary | 2013-12-13 03:53:38 |
|        17 | CreatedOnPrimary | 2013-12-13 03:53:38 |
|        16 | CreatedOnPrimary | 2013-12-13 03:53:39 |
|        14 | Destroyed        | 2013-12-13 03:53:39 |
|        24 | Destroyed        | 2013-12-13 08:53:37 |
|        25 | Destroyed        | 2013-12-13 09:53:37 |
|        23 | Destroyed        | 2013-12-13 10:53:37 |
|        21 | Destroyed        | 2013-12-13 16:53:37 |
|        20 | Destroyed        | 2013-12-13 16:53:38 |
|        19 | Destroyed        | 2013-12-13 16:53:38 |
|        14 | Destroyed        | 2013-12-13 16:53:38 |
|        21 | BackedUp         | 2013-12-13 18:53:37 |
|        20 | BackedUp         | 2013-12-13 18:53:38 |
|        19 | BackedUp         | 2013-12-13 18:53:38 |
|        14 | BackedUp         | 2013-12-13 18:53:38 |
|        25 | BackedUp         | 2013-12-13 18:53:38 |
|        24 | BackedUp         | 2013-12-13 18:53:38 |
|        23 | BackedUp         | 2013-12-13 18:53:39 |
|        21 | BackedUp         | 2013-12-13 19:53:37 |
|        20 | BackedUp         | 2013-12-13 19:53:38 |
|        19 | BackedUp         | 2013-12-13 19:53:38 |
|        14 | BackedUp         | 2013-12-13 19:53:38 |
|        25 | BackedUp         | 2013-12-13 19:53:38 |
|        24 | BackedUp         | 2013-12-13 19:53:39 |
|        23 | BackedUp         | 2013-12-13 19:53:39 |
+-----------+------------------+---------------------+




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)