You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "France (JIRA)" <ji...@apache.org> on 2014/02/07 18:13:21 UTC

[jira] [Created] (CLOUDSTACK-6060) Excessive use of LVM snapshots on XenServer, that leads to snapshot failure and unnecessary disk usage.

France created CLOUDSTACK-6060:
----------------------------------

             Summary: Excessive use of LVM snapshots on XenServer, that leads to snapshot failure and unnecessary disk usage.
                 Key: CLOUDSTACK-6060
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6060
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server, XenServer
    Affects Versions: 4.1.1
         Environment: CS 4.1.1, XS S602E027
            Reporter: France


When user created multiple snapshots in CS GUI (in my case 3 daily, 2 weekly and 2 monthly) snapshot creation soon failed, because the maximum amount of LVM snapshots on XenServer was reached.

>From SMlog on XenServer:
[9294] 2014-02-07 15:16:58.326838	***** vdi_snapshot: EXCEPTION SR.SROSError, The snapshot chain is too long
  File "/opt/xensource/sm/SRCommand.py", line 94, in run
    return self._run_locked(sr)
  File "/opt/xensource/sm/SRCommand.py", line 131, in _run_locked
    return self._run(sr, target)
  File "/opt/xensource/sm/SRCommand.py", line 170, in _run
    return target.snapshot(self.params['sr_uuid'], self.vdi_uuid)
  File "/opt/xensource/sm/LVHDSR.py", line 1440, in snapshot
    return self._snapshot(snapType)
  File "/opt/xensource/sm/LVHDSR.py", line 1509, in _snapshot
    raise xs_errors.XenError('SnapshotChainTooLong')
  File "/opt/xensource/sm/xs_errors.py", line 49, in __init__
    raise SR.SROSError(errorcode, errormessage)

>From CS:
WARN  [xen.resource.CitrixResourceBase] (DirectAgent-150:) ManageSnapshotCommand operation: create Failed for snapshotId: 489, reason: SR_BACKEND_FAILURE_109The snapshot chain is too long
SR_BACKEND_FAILURE_109The snapshot chain is too long
	at com.xensource.xenapi.Types.checkResponse(Types.java:1936)
	at com.xensource.xenapi.Connection.dispatch(Connection.java:368)
	at com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:909)
	at com.xensource.xenapi.VDI.miamiSnapshot(VDI.java:1217)
	at com.xensource.xenapi.VDI.snapshot(VDI.java:1192)
	at com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:6293)
	at com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:487)
	at com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:73)
	at com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:701)

Here is the snapshot list for the VM:
[root@x1 ~]# xe vdi-list is-a-snapshot=true  | grep XZY
          name-label ( RW): XZY_ROOT-385_20140125020342
          name-label ( RW): XZY_ROOT-385_20140121020342
          name-label ( RW): XZY_ROOT-385_20140121020342
          name-label ( RW): XZY_ROOT-385_20140124020342
          name-label ( RW): XZY_ROOT-385_20140122020342
          name-label ( RW): XZY_ROOT-385_20140125020342
          name-label ( RW): XZY_ROOT-385_20140123020342
          name-label ( RW): XZY_ROOT-385_20140122020342
          name-label ( RW): XZY_ROOT-385_20140125020342
          name-label ( RW): XZY_ROOT-385_20140124020342
          name-label ( RW): XZY_ROOT-385_20140120020341
          name-label ( RW): XZY_ROOT-385_20140123020342
          name-label ( RW): XZY_ROOT-385_20140124020342
          name-label ( RW): XZY_ROOT-385_20140121020342
          name-label ( RW): XZY_ROOT-385_20140122020342
          name-label ( RW): XZY_ROOT-385_20140120020341
          name-label ( RW): XZY_ROOT-385_20140122020342
          name-label ( RW): XZY_ROOT-385_20140120020341
          name-label ( RW): XZY_ROOT-385_20140123020342
          name-label ( RW): XZY_ROOT-385_20140123020342
          name-label ( RW): XZY_ROOT-385_20140122020342
          name-label ( RW): XZY_ROOT-385_20140120020341
          name-label ( RW): XZY_ROOT-385_20140123020342
          name-label ( RW): XZY_ROOT-385_20140124020342
          name-label ( RW): XZY_ROOT-385_20140121020342
          name-label ( RW): XZY_ROOT-385_20140124020342
          name-label ( RW): XZY_ROOT-385_20140121020342
          name-label ( RW): XZY_ROOT-385_20140120020341


I see lot's of of other LVM snapshots:
xe vdi-list is-a-snapshot=true  | grep name-label
         name-label ( RW): XYZ_ROOT-385_20140125020342
         name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c
         name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6
         name-label ( RW): XYZ_ROOT-385_20140121020342
         name-label ( RW): XYZ_ROOT-385_20140121020342
         name-label ( RW): XBBBC_20140206040441
         name-label ( RW): XYZ_ROOT-385_20140124020342
         name-label ( RW): OCWWW_20140112000011
         name-label ( RW): XYZ_ROOT-385_20140122020342
         name-label ( RW): Template routing-1
         name-label ( RW): Template aa0bcd7c-4b03-4778-a038-da80fdfb7a43
         name-label ( RW): OCWWWXXXXX_ROOT-330_20140201130342
         name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6
         name-label ( RW): XYZ_ROOT-385_20140125020342
         name-label ( RW): XYZ_ROOT-385_20140123020342
         name-label ( RW): XYZ_ROOT-385_20140122020342
         name-label ( RW): Template 58e13a51-affa-4fe2-a66b-19e89091290d
         name-label ( RW): ABCCCDDDD_ROOT-334_20140201160342
         name-label ( RW): ANON_ROOT-324_20131121124532
         name-label ( RW): Template fc0262f2-7609-498b-a1ac-ed71e1ebe7f9
         name-label ( RW): XYZ_ROOT-385_20140125020342
         name-label ( RW): Template d768db3f-6d42-48f9-bdfb-7dceccef9f3e
         name-label ( RW): XYZ_ROOT-385_20140124020342
         name-label ( RW): Template fc0262f2-7609-498b-a1ac-ed71e1ebe7f9
         name-label ( RW): XYZ_ROOT-385_20140120020341
         name-label ( RW): detached_hrosci_20130513190437
         name-label ( RW): XYZ_ROOT-385_20140123020342
         name-label ( RW): XYZ_ROOT-385_20140124020342
         name-label ( RW): Template routing-1
         name-label ( RW): NGGQQQ_ROOT-423_20140202030342
         name-label ( RW): XYZ_ROOT-385_20140121020342
         name-label ( RW): SOME_work_ROOT-295_20130322150148
         name-label ( RW): XYZ_ROOT-385_20140122020342
         name-label ( RW): XYZ_ROOT-385_20140120020341
         name-label ( RW): Template 57d7c73c-ca06-4225-8a9f-7cc5776c5610
         name-label ( RW): XYZ_ROOT-385_20140122020342
         name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6
         name-label ( RW): Template 90d42566-b956-4d9d-9685-91e19b693f86
         name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c
         name-label ( RW): XYZ_ROOT-385_20140120020341
         name-label ( RW): DDGGWW_ROOT-234_20140118030341
         name-label ( RW): Template 8f99eaf7-6d33-4097-8d19-9cafd681f124
         name-label ( RW): Template afa9d0a0-8242-443b-ac53-b0b1c760559c
         name-label ( RW): XYZ_ROOT-385_20140123020342
         name-label ( RW): NNGGNN_ROOT-351_20140205020342
         name-label ( RW): OOIIOO_ROOT-313_20131203084833
         name-label ( RW): Template routing-1
         name-label ( RW): Template 61d4df5b-bccf-4457-ad08-0ae57ea16a7e
         name-label ( RW): XYZ_ROOT-385_20140123020342
         name-label ( RW): TTGGTT_ROOT-233_20121213090209
         name-label ( RW): XYZ_ROOT-385_20140122020342
         name-label ( RW): XYZ_ROOT-385_20140120020341
         name-label ( RW): XYZ_ROOT-385_20140123020342
         name-label ( RW): Template 1feab759-b573-4227-8b12-8e9846ee4bd6
         name-label ( RW): Template 4e444f15-ac78-4b53-9899-c406478f99b2
         name-label ( RW): Template 690fa285-3317-45d6-a563-35ddd5af493e
         name-label ( RW): XYZ_ROOT-385_20140124020342
         name-label ( RW): XYZ_ROOT-385_20140121020342
         name-label ( RW): ABCCCDDDD_DATA-334_20140207020441
         name-label ( RW): Template 4e444f15-ac78-4b53-9899-c406478f99b2
         name-label ( RW): XYZ_ROOT-385_20140124020342
         name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c
         name-label ( RW): Template c1e4d036-bd10-4b33-803a-9e34f2c755fe
         name-label ( RW): XYZ_ROOT-385_20140121020342
         name-label ( RW): XYZ_ROOT-385_20140120020341


Is there a reason why LVM snapshot is not destroyed after the actual backup is made to NFS storage?

If it's really necessary to have snapshots, we must limit their amount below 32 or what the Limix in XenServer is.

It's also crazy to keep snapshots, if each uses the same amount of storage room as original VM on production iSCSI cluster. So from one VM of 30GB with 28 LVM snapshots i get 840GB of usage?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)