You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@cloudstack.apache.org by "John Skinner (JIRA)" <ji...@apache.org> on 2014/05/21 17:58:38 UTC

[jira] [Commented] (CLOUDSTACK-6060) Excessive use of LVM snapshots on XenServer, that leads to snapshot failure and unnecessary disk usage.

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004815#comment-14004815 ] 

John Skinner commented on CLOUDSTACK-6060:
------------------------------------------

I am experiencing this same issue. Same environment (CS 4.1.1, XS 6.02 patched up). Customer is also doing 3 daily, 2 weekly, and 2 monthly. Looking at how we are keeping snapshot deltas on secondary storage, is there any need at all to keep the snapshot active on XenServer - Shouldn't we remove it after snapshot is copied to secondary storage?

> Excessive use of LVM snapshots on XenServer, that leads to snapshot failure and unnecessary disk usage.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-6060
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6060
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server, XenServer
>    Affects Versions: 4.1.1
>         Environment: CS 4.1.1, XS S602E027
>            Reporter: France
>
> When user created multiple snapshots in CS GUI (in my case 3 daily, 2 weekly and 2 monthly) snapshot creation soon failed, because the maximum amount of LVM snapshots on XenServer was reached.
> From SMlog on XenServer:
> [9294] 2014-02-07 15:16:58.326838	***** vdi_snapshot: EXCEPTION SR.SROSError, The snapshot chain is too long
>   File "/opt/xensource/sm/SRCommand.py", line 94, in run
>     return self._run_locked(sr)
>   File "/opt/xensource/sm/SRCommand.py", line 131, in _run_locked
>     return self._run(sr, target)
>   File "/opt/xensource/sm/SRCommand.py", line 170, in _run
>     return target.snapshot(self.params['sr_uuid'], self.vdi_uuid)
>   File "/opt/xensource/sm/LVHDSR.py", line 1440, in snapshot
>     return self._snapshot(snapType)
>   File "/opt/xensource/sm/LVHDSR.py", line 1509, in _snapshot
>     raise xs_errors.XenError('SnapshotChainTooLong')
>   File "/opt/xensource/sm/xs_errors.py", line 49, in __init__
>     raise SR.SROSError(errorcode, errormessage)
> From CS:
> WARN  [xen.resource.CitrixResourceBase] (DirectAgent-150:) ManageSnapshotCommand operation: create Failed for snapshotId: 489, reason: SR_BACKEND_FAILURE_109The snapshot chain is too long
> SR_BACKEND_FAILURE_109The snapshot chain is too long
> 	at com.xensource.xenapi.Types.checkResponse(Types.java:1936)
> 	at com.xensource.xenapi.Connection.dispatch(Connection.java:368)
> 	at com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:909)
> 	at com.xensource.xenapi.VDI.miamiSnapshot(VDI.java:1217)
> 	at com.xensource.xenapi.VDI.snapshot(VDI.java:1192)
> 	at com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:6293)
> 	at com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:487)
> 	at com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:73)
> 	at com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:701)
> Here is the snapshot list for the VM:
> [root@x1 ~]# xe vdi-list is-a-snapshot=true  | grep XZY
>           name-label ( RW): XZY_ROOT-385_20140125020342
>           name-label ( RW): XZY_ROOT-385_20140121020342
>           name-label ( RW): XZY_ROOT-385_20140121020342
>           name-label ( RW): XZY_ROOT-385_20140124020342
>           name-label ( RW): XZY_ROOT-385_20140122020342
>           name-label ( RW): XZY_ROOT-385_20140125020342
>           name-label ( RW): XZY_ROOT-385_20140123020342
>           name-label ( RW): XZY_ROOT-385_20140122020342
>           name-label ( RW): XZY_ROOT-385_20140125020342
>           name-label ( RW): XZY_ROOT-385_20140124020342
>           name-label ( RW): XZY_ROOT-385_20140120020341
>           name-label ( RW): XZY_ROOT-385_20140123020342
>           name-label ( RW): XZY_ROOT-385_20140124020342
>           name-label ( RW): XZY_ROOT-385_20140121020342
>           name-label ( RW): XZY_ROOT-385_20140122020342
>           name-label ( RW): XZY_ROOT-385_20140120020341
>           name-label ( RW): XZY_ROOT-385_20140122020342
>           name-label ( RW): XZY_ROOT-385_20140120020341
>           name-label ( RW): XZY_ROOT-385_20140123020342
>           name-label ( RW): XZY_ROOT-385_20140123020342
>           name-label ( RW): XZY_ROOT-385_20140122020342
>           name-label ( RW): XZY_ROOT-385_20140120020341
>           name-label ( RW): XZY_ROOT-385_20140123020342
>           name-label ( RW): XZY_ROOT-385_20140124020342
>           name-label ( RW): XZY_ROOT-385_20140121020342
>           name-label ( RW): XZY_ROOT-385_20140124020342
>           name-label ( RW): XZY_ROOT-385_20140121020342
>           name-label ( RW): XZY_ROOT-385_20140120020341
> I see lot's of of other LVM snapshots:
> xe vdi-list is-a-snapshot=true  | grep name-label
>          name-label ( RW): XYZ_ROOT-385_20140125020342
>          name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c
>          name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6
>          name-label ( RW): XYZ_ROOT-385_20140121020342
>          name-label ( RW): XYZ_ROOT-385_20140121020342
>          name-label ( RW): XBBBC_20140206040441
>          name-label ( RW): XYZ_ROOT-385_20140124020342
>          name-label ( RW): OCWWW_20140112000011
>          name-label ( RW): XYZ_ROOT-385_20140122020342
>          name-label ( RW): Template routing-1
>          name-label ( RW): Template aa0bcd7c-4b03-4778-a038-da80fdfb7a43
>          name-label ( RW): OCWWWXXXXX_ROOT-330_20140201130342
>          name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6
>          name-label ( RW): XYZ_ROOT-385_20140125020342
>          name-label ( RW): XYZ_ROOT-385_20140123020342
>          name-label ( RW): XYZ_ROOT-385_20140122020342
>          name-label ( RW): Template 58e13a51-affa-4fe2-a66b-19e89091290d
>          name-label ( RW): ABCCCDDDD_ROOT-334_20140201160342
>          name-label ( RW): ANON_ROOT-324_20131121124532
>          name-label ( RW): Template fc0262f2-7609-498b-a1ac-ed71e1ebe7f9
>          name-label ( RW): XYZ_ROOT-385_20140125020342
>          name-label ( RW): Template d768db3f-6d42-48f9-bdfb-7dceccef9f3e
>          name-label ( RW): XYZ_ROOT-385_20140124020342
>          name-label ( RW): Template fc0262f2-7609-498b-a1ac-ed71e1ebe7f9
>          name-label ( RW): XYZ_ROOT-385_20140120020341
>          name-label ( RW): detached_hrosci_20130513190437
>          name-label ( RW): XYZ_ROOT-385_20140123020342
>          name-label ( RW): XYZ_ROOT-385_20140124020342
>          name-label ( RW): Template routing-1
>          name-label ( RW): NGGQQQ_ROOT-423_20140202030342
>          name-label ( RW): XYZ_ROOT-385_20140121020342
>          name-label ( RW): SOME_work_ROOT-295_20130322150148
>          name-label ( RW): XYZ_ROOT-385_20140122020342
>          name-label ( RW): XYZ_ROOT-385_20140120020341
>          name-label ( RW): Template 57d7c73c-ca06-4225-8a9f-7cc5776c5610
>          name-label ( RW): XYZ_ROOT-385_20140122020342
>          name-label ( RW): Template e80af9a4-e087-4220-977b-868fa4ec75b6
>          name-label ( RW): Template 90d42566-b956-4d9d-9685-91e19b693f86
>          name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c
>          name-label ( RW): XYZ_ROOT-385_20140120020341
>          name-label ( RW): DDGGWW_ROOT-234_20140118030341
>          name-label ( RW): Template 8f99eaf7-6d33-4097-8d19-9cafd681f124
>          name-label ( RW): Template afa9d0a0-8242-443b-ac53-b0b1c760559c
>          name-label ( RW): XYZ_ROOT-385_20140123020342
>          name-label ( RW): NNGGNN_ROOT-351_20140205020342
>          name-label ( RW): OOIIOO_ROOT-313_20131203084833
>          name-label ( RW): Template routing-1
>          name-label ( RW): Template 61d4df5b-bccf-4457-ad08-0ae57ea16a7e
>          name-label ( RW): XYZ_ROOT-385_20140123020342
>          name-label ( RW): TTGGTT_ROOT-233_20121213090209
>          name-label ( RW): XYZ_ROOT-385_20140122020342
>          name-label ( RW): XYZ_ROOT-385_20140120020341
>          name-label ( RW): XYZ_ROOT-385_20140123020342
>          name-label ( RW): Template 1feab759-b573-4227-8b12-8e9846ee4bd6
>          name-label ( RW): Template 4e444f15-ac78-4b53-9899-c406478f99b2
>          name-label ( RW): Template 690fa285-3317-45d6-a563-35ddd5af493e
>          name-label ( RW): XYZ_ROOT-385_20140124020342
>          name-label ( RW): XYZ_ROOT-385_20140121020342
>          name-label ( RW): ABCCCDDDD_DATA-334_20140207020441
>          name-label ( RW): Template 4e444f15-ac78-4b53-9899-c406478f99b2
>          name-label ( RW): XYZ_ROOT-385_20140124020342
>          name-label ( RW): Template c2b3a07f-d16f-4abb-9162-55e4130a417c
>          name-label ( RW): Template c1e4d036-bd10-4b33-803a-9e34f2c755fe
>          name-label ( RW): XYZ_ROOT-385_20140121020342
>          name-label ( RW): XYZ_ROOT-385_20140120020341
> Is there a reason why LVM snapshot is not destroyed after the actual backup is made to NFS storage?
> If it's really necessary to have snapshots, we must limit their amount below 32 or what the Limix in XenServer is.
> It's also crazy to keep snapshots, if each uses the same amount of storage room as original VM on production iSCSI cluster. So from one VM of 30GB with 28 LVM snapshots i get 840GB of usage?



--
This message was sent by Atlassian JIRA
(v6.2#6252)