You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by li jerry <di...@hotmail.com> on 2019/09/06 16:52:34 UTC

4.13 rbd snapshot delete failed

Hello All

When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could be created and rolled back (using API alone), but deletion could not be completed.



After executing the deletion API, the snapshot will disappear from the list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)



Is there any way we can completely delete the snapshot?

-Jerry


Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
Thx Gabriel - I've commented on the PR - needs some more love - but we're
almost there!

On Thu, 3 Oct 2019 at 20:46, Gabriel Beims Bräscher <ga...@gmail.com>
wrote:

> Hello folks,
>
> Just pinging that I have created PR
> https://github.com/apache/cloudstack/pull/3615 addressing the snapshot
> deletion issue #3586 (https://github.com/apache/cloudstack/issues/3586).
> Please, feel free to test and review.
>
> Regards,
> Gabriel.
>
> Em seg, 9 de set de 2019 às 12:08, Gabriel Beims Bräscher <
> gabrascher@gmail.com> escreveu:
>
> > Thanks for the feedback Andrija and Andrei.
> >
> > I have opened issue #3590 for the snapshot rollback issue raised by
> > Andrija.
> > I will be investigating both issues:
> > - RBD snapshot Revert #3590 (
> > https://github.com/apache/cloudstack/issues/3590)
> > - RBD snapshot deletion #3586 (
> > https://github.com/apache/cloudstack/issues/3586)
> >
> > Cheers,
> > Gabriel
> >
> > Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky <
> andrei@arhont.com>
> > escreveu:
> >
> >> A quick feedback from my side. I've never had a properly working delete
> >> snapshot with ceph. Every week or so I have to manually delete all ceph
> >> snapshots. However, the NFS secondary storage snapshots are deleted just
> >> fine. I've been using CloudStack for 5+ years and it was always the
> case. I
> >> am currently running 4.11.2 with ceph 13.2.6-1xenial.
> >>
> >> Andrei
> >>
> >> ----- Original Message -----
> >> > From: "Andrija Panic" <an...@gmail.com>
> >> > To: "Gabriel Beims Bräscher" <ga...@gmail.com>
> >> > Cc: "users" <us...@cloudstack.apache.org>, "dev" <
> >> dev@cloudstack.apache.org>
> >> > Sent: Sunday, 8 September, 2019 19:17:59
> >> > Subject: Re: 4.13 rbd snapshot delete failed
> >>
> >> > Thx Gabriel for extensive feedback.
> >> > Actually my ex company added the code to really delete a RBD snap back
> >> in
> >> > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is
> >> there,
> >> > but probably some exception is happening or regression...
> >> >
> >> > Cheers
> >> >
> >> > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <
> gabrascher@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Thanks for the feedback, Andrija. It looks like delete was not
> totally
> >> >> supported then (am I missing something?). I will take a look into
> this
> >> and
> >> >> open a PR adding propper support for rbd snapshot deletion if
> >> necessary.
> >> >>
> >> >> Regarding the rollback, I have tested it several times and it worked;
> >> >> however, I see a weak point on the Ceph rollback implementation.
> >> >>
> >> >> It looks like Li Jerry was able to execute the rollback without any
> >> >> problem. Li, could you please post here  the log output: "Attempting
> to
> >> >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
> >> >> [snapshotid:%s]"? Andrija will not be able to see that log as the
> >> exception
> >> >> happen prior to it, the only way of you checking those values is via
> >> remote
> >> >> debugging. If you be able to post those values it would help as well
> on
> >> >> sorting out what is wrong.
> >> >>
> >> >> I am checking the code base, running a few tests, and evaluating the
> >> log
> >> >> that you (Andrija) sent. What I can say for now is that it looks that
> >> the
> >> >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical
> >> piece of
> >> >> code that can definitely break the rollback execution flow. My tests
> >> had
> >> >> pointed for a pattern but now I see other possibilities. I will
> >> probably
> >> >> add a few parameters on the rollback/revert command instead of using
> >> the
> >> >> path or review the path life-cycle and different execution flows in
> >> order
> >> >> to keep it safer to be used.
> >> >> [1]
> >> >>
> >>
> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
> >> >>
> >> >> A few details on the test environments and Ceph/RBD version:
> >> >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
> >> >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
> >> >> (stable)
> >> >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2
> [
> >> >> https://github.com/ceph/ceph/pull/6878]
> >> >> Rados-java [https://github.com/ceph/rados-java] supports snapshot
> >> >> rollback since 0.5.0; rados-java 0.5.0 is the version used by
> >> CloudStack
> >> >> 4.13.0.0
> >> >>
> >> >> I will be updating here soon.
> >> >>
> >> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <
> wido@widodh.nl>
> >> >> escreveu:
> >> >>
> >> >>>
> >> >>>
> >> >>> On 9/8/19 5:26 AM, Andrija Panic wrote:
> >> >>> > Maaany release ago, deleting Ceph volume snap, was also only
> >> deleting
> >> >>> it in
> >> >>> > DB, so the RBD performance become terrible with many tens of (i.
> e.
> >> >>> Hourly)
> >> >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and
> the
> >> guys
> >> >>> > will know better...
> >> >>>
> >> >>> I pinged Gabriel and he's looking into it. He'll get back to it.
> >> >>>
> >> >>> Wido
> >> >>>
> >> >>> >
> >> >>> > I
> >> >>> >
> >> >>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> >> >>> >
> >> >>> >> I found it had nothing to do with  storage.cleanup.delay and
> >> >>> >> storage.cleanup.interval.
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because
> >> the RBD
> >> >>> >> snapshot does not have Copy to secondary storage, it only changes
> >> the
> >> >>> >> database information, and does not enter the main storage to
> >> delete the
> >> >>> >> snapshot.
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> Log===========================
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> >> >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
> >> >>> 192.168.254.3
> >> >>> >> -- GET
> >> >>> >>
> >> >>>
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> CIDRs
> >> from
> >> >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]'
> is
> >> >>> allowed
> >> >>> >> to perform API calls: 0.0.0.0/0,::/0
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> >> Retrieved
> >> >>> >> cmdEventType from job info: SNAPSHOT.DELETE
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> >> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
> >> >>> job-1378
> >> >>> >> into job monitoring
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> submit
> >> >>> async
> >> >>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> >> >>> >> instanceType: Snapshot, instanceId: 13, cmd:
> >> >>> >>
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >> >>> cmdInfo:
> >> >>> >>
> >> >>>
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >> >>> >>
> >> >>>
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
> >> 0,
> >> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> >> lastUpdated:
> >> >>> >> null, lastPolled: null, created: null, removed: null}
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097)
> >> Executing
> >> >>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType:
> >> Snapshot,
> >> >>> >> instanceId: 13, cmd:
> >> >>> >>
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >> >>> cmdInfo:
> >> >>> >>
> >> >>>
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >> >>> >>
> >> >>>
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
> >> 0,
> >> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> >> lastUpdated:
> >> >>> >> null, lastPolled: null, created: null, removed: null}
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> >> ===END===
> >> >>> >> 192.168.254.3 -- GET
> >> >>> >>
> >> >>>
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> >> >>> >> (AgentManager-Handler-12:null) (logid:) Seq
> 1-8660140608456756853:
> >> >>> Routing
> >> >>> >> from 2199066247173
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,305 DEBUG
> [o.a.c.s.s.XenserverSnapshotStrategy]
> >> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
> >> >>> (logid:1cee5097)
> >> >>> >> Can't find snapshot on backup storage, delete it in db
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> -Jerry
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> ________________________________
> >> >>> >> 发件人: Andrija Panic <an...@gmail.com>
> >> >>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> >> >>> >> 收件人: users <us...@cloudstack.apache.org>
> >> >>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> >> >>> >> 主题: Re: 4.13 rbd snapshot delete failed
> >> >>> >>
> >> >>> >> storage.cleanup.delay
> >> >>> >> storage.cleanup.interval
> >> >>> >>
> >> >>> >> put both to 60 (seconds) and wait for up to 2min - should be
> >> deleted
> >> >>> just
> >> >>> >> fine...
> >> >>> >>
> >> >>> >> cheers
> >> >>> >>
> >> >>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com>
> wrote:
> >> >>> >>
> >> >>> >>> Hello All
> >> >>> >>>
> >> >>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that
> snapshots
> >> >>> could
> >> >>> >>> be created and rolled back (using API alone), but deletion could
> >> not
> >> >>> be
> >> >>> >>> completed.
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> After executing the deletion API, the snapshot will disappear
> >> from the
> >> >>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted
> >> (rbd
> >> >>> >> snap
> >> >>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> Is there any way we can completely delete the snapshot?
> >> >>> >>>
> >> >>> >>> -Jerry
> >> >>> >>>
> >> >>> >>>
> >> >>> >>
> >> >>> >> --
> >> >>> >>
> >> >>> >> Andrija Panić
> >> >>> >>
> >> >>> >
> >> >>>
> >>
> >
>


-- 

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
Thx Gabriel - I've commented on the PR - needs some more love - but we're
almost there!

On Thu, 3 Oct 2019 at 20:46, Gabriel Beims Bräscher <ga...@gmail.com>
wrote:

> Hello folks,
>
> Just pinging that I have created PR
> https://github.com/apache/cloudstack/pull/3615 addressing the snapshot
> deletion issue #3586 (https://github.com/apache/cloudstack/issues/3586).
> Please, feel free to test and review.
>
> Regards,
> Gabriel.
>
> Em seg, 9 de set de 2019 às 12:08, Gabriel Beims Bräscher <
> gabrascher@gmail.com> escreveu:
>
> > Thanks for the feedback Andrija and Andrei.
> >
> > I have opened issue #3590 for the snapshot rollback issue raised by
> > Andrija.
> > I will be investigating both issues:
> > - RBD snapshot Revert #3590 (
> > https://github.com/apache/cloudstack/issues/3590)
> > - RBD snapshot deletion #3586 (
> > https://github.com/apache/cloudstack/issues/3586)
> >
> > Cheers,
> > Gabriel
> >
> > Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky <
> andrei@arhont.com>
> > escreveu:
> >
> >> A quick feedback from my side. I've never had a properly working delete
> >> snapshot with ceph. Every week or so I have to manually delete all ceph
> >> snapshots. However, the NFS secondary storage snapshots are deleted just
> >> fine. I've been using CloudStack for 5+ years and it was always the
> case. I
> >> am currently running 4.11.2 with ceph 13.2.6-1xenial.
> >>
> >> Andrei
> >>
> >> ----- Original Message -----
> >> > From: "Andrija Panic" <an...@gmail.com>
> >> > To: "Gabriel Beims Bräscher" <ga...@gmail.com>
> >> > Cc: "users" <us...@cloudstack.apache.org>, "dev" <
> >> dev@cloudstack.apache.org>
> >> > Sent: Sunday, 8 September, 2019 19:17:59
> >> > Subject: Re: 4.13 rbd snapshot delete failed
> >>
> >> > Thx Gabriel for extensive feedback.
> >> > Actually my ex company added the code to really delete a RBD snap back
> >> in
> >> > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is
> >> there,
> >> > but probably some exception is happening or regression...
> >> >
> >> > Cheers
> >> >
> >> > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <
> gabrascher@gmail.com
> >> >
> >> > wrote:
> >> >
> >> >> Thanks for the feedback, Andrija. It looks like delete was not
> totally
> >> >> supported then (am I missing something?). I will take a look into
> this
> >> and
> >> >> open a PR adding propper support for rbd snapshot deletion if
> >> necessary.
> >> >>
> >> >> Regarding the rollback, I have tested it several times and it worked;
> >> >> however, I see a weak point on the Ceph rollback implementation.
> >> >>
> >> >> It looks like Li Jerry was able to execute the rollback without any
> >> >> problem. Li, could you please post here  the log output: "Attempting
> to
> >> >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
> >> >> [snapshotid:%s]"? Andrija will not be able to see that log as the
> >> exception
> >> >> happen prior to it, the only way of you checking those values is via
> >> remote
> >> >> debugging. If you be able to post those values it would help as well
> on
> >> >> sorting out what is wrong.
> >> >>
> >> >> I am checking the code base, running a few tests, and evaluating the
> >> log
> >> >> that you (Andrija) sent. What I can say for now is that it looks that
> >> the
> >> >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical
> >> piece of
> >> >> code that can definitely break the rollback execution flow. My tests
> >> had
> >> >> pointed for a pattern but now I see other possibilities. I will
> >> probably
> >> >> add a few parameters on the rollback/revert command instead of using
> >> the
> >> >> path or review the path life-cycle and different execution flows in
> >> order
> >> >> to keep it safer to be used.
> >> >> [1]
> >> >>
> >>
> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
> >> >>
> >> >> A few details on the test environments and Ceph/RBD version:
> >> >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
> >> >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
> >> >> (stable)
> >> >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2
> [
> >> >> https://github.com/ceph/ceph/pull/6878]
> >> >> Rados-java [https://github.com/ceph/rados-java] supports snapshot
> >> >> rollback since 0.5.0; rados-java 0.5.0 is the version used by
> >> CloudStack
> >> >> 4.13.0.0
> >> >>
> >> >> I will be updating here soon.
> >> >>
> >> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <
> wido@widodh.nl>
> >> >> escreveu:
> >> >>
> >> >>>
> >> >>>
> >> >>> On 9/8/19 5:26 AM, Andrija Panic wrote:
> >> >>> > Maaany release ago, deleting Ceph volume snap, was also only
> >> deleting
> >> >>> it in
> >> >>> > DB, so the RBD performance become terrible with many tens of (i.
> e.
> >> >>> Hourly)
> >> >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and
> the
> >> guys
> >> >>> > will know better...
> >> >>>
> >> >>> I pinged Gabriel and he's looking into it. He'll get back to it.
> >> >>>
> >> >>> Wido
> >> >>>
> >> >>> >
> >> >>> > I
> >> >>> >
> >> >>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> >> >>> >
> >> >>> >> I found it had nothing to do with  storage.cleanup.delay and
> >> >>> >> storage.cleanup.interval.
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because
> >> the RBD
> >> >>> >> snapshot does not have Copy to secondary storage, it only changes
> >> the
> >> >>> >> database information, and does not enter the main storage to
> >> delete the
> >> >>> >> snapshot.
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> Log===========================
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> >> >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
> >> >>> 192.168.254.3
> >> >>> >> -- GET
> >> >>> >>
> >> >>>
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> CIDRs
> >> from
> >> >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]'
> is
> >> >>> allowed
> >> >>> >> to perform API calls: 0.0.0.0/0,::/0
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> >> Retrieved
> >> >>> >> cmdEventType from job info: SNAPSHOT.DELETE
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> >> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
> >> >>> job-1378
> >> >>> >> into job monitoring
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> submit
> >> >>> async
> >> >>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> >> >>> >> instanceType: Snapshot, instanceId: 13, cmd:
> >> >>> >>
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >> >>> cmdInfo:
> >> >>> >>
> >> >>>
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >> >>> >>
> >> >>>
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
> >> 0,
> >> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> >> lastUpdated:
> >> >>> >> null, lastPolled: null, created: null, removed: null}
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097)
> >> Executing
> >> >>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType:
> >> Snapshot,
> >> >>> >> instanceId: 13, cmd:
> >> >>> >>
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >> >>> cmdInfo:
> >> >>> >>
> >> >>>
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >> >>> >>
> >> >>>
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
> >> 0,
> >> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> >> lastUpdated:
> >> >>> >> null, lastPolled: null, created: null, removed: null}
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> >> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> >> ===END===
> >> >>> >> 192.168.254.3 -- GET
> >> >>> >>
> >> >>>
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> >> >>> >> (AgentManager-Handler-12:null) (logid:) Seq
> 1-8660140608456756853:
> >> >>> Routing
> >> >>> >> from 2199066247173
> >> >>> >>
> >> >>> >> 2019-09-07 23:27:00,305 DEBUG
> [o.a.c.s.s.XenserverSnapshotStrategy]
> >> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
> >> >>> (logid:1cee5097)
> >> >>> >> Can't find snapshot on backup storage, delete it in db
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> -Jerry
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> ________________________________
> >> >>> >> 发件人: Andrija Panic <an...@gmail.com>
> >> >>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> >> >>> >> 收件人: users <us...@cloudstack.apache.org>
> >> >>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> >> >>> >> 主题: Re: 4.13 rbd snapshot delete failed
> >> >>> >>
> >> >>> >> storage.cleanup.delay
> >> >>> >> storage.cleanup.interval
> >> >>> >>
> >> >>> >> put both to 60 (seconds) and wait for up to 2min - should be
> >> deleted
> >> >>> just
> >> >>> >> fine...
> >> >>> >>
> >> >>> >> cheers
> >> >>> >>
> >> >>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com>
> wrote:
> >> >>> >>
> >> >>> >>> Hello All
> >> >>> >>>
> >> >>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that
> snapshots
> >> >>> could
> >> >>> >>> be created and rolled back (using API alone), but deletion could
> >> not
> >> >>> be
> >> >>> >>> completed.
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> After executing the deletion API, the snapshot will disappear
> >> from the
> >> >>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted
> >> (rbd
> >> >>> >> snap
> >> >>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> Is there any way we can completely delete the snapshot?
> >> >>> >>>
> >> >>> >>> -Jerry
> >> >>> >>>
> >> >>> >>>
> >> >>> >>
> >> >>> >> --
> >> >>> >>
> >> >>> >> Andrija Panić
> >> >>> >>
> >> >>> >
> >> >>>
> >>
> >
>


-- 

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Hello folks,

Just pinging that I have created PR
https://github.com/apache/cloudstack/pull/3615 addressing the snapshot
deletion issue #3586 (https://github.com/apache/cloudstack/issues/3586).
Please, feel free to test and review.

Regards,
Gabriel.

Em seg, 9 de set de 2019 às 12:08, Gabriel Beims Bräscher <
gabrascher@gmail.com> escreveu:

> Thanks for the feedback Andrija and Andrei.
>
> I have opened issue #3590 for the snapshot rollback issue raised by
> Andrija.
> I will be investigating both issues:
> - RBD snapshot Revert #3590 (
> https://github.com/apache/cloudstack/issues/3590)
> - RBD snapshot deletion #3586 (
> https://github.com/apache/cloudstack/issues/3586)
>
> Cheers,
> Gabriel
>
> Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky <an...@arhont.com>
> escreveu:
>
>> A quick feedback from my side. I've never had a properly working delete
>> snapshot with ceph. Every week or so I have to manually delete all ceph
>> snapshots. However, the NFS secondary storage snapshots are deleted just
>> fine. I've been using CloudStack for 5+ years and it was always the case. I
>> am currently running 4.11.2 with ceph 13.2.6-1xenial.
>>
>> Andrei
>>
>> ----- Original Message -----
>> > From: "Andrija Panic" <an...@gmail.com>
>> > To: "Gabriel Beims Bräscher" <ga...@gmail.com>
>> > Cc: "users" <us...@cloudstack.apache.org>, "dev" <
>> dev@cloudstack.apache.org>
>> > Sent: Sunday, 8 September, 2019 19:17:59
>> > Subject: Re: 4.13 rbd snapshot delete failed
>>
>> > Thx Gabriel for extensive feedback.
>> > Actually my ex company added the code to really delete a RBD snap back
>> in
>> > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is
>> there,
>> > but probably some exception is happening or regression...
>> >
>> > Cheers
>> >
>> > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <gabrascher@gmail.com
>> >
>> > wrote:
>> >
>> >> Thanks for the feedback, Andrija. It looks like delete was not totally
>> >> supported then (am I missing something?). I will take a look into this
>> and
>> >> open a PR adding propper support for rbd snapshot deletion if
>> necessary.
>> >>
>> >> Regarding the rollback, I have tested it several times and it worked;
>> >> however, I see a weak point on the Ceph rollback implementation.
>> >>
>> >> It looks like Li Jerry was able to execute the rollback without any
>> >> problem. Li, could you please post here  the log output: "Attempting to
>> >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
>> >> [snapshotid:%s]"? Andrija will not be able to see that log as the
>> exception
>> >> happen prior to it, the only way of you checking those values is via
>> remote
>> >> debugging. If you be able to post those values it would help as well on
>> >> sorting out what is wrong.
>> >>
>> >> I am checking the code base, running a few tests, and evaluating the
>> log
>> >> that you (Andrija) sent. What I can say for now is that it looks that
>> the
>> >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical
>> piece of
>> >> code that can definitely break the rollback execution flow. My tests
>> had
>> >> pointed for a pattern but now I see other possibilities. I will
>> probably
>> >> add a few parameters on the rollback/revert command instead of using
>> the
>> >> path or review the path life-cycle and different execution flows in
>> order
>> >> to keep it safer to be used.
>> >> [1]
>> >>
>> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>> >>
>> >> A few details on the test environments and Ceph/RBD version:
>> >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
>> >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
>> >> (stable)
>> >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
>> >> https://github.com/ceph/ceph/pull/6878]
>> >> Rados-java [https://github.com/ceph/rados-java] supports snapshot
>> >> rollback since 0.5.0; rados-java 0.5.0 is the version used by
>> CloudStack
>> >> 4.13.0.0
>> >>
>> >> I will be updating here soon.
>> >>
>> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
>> >> escreveu:
>> >>
>> >>>
>> >>>
>> >>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>> >>> > Maaany release ago, deleting Ceph volume snap, was also only
>> deleting
>> >>> it in
>> >>> > DB, so the RBD performance become terrible with many tens of (i. e.
>> >>> Hourly)
>> >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the
>> guys
>> >>> > will know better...
>> >>>
>> >>> I pinged Gabriel and he's looking into it. He'll get back to it.
>> >>>
>> >>> Wido
>> >>>
>> >>> >
>> >>> > I
>> >>> >
>> >>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
>> >>> >
>> >>> >> I found it had nothing to do with  storage.cleanup.delay and
>> >>> >> storage.cleanup.interval.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because
>> the RBD
>> >>> >> snapshot does not have Copy to secondary storage, it only changes
>> the
>> >>> >> database information, and does not enter the main storage to
>> delete the
>> >>> >> snapshot.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> Log===========================
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>> >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>> >>> 192.168.254.3
>> >>> >> -- GET
>> >>> >>
>> >>>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs
>> from
>> >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>> >>> allowed
>> >>> >> to perform API calls: 0.0.0.0/0,::/0
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
>> Retrieved
>> >>> >> cmdEventType from job info: SNAPSHOT.DELETE
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>> >>> job-1378
>> >>> >> into job monitoring
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
>> >>> async
>> >>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>> >>> >> instanceType: Snapshot, instanceId: 13, cmd:
>> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> >>> cmdInfo:
>> >>> >>
>> >>>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>> >>
>> >>>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>> 0,
>> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
>> lastUpdated:
>> >>> >> null, lastPolled: null, created: null, removed: null}
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097)
>> Executing
>> >>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType:
>> Snapshot,
>> >>> >> instanceId: 13, cmd:
>> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> >>> cmdInfo:
>> >>> >>
>> >>>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>> >>
>> >>>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>> 0,
>> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
>> lastUpdated:
>> >>> >> null, lastPolled: null, created: null, removed: null}
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
>> ===END===
>> >>> >> 192.168.254.3 -- GET
>> >>> >>
>> >>>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> >>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
>> >>> Routing
>> >>> >> from 2199066247173
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
>> >>> (logid:1cee5097)
>> >>> >> Can't find snapshot on backup storage, delete it in db
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> -Jerry
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> ________________________________
>> >>> >> 发件人: Andrija Panic <an...@gmail.com>
>> >>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>> >>> >> 收件人: users <us...@cloudstack.apache.org>
>> >>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>> >>> >> 主题: Re: 4.13 rbd snapshot delete failed
>> >>> >>
>> >>> >> storage.cleanup.delay
>> >>> >> storage.cleanup.interval
>> >>> >>
>> >>> >> put both to 60 (seconds) and wait for up to 2min - should be
>> deleted
>> >>> just
>> >>> >> fine...
>> >>> >>
>> >>> >> cheers
>> >>> >>
>> >>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>> >>> >>
>> >>> >>> Hello All
>> >>> >>>
>> >>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
>> >>> could
>> >>> >>> be created and rolled back (using API alone), but deletion could
>> not
>> >>> be
>> >>> >>> completed.
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> After executing the deletion API, the snapshot will disappear
>> from the
>> >>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted
>> (rbd
>> >>> >> snap
>> >>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> Is there any way we can completely delete the snapshot?
>> >>> >>>
>> >>> >>> -Jerry
>> >>> >>>
>> >>> >>>
>> >>> >>
>> >>> >> --
>> >>> >>
>> >>> >> Andrija Panić
>> >>> >>
>> >>> >
>> >>>
>>
>

Re: 4.13 rbd snapshot delete failed

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Hello folks,

Just pinging that I have created PR
https://github.com/apache/cloudstack/pull/3615 addressing the snapshot
deletion issue #3586 (https://github.com/apache/cloudstack/issues/3586).
Please, feel free to test and review.

Regards,
Gabriel.

Em seg, 9 de set de 2019 às 12:08, Gabriel Beims Bräscher <
gabrascher@gmail.com> escreveu:

> Thanks for the feedback Andrija and Andrei.
>
> I have opened issue #3590 for the snapshot rollback issue raised by
> Andrija.
> I will be investigating both issues:
> - RBD snapshot Revert #3590 (
> https://github.com/apache/cloudstack/issues/3590)
> - RBD snapshot deletion #3586 (
> https://github.com/apache/cloudstack/issues/3586)
>
> Cheers,
> Gabriel
>
> Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky <an...@arhont.com>
> escreveu:
>
>> A quick feedback from my side. I've never had a properly working delete
>> snapshot with ceph. Every week or so I have to manually delete all ceph
>> snapshots. However, the NFS secondary storage snapshots are deleted just
>> fine. I've been using CloudStack for 5+ years and it was always the case. I
>> am currently running 4.11.2 with ceph 13.2.6-1xenial.
>>
>> Andrei
>>
>> ----- Original Message -----
>> > From: "Andrija Panic" <an...@gmail.com>
>> > To: "Gabriel Beims Bräscher" <ga...@gmail.com>
>> > Cc: "users" <us...@cloudstack.apache.org>, "dev" <
>> dev@cloudstack.apache.org>
>> > Sent: Sunday, 8 September, 2019 19:17:59
>> > Subject: Re: 4.13 rbd snapshot delete failed
>>
>> > Thx Gabriel for extensive feedback.
>> > Actually my ex company added the code to really delete a RBD snap back
>> in
>> > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is
>> there,
>> > but probably some exception is happening or regression...
>> >
>> > Cheers
>> >
>> > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <gabrascher@gmail.com
>> >
>> > wrote:
>> >
>> >> Thanks for the feedback, Andrija. It looks like delete was not totally
>> >> supported then (am I missing something?). I will take a look into this
>> and
>> >> open a PR adding propper support for rbd snapshot deletion if
>> necessary.
>> >>
>> >> Regarding the rollback, I have tested it several times and it worked;
>> >> however, I see a weak point on the Ceph rollback implementation.
>> >>
>> >> It looks like Li Jerry was able to execute the rollback without any
>> >> problem. Li, could you please post here  the log output: "Attempting to
>> >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
>> >> [snapshotid:%s]"? Andrija will not be able to see that log as the
>> exception
>> >> happen prior to it, the only way of you checking those values is via
>> remote
>> >> debugging. If you be able to post those values it would help as well on
>> >> sorting out what is wrong.
>> >>
>> >> I am checking the code base, running a few tests, and evaluating the
>> log
>> >> that you (Andrija) sent. What I can say for now is that it looks that
>> the
>> >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical
>> piece of
>> >> code that can definitely break the rollback execution flow. My tests
>> had
>> >> pointed for a pattern but now I see other possibilities. I will
>> probably
>> >> add a few parameters on the rollback/revert command instead of using
>> the
>> >> path or review the path life-cycle and different execution flows in
>> order
>> >> to keep it safer to be used.
>> >> [1]
>> >>
>> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>> >>
>> >> A few details on the test environments and Ceph/RBD version:
>> >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
>> >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
>> >> (stable)
>> >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
>> >> https://github.com/ceph/ceph/pull/6878]
>> >> Rados-java [https://github.com/ceph/rados-java] supports snapshot
>> >> rollback since 0.5.0; rados-java 0.5.0 is the version used by
>> CloudStack
>> >> 4.13.0.0
>> >>
>> >> I will be updating here soon.
>> >>
>> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
>> >> escreveu:
>> >>
>> >>>
>> >>>
>> >>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>> >>> > Maaany release ago, deleting Ceph volume snap, was also only
>> deleting
>> >>> it in
>> >>> > DB, so the RBD performance become terrible with many tens of (i. e.
>> >>> Hourly)
>> >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the
>> guys
>> >>> > will know better...
>> >>>
>> >>> I pinged Gabriel and he's looking into it. He'll get back to it.
>> >>>
>> >>> Wido
>> >>>
>> >>> >
>> >>> > I
>> >>> >
>> >>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
>> >>> >
>> >>> >> I found it had nothing to do with  storage.cleanup.delay and
>> >>> >> storage.cleanup.interval.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because
>> the RBD
>> >>> >> snapshot does not have Copy to secondary storage, it only changes
>> the
>> >>> >> database information, and does not enter the main storage to
>> delete the
>> >>> >> snapshot.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> Log===========================
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>> >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>> >>> 192.168.254.3
>> >>> >> -- GET
>> >>> >>
>> >>>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs
>> from
>> >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>> >>> allowed
>> >>> >> to perform API calls: 0.0.0.0/0,::/0
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
>> Retrieved
>> >>> >> cmdEventType from job info: SNAPSHOT.DELETE
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>> >>> job-1378
>> >>> >> into job monitoring
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
>> >>> async
>> >>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>> >>> >> instanceType: Snapshot, instanceId: 13, cmd:
>> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> >>> cmdInfo:
>> >>> >>
>> >>>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>> >>
>> >>>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>> 0,
>> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
>> lastUpdated:
>> >>> >> null, lastPolled: null, created: null, removed: null}
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097)
>> Executing
>> >>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType:
>> Snapshot,
>> >>> >> instanceId: 13, cmd:
>> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> >>> cmdInfo:
>> >>> >>
>> >>>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>> >>
>> >>>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode:
>> 0,
>> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
>> lastUpdated:
>> >>> >> null, lastPolled: null, created: null, removed: null}
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
>> ===END===
>> >>> >> 192.168.254.3 -- GET
>> >>> >>
>> >>>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> >>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
>> >>> Routing
>> >>> >> from 2199066247173
>> >>> >>
>> >>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
>> >>> (logid:1cee5097)
>> >>> >> Can't find snapshot on backup storage, delete it in db
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> -Jerry
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> ________________________________
>> >>> >> 发件人: Andrija Panic <an...@gmail.com>
>> >>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>> >>> >> 收件人: users <us...@cloudstack.apache.org>
>> >>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>> >>> >> 主题: Re: 4.13 rbd snapshot delete failed
>> >>> >>
>> >>> >> storage.cleanup.delay
>> >>> >> storage.cleanup.interval
>> >>> >>
>> >>> >> put both to 60 (seconds) and wait for up to 2min - should be
>> deleted
>> >>> just
>> >>> >> fine...
>> >>> >>
>> >>> >> cheers
>> >>> >>
>> >>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>> >>> >>
>> >>> >>> Hello All
>> >>> >>>
>> >>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
>> >>> could
>> >>> >>> be created and rolled back (using API alone), but deletion could
>> not
>> >>> be
>> >>> >>> completed.
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> After executing the deletion API, the snapshot will disappear
>> from the
>> >>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted
>> (rbd
>> >>> >> snap
>> >>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> Is there any way we can completely delete the snapshot?
>> >>> >>>
>> >>> >>> -Jerry
>> >>> >>>
>> >>> >>>
>> >>> >>
>> >>> >> --
>> >>> >>
>> >>> >> Andrija Panić
>> >>> >>
>> >>> >
>> >>>
>>
>

Re: 4.13 rbd snapshot delete failed

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Thanks for the feedback Andrija and Andrei.

I have opened issue #3590 for the snapshot rollback issue raised by
Andrija.
I will be investigating both issues:
- RBD snapshot Revert #3590 (
https://github.com/apache/cloudstack/issues/3590)
- RBD snapshot deletion #3586 (
https://github.com/apache/cloudstack/issues/3586)

Cheers,
Gabriel

Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky <an...@arhont.com>
escreveu:

> A quick feedback from my side. I've never had a properly working delete
> snapshot with ceph. Every week or so I have to manually delete all ceph
> snapshots. However, the NFS secondary storage snapshots are deleted just
> fine. I've been using CloudStack for 5+ years and it was always the case. I
> am currently running 4.11.2 with ceph 13.2.6-1xenial.
>
> Andrei
>
> ----- Original Message -----
> > From: "Andrija Panic" <an...@gmail.com>
> > To: "Gabriel Beims Bräscher" <ga...@gmail.com>
> > Cc: "users" <us...@cloudstack.apache.org>, "dev" <
> dev@cloudstack.apache.org>
> > Sent: Sunday, 8 September, 2019 19:17:59
> > Subject: Re: 4.13 rbd snapshot delete failed
>
> > Thx Gabriel for extensive feedback.
> > Actually my ex company added the code to really delete a RBD snap back in
> > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is
> there,
> > but probably some exception is happening or regression...
> >
> > Cheers
> >
> > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <ga...@gmail.com>
> > wrote:
> >
> >> Thanks for the feedback, Andrija. It looks like delete was not totally
> >> supported then (am I missing something?). I will take a look into this
> and
> >> open a PR adding propper support for rbd snapshot deletion if necessary.
> >>
> >> Regarding the rollback, I have tested it several times and it worked;
> >> however, I see a weak point on the Ceph rollback implementation.
> >>
> >> It looks like Li Jerry was able to execute the rollback without any
> >> problem. Li, could you please post here  the log output: "Attempting to
> >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
> >> [snapshotid:%s]"? Andrija will not be able to see that log as the
> exception
> >> happen prior to it, the only way of you checking those values is via
> remote
> >> debugging. If you be able to post those values it would help as well on
> >> sorting out what is wrong.
> >>
> >> I am checking the code base, running a few tests, and evaluating the log
> >> that you (Andrija) sent. What I can say for now is that it looks that
> the
> >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical
> piece of
> >> code that can definitely break the rollback execution flow. My tests had
> >> pointed for a pattern but now I see other possibilities. I will probably
> >> add a few parameters on the rollback/revert command instead of using the
> >> path or review the path life-cycle and different execution flows in
> order
> >> to keep it safer to be used.
> >> [1]
> >>
> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
> >>
> >> A few details on the test environments and Ceph/RBD version:
> >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
> >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
> >> (stable)
> >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
> >> https://github.com/ceph/ceph/pull/6878]
> >> Rados-java [https://github.com/ceph/rados-java] supports snapshot
> >> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
> >> 4.13.0.0
> >>
> >> I will be updating here soon.
> >>
> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
> >> escreveu:
> >>
> >>>
> >>>
> >>> On 9/8/19 5:26 AM, Andrija Panic wrote:
> >>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
> >>> it in
> >>> > DB, so the RBD performance become terrible with many tens of (i. e.
> >>> Hourly)
> >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the
> guys
> >>> > will know better...
> >>>
> >>> I pinged Gabriel and he's looking into it. He'll get back to it.
> >>>
> >>> Wido
> >>>
> >>> >
> >>> > I
> >>> >
> >>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> >>> >
> >>> >> I found it had nothing to do with  storage.cleanup.delay and
> >>> >> storage.cleanup.interval.
> >>> >>
> >>> >>
> >>> >>
> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because the
> RBD
> >>> >> snapshot does not have Copy to secondary storage, it only changes
> the
> >>> >> database information, and does not enter the main storage to delete
> the
> >>> >> snapshot.
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> Log===========================
> >>> >>
> >>> >>
> >>> >>
> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
> >>> 192.168.254.3
> >>> >> -- GET
> >>> >>
> >>>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>> >>
> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs
> from
> >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
> >>> allowed
> >>> >> to perform API calls: 0.0.0.0/0,::/0
> >>> >>
> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> Retrieved
> >>> >> cmdEventType from job info: SNAPSHOT.DELETE
> >>> >>
> >>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
> >>> job-1378
> >>> >> into job monitoring
> >>> >>
> >>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
> >>> async
> >>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> >>> >> instanceType: Snapshot, instanceId: 13, cmd:
> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >>> cmdInfo:
> >>> >>
> >>>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>> >>
> >>>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> lastUpdated:
> >>> >> null, lastPolled: null, created: null, removed: null}
> >>> >>
> >>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097)
> Executing
> >>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType:
> Snapshot,
> >>> >> instanceId: 13, cmd:
> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >>> cmdInfo:
> >>> >>
> >>>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>> >>
> >>>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> lastUpdated:
> >>> >> null, lastPolled: null, created: null, removed: null}
> >>> >>
> >>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> ===END===
> >>> >> 192.168.254.3 -- GET
> >>> >>
> >>>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>> >>
> >>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> >>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
> >>> Routing
> >>> >> from 2199066247173
> >>> >>
> >>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
> >>> (logid:1cee5097)
> >>> >> Can't find snapshot on backup storage, delete it in db
> >>> >>
> >>> >>
> >>> >>
> >>> >> -Jerry
> >>> >>
> >>> >>
> >>> >>
> >>> >> ________________________________
> >>> >> 发件人: Andrija Panic <an...@gmail.com>
> >>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> >>> >> 收件人: users <us...@cloudstack.apache.org>
> >>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> >>> >> 主题: Re: 4.13 rbd snapshot delete failed
> >>> >>
> >>> >> storage.cleanup.delay
> >>> >> storage.cleanup.interval
> >>> >>
> >>> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
> >>> just
> >>> >> fine...
> >>> >>
> >>> >> cheers
> >>> >>
> >>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
> >>> >>
> >>> >>> Hello All
> >>> >>>
> >>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
> >>> could
> >>> >>> be created and rolled back (using API alone), but deletion could
> not
> >>> be
> >>> >>> completed.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> After executing the deletion API, the snapshot will disappear from
> the
> >>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted
> (rbd
> >>> >> snap
> >>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Is there any way we can completely delete the snapshot?
> >>> >>>
> >>> >>> -Jerry
> >>> >>>
> >>> >>>
> >>> >>
> >>> >> --
> >>> >>
> >>> >> Andrija Panić
> >>> >>
> >>> >
> >>>
>

Re: 4.13 rbd snapshot delete failed

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Thanks for the feedback Andrija and Andrei.

I have opened issue #3590 for the snapshot rollback issue raised by
Andrija.
I will be investigating both issues:
- RBD snapshot Revert #3590 (
https://github.com/apache/cloudstack/issues/3590)
- RBD snapshot deletion #3586 (
https://github.com/apache/cloudstack/issues/3586)

Cheers,
Gabriel

Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky <an...@arhont.com>
escreveu:

> A quick feedback from my side. I've never had a properly working delete
> snapshot with ceph. Every week or so I have to manually delete all ceph
> snapshots. However, the NFS secondary storage snapshots are deleted just
> fine. I've been using CloudStack for 5+ years and it was always the case. I
> am currently running 4.11.2 with ceph 13.2.6-1xenial.
>
> Andrei
>
> ----- Original Message -----
> > From: "Andrija Panic" <an...@gmail.com>
> > To: "Gabriel Beims Bräscher" <ga...@gmail.com>
> > Cc: "users" <us...@cloudstack.apache.org>, "dev" <
> dev@cloudstack.apache.org>
> > Sent: Sunday, 8 September, 2019 19:17:59
> > Subject: Re: 4.13 rbd snapshot delete failed
>
> > Thx Gabriel for extensive feedback.
> > Actually my ex company added the code to really delete a RBD snap back in
> > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is
> there,
> > but probably some exception is happening or regression...
> >
> > Cheers
> >
> > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <ga...@gmail.com>
> > wrote:
> >
> >> Thanks for the feedback, Andrija. It looks like delete was not totally
> >> supported then (am I missing something?). I will take a look into this
> and
> >> open a PR adding propper support for rbd snapshot deletion if necessary.
> >>
> >> Regarding the rollback, I have tested it several times and it worked;
> >> however, I see a weak point on the Ceph rollback implementation.
> >>
> >> It looks like Li Jerry was able to execute the rollback without any
> >> problem. Li, could you please post here  the log output: "Attempting to
> >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
> >> [snapshotid:%s]"? Andrija will not be able to see that log as the
> exception
> >> happen prior to it, the only way of you checking those values is via
> remote
> >> debugging. If you be able to post those values it would help as well on
> >> sorting out what is wrong.
> >>
> >> I am checking the code base, running a few tests, and evaluating the log
> >> that you (Andrija) sent. What I can say for now is that it looks that
> the
> >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical
> piece of
> >> code that can definitely break the rollback execution flow. My tests had
> >> pointed for a pattern but now I see other possibilities. I will probably
> >> add a few parameters on the rollback/revert command instead of using the
> >> path or review the path life-cycle and different execution flows in
> order
> >> to keep it safer to be used.
> >> [1]
> >>
> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
> >>
> >> A few details on the test environments and Ceph/RBD version:
> >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
> >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
> >> (stable)
> >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
> >> https://github.com/ceph/ceph/pull/6878]
> >> Rados-java [https://github.com/ceph/rados-java] supports snapshot
> >> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
> >> 4.13.0.0
> >>
> >> I will be updating here soon.
> >>
> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
> >> escreveu:
> >>
> >>>
> >>>
> >>> On 9/8/19 5:26 AM, Andrija Panic wrote:
> >>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
> >>> it in
> >>> > DB, so the RBD performance become terrible with many tens of (i. e.
> >>> Hourly)
> >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the
> guys
> >>> > will know better...
> >>>
> >>> I pinged Gabriel and he's looking into it. He'll get back to it.
> >>>
> >>> Wido
> >>>
> >>> >
> >>> > I
> >>> >
> >>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> >>> >
> >>> >> I found it had nothing to do with  storage.cleanup.delay and
> >>> >> storage.cleanup.interval.
> >>> >>
> >>> >>
> >>> >>
> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because the
> RBD
> >>> >> snapshot does not have Copy to secondary storage, it only changes
> the
> >>> >> database information, and does not enter the main storage to delete
> the
> >>> >> snapshot.
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> Log===========================
> >>> >>
> >>> >>
> >>> >>
> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
> >>> 192.168.254.3
> >>> >> -- GET
> >>> >>
> >>>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>> >>
> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs
> from
> >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
> >>> allowed
> >>> >> to perform API calls: 0.0.0.0/0,::/0
> >>> >>
> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> Retrieved
> >>> >> cmdEventType from job info: SNAPSHOT.DELETE
> >>> >>
> >>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
> >>> job-1378
> >>> >> into job monitoring
> >>> >>
> >>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
> >>> async
> >>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> >>> >> instanceType: Snapshot, instanceId: 13, cmd:
> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >>> cmdInfo:
> >>> >>
> >>>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>> >>
> >>>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> lastUpdated:
> >>> >> null, lastPolled: null, created: null, removed: null}
> >>> >>
> >>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097)
> Executing
> >>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType:
> Snapshot,
> >>> >> instanceId: 13, cmd:
> >>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> >>> cmdInfo:
> >>> >>
> >>>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>> >>
> >>>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >>> >> result: null, initMsid: 2200502468634, completeMsid: null,
> lastUpdated:
> >>> >> null, lastPolled: null, created: null, removed: null}
> >>> >>
> >>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8)
> ===END===
> >>> >> 192.168.254.3 -- GET
> >>> >>
> >>>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>> >>
> >>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> >>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
> >>> Routing
> >>> >> from 2199066247173
> >>> >>
> >>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
> >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
> >>> (logid:1cee5097)
> >>> >> Can't find snapshot on backup storage, delete it in db
> >>> >>
> >>> >>
> >>> >>
> >>> >> -Jerry
> >>> >>
> >>> >>
> >>> >>
> >>> >> ________________________________
> >>> >> 发件人: Andrija Panic <an...@gmail.com>
> >>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> >>> >> 收件人: users <us...@cloudstack.apache.org>
> >>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> >>> >> 主题: Re: 4.13 rbd snapshot delete failed
> >>> >>
> >>> >> storage.cleanup.delay
> >>> >> storage.cleanup.interval
> >>> >>
> >>> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
> >>> just
> >>> >> fine...
> >>> >>
> >>> >> cheers
> >>> >>
> >>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
> >>> >>
> >>> >>> Hello All
> >>> >>>
> >>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
> >>> could
> >>> >>> be created and rolled back (using API alone), but deletion could
> not
> >>> be
> >>> >>> completed.
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> After executing the deletion API, the snapshot will disappear from
> the
> >>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted
> (rbd
> >>> >> snap
> >>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> Is there any way we can completely delete the snapshot?
> >>> >>>
> >>> >>> -Jerry
> >>> >>>
> >>> >>>
> >>> >>
> >>> >> --
> >>> >>
> >>> >> Andrija Panić
> >>> >>
> >>> >
> >>>
>

Re: 4.13 rbd snapshot delete failed

Posted by Andrei Mikhailovsky <an...@arhont.com.INVALID>.
A quick feedback from my side. I've never had a properly working delete snapshot with ceph. Every week or so I have to manually delete all ceph snapshots. However, the NFS secondary storage snapshots are deleted just fine. I've been using CloudStack for 5+ years and it was always the case. I am currently running 4.11.2 with ceph 13.2.6-1xenial.

Andrei

----- Original Message -----
> From: "Andrija Panic" <an...@gmail.com>
> To: "Gabriel Beims Bräscher" <ga...@gmail.com>
> Cc: "users" <us...@cloudstack.apache.org>, "dev" <de...@cloudstack.apache.org>
> Sent: Sunday, 8 September, 2019 19:17:59
> Subject: Re: 4.13 rbd snapshot delete failed

> Thx Gabriel for extensive feedback.
> Actually my ex company added the code to really delete a RBD snap back in
> 2016 or so, was part of 4.9 if not mistaken. So I expect the code is there,
> but probably some exception is happening or regression...
> 
> Cheers
> 
> On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <ga...@gmail.com>
> wrote:
> 
>> Thanks for the feedback, Andrija. It looks like delete was not totally
>> supported then (am I missing something?). I will take a look into this and
>> open a PR adding propper support for rbd snapshot deletion if necessary.
>>
>> Regarding the rollback, I have tested it several times and it worked;
>> however, I see a weak point on the Ceph rollback implementation.
>>
>> It looks like Li Jerry was able to execute the rollback without any
>> problem. Li, could you please post here  the log output: "Attempting to
>> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
>> [snapshotid:%s]"? Andrija will not be able to see that log as the exception
>> happen prior to it, the only way of you checking those values is via remote
>> debugging. If you be able to post those values it would help as well on
>> sorting out what is wrong.
>>
>> I am checking the code base, running a few tests, and evaluating the log
>> that you (Andrija) sent. What I can say for now is that it looks that the
>> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
>> code that can definitely break the rollback execution flow. My tests had
>> pointed for a pattern but now I see other possibilities. I will probably
>> add a few parameters on the rollback/revert command instead of using the
>> path or review the path life-cycle and different execution flows in order
>> to keep it safer to be used.
>> [1]
>> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>>
>> A few details on the test environments and Ceph/RBD version:
>> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
>> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
>> (stable)
>> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
>> https://github.com/ceph/ceph/pull/6878]
>> Rados-java [https://github.com/ceph/rados-java] supports snapshot
>> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
>> 4.13.0.0
>>
>> I will be updating here soon.
>>
>> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
>> escreveu:
>>
>>>
>>>
>>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
>>> it in
>>> > DB, so the RBD performance become terrible with many tens of (i. e.
>>> Hourly)
>>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
>>> > will know better...
>>>
>>> I pinged Gabriel and he's looking into it. He'll get back to it.
>>>
>>> Wido
>>>
>>> >
>>> > I
>>> >
>>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
>>> >
>>> >> I found it had nothing to do with  storage.cleanup.delay and
>>> >> storage.cleanup.interval.
>>> >>
>>> >>
>>> >>
>>> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>>> >> snapshot does not have Copy to secondary storage, it only changes the
>>> >> database information, and does not enter the main storage to delete the
>>> >> snapshot.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Log===========================
>>> >>
>>> >>
>>> >>
>>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>>> 192.168.254.3
>>> >> -- GET
>>> >>
>>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>> >>
>>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>>> allowed
>>> >> to perform API calls: 0.0.0.0/0,::/0
>>> >>
>>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>>> >> cmdEventType from job info: SNAPSHOT.DELETE
>>> >>
>>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>>> job-1378
>>> >> into job monitoring
>>> >>
>>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
>>> async
>>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>>> >> instanceType: Snapshot, instanceId: 13, cmd:
>>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>>> cmdInfo:
>>> >>
>>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>>> >>
>>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>>> >> null, lastPolled: null, created: null, removed: null}
>>> >>
>>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
>>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
>>> >> instanceId: 13, cmd:
>>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>>> cmdInfo:
>>> >>
>>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>>> >>
>>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>>> >> null, lastPolled: null, created: null, removed: null}
>>> >>
>>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
>>> >> 192.168.254.3 -- GET
>>> >>
>>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>> >>
>>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
>>> Routing
>>> >> from 2199066247173
>>> >>
>>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
>>> (logid:1cee5097)
>>> >> Can't find snapshot on backup storage, delete it in db
>>> >>
>>> >>
>>> >>
>>> >> -Jerry
>>> >>
>>> >>
>>> >>
>>> >> ________________________________
>>> >> 发件人: Andrija Panic <an...@gmail.com>
>>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>>> >> 收件人: users <us...@cloudstack.apache.org>
>>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>>> >> 主题: Re: 4.13 rbd snapshot delete failed
>>> >>
>>> >> storage.cleanup.delay
>>> >> storage.cleanup.interval
>>> >>
>>> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
>>> just
>>> >> fine...
>>> >>
>>> >> cheers
>>> >>
>>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>>> >>
>>> >>> Hello All
>>> >>>
>>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
>>> could
>>> >>> be created and rolled back (using API alone), but deletion could not
>>> be
>>> >>> completed.
>>> >>>
>>> >>>
>>> >>>
>>> >>> After executing the deletion API, the snapshot will disappear from the
>>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
>>> >> snap
>>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>> >>>
>>> >>>
>>> >>>
>>> >>> Is there any way we can completely delete the snapshot?
>>> >>>
>>> >>> -Jerry
>>> >>>
>>> >>>
>>> >>
>>> >> --
>>> >>
>>> >> Andrija Panić
>>> >>
>>> >
>>>

Re: 4.13 rbd snapshot delete failed

Posted by Andrei Mikhailovsky <an...@arhont.com.INVALID>.
A quick feedback from my side. I've never had a properly working delete snapshot with ceph. Every week or so I have to manually delete all ceph snapshots. However, the NFS secondary storage snapshots are deleted just fine. I've been using CloudStack for 5+ years and it was always the case. I am currently running 4.11.2 with ceph 13.2.6-1xenial.

Andrei

----- Original Message -----
> From: "Andrija Panic" <an...@gmail.com>
> To: "Gabriel Beims Bräscher" <ga...@gmail.com>
> Cc: "users" <us...@cloudstack.apache.org>, "dev" <de...@cloudstack.apache.org>
> Sent: Sunday, 8 September, 2019 19:17:59
> Subject: Re: 4.13 rbd snapshot delete failed

> Thx Gabriel for extensive feedback.
> Actually my ex company added the code to really delete a RBD snap back in
> 2016 or so, was part of 4.9 if not mistaken. So I expect the code is there,
> but probably some exception is happening or regression...
> 
> Cheers
> 
> On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <ga...@gmail.com>
> wrote:
> 
>> Thanks for the feedback, Andrija. It looks like delete was not totally
>> supported then (am I missing something?). I will take a look into this and
>> open a PR adding propper support for rbd snapshot deletion if necessary.
>>
>> Regarding the rollback, I have tested it several times and it worked;
>> however, I see a weak point on the Ceph rollback implementation.
>>
>> It looks like Li Jerry was able to execute the rollback without any
>> problem. Li, could you please post here  the log output: "Attempting to
>> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
>> [snapshotid:%s]"? Andrija will not be able to see that log as the exception
>> happen prior to it, the only way of you checking those values is via remote
>> debugging. If you be able to post those values it would help as well on
>> sorting out what is wrong.
>>
>> I am checking the code base, running a few tests, and evaluating the log
>> that you (Andrija) sent. What I can say for now is that it looks that the
>> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
>> code that can definitely break the rollback execution flow. My tests had
>> pointed for a pattern but now I see other possibilities. I will probably
>> add a few parameters on the rollback/revert command instead of using the
>> path or review the path life-cycle and different execution flows in order
>> to keep it safer to be used.
>> [1]
>> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>>
>> A few details on the test environments and Ceph/RBD version:
>> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
>> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
>> (stable)
>> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
>> https://github.com/ceph/ceph/pull/6878]
>> Rados-java [https://github.com/ceph/rados-java] supports snapshot
>> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
>> 4.13.0.0
>>
>> I will be updating here soon.
>>
>> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
>> escreveu:
>>
>>>
>>>
>>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
>>> it in
>>> > DB, so the RBD performance become terrible with many tens of (i. e.
>>> Hourly)
>>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
>>> > will know better...
>>>
>>> I pinged Gabriel and he's looking into it. He'll get back to it.
>>>
>>> Wido
>>>
>>> >
>>> > I
>>> >
>>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
>>> >
>>> >> I found it had nothing to do with  storage.cleanup.delay and
>>> >> storage.cleanup.interval.
>>> >>
>>> >>
>>> >>
>>> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>>> >> snapshot does not have Copy to secondary storage, it only changes the
>>> >> database information, and does not enter the main storage to delete the
>>> >> snapshot.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Log===========================
>>> >>
>>> >>
>>> >>
>>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>>> 192.168.254.3
>>> >> -- GET
>>> >>
>>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>> >>
>>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>>> allowed
>>> >> to perform API calls: 0.0.0.0/0,::/0
>>> >>
>>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>>> >> cmdEventType from job info: SNAPSHOT.DELETE
>>> >>
>>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>>> job-1378
>>> >> into job monitoring
>>> >>
>>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
>>> async
>>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>>> >> instanceType: Snapshot, instanceId: 13, cmd:
>>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>>> cmdInfo:
>>> >>
>>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>>> >>
>>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>>> >> null, lastPolled: null, created: null, removed: null}
>>> >>
>>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
>>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
>>> >> instanceId: 13, cmd:
>>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>>> cmdInfo:
>>> >>
>>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>>> >>
>>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>>> >> null, lastPolled: null, created: null, removed: null}
>>> >>
>>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
>>> >> 192.168.254.3 -- GET
>>> >>
>>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>> >>
>>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
>>> Routing
>>> >> from 2199066247173
>>> >>
>>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
>>> (logid:1cee5097)
>>> >> Can't find snapshot on backup storage, delete it in db
>>> >>
>>> >>
>>> >>
>>> >> -Jerry
>>> >>
>>> >>
>>> >>
>>> >> ________________________________
>>> >> 发件人: Andrija Panic <an...@gmail.com>
>>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>>> >> 收件人: users <us...@cloudstack.apache.org>
>>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>>> >> 主题: Re: 4.13 rbd snapshot delete failed
>>> >>
>>> >> storage.cleanup.delay
>>> >> storage.cleanup.interval
>>> >>
>>> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
>>> just
>>> >> fine...
>>> >>
>>> >> cheers
>>> >>
>>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>>> >>
>>> >>> Hello All
>>> >>>
>>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
>>> could
>>> >>> be created and rolled back (using API alone), but deletion could not
>>> be
>>> >>> completed.
>>> >>>
>>> >>>
>>> >>>
>>> >>> After executing the deletion API, the snapshot will disappear from the
>>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
>>> >> snap
>>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>> >>>
>>> >>>
>>> >>>
>>> >>> Is there any way we can completely delete the snapshot?
>>> >>>
>>> >>> -Jerry
>>> >>>
>>> >>>
>>> >>
>>> >> --
>>> >>
>>> >> Andrija Panić
>>> >>
>>> >
>>>

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
Thx Gabriel for extensive feedback.
Actually my ex company added the code to really delete a RBD snap back in
2016 or so, was part of 4.9 if not mistaken. So I expect the code is there,
but probably some exception is happening or regression...

Cheers

On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <ga...@gmail.com>
wrote:

> Thanks for the feedback, Andrija. It looks like delete was not totally
> supported then (am I missing something?). I will take a look into this and
> open a PR adding propper support for rbd snapshot deletion if necessary.
>
> Regarding the rollback, I have tested it several times and it worked;
> however, I see a weak point on the Ceph rollback implementation.
>
> It looks like Li Jerry was able to execute the rollback without any
> problem. Li, could you please post here  the log output: "Attempting to
> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
> [snapshotid:%s]"? Andrija will not be able to see that log as the exception
> happen prior to it, the only way of you checking those values is via remote
> debugging. If you be able to post those values it would help as well on
> sorting out what is wrong.
>
> I am checking the code base, running a few tests, and evaluating the log
> that you (Andrija) sent. What I can say for now is that it looks that the
> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
> code that can definitely break the rollback execution flow. My tests had
> pointed for a pattern but now I see other possibilities. I will probably
> add a few parameters on the rollback/revert command instead of using the
> path or review the path life-cycle and different execution flows in order
> to keep it safer to be used.
> [1]
> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>
> A few details on the test environments and Ceph/RBD version:
> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
> (stable)
> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
> https://github.com/ceph/ceph/pull/6878]
> Rados-java [https://github.com/ceph/rados-java] supports snapshot
> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
> 4.13.0.0
>
> I will be updating here soon.
>
> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
> escreveu:
>
>>
>>
>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
>> it in
>> > DB, so the RBD performance become terrible with many tens of (i. e.
>> Hourly)
>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
>> > will know better...
>>
>> I pinged Gabriel and he's looking into it. He'll get back to it.
>>
>> Wido
>>
>> >
>> > I
>> >
>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
>> >
>> >> I found it had nothing to do with  storage.cleanup.delay and
>> >> storage.cleanup.interval.
>> >>
>> >>
>> >>
>> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>> >> snapshot does not have Copy to secondary storage, it only changes the
>> >> database information, and does not enter the main storage to delete the
>> >> snapshot.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Log===========================
>> >>
>> >>
>> >>
>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>> 192.168.254.3
>> >> -- GET
>> >>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>
>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>> allowed
>> >> to perform API calls: 0.0.0.0/0,::/0
>> >>
>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>> >> cmdEventType from job info: SNAPSHOT.DELETE
>> >>
>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>> job-1378
>> >> into job monitoring
>> >>
>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
>> async
>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>> >> instanceType: Snapshot, instanceId: 13, cmd:
>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> cmdInfo:
>> >>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> >> null, lastPolled: null, created: null, removed: null}
>> >>
>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
>> >> instanceId: 13, cmd:
>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> cmdInfo:
>> >>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> >> null, lastPolled: null, created: null, removed: null}
>> >>
>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
>> >> 192.168.254.3 -- GET
>> >>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>
>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
>> Routing
>> >> from 2199066247173
>> >>
>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
>> (logid:1cee5097)
>> >> Can't find snapshot on backup storage, delete it in db
>> >>
>> >>
>> >>
>> >> -Jerry
>> >>
>> >>
>> >>
>> >> ________________________________
>> >> 发件人: Andrija Panic <an...@gmail.com>
>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>> >> 收件人: users <us...@cloudstack.apache.org>
>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>> >> 主题: Re: 4.13 rbd snapshot delete failed
>> >>
>> >> storage.cleanup.delay
>> >> storage.cleanup.interval
>> >>
>> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
>> just
>> >> fine...
>> >>
>> >> cheers
>> >>
>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>> >>
>> >>> Hello All
>> >>>
>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
>> could
>> >>> be created and rolled back (using API alone), but deletion could not
>> be
>> >>> completed.
>> >>>
>> >>>
>> >>>
>> >>> After executing the deletion API, the snapshot will disappear from the
>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
>> >> snap
>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>> >>>
>> >>>
>> >>>
>> >>> Is there any way we can completely delete the snapshot?
>> >>>
>> >>> -Jerry
>> >>>
>> >>>
>> >>
>> >> --
>> >>
>> >> Andrija Panić
>> >>
>> >
>>
>

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
Thx Gabriel for extensive feedback.
Actually my ex company added the code to really delete a RBD snap back in
2016 or so, was part of 4.9 if not mistaken. So I expect the code is there,
but probably some exception is happening or regression...

Cheers

On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher <ga...@gmail.com>
wrote:

> Thanks for the feedback, Andrija. It looks like delete was not totally
> supported then (am I missing something?). I will take a look into this and
> open a PR adding propper support for rbd snapshot deletion if necessary.
>
> Regarding the rollback, I have tested it several times and it worked;
> however, I see a weak point on the Ceph rollback implementation.
>
> It looks like Li Jerry was able to execute the rollback without any
> problem. Li, could you please post here  the log output: "Attempting to
> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
> [snapshotid:%s]"? Andrija will not be able to see that log as the exception
> happen prior to it, the only way of you checking those values is via remote
> debugging. If you be able to post those values it would help as well on
> sorting out what is wrong.
>
> I am checking the code base, running a few tests, and evaluating the log
> that you (Andrija) sent. What I can say for now is that it looks that the
> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
> code that can definitely break the rollback execution flow. My tests had
> pointed for a pattern but now I see other possibilities. I will probably
> add a few parameters on the rollback/revert command instead of using the
> path or review the path life-cycle and different execution flows in order
> to keep it safer to be used.
> [1]
> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>
> A few details on the test environments and Ceph/RBD version:
> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
> (stable)
> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
> https://github.com/ceph/ceph/pull/6878]
> Rados-java [https://github.com/ceph/rados-java] supports snapshot
> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
> 4.13.0.0
>
> I will be updating here soon.
>
> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
> escreveu:
>
>>
>>
>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
>> it in
>> > DB, so the RBD performance become terrible with many tens of (i. e.
>> Hourly)
>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
>> > will know better...
>>
>> I pinged Gabriel and he's looking into it. He'll get back to it.
>>
>> Wido
>>
>> >
>> > I
>> >
>> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
>> >
>> >> I found it had nothing to do with  storage.cleanup.delay and
>> >> storage.cleanup.interval.
>> >>
>> >>
>> >>
>> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>> >> snapshot does not have Copy to secondary storage, it only changes the
>> >> database information, and does not enter the main storage to delete the
>> >> snapshot.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> Log===========================
>> >>
>> >>
>> >>
>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>> 192.168.254.3
>> >> -- GET
>> >>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>
>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>> allowed
>> >> to perform API calls: 0.0.0.0/0,::/0
>> >>
>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>> >> cmdEventType from job info: SNAPSHOT.DELETE
>> >>
>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>> job-1378
>> >> into job monitoring
>> >>
>> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
>> async
>> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>> >> instanceType: Snapshot, instanceId: 13, cmd:
>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> cmdInfo:
>> >>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> >> null, lastPolled: null, created: null, removed: null}
>> >>
>> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
>> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
>> >> instanceId: 13, cmd:
>> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
>> cmdInfo:
>> >>
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> >>
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> >> null, lastPolled: null, created: null, removed: null}
>> >>
>> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
>> >> 192.168.254.3 -- GET
>> >>
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>> >>
>> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
>> Routing
>> >> from 2199066247173
>> >>
>> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4)
>> (logid:1cee5097)
>> >> Can't find snapshot on backup storage, delete it in db
>> >>
>> >>
>> >>
>> >> -Jerry
>> >>
>> >>
>> >>
>> >> ________________________________
>> >> 发件人: Andrija Panic <an...@gmail.com>
>> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>> >> 收件人: users <us...@cloudstack.apache.org>
>> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>> >> 主题: Re: 4.13 rbd snapshot delete failed
>> >>
>> >> storage.cleanup.delay
>> >> storage.cleanup.interval
>> >>
>> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
>> just
>> >> fine...
>> >>
>> >> cheers
>> >>
>> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>> >>
>> >>> Hello All
>> >>>
>> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
>> could
>> >>> be created and rolled back (using API alone), but deletion could not
>> be
>> >>> completed.
>> >>>
>> >>>
>> >>>
>> >>> After executing the deletion API, the snapshot will disappear from the
>> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
>> >> snap
>> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>> >>>
>> >>>
>> >>>
>> >>> Is there any way we can completely delete the snapshot?
>> >>>
>> >>> -Jerry
>> >>>
>> >>>
>> >>
>> >> --
>> >>
>> >> Andrija Panić
>> >>
>> >
>>
>

Re: 4.13 rbd snapshot delete failed

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Thanks for the feedback, Andrija. It looks like delete was not totally
supported then (am I missing something?). I will take a look into this and
open a PR adding propper support for rbd snapshot deletion if necessary.

Regarding the rollback, I have tested it several times and it worked;
however, I see a weak point on the Ceph rollback implementation.

It looks like Li Jerry was able to execute the rollback without any
problem. Li, could you please post here  the log output: "Attempting to
rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
[snapshotid:%s]"? Andrija will not be able to see that log as the exception
happen prior to it, the only way of you checking those values is via remote
debugging. If you be able to post those values it would help as well on
sorting out what is wrong.

I am checking the code base, running a few tests, and evaluating the log
that you (Andrija) sent. What I can say for now is that it looks that the
parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
code that can definitely break the rollback execution flow. My tests had
pointed for a pattern but now I see other possibilities. I will probably
add a few parameters on the rollback/revert command instead of using the
path or review the path life-cycle and different execution flows in order
to keep it safer to be used.
[1]
https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper

A few details on the test environments and Ceph/RBD version:
CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
(stable)
RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
https://github.com/ceph/ceph/pull/6878]
Rados-java [https://github.com/ceph/rados-java] supports snapshot rollback
since 0.5.0; rados-java 0.5.0 is the version used by CloudStack 4.13.0.0

I will be updating here soon.

Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
escreveu:

>
>
> On 9/8/19 5:26 AM, Andrija Panic wrote:
> > Maaany release ago, deleting Ceph volume snap, was also only deleting it
> in
> > DB, so the RBD performance become terrible with many tens of (i. e.
> Hourly)
> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
> > will know better...
>
> I pinged Gabriel and he's looking into it. He'll get back to it.
>
> Wido
>
> >
> > I
> >
> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> >
> >> I found it had nothing to do with  storage.cleanup.delay and
> >> storage.cleanup.interval.
> >>
> >>
> >>
> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
> >> snapshot does not have Copy to secondary storage, it only changes the
> >> database information, and does not enter the main storage to delete the
> >> snapshot.
> >>
> >>
> >>
> >>
> >>
> >> Log===========================
> >>
> >>
> >>
> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
> 192.168.254.3
> >> -- GET
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>
> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
> allowed
> >> to perform API calls: 0.0.0.0/0,::/0
> >>
> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
> >> cmdEventType from job info: SNAPSHOT.DELETE
> >>
> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378
> >> into job monitoring
> >>
> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
> async
> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> >> instanceType: Snapshot, instanceId: 13, cmd:
> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> cmdInfo:
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> >> null, lastPolled: null, created: null, removed: null}
> >>
> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
> >> instanceId: 13, cmd:
> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> cmdInfo:
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> >> null, lastPolled: null, created: null, removed: null}
> >>
> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
> >> 192.168.254.3 -- GET
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>
> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
> Routing
> >> from 2199066247173
> >>
> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097)
> >> Can't find snapshot on backup storage, delete it in db
> >>
> >>
> >>
> >> -Jerry
> >>
> >>
> >>
> >> ________________________________
> >> 发件人: Andrija Panic <an...@gmail.com>
> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> >> 收件人: users <us...@cloudstack.apache.org>
> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> >> 主题: Re: 4.13 rbd snapshot delete failed
> >>
> >> storage.cleanup.delay
> >> storage.cleanup.interval
> >>
> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
> just
> >> fine...
> >>
> >> cheers
> >>
> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
> >>
> >>> Hello All
> >>>
> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
> could
> >>> be created and rolled back (using API alone), but deletion could not be
> >>> completed.
> >>>
> >>>
> >>>
> >>> After executing the deletion API, the snapshot will disappear from the
> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
> >> snap
> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >>>
> >>>
> >>>
> >>> Is there any way we can completely delete the snapshot?
> >>>
> >>> -Jerry
> >>>
> >>>
> >>
> >> --
> >>
> >> Andrija Panić
> >>
> >
>

Re: 4.13 rbd snapshot delete failed

Posted by Gabriel Beims Bräscher <ga...@gmail.com>.
Thanks for the feedback, Andrija. It looks like delete was not totally
supported then (am I missing something?). I will take a look into this and
open a PR adding propper support for rbd snapshot deletion if necessary.

Regarding the rollback, I have tested it several times and it worked;
however, I see a weak point on the Ceph rollback implementation.

It looks like Li Jerry was able to execute the rollback without any
problem. Li, could you please post here  the log output: "Attempting to
rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
[snapshotid:%s]"? Andrija will not be able to see that log as the exception
happen prior to it, the only way of you checking those values is via remote
debugging. If you be able to post those values it would help as well on
sorting out what is wrong.

I am checking the code base, running a few tests, and evaluating the log
that you (Andrija) sent. What I can say for now is that it looks that the
parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
code that can definitely break the rollback execution flow. My tests had
pointed for a pattern but now I see other possibilities. I will probably
add a few parameters on the rollback/revert command instead of using the
path or review the path life-cycle and different execution flows in order
to keep it safer to be used.
[1]
https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper

A few details on the test environments and Ceph/RBD version:
CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
(stable)
RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
https://github.com/ceph/ceph/pull/6878]
Rados-java [https://github.com/ceph/rados-java] supports snapshot rollback
since 0.5.0; rados-java 0.5.0 is the version used by CloudStack 4.13.0.0

I will be updating here soon.

Em dom, 8 de set de 2019 às 12:28, Wido den Hollander <wi...@widodh.nl>
escreveu:

>
>
> On 9/8/19 5:26 AM, Andrija Panic wrote:
> > Maaany release ago, deleting Ceph volume snap, was also only deleting it
> in
> > DB, so the RBD performance become terrible with many tens of (i. e.
> Hourly)
> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
> > will know better...
>
> I pinged Gabriel and he's looking into it. He'll get back to it.
>
> Wido
>
> >
> > I
> >
> > On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> >
> >> I found it had nothing to do with  storage.cleanup.delay and
> >> storage.cleanup.interval.
> >>
> >>
> >>
> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
> >> snapshot does not have Copy to secondary storage, it only changes the
> >> database information, and does not enter the main storage to delete the
> >> snapshot.
> >>
> >>
> >>
> >>
> >>
> >> Log===========================
> >>
> >>
> >>
> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
> 192.168.254.3
> >> -- GET
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>
> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
> allowed
> >> to perform API calls: 0.0.0.0/0,::/0
> >>
> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
> >> cmdEventType from job info: SNAPSHOT.DELETE
> >>
> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378
> >> into job monitoring
> >>
> >> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit
> async
> >> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> >> instanceType: Snapshot, instanceId: 13, cmd:
> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> cmdInfo:
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> >> null, lastPolled: null, created: null, removed: null}
> >>
> >> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
> >> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
> >> instanceId: 13, cmd:
> >> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd,
> cmdInfo:
> >>
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> >>
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> >> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> >> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> >> null, lastPolled: null, created: null, removed: null}
> >>
> >> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
> >> 192.168.254.3 -- GET
> >>
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
> >>
> >> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> >> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853:
> Routing
> >> from 2199066247173
> >>
> >> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
> >> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097)
> >> Can't find snapshot on backup storage, delete it in db
> >>
> >>
> >>
> >> -Jerry
> >>
> >>
> >>
> >> ________________________________
> >> 发件人: Andrija Panic <an...@gmail.com>
> >> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> >> 收件人: users <us...@cloudstack.apache.org>
> >> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> >> 主题: Re: 4.13 rbd snapshot delete failed
> >>
> >> storage.cleanup.delay
> >> storage.cleanup.interval
> >>
> >> put both to 60 (seconds) and wait for up to 2min - should be deleted
> just
> >> fine...
> >>
> >> cheers
> >>
> >> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
> >>
> >>> Hello All
> >>>
> >>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots
> could
> >>> be created and rolled back (using API alone), but deletion could not be
> >>> completed.
> >>>
> >>>
> >>>
> >>> After executing the deletion API, the snapshot will disappear from the
> >>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
> >> snap
> >>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >>>
> >>>
> >>>
> >>> Is there any way we can completely delete the snapshot?
> >>>
> >>> -Jerry
> >>>
> >>>
> >>
> >> --
> >>
> >> Andrija Panić
> >>
> >
>

Re: 4.13 rbd snapshot delete failed

Posted by Wido den Hollander <wi...@widodh.nl>.

On 9/8/19 5:26 AM, Andrija Panic wrote:
> Maaany release ago, deleting Ceph volume snap, was also only deleting it in
> DB, so the RBD performance become terrible with many tens of (i. e. Hourly)
> snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
> will know better...

I pinged Gabriel and he's looking into it. He'll get back to it.

Wido

> 
> I
> 
> On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> 
>> I found it had nothing to do with  storage.cleanup.delay and
>> storage.cleanup.interval.
>>
>>
>>
>> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>> snapshot does not have Copy to secondary storage, it only changes the
>> database information, and does not enter the main storage to delete the
>> snapshot.
>>
>>
>>
>>
>>
>> Log===========================
>>
>>
>>
>> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===  192.168.254.3
>> -- GET
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>
>> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is allowed
>> to perform API calls: 0.0.0.0/0,::/0
>>
>> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>> cmdEventType from job info: SNAPSHOT.DELETE
>>
>> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378
>> into job monitoring
>>
>> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit async
>> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>> instanceType: Snapshot, instanceId: 13, cmd:
>> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> null, lastPolled: null, created: null, removed: null}
>>
>> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
>> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
>> instanceId: 13, cmd:
>> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> null, lastPolled: null, created: null, removed: null}
>>
>> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
>> 192.168.254.3 -- GET
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>
>> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853: Routing
>> from 2199066247173
>>
>> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097)
>> Can't find snapshot on backup storage, delete it in db
>>
>>
>>
>> -Jerry
>>
>>
>>
>> ________________________________
>> 发件人: Andrija Panic <an...@gmail.com>
>> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>> 收件人: users <us...@cloudstack.apache.org>
>> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>> 主题: Re: 4.13 rbd snapshot delete failed
>>
>> storage.cleanup.delay
>> storage.cleanup.interval
>>
>> put both to 60 (seconds) and wait for up to 2min - should be deleted just
>> fine...
>>
>> cheers
>>
>> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>>
>>> Hello All
>>>
>>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
>>> be created and rolled back (using API alone), but deletion could not be
>>> completed.
>>>
>>>
>>>
>>> After executing the deletion API, the snapshot will disappear from the
>>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
>> snap
>>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>>
>>>
>>>
>>> Is there any way we can completely delete the snapshot?
>>>
>>> -Jerry
>>>
>>>
>>
>> --
>>
>> Andrija Panić
>>
> 

Re: 4.13 rbd snapshot delete failed

Posted by Wido den Hollander <wi...@widodh.nl>.

On 9/8/19 5:26 AM, Andrija Panic wrote:
> Maaany release ago, deleting Ceph volume snap, was also only deleting it in
> DB, so the RBD performance become terrible with many tens of (i. e. Hourly)
> snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
> will know better...

I pinged Gabriel and he's looking into it. He'll get back to it.

Wido

> 
> I
> 
> On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:
> 
>> I found it had nothing to do with  storage.cleanup.delay and
>> storage.cleanup.interval.
>>
>>
>>
>> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>> snapshot does not have Copy to secondary storage, it only changes the
>> database information, and does not enter the main storage to delete the
>> snapshot.
>>
>>
>>
>>
>>
>> Log===========================
>>
>>
>>
>> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===  192.168.254.3
>> -- GET
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>
>> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is allowed
>> to perform API calls: 0.0.0.0/0,::/0
>>
>> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>> cmdEventType from job info: SNAPSHOT.DELETE
>>
>> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378
>> into job monitoring
>>
>> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit async
>> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
>> instanceType: Snapshot, instanceId: 13, cmd:
>> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> null, lastPolled: null, created: null, removed: null}
>>
>> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
>> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
>> instanceId: 13, cmd:
>> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
>> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
>> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
>> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
>> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
>> null, lastPolled: null, created: null, removed: null}
>>
>> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
>> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
>> 192.168.254.3 -- GET
>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>>
>> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
>> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853: Routing
>> from 2199066247173
>>
>> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
>> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097)
>> Can't find snapshot on backup storage, delete it in db
>>
>>
>>
>> -Jerry
>>
>>
>>
>> ________________________________
>> 发件人: Andrija Panic <an...@gmail.com>
>> 发送时间: Saturday, September 7, 2019 1:07:19 AM
>> 收件人: users <us...@cloudstack.apache.org>
>> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
>> 主题: Re: 4.13 rbd snapshot delete failed
>>
>> storage.cleanup.delay
>> storage.cleanup.interval
>>
>> put both to 60 (seconds) and wait for up to 2min - should be deleted just
>> fine...
>>
>> cheers
>>
>> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>>
>>> Hello All
>>>
>>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
>>> be created and rolled back (using API alone), but deletion could not be
>>> completed.
>>>
>>>
>>>
>>> After executing the deletion API, the snapshot will disappear from the
>>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
>> snap
>>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>>
>>>
>>>
>>> Is there any way we can completely delete the snapshot?
>>>
>>> -Jerry
>>>
>>>
>>
>> --
>>
>> Andrija Panić
>>
> 

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
Maaany release ago, deleting Ceph volume snap, was also only deleting it in
DB, so the RBD performance become terrible with many tens of (i. e. Hourly)
snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
will know better...

I

On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:

> I found it had nothing to do with  storage.cleanup.delay and
> storage.cleanup.interval.
>
>
>
> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
> snapshot does not have Copy to secondary storage, it only changes the
> database information, and does not enter the main storage to delete the
> snapshot.
>
>
>
>
>
> Log===========================
>
>
>
> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===  192.168.254.3
> -- GET
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>
> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is allowed
> to perform API calls: 0.0.0.0/0,::/0
>
> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
> cmdEventType from job info: SNAPSHOT.DELETE
>
> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378
> into job monitoring
>
> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit async
> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> instanceType: Snapshot, instanceId: 13, cmd:
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: null, removed: null}
>
> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
> instanceId: 13, cmd:
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: null, removed: null}
>
> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
> 192.168.254.3 -- GET
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>
> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853: Routing
> from 2199066247173
>
> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097)
> Can't find snapshot on backup storage, delete it in db
>
>
>
> -Jerry
>
>
>
> ________________________________
> 发件人: Andrija Panic <an...@gmail.com>
> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> 收件人: users <us...@cloudstack.apache.org>
> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> 主题: Re: 4.13 rbd snapshot delete failed
>
> storage.cleanup.delay
> storage.cleanup.interval
>
> put both to 60 (seconds) and wait for up to 2min - should be deleted just
> fine...
>
> cheers
>
> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>
> > Hello All
> >
> > When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
> > be created and rolled back (using API alone), but deletion could not be
> > completed.
> >
> >
> >
> > After executing the deletion API, the snapshot will disappear from the
> > list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
> snap
> > list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >
> >
> >
> > Is there any way we can completely delete the snapshot?
> >
> > -Jerry
> >
> >
>
> --
>
> Andrija Panić
>

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
Maaany release ago, deleting Ceph volume snap, was also only deleting it in
DB, so the RBD performance become terrible with many tens of (i. e. Hourly)
snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
will know better...

I

On Sat, Sep 7, 2019, 08:34 li jerry <di...@hotmail.com> wrote:

> I found it had nothing to do with  storage.cleanup.delay and
> storage.cleanup.interval.
>
>
>
> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
> snapshot does not have Copy to secondary storage, it only changes the
> database information, and does not enter the main storage to delete the
> snapshot.
>
>
>
>
>
> Log===========================
>
>
>
> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===  192.168.254.3
> -- GET
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>
> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is allowed
> to perform API calls: 0.0.0.0/0,::/0
>
> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
> cmdEventType from job info: SNAPSHOT.DELETE
>
> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378
> into job monitoring
>
> 2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit async
> job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2,
> instanceType: Snapshot, instanceId: 13, cmd:
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: null, removed: null}
>
> 2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing
> AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot,
> instanceId: 13, cmd:
> org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo:
> {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface
> com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"},
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: null, removed: null}
>
> 2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet]
> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===
> 192.168.254.3 -- GET
> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480
>
> 2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache]
> (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853: Routing
> from 2199066247173
>
> 2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy]
> (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097)
> Can't find snapshot on backup storage, delete it in db
>
>
>
> -Jerry
>
>
>
> ________________________________
> 发件人: Andrija Panic <an...@gmail.com>
> 发送时间: Saturday, September 7, 2019 1:07:19 AM
> 收件人: users <us...@cloudstack.apache.org>
> 抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
> 主题: Re: 4.13 rbd snapshot delete failed
>
> storage.cleanup.delay
> storage.cleanup.interval
>
> put both to 60 (seconds) and wait for up to 2min - should be deleted just
> fine...
>
> cheers
>
> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>
> > Hello All
> >
> > When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
> > be created and rolled back (using API alone), but deletion could not be
> > completed.
> >
> >
> >
> > After executing the deletion API, the snapshot will disappear from the
> > list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd
> snap
> > list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
> >
> >
> >
> > Is there any way we can completely delete the snapshot?
> >
> > -Jerry
> >
> >
>
> --
>
> Andrija Panić
>

Re: 4.13 rbd snapshot delete failed

Posted by li jerry <di...@hotmail.com>.
I found it had nothing to do with  storage.cleanup.delay and storage.cleanup.interval.



The reason is that when DeleteSnapshot Cmd is executed, because the RBD snapshot does not have Copy to secondary storage, it only changes the database information, and does not enter the main storage to delete the snapshot.





Log===========================



2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet] (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===  192.168.254.3 -- GET  command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480

2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is allowed to perform API calls: 0.0.0.0/0,::/0

2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved cmdEventType from job info: SNAPSHOT.DELETE

2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378 into job monitoring

2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit async job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot, instanceId: 13, cmd: org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo: {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"}, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated: null, lastPolled: null, created: null, removed: null}

2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot, instanceId: 13, cmd: org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo: {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"}, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated: null, lastPolled: null, created: null, removed: null}

2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===  192.168.254.3 -- GET  command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480

2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache] (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853: Routing from 2199066247173

2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097) Can't find snapshot on backup storage, delete it in db



-Jerry



________________________________
发件人: Andrija Panic <an...@gmail.com>
发送时间: Saturday, September 7, 2019 1:07:19 AM
收件人: users <us...@cloudstack.apache.org>
抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
主题: Re: 4.13 rbd snapshot delete failed

storage.cleanup.delay
storage.cleanup.interval

put both to 60 (seconds) and wait for up to 2min - should be deleted just
fine...

cheers

On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:

> Hello All
>
> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
> be created and rolled back (using API alone), but deletion could not be
> completed.
>
>
>
> After executing the deletion API, the snapshot will disappear from the
> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>
>
>
> Is there any way we can completely delete the snapshot?
>
> -Jerry
>
>

--

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by Wido den Hollander <wi...@widodh.nl>.

On 9/6/19 11:34 PM, Andrija Panic wrote:
> One question though... for me (4.13, Nautilus 14.2, test env) - it fails to
> revert back to snapshot with below error
> 

Ok, that's weird.

Gabriel worked on this code recently, maybe he can take a look. I'll
ping him

Wido

> Which CEPH and QEMU/libvirt/os versions are you using?
> 
> 
> Error:
> 2019-09-06 21:27:16,094 ERROR
> [resource.wrapper.LibvirtRevertSnapshotCommandWrapper]
> (agentRequest-Handler-3:null) (logid:9593f65a) Failed to connect to revert
> snapshot due to RBD exception:
> com.ceph.rbd.RbdException: Failed to open image 2
>         at com.ceph.rbd.Rbd.open(Rbd.java:243)
>         at com.ceph.rbd.Rbd.open(Rbd.java:226)
>         at
> com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:92)
>         at
> com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:49)
>         at
> com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
>         at
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1476)
>         at com.cloud.agent.Agent.processRequest(Agent.java:640)
>         at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)
>         at com.cloud.utils.nio.Task.call(Task.java:83)
>         at com.cloud.utils.nio.Task.call(Task.java:29)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 
> On Fri, 6 Sep 2019 at 19:07, Andrija Panic <an...@gmail.com> wrote:
> 
>> storage.cleanup.delay
>> storage.cleanup.interval
>>
>> put both to 60 (seconds) and wait for up to 2min - should be deleted just
>> fine...
>>
>> cheers
>>
>> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>>
>>> Hello All
>>>
>>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
>>> be created and rolled back (using API alone), but deletion could not be
>>> completed.
>>>
>>>
>>>
>>> After executing the deletion API, the snapshot will disappear from the
>>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
>>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>>
>>>
>>>
>>> Is there any way we can completely delete the snapshot?
>>>
>>> -Jerry
>>>
>>>
>>
>> --
>>
>> Andrija Panić
>>
> 
> 

Re: 4.13 rbd snapshot delete failed

Posted by Wido den Hollander <wi...@widodh.nl>.

On 9/6/19 11:34 PM, Andrija Panic wrote:
> One question though... for me (4.13, Nautilus 14.2, test env) - it fails to
> revert back to snapshot with below error
> 

Ok, that's weird.

Gabriel worked on this code recently, maybe he can take a look. I'll
ping him

Wido

> Which CEPH and QEMU/libvirt/os versions are you using?
> 
> 
> Error:
> 2019-09-06 21:27:16,094 ERROR
> [resource.wrapper.LibvirtRevertSnapshotCommandWrapper]
> (agentRequest-Handler-3:null) (logid:9593f65a) Failed to connect to revert
> snapshot due to RBD exception:
> com.ceph.rbd.RbdException: Failed to open image 2
>         at com.ceph.rbd.Rbd.open(Rbd.java:243)
>         at com.ceph.rbd.Rbd.open(Rbd.java:226)
>         at
> com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:92)
>         at
> com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:49)
>         at
> com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
>         at
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1476)
>         at com.cloud.agent.Agent.processRequest(Agent.java:640)
>         at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)
>         at com.cloud.utils.nio.Task.call(Task.java:83)
>         at com.cloud.utils.nio.Task.call(Task.java:29)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 
> On Fri, 6 Sep 2019 at 19:07, Andrija Panic <an...@gmail.com> wrote:
> 
>> storage.cleanup.delay
>> storage.cleanup.interval
>>
>> put both to 60 (seconds) and wait for up to 2min - should be deleted just
>> fine...
>>
>> cheers
>>
>> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>>
>>> Hello All
>>>
>>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
>>> be created and rolled back (using API alone), but deletion could not be
>>> completed.
>>>
>>>
>>>
>>> After executing the deletion API, the snapshot will disappear from the
>>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
>>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>>
>>>
>>>
>>> Is there any way we can completely delete the snapshot?
>>>
>>> -Jerry
>>>
>>>
>>
>> --
>>
>> Andrija Panić
>>
> 
> 

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
One question though... for me (4.13, Nautilus 14.2, test env) - it fails to
revert back to snapshot with below error

Which CEPH and QEMU/libvirt/os versions are you using?


Error:
2019-09-06 21:27:16,094 ERROR
[resource.wrapper.LibvirtRevertSnapshotCommandWrapper]
(agentRequest-Handler-3:null) (logid:9593f65a) Failed to connect to revert
snapshot due to RBD exception:
com.ceph.rbd.RbdException: Failed to open image 2
        at com.ceph.rbd.Rbd.open(Rbd.java:243)
        at com.ceph.rbd.Rbd.open(Rbd.java:226)
        at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:92)
        at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:49)
        at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
        at
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1476)
        at com.cloud.agent.Agent.processRequest(Agent.java:640)
        at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)
        at com.cloud.utils.nio.Task.call(Task.java:83)
        at com.cloud.utils.nio.Task.call(Task.java:29)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

On Fri, 6 Sep 2019 at 19:07, Andrija Panic <an...@gmail.com> wrote:

> storage.cleanup.delay
> storage.cleanup.interval
>
> put both to 60 (seconds) and wait for up to 2min - should be deleted just
> fine...
>
> cheers
>
> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>
>> Hello All
>>
>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
>> be created and rolled back (using API alone), but deletion could not be
>> completed.
>>
>>
>>
>> After executing the deletion API, the snapshot will disappear from the
>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>
>>
>>
>> Is there any way we can completely delete the snapshot?
>>
>> -Jerry
>>
>>
>
> --
>
> Andrija Panić
>


-- 

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
One question though... for me (4.13, Nautilus 14.2, test env) - it fails to
revert back to snapshot with below error

Which CEPH and QEMU/libvirt/os versions are you using?


Error:
2019-09-06 21:27:16,094 ERROR
[resource.wrapper.LibvirtRevertSnapshotCommandWrapper]
(agentRequest-Handler-3:null) (logid:9593f65a) Failed to connect to revert
snapshot due to RBD exception:
com.ceph.rbd.RbdException: Failed to open image 2
        at com.ceph.rbd.Rbd.open(Rbd.java:243)
        at com.ceph.rbd.Rbd.open(Rbd.java:226)
        at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:92)
        at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRevertSnapshotCommandWrapper.execute(LibvirtRevertSnapshotCommandWrapper.java:49)
        at
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
        at
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1476)
        at com.cloud.agent.Agent.processRequest(Agent.java:640)
        at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1053)
        at com.cloud.utils.nio.Task.call(Task.java:83)
        at com.cloud.utils.nio.Task.call(Task.java:29)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

On Fri, 6 Sep 2019 at 19:07, Andrija Panic <an...@gmail.com> wrote:

> storage.cleanup.delay
> storage.cleanup.interval
>
> put both to 60 (seconds) and wait for up to 2min - should be deleted just
> fine...
>
> cheers
>
> On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:
>
>> Hello All
>>
>> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
>> be created and rolled back (using API alone), but deletion could not be
>> completed.
>>
>>
>>
>> After executing the deletion API, the snapshot will disappear from the
>> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
>> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>>
>>
>>
>> Is there any way we can completely delete the snapshot?
>>
>> -Jerry
>>
>>
>
> --
>
> Andrija Panić
>


-- 

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by li jerry <di...@hotmail.com>.
I found it had nothing to do with  storage.cleanup.delay and storage.cleanup.interval.



The reason is that when DeleteSnapshot Cmd is executed, because the RBD snapshot does not have Copy to secondary storage, it only changes the database information, and does not enter the main storage to delete the snapshot.





Log===========================



2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet] (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===  192.168.254.3 -- GET  command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480

2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is allowed to perform API calls: 0.0.0.0/0,::/0

2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved cmdEventType from job info: SNAPSHOT.DELETE

2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor] (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add job-1378 into job monitoring

2019-09-07 23:27:00,219 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) submit async job-1378, details: AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot, instanceId: 13, cmd: org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo: {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"}, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated: null, lastPolled: null, created: null, removed: null}

2019-09-07 23:27:00,220 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:1cee5097) Executing AsyncJobVO {id:1378, userId: 2, accountId: 2, instanceType: Snapshot, instanceId: 13, cmd: org.apache.cloudstack.api.command.user.snapshot.DeleteSnapshotCmd, cmdInfo: {"response":"json","ctxUserId":"2","httpmethod":"GET","ctxStartEventId":"1237","id":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","ctxDetails":"{\"interface com.cloud.storage.Snapshot\":\"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f\"}","ctxAccountId":"2","uuid":"0b50eb7e-4f42-4de7-96c2-1fae137c8c9f","cmdEventType":"SNAPSHOT.DELETE","_":"1567869534480"}, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 2200502468634, completeMsid: null, lastUpdated: null, lastPolled: null, created: null, removed: null}

2019-09-07 23:27:00,221 DEBUG [c.c.a.ApiServlet] (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) ===END===  192.168.254.3 -- GET  command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480

2019-09-07 23:27:00,305 DEBUG [c.c.a.m.ClusteredAgentAttache] (AgentManager-Handler-12:null) (logid:) Seq 1-8660140608456756853: Routing from 2199066247173

2019-09-07 23:27:00,305 DEBUG [o.a.c.s.s.XenserverSnapshotStrategy] (API-Job-Executor-2:ctx-f0843047 job-1378 ctx-f50e25a4) (logid:1cee5097) Can't find snapshot on backup storage, delete it in db



-Jerry



________________________________
发件人: Andrija Panic <an...@gmail.com>
发送时间: Saturday, September 7, 2019 1:07:19 AM
收件人: users <us...@cloudstack.apache.org>
抄送: dev@cloudstack.apache.org <de...@cloudstack.apache.org>
主题: Re: 4.13 rbd snapshot delete failed

storage.cleanup.delay
storage.cleanup.interval

put both to 60 (seconds) and wait for up to 2min - should be deleted just
fine...

cheers

On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:

> Hello All
>
> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
> be created and rolled back (using API alone), but deletion could not be
> completed.
>
>
>
> After executing the deletion API, the snapshot will disappear from the
> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>
>
>
> Is there any way we can completely delete the snapshot?
>
> -Jerry
>
>

--

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
storage.cleanup.delay
storage.cleanup.interval

put both to 60 (seconds) and wait for up to 2min - should be deleted just
fine...

cheers

On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:

> Hello All
>
> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
> be created and rolled back (using API alone), but deletion could not be
> completed.
>
>
>
> After executing the deletion API, the snapshot will disappear from the
> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>
>
>
> Is there any way we can completely delete the snapshot?
>
> -Jerry
>
>

-- 

Andrija Panić

Re: 4.13 rbd snapshot delete failed

Posted by Andrija Panic <an...@gmail.com>.
storage.cleanup.delay
storage.cleanup.interval

put both to 60 (seconds) and wait for up to 2min - should be deleted just
fine...

cheers

On Fri, 6 Sep 2019 at 18:52, li jerry <di...@hotmail.com> wrote:

> Hello All
>
> When I tested ACS 4.13 KVM + CEPH snapshot, I found that snapshots could
> be created and rolled back (using API alone), but deletion could not be
> completed.
>
>
>
> After executing the deletion API, the snapshot will disappear from the
> list Snapshots, but the snapshot on CEPH RBD will not be deleted (rbd snap
> list rbd/ac510428-5d09-4e86-9d34-9dfab3715b7c)
>
>
>
> Is there any way we can completely delete the snapshot?
>
> -Jerry
>
>

-- 

Andrija Panić