You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Andrei Mikhailovsky <an...@arhont.com.INVALID> on 2019/06/13 12:57:23 UTC

Concurrent Volume Snapshots

Hello everyone 

I am having running snapshot issues on large volumes. The hypervisor is KVM and the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue: 

I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule setup to take a snapshot of each volume once a month. It takes a long time for a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the schedule kicks in, it only manages to snapshot the first volume and the snapshots of the other volumes fail due to the async job timeout. From what I have discovered, ACS only does a single volume snapshot at a time. I can't seem to find the settings to enable concurrent snapshotting. So, it can't snapshot all of the vm volumes at the same time. This is very much problematic for many reasons, but the main reason is that upon recovery of multiple volumes, the data on those will not be consistent. 

Is there a way around it? Perhaps there is an option in the settings that I can't find that disables this odd behaviour of the volume snapshots? 

Cheers 

Andrei 

Re: Concurrent Volume Snapshots

Posted by Andrei Mikhailovsky <an...@arhont.com.INVALID>.
Thanks Rohit, I will try to investigate those options.

Andrei



----- Original Message -----
> From: "Rohit Yadav" <ro...@shapeblue.com>
> To: "dev" <de...@cloudstack.apache.org>
> Sent: Thursday, 13 June, 2019 14:02:21
> Subject: Re: Concurrent Volume Snapshots

> You can try to experiment with the following global settings:
> 
> 
> wait
> 
> backup.snapshot.wait
> copy.volume.wait
> vm.job.lock.timeout
> 
> 
> Regards,
> 
> Rohit Yadav
> 
> Software Architect, ShapeBlue
> 
> https://www.shapeblue.com
> 
> ________________________________
> From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
> Sent: Thursday, June 13, 2019 6:27:23 PM
> To: dev
> Subject: Concurrent Volume Snapshots
> 
> Hello everyone
> 
> I am having running snapshot issues on large volumes. The hypervisor is KVM and
> the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:
> 
> I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule
> setup to take a snapshot of each volume once a month. It takes a long time for
> a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the
> schedule kicks in, it only manages to snapshot the first volume and the
> snapshots of the other volumes fail due to the async job timeout. From what I
> have discovered, ACS only does a single volume snapshot at a time. I can't seem
> to find the settings to enable concurrent snapshotting. So, it can't snapshot
> all of the vm volumes at the same time. This is very much problematic for many
> reasons, but the main reason is that upon recovery of multiple volumes, the
> data on those will not be consistent.
> 
> Is there a way around it? Perhaps there is an option in the settings that I
> can't find that disables this odd behaviour of the volume snapshots?
> 
> Cheers
> 
> Andrei
> 
> rohit.yadav@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue

Re: Concurrent Volume Snapshots

Posted by Andrei Mikhailovsky <an...@arhont.com.INVALID>.
Thanks for the update Rohit, Suresh,

Is this feature happening in the 4.13 release?

Andrei

----- Original Message -----
> From: "Suresh Kumar Anaparti" <su...@gmail.com>
> To: "dev" <de...@cloudstack.apache.org>
> Sent: Thursday, 13 June, 2019 19:10:11
> Subject: Re: Concurrent Volume Snapshots

> Currently, in CloudStack, only one job per VM can be active (in execution)
> at any given point of time. Here, for concurrent volume snapshots of same
> VM, all these snapshot operations are considered to be the same VM work
> jobs and would be in queue. Once an active volume snapshot job is done, the
> next one is picked up for execution.
> 
> The PR https://github.com/apache/cloudstack/pull/1897 supports multiple
> snapshots of the same VM for XenServer. KVM not tested. I'll rebase the
> code with the latest master.
> 
> Regards,
> Suresh
> 
> 
> On Thu, Jun 13, 2019 at 7:41 PM Rohit Yadav <ro...@shapeblue.com>
> wrote:
> 
>> I checked out outstanding PRs list, looks like this feature is not
>> supported currently:
>>
>> https://github.com/apache/cloudstack/pull/1897
>>
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> ________________________________
>> From: Rohit Yadav <ro...@shapeblue.com>
>> Sent: Thursday, June 13, 2019 7:37:09 PM
>> To: dev
>> Subject: Re: Concurrent Volume Snapshots
>>
>> Hi Andrei,
>>
>>
>> Try playing with concurrent.snapshots.threshold.perhost. (empty is treated
>> as 1).
>>
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> ________________________________
>> From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
>> Sent: Thursday, June 13, 2019 6:54:07 PM
>> To: dev
>> Subject: Re: Concurrent Volume Snapshots
>>
>> Hi Rohit,
>>
>> I have updated some of those options to increase the timeout to 2 days
>> rather than a few hours by default.
>>
>> However, these options relate to the timeout of the process.
>>
>> I was wondering if there is an option to allow simultaneous snapshotting
>> of volumes on a single VM? I would like all volumes of the vm to be copied
>> over to the secondary storage at the same time, rather than one after
>> another.
>>
>> Cheers
>>
>>
>> rohit.yadav@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>>
>> rohit.yadav@shapeblue.com
>> www.shapeblue.com
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>> ----- Original Message -----
>> > From: "Rohit Yadav" <ro...@shapeblue.com>
>> > To: "dev" <de...@cloudstack.apache.org>
>> > Sent: Thursday, 13 June, 2019 14:02:21
>> > Subject: Re: Concurrent Volume Snapshots
>>
>> > You can try to experiment with the following global settings:
>> >
>> >
>> > wait
>> >
>> > backup.snapshot.wait
>> > copy.volume.wait
>> > vm.job.lock.timeout
>> >
>> >
>> > Regards,
>> >
>> > Rohit Yadav
>> >
>> > Software Architect, ShapeBlue
>> >
>> > https://www.shapeblue.com
>> >
>> > ________________________________
>> > From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
>> > Sent: Thursday, June 13, 2019 6:27:23 PM
>> > To: dev
>> > Subject: Concurrent Volume Snapshots
>> >
>> > Hello everyone
>> >
>> > I am having running snapshot issues on large volumes. The hypervisor is
>> KVM and
>> > the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my
>> issue:
>> >
>> > I've got several vms with 3-6 volumes of 2TB each. I have a recurring
>> schedule
>> > setup to take a snapshot of each volume once a month. It takes a long
>> time for
>> > a volume to be snapshotted (in a magnitude of 20 hours). As a result,
>> when the
>> > schedule kicks in, it only manages to snapshot the first volume and the
>> > snapshots of the other volumes fail due to the async job timeout. From
>> what I
>> > have discovered, ACS only does a single volume snapshot at a time. I
>> can't seem
>> > to find the settings to enable concurrent snapshotting. So, it can't
>> snapshot
>> > all of the vm volumes at the same time. This is very much problematic
>> for many
>> > reasons, but the main reason is that upon recovery of multiple volumes,
>> the
>> > data on those will not be consistent.
>> >
>> > Is there a way around it? Perhaps there is an option in the settings
>> that I
>> > can't find that disables this odd behaviour of the volume snapshots?
>> >
>> > Cheers
>> >
>> > Andrei
>> >
>> > rohit.yadav@shapeblue.com
>> > www.shapeblue.com<http://www.shapeblue.com>
>> > Amadeus House, Floral Street, London  WC2E 9DPUK
>> > @shapeblue

Re: Concurrent Volume Snapshots

Posted by Suresh Kumar Anaparti <su...@gmail.com>.
Currently, in CloudStack, only one job per VM can be active (in execution)
at any given point of time. Here, for concurrent volume snapshots of same
VM, all these snapshot operations are considered to be the same VM work
jobs and would be in queue. Once an active volume snapshot job is done, the
next one is picked up for execution.

The PR https://github.com/apache/cloudstack/pull/1897 supports multiple
snapshots of the same VM for XenServer. KVM not tested. I'll rebase the
code with the latest master.

Regards,
Suresh


On Thu, Jun 13, 2019 at 7:41 PM Rohit Yadav <ro...@shapeblue.com>
wrote:

> I checked out outstanding PRs list, looks like this feature is not
> supported currently:
>
> https://github.com/apache/cloudstack/pull/1897
>
>
> Regards,
>
> Rohit Yadav
>
> Software Architect, ShapeBlue
>
> https://www.shapeblue.com
>
> ________________________________
> From: Rohit Yadav <ro...@shapeblue.com>
> Sent: Thursday, June 13, 2019 7:37:09 PM
> To: dev
> Subject: Re: Concurrent Volume Snapshots
>
> Hi Andrei,
>
>
> Try playing with concurrent.snapshots.threshold.perhost. (empty is treated
> as 1).
>
>
> Regards,
>
> Rohit Yadav
>
> Software Architect, ShapeBlue
>
> https://www.shapeblue.com
>
> ________________________________
> From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
> Sent: Thursday, June 13, 2019 6:54:07 PM
> To: dev
> Subject: Re: Concurrent Volume Snapshots
>
> Hi Rohit,
>
> I have updated some of those options to increase the timeout to 2 days
> rather than a few hours by default.
>
> However, these options relate to the timeout of the process.
>
> I was wondering if there is an option to allow simultaneous snapshotting
> of volumes on a single VM? I would like all volumes of the vm to be copied
> over to the secondary storage at the same time, rather than one after
> another.
>
> Cheers
>
>
> rohit.yadav@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
>
> rohit.yadav@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
> ----- Original Message -----
> > From: "Rohit Yadav" <ro...@shapeblue.com>
> > To: "dev" <de...@cloudstack.apache.org>
> > Sent: Thursday, 13 June, 2019 14:02:21
> > Subject: Re: Concurrent Volume Snapshots
>
> > You can try to experiment with the following global settings:
> >
> >
> > wait
> >
> > backup.snapshot.wait
> > copy.volume.wait
> > vm.job.lock.timeout
> >
> >
> > Regards,
> >
> > Rohit Yadav
> >
> > Software Architect, ShapeBlue
> >
> > https://www.shapeblue.com
> >
> > ________________________________
> > From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
> > Sent: Thursday, June 13, 2019 6:27:23 PM
> > To: dev
> > Subject: Concurrent Volume Snapshots
> >
> > Hello everyone
> >
> > I am having running snapshot issues on large volumes. The hypervisor is
> KVM and
> > the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my
> issue:
> >
> > I've got several vms with 3-6 volumes of 2TB each. I have a recurring
> schedule
> > setup to take a snapshot of each volume once a month. It takes a long
> time for
> > a volume to be snapshotted (in a magnitude of 20 hours). As a result,
> when the
> > schedule kicks in, it only manages to snapshot the first volume and the
> > snapshots of the other volumes fail due to the async job timeout. From
> what I
> > have discovered, ACS only does a single volume snapshot at a time. I
> can't seem
> > to find the settings to enable concurrent snapshotting. So, it can't
> snapshot
> > all of the vm volumes at the same time. This is very much problematic
> for many
> > reasons, but the main reason is that upon recovery of multiple volumes,
> the
> > data on those will not be consistent.
> >
> > Is there a way around it? Perhaps there is an option in the settings
> that I
> > can't find that disables this odd behaviour of the volume snapshots?
> >
> > Cheers
> >
> > Andrei
> >
> > rohit.yadav@shapeblue.com
> > www.shapeblue.com<http://www.shapeblue.com>
> > Amadeus House, Floral Street, London  WC2E 9DPUK
> > @shapeblue
>

Re: Concurrent Volume Snapshots

Posted by Rohit Yadav <ro...@shapeblue.com>.
I checked out outstanding PRs list, looks like this feature is not supported currently:

https://github.com/apache/cloudstack/pull/1897


Regards,

Rohit Yadav

Software Architect, ShapeBlue

https://www.shapeblue.com

________________________________
From: Rohit Yadav <ro...@shapeblue.com>
Sent: Thursday, June 13, 2019 7:37:09 PM
To: dev
Subject: Re: Concurrent Volume Snapshots

Hi Andrei,


Try playing with concurrent.snapshots.threshold.perhost. (empty is treated as 1).


Regards,

Rohit Yadav

Software Architect, ShapeBlue

https://www.shapeblue.com

________________________________
From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
Sent: Thursday, June 13, 2019 6:54:07 PM
To: dev
Subject: Re: Concurrent Volume Snapshots

Hi Rohit,

I have updated some of those options to increase the timeout to 2 days rather than a few hours by default.

However, these options relate to the timeout of the process.

I was wondering if there is an option to allow simultaneous snapshotting of volumes on a single VM? I would like all volumes of the vm to be copied over to the secondary storage at the same time, rather than one after another.

Cheers


rohit.yadav@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue




rohit.yadav@shapeblue.com 
www.shapeblue.com
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue
  
 

----- Original Message -----
> From: "Rohit Yadav" <ro...@shapeblue.com>
> To: "dev" <de...@cloudstack.apache.org>
> Sent: Thursday, 13 June, 2019 14:02:21
> Subject: Re: Concurrent Volume Snapshots

> You can try to experiment with the following global settings:
>
>
> wait
>
> backup.snapshot.wait
> copy.volume.wait
> vm.job.lock.timeout
>
>
> Regards,
>
> Rohit Yadav
>
> Software Architect, ShapeBlue
>
> https://www.shapeblue.com
>
> ________________________________
> From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
> Sent: Thursday, June 13, 2019 6:27:23 PM
> To: dev
> Subject: Concurrent Volume Snapshots
>
> Hello everyone
>
> I am having running snapshot issues on large volumes. The hypervisor is KVM and
> the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:
>
> I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule
> setup to take a snapshot of each volume once a month. It takes a long time for
> a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the
> schedule kicks in, it only manages to snapshot the first volume and the
> snapshots of the other volumes fail due to the async job timeout. From what I
> have discovered, ACS only does a single volume snapshot at a time. I can't seem
> to find the settings to enable concurrent snapshotting. So, it can't snapshot
> all of the vm volumes at the same time. This is very much problematic for many
> reasons, but the main reason is that upon recovery of multiple volumes, the
> data on those will not be consistent.
>
> Is there a way around it? Perhaps there is an option in the settings that I
> can't find that disables this odd behaviour of the volume snapshots?
>
> Cheers
>
> Andrei
>
> rohit.yadav@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue

Re: Concurrent Volume Snapshots

Posted by Rohit Yadav <ro...@shapeblue.com>.
Hi Andrei,


Try playing with concurrent.snapshots.threshold.perhost. (empty is treated as 1).


Regards,

Rohit Yadav

Software Architect, ShapeBlue

https://www.shapeblue.com

________________________________
From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
Sent: Thursday, June 13, 2019 6:54:07 PM
To: dev
Subject: Re: Concurrent Volume Snapshots

Hi Rohit,

I have updated some of those options to increase the timeout to 2 days rather than a few hours by default.

However, these options relate to the timeout of the process.

I was wondering if there is an option to allow simultaneous snapshotting of volumes on a single VM? I would like all volumes of the vm to be copied over to the secondary storage at the same time, rather than one after another.

Cheers


rohit.yadav@shapeblue.com 
www.shapeblue.com
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue
  
 

----- Original Message -----
> From: "Rohit Yadav" <ro...@shapeblue.com>
> To: "dev" <de...@cloudstack.apache.org>
> Sent: Thursday, 13 June, 2019 14:02:21
> Subject: Re: Concurrent Volume Snapshots

> You can try to experiment with the following global settings:
>
>
> wait
>
> backup.snapshot.wait
> copy.volume.wait
> vm.job.lock.timeout
>
>
> Regards,
>
> Rohit Yadav
>
> Software Architect, ShapeBlue
>
> https://www.shapeblue.com
>
> ________________________________
> From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
> Sent: Thursday, June 13, 2019 6:27:23 PM
> To: dev
> Subject: Concurrent Volume Snapshots
>
> Hello everyone
>
> I am having running snapshot issues on large volumes. The hypervisor is KVM and
> the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:
>
> I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule
> setup to take a snapshot of each volume once a month. It takes a long time for
> a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the
> schedule kicks in, it only manages to snapshot the first volume and the
> snapshots of the other volumes fail due to the async job timeout. From what I
> have discovered, ACS only does a single volume snapshot at a time. I can't seem
> to find the settings to enable concurrent snapshotting. So, it can't snapshot
> all of the vm volumes at the same time. This is very much problematic for many
> reasons, but the main reason is that upon recovery of multiple volumes, the
> data on those will not be consistent.
>
> Is there a way around it? Perhaps there is an option in the settings that I
> can't find that disables this odd behaviour of the volume snapshots?
>
> Cheers
>
> Andrei
>
> rohit.yadav@shapeblue.com
> www.shapeblue.com<http://www.shapeblue.com>
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue

Re: Concurrent Volume Snapshots

Posted by Andrei Mikhailovsky <an...@arhont.com.INVALID>.
Hi Rohit,

I have updated some of those options to increase the timeout to 2 days rather than a few hours by default.

However, these options relate to the timeout of the process.

I was wondering if there is an option to allow simultaneous snapshotting of volumes on a single VM? I would like all volumes of the vm to be copied over to the secondary storage at the same time, rather than one after another.

Cheers

----- Original Message -----
> From: "Rohit Yadav" <ro...@shapeblue.com>
> To: "dev" <de...@cloudstack.apache.org>
> Sent: Thursday, 13 June, 2019 14:02:21
> Subject: Re: Concurrent Volume Snapshots

> You can try to experiment with the following global settings:
> 
> 
> wait
> 
> backup.snapshot.wait
> copy.volume.wait
> vm.job.lock.timeout
> 
> 
> Regards,
> 
> Rohit Yadav
> 
> Software Architect, ShapeBlue
> 
> https://www.shapeblue.com
> 
> ________________________________
> From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
> Sent: Thursday, June 13, 2019 6:27:23 PM
> To: dev
> Subject: Concurrent Volume Snapshots
> 
> Hello everyone
> 
> I am having running snapshot issues on large volumes. The hypervisor is KVM and
> the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:
> 
> I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule
> setup to take a snapshot of each volume once a month. It takes a long time for
> a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the
> schedule kicks in, it only manages to snapshot the first volume and the
> snapshots of the other volumes fail due to the async job timeout. From what I
> have discovered, ACS only does a single volume snapshot at a time. I can't seem
> to find the settings to enable concurrent snapshotting. So, it can't snapshot
> all of the vm volumes at the same time. This is very much problematic for many
> reasons, but the main reason is that upon recovery of multiple volumes, the
> data on those will not be consistent.
> 
> Is there a way around it? Perhaps there is an option in the settings that I
> can't find that disables this odd behaviour of the volume snapshots?
> 
> Cheers
> 
> Andrei
> 
> rohit.yadav@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue

Re: Concurrent Volume Snapshots

Posted by Rohit Yadav <ro...@shapeblue.com>.
You can try to experiment with the following global settings:


wait

backup.snapshot.wait
copy.volume.wait
vm.job.lock.timeout


Regards,

Rohit Yadav

Software Architect, ShapeBlue

https://www.shapeblue.com

________________________________
From: Andrei Mikhailovsky <an...@arhont.com.INVALID>
Sent: Thursday, June 13, 2019 6:27:23 PM
To: dev
Subject: Concurrent Volume Snapshots

Hello everyone

I am having running snapshot issues on large volumes. The hypervisor is KVM and the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:

I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule setup to take a snapshot of each volume once a month. It takes a long time for a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the schedule kicks in, it only manages to snapshot the first volume and the snapshots of the other volumes fail due to the async job timeout. From what I have discovered, ACS only does a single volume snapshot at a time. I can't seem to find the settings to enable concurrent snapshotting. So, it can't snapshot all of the vm volumes at the same time. This is very much problematic for many reasons, but the main reason is that upon recovery of multiple volumes, the data on those will not be consistent.

Is there a way around it? Perhaps there is an option in the settings that I can't find that disables this odd behaviour of the volume snapshots?

Cheers

Andrei

rohit.yadav@shapeblue.com 
www.shapeblue.com
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue