You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by "SuichII, Christopher" <Ch...@netapp.com> on 2013/09/27 19:27:29 UTC

Scalable Backup and Recovery

I'd like to start a discussion around the direction of scalable backup and recovery in CloudStack. Currently, the only want to backup and recover vms is by setting up a schedule or manually snapshotting up individual vm disks or manually snapshotting vms. Unfortunately, I don't believe this is a very scalable solution. What if a user wants all of their vm disks to be backed up on the same schedule? What if a domain administrator wants all of the vms in their domain to be backed up on the same schedule or to manually backup every vm in their domain?

Here are some use cases I see for helping to scale things up:
-Scheduled and manual backup of 1 to all of a user's vms and vm disks
-Scheduled and manual backup of 1 to all of a domain's vms and vm disks (by a domain admin)
-Scheduled and manual backup of 1 to all vms and vm disks on primary storage (by a cloud admin) - this one is tougher to find a valid use case for
-Backup schedules attached to service offerings

I know I previously started a discussion about backing up multiple vm disks at once, but I think these use cases, broken down by user type (user, domain admin and admin), should help clear things up and show the utility of being able to backup multiple objects at once.

Thanks!
Chris
-- 
Chris Suich
chris.suich@netapp.com
NetApp Software Engineer
Data Center Platforms – Cloud Solutions
Citrix, Cisco & Red Hat


Re: Scalable Backup and Recovery

Posted by Kelcey Jamison Damage <ke...@backbonetechnology.com>.
I agree this needs to be approached. I find it quite frustrating to have a VM with 4-5 volumes and not be able to backup all 5 volumes as a set. Plus the restore operation gets messy when you have to restore 5 volumes and re-attach them to the VM.

The user and domain sets you discuss below I can also clearly see the use for.

Thanks for starting this discussion.

-Kelcey

----- Original Message -----
From: "Christopher SuichII" <Ch...@netapp.com>
To: "<de...@cloudstack.apache.org>" <de...@cloudstack.apache.org>
Sent: Friday, September 27, 2013 10:27:29 AM
Subject: Scalable Backup and Recovery

I'd like to start a discussion around the direction of scalable backup and recovery in CloudStack. Currently, the only want to backup and recover vms is by setting up a schedule or manually snapshotting up individual vm disks or manually snapshotting vms. Unfortunately, I don't believe this is a very scalable solution. What if a user wants all of their vm disks to be backed up on the same schedule? What if a domain administrator wants all of the vms in their domain to be backed up on the same schedule or to manually backup every vm in their domain?

Here are some use cases I see for helping to scale things up:
-Scheduled and manual backup of 1 to all of a user's vms and vm disks
-Scheduled and manual backup of 1 to all of a domain's vms and vm disks (by a domain admin)
-Scheduled and manual backup of 1 to all vms and vm disks on primary storage (by a cloud admin) - this one is tougher to find a valid use case for
-Backup schedules attached to service offerings

I know I previously started a discussion about backing up multiple vm disks at once, but I think these use cases, broken down by user type (user, domain admin and admin), should help clear things up and show the utility of being able to backup multiple objects at once.

Thanks!
Chris
-- 
Chris Suich
chris.suich@netapp.com
NetApp Software Engineer
Data Center Platforms – Cloud Solutions
Citrix, Cisco & Red Hat


Re: Scalable Backup and Recovery

Posted by Darren Shepherd <da...@gmail.com>.
Based on your use cases it sounds like what your asking for is the
ability to create a selection criteria for scheduled snapshots.  So as
long as your VM/Volume matches that criteria it will be backed up at
some given time.  I think that would be useful because a user could
say something like "all volumes in network 'production' should be
backed up every night."

So if such a thing was implemented it seems like if the storage
provider implemented some capability to handle multiple snapshots at
once, then it could be passed all volumes at once and it would do
something intelligent.

Now reading between the lines, it seems like you're looking for some
use case to exploit some functionality in netapp.  I'll tell you what
I'd like to see implemented.  Having ran netapp in the past with CS we
ran into this conundrum.  We would take snapshots on the filter, but
in reality they were pretty useless.  If you ever needed to rollback
to the snapshot, your screwed.  Things have changed since the
snapshot, VMs were created, deleted, VM snapshots had occurred, etc.
So if you rollback the metadata in CS is completely out of sync now.
What I think would be a great feature is to be able to do StoragePool
snapshots.  Now doing the snapshot is simple, the really complex thing
is how to implement the "StoragePool revert to snapshot"
functionality.  If somebody could do that, that would be awesome.

Darren

On Sat, Sep 28, 2013 at 5:41 PM, SuichII, Christopher
<Ch...@netapp.com> wrote:
> Well, yes, in part. By scalable I mean that if CloudStack is expected to be able to manage such a large number of vms, it should be able to backup and recover those vms with minimal effort. Doing things one at a time does not necessarily scale well when you're talking about a cloud infrastructure.
>
>> Also certain hypervisors have various quirks which stand in the way of an
>> efficient solution.
>
> I absolutely agree. This is where the storage subsystem API comes in. Creating backups for some storage providers can be much faster, easier and more efficient than hypervisor. As the storage subsystem API gains more traction and true backup and recovery becomes available, I think we'll begin to see people asking why things must be done one at a time. The use cases I listed below would help us get ahead of the curve and have these features I predict people will be asking for (and it sounds like Kelcey is asking for it now!).
>
> --
> Chris Suich
> chris.suich@netapp.com
> NetApp Software Engineer
> Data Center Platforms – Cloud Solutions
> Citrix, Cisco & Red Hat
>
> On Sep 27, 2013, at 6:38 PM, Chiradeep Vittal <Ch...@citrix.com> wrote:
>
>> Ah I see. You mean a "scalable user experience".
>>
>> The actual scalability of the snapshot process itself is limited by
>> available disk and network bandwidth.
>> Also certain hypervisors have various quirks which stand in the way of an
>> efficient solution.
>>
>> On 9/27/13 10:27 AM, "SuichII, Christopher" <Ch...@netapp.com> wrote:
>>
>>> I'd like to start a discussion around the direction of scalable backup
>>> and recovery in CloudStack. Currently, the only want to backup and
>>> recover vms is by setting up a schedule or manually snapshotting up
>>> individual vm disks or manually snapshotting vms. Unfortunately, I don't
>>> believe this is a very scalable solution. What if a user wants all of
>>> their vm disks to be backed up on the same schedule? What if a domain
>>> administrator wants all of the vms in their domain to be backed up on the
>>> same schedule or to manually backup every vm in their domain?
>>>
>>> Here are some use cases I see for helping to scale things up:
>>> -Scheduled and manual backup of 1 to all of a user's vms and vm disks
>>> -Scheduled and manual backup of 1 to all of a domain's vms and vm disks
>>> (by a domain admin)
>>> -Scheduled and manual backup of 1 to all vms and vm disks on primary
>>> storage (by a cloud admin) - this one is tougher to find a valid use case
>>> for
>>> -Backup schedules attached to service offerings
>>>
>>> I know I previously started a discussion about backing up multiple vm
>>> disks at once, but I think these use cases, broken down by user type
>>> (user, domain admin and admin), should help clear things up and show the
>>> utility of being able to backup multiple objects at once.
>>>
>>> Thanks!
>>> Chris
>>> --
>>> Chris Suich
>>> chris.suich@netapp.com
>>> NetApp Software Engineer
>>> Data Center Platforms ­ Cloud Solutions
>>> Citrix, Cisco & Red Hat
>>>
>>
>

Re: Scalable Backup and Recovery

Posted by "SuichII, Christopher" <Ch...@netapp.com>.
Well, yes, in part. By scalable I mean that if CloudStack is expected to be able to manage such a large number of vms, it should be able to backup and recover those vms with minimal effort. Doing things one at a time does not necessarily scale well when you're talking about a cloud infrastructure.

> Also certain hypervisors have various quirks which stand in the way of an
> efficient solution.

I absolutely agree. This is where the storage subsystem API comes in. Creating backups for some storage providers can be much faster, easier and more efficient than hypervisor. As the storage subsystem API gains more traction and true backup and recovery becomes available, I think we'll begin to see people asking why things must be done one at a time. The use cases I listed below would help us get ahead of the curve and have these features I predict people will be asking for (and it sounds like Kelcey is asking for it now!).

-- 
Chris Suich
chris.suich@netapp.com
NetApp Software Engineer
Data Center Platforms – Cloud Solutions
Citrix, Cisco & Red Hat

On Sep 27, 2013, at 6:38 PM, Chiradeep Vittal <Ch...@citrix.com> wrote:

> Ah I see. You mean a "scalable user experience".
> 
> The actual scalability of the snapshot process itself is limited by
> available disk and network bandwidth.
> Also certain hypervisors have various quirks which stand in the way of an
> efficient solution.
> 
> On 9/27/13 10:27 AM, "SuichII, Christopher" <Ch...@netapp.com> wrote:
> 
>> I'd like to start a discussion around the direction of scalable backup
>> and recovery in CloudStack. Currently, the only want to backup and
>> recover vms is by setting up a schedule or manually snapshotting up
>> individual vm disks or manually snapshotting vms. Unfortunately, I don't
>> believe this is a very scalable solution. What if a user wants all of
>> their vm disks to be backed up on the same schedule? What if a domain
>> administrator wants all of the vms in their domain to be backed up on the
>> same schedule or to manually backup every vm in their domain?
>> 
>> Here are some use cases I see for helping to scale things up:
>> -Scheduled and manual backup of 1 to all of a user's vms and vm disks
>> -Scheduled and manual backup of 1 to all of a domain's vms and vm disks
>> (by a domain admin)
>> -Scheduled and manual backup of 1 to all vms and vm disks on primary
>> storage (by a cloud admin) - this one is tougher to find a valid use case
>> for
>> -Backup schedules attached to service offerings
>> 
>> I know I previously started a discussion about backing up multiple vm
>> disks at once, but I think these use cases, broken down by user type
>> (user, domain admin and admin), should help clear things up and show the
>> utility of being able to backup multiple objects at once.
>> 
>> Thanks!
>> Chris
>> -- 
>> Chris Suich
>> chris.suich@netapp.com
>> NetApp Software Engineer
>> Data Center Platforms ­ Cloud Solutions
>> Citrix, Cisco & Red Hat
>> 
> 


Re: Scalable Backup and Recovery

Posted by Chiradeep Vittal <Ch...@citrix.com>.
Ah I see. You mean a "scalable user experience".

The actual scalability of the snapshot process itself is limited by
available disk and network bandwidth.
Also certain hypervisors have various quirks which stand in the way of an
efficient solution.

On 9/27/13 10:27 AM, "SuichII, Christopher" <Ch...@netapp.com> wrote:

>I'd like to start a discussion around the direction of scalable backup
>and recovery in CloudStack. Currently, the only want to backup and
>recover vms is by setting up a schedule or manually snapshotting up
>individual vm disks or manually snapshotting vms. Unfortunately, I don't
>believe this is a very scalable solution. What if a user wants all of
>their vm disks to be backed up on the same schedule? What if a domain
>administrator wants all of the vms in their domain to be backed up on the
>same schedule or to manually backup every vm in their domain?
>
>Here are some use cases I see for helping to scale things up:
>-Scheduled and manual backup of 1 to all of a user's vms and vm disks
>-Scheduled and manual backup of 1 to all of a domain's vms and vm disks
>(by a domain admin)
>-Scheduled and manual backup of 1 to all vms and vm disks on primary
>storage (by a cloud admin) - this one is tougher to find a valid use case
>for
>-Backup schedules attached to service offerings
>
>I know I previously started a discussion about backing up multiple vm
>disks at once, but I think these use cases, broken down by user type
>(user, domain admin and admin), should help clear things up and show the
>utility of being able to backup multiple objects at once.
>
>Thanks!
>Chris
>-- 
>Chris Suich
>chris.suich@netapp.com
>NetApp Software Engineer
>Data Center Platforms ­ Cloud Solutions
>Citrix, Cisco & Red Hat
>