You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Andrei Mikhailovsky <an...@arhont.com> on 2015/02/16 13:01:19 UTC

Your thoughts on using Primary Storage for keeping snapshots

Hello guys, 

I was hoping to have some feedback from the community on the subject of having an ability to keep snapshots on the primary storage where it is supported by the storage backend. 

The idea behind this functionality is to improve how snapshots are currently handled on KVM hypervisors with Ceph primary storage. At the moment, the snapshots are taken on the primary storage and being copied to the secondary storage. This method is very slow and inefficient even on small infrastructure. Even on medium deployments using snapshots in KVM becomes nearly impossible. If you have tens or hundreds concurrent snapshots taking place you will have a bunch of timeouts and errors, your network becomes clogged, etc. In addition, using these snapshots for creating new volumes or reverting back vms also slow and inefficient. As above, when you have tens or hundreds concurrent operations it will not succeed and you will have a majority of tasks with errors or timeouts. 

At the moment, taking a single snapshot of relatively small volumes (200GB or 500GB for instance) takes tens if not hundreds of minutes. Taking a snapshot of the same volume on ceph primary storage takes a few seconds at most! Similarly, converting a snapshot to a volume takes tens if not hundreds of minutes when secondary storage is involved; compared with seconds if done directly on the primary storage. 

I suggest that the CloudStack should have the ability to keep volume snapshots on the primary storage where this is supported by the storage. Perhaps having a per primary storage setting that enables this functionality. This will be beneficial for Ceph primary storage on KVM hypervisors and perhaps on XenServer when Ceph will be supported in a near future. 

This will greatly speed up the process of using snapshots on KVM and users will actually start using snapshotting rather than giving up with frustration. 

I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you are in agreement. 

Thanks for your input 

Andrei 





Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Ian Rae <ir...@cloudops.com>.
Totally agreed that there is high value in having both the ability to do
rapid, lightweight snapshots on primary storage as well as the ability to
transfer those snapshots to secondary storage for highly durable long-term
use, template creation etc... Glad to hear that others see a distinction
between these use cases, will ask the CloudOps team and Mike T to engage on
this.

On Monday, February 16, 2015, Andrei Mikhailovsky <an...@arhont.com> wrote:

> Hello guys,
>
> I was hoping to have some feedback from the community on the subject of
> having an ability to keep snapshots on the primary storage where it is
> supported by the storage backend.
>
> The idea behind this functionality is to improve how snapshots are
> currently handled on KVM hypervisors with Ceph primary storage. At the
> moment, the snapshots are taken on the primary storage and being copied to
> the secondary storage. This method is very slow and inefficient even on
> small infrastructure. Even on medium deployments using snapshots in KVM
> becomes nearly impossible. If you have tens or hundreds concurrent
> snapshots taking place you will have a bunch of timeouts and errors, your
> network becomes clogged, etc. In addition, using these snapshots for
> creating new volumes or reverting back vms also slow and inefficient. As
> above, when you have tens or hundreds concurrent operations it will not
> succeed and you will have a majority of tasks with errors or timeouts.
>
> At the moment, taking a single snapshot of relatively small volumes (200GB
> or 500GB for instance) takes tens if not hundreds of minutes. Taking a
> snapshot of the same volume on ceph primary storage takes a few seconds at
> most! Similarly, converting a snapshot to a volume takes tens if not
> hundreds of minutes when secondary storage is involved; compared with
> seconds if done directly on the primary storage.
>
> I suggest that the CloudStack should have the ability to keep volume
> snapshots on the primary storage where this is supported by the storage.
> Perhaps having a per primary storage setting that enables this
> functionality. This will be beneficial for Ceph primary storage on KVM
> hypervisors and perhaps on XenServer when Ceph will be supported in a near
> future.
>
> This will greatly speed up the process of using snapshots on KVM and users
> will actually start using snapshotting rather than giving up with
> frustration.
>
> I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you
> are in agreement.
>
> Thanks for your input
>
> Andrei
>
>
>
>
>

-- 
*Ian Rae*
PDG *| *CEO
t *514.944.4008*

*CloudOps* Votre partenaire infonuagique* | *Cloud Solutions Experts
w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|* Montreal *|*
 Quebec *|* H3J 1S6

<https://www.cloud.ca/>
<http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/>

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Andrei Mikhailovsky <an...@arhont.com>.
+1 for renaming the Snapshot into something more logical. 

However, for many ppl Backups kind of means the functionality on a more granular level (like ability to restore files, etc.) Not sure if Backup should be the right term for the current volume Snapshots. 

I agree, there should be ability to copy snapshots to the secondary storage. perhaps even both if one requires. If someone wants to have a backup copy of the snapshot on the secondary storage, they might choose to have this option. 

Andrei 
----- Original Message -----

> From: "Logan Barfield" <lb...@tqhosting.com>
> To: dev@cloudstack.apache.org
> Sent: Monday, 16 February, 2015 2:38:00 PM
> Subject: Re: Your thoughts on using Primary Storage for keeping
> snapshots

> I like this idea a lot for Ceph RBD. I do think there should still be
> support for copying snapshots to secondary storage as needed (for
> transfers between zones, etc.). I really think that this could be
> part of a larger move to clarify the naming conventions used for disk
> operations. Currently "Volume Snapshots" should probably really be
> called "Backups". So having "snapshot" functionality, and a "convert
> snapshot to backup/template" would be a good move.

> Thank You,

> Logan Barfield
> Tranquil Hosting

> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic
> <an...@gmail.com> wrote:
> > BIG +1
> >
> > My team should submit some patch to ACS for better KVM snapshots,
> > including
> > whole VM snapshot etc...but it's too early to give details...
> > best
> >
> > On 16 February 2015 at 13:01, Andrei Mikhailovsky
> > <an...@arhont.com> wrote:
> >
> >> Hello guys,
> >>
> >> I was hoping to have some feedback from the community on the
> >> subject of
> >> having an ability to keep snapshots on the primary storage where
> >> it is
> >> supported by the storage backend.
> >>
> >> The idea behind this functionality is to improve how snapshots are
> >> currently handled on KVM hypervisors with Ceph primary storage. At
> >> the
> >> moment, the snapshots are taken on the primary storage and being
> >> copied to
> >> the secondary storage. This method is very slow and inefficient
> >> even on
> >> small infrastructure. Even on medium deployments using snapshots
> >> in KVM
> >> becomes nearly impossible. If you have tens or hundreds concurrent
> >> snapshots taking place you will have a bunch of timeouts and
> >> errors, your
> >> network becomes clogged, etc. In addition, using these snapshots
> >> for
> >> creating new volumes or reverting back vms also slow and
> >> inefficient. As
> >> above, when you have tens or hundreds concurrent operations it
> >> will not
> >> succeed and you will have a majority of tasks with errors or
> >> timeouts.
> >>
> >> At the moment, taking a single snapshot of relatively small
> >> volumes (200GB
> >> or 500GB for instance) takes tens if not hundreds of minutes.
> >> Taking a
> >> snapshot of the same volume on ceph primary storage takes a few
> >> seconds at
> >> most! Similarly, converting a snapshot to a volume takes tens if
> >> not
> >> hundreds of minutes when secondary storage is involved; compared
> >> with
> >> seconds if done directly on the primary storage.
> >>
> >> I suggest that the CloudStack should have the ability to keep
> >> volume
> >> snapshots on the primary storage where this is supported by the
> >> storage.
> >> Perhaps having a per primary storage setting that enables this
> >> functionality. This will be beneficial for Ceph primary storage on
> >> KVM
> >> hypervisors and perhaps on XenServer when Ceph will be supported
> >> in a near
> >> future.
> >>
> >> This will greatly speed up the process of using snapshots on KVM
> >> and users
> >> will actually start using snapshotting rather than giving up with
> >> frustration.
> >>
> >> I have opened the ticket CLOUDSTACK-8256, so please cast your vote
> >> if you
> >> are in agreement.
> >>
> >> Thanks for your input
> >>
> >> Andrei
> >>
> >>
> >>
> >>
> >>
> >
> >
> > --
> >
> > Andrija Panić

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Rohit Yadav <ro...@shapeblue.com>.
Sounds like a good idea.

On Wednesday 18 February 2015 10:29 PM, Mike Tutkowski wrote:
> So, I spoke with Edison a couple months ago about my desire to take
> snapshots on SolidFire's SAN instead of a making a backup on secondary
> storage.
>
> As Edison explained, CloudStack is theoretically flexible enough to
> accommodate this if the developer implements the SnapshotStrategy interface
> and orchestrates what needs to be done.

May be we can refactor the core to not assume that snapshots be
available only on secondary storage and implement a SnapshotStrategy
class that user can choose via a global setting, the core reads which
strategy it is and based on that decides that to assume.

> I went ahead and implemented this approach successfully in XenServer
> (although, as I mentioned earlier, not in as ideal a fashion as I would
> have liked due to issues in XenServer). However, along the way, I found
> that there was code in CloudStack "core" that assumed snapshots would be
> stored on secondary storage.
>
> I went ahead and made this core logic more flexible in 4.6 and so 4.6 is
> the first release that I will be able to offer the ability for users to
> take CloudStack snapshots that are stored on my SAN instead of on secondary
> storage.
>
> I then implemented a custom DataMotionStrategy to use with my snapshots so
> that I could create templates for general use in CloudStack and so I could
> make CloudStack volumes from my snapshots.

I think the bigger idea I'm seeing here is to allow primary storage to
be used as a staging storage, where both (immutable, so once made no one
changes them) snapshots and even templates can be stored and migrated in
background to secondary storage. The implementation would be easiest for
NFS and local storage, not sure about others, and except for Xen (since
snapshots are differential/incremental) it should be easier for KVM,
VMWare etc using full (immutable) snapshots.

> Again...I know this thread is more about RBD, but I thought this might be
> of interest as an FYI.

Keeping snapshots on primary storage would be a huge performance bonus,
not just for RBD but other storage technologies as well.

I think it's a good idea, as I've seen snapshots taking too much time
because of network latency issues.

>
> On Wed, Feb 18, 2015 at 6:09 AM, Punith S <pu...@cloudbyte.com> wrote:
>
>> +1 with this feature.
>>
>> like Logan said the current snapshot is just a copy of VDI/vmdk/qcow2's to
>> the secondary storage
>> hence the current feature acts as a backup feature taking a long time.
>>
>> also the current cloudstack storage framework is not allowing the third
>> party storage vendors like cloudbyte and others to leverage their backend
>> storage snapshot feature where file systems support Copy on Write.
>>
>> for example : In cloudbyte elastistor which is based on zfs filesystem, it
>> allows to take multiple snapshots within seconds.
>>                       and a volume clones can be created from each snapshot.
>> hence if the VDI/vmdk/qcow2's are already residing in
>>                       the volume, the clones will just replicate the
>> existing virtual disks. hence there will be no overhead of copying a
>>                       snapshot to a new volume or primary storage over the
>> network.
>>
>> hence there should be an option provided to leverage a snapshot and
>> creating a volume out of it to the corresponding backend storage provider
>> in use.
>>
>> thanks
>>
>>
>> On Tue, Feb 17, 2015 at 3:11 AM, Mike Tutkowski <
>> mike.tutkowski@solidfire.com> wrote:
>>
>>> Whatever way you think makes the most sense.
>>>
>>> Either way, I'm working on this for XenServer and ESXi (eventually on
>> KVM,
>>> I expect) for managed storage (SolidFire is an example of managed
>> storage).
>>>
>>> On Mon, Feb 16, 2015 at 2:38 PM, Andrei Mikhailovsky <an...@arhont.com>
>>> wrote:
>>>
>>>> I am happy to see the discussion is taking its pace and a lot of people
>>>> tend to agree that we should address this area. I have done the ticket
>>> for
>>>> that, but I am not sure if this should be dealt in a more general way
>> as
>>>> suggested. Or perhaps having individual tickets for each hypervisor
>> would
>>>> achieve a faster response from the community?
>>>>
>>>> Andrei
>>>>
>>>> ----- Original Message -----
>>>>
>>>>> From: "Mike Tutkowski" <mi...@solidfire.com>
>>>>> To: dev@cloudstack.apache.org
>>>>> Sent: Monday, 16 February, 2015 9:17:26 PM
>>>>> Subject: Re: Your thoughts on using Primary Storage for keeping
>>>>> snapshots
>>>>
>>>>> Well...count me in on the general-purpose part (I'm already working
>>>>> on that
>>>>> and have much of it working).
>>>>
>>>>> If someone is interested in implementing the RBD part, he/she can
>>>>> sync with
>>>>> me and see if there is any overlapping work that I've already
>>>>> implementing
>>>>> from a general-purpose standpoint.
>>>>
>>>>> On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <ir...@cloudops.com> wrote:
>>>>
>>>>>> Agree with Logan. As fans of Ceph as well as SolidFire, we are
>>>>>> interested
>>>>>> in seeing this particular use case (RBD/KVM) being well
>>>>>> implemented,
>>>>>> however the concept of volume snapshots residing only on primary
>>>>>> storage vs
>>>>>> being transferred to secondary storage is a more generally useful
>>>>>> one that
>>>>>> is worth solving with the same terminology and interfaces, even if
>>>>>> the
>>>>>> mechanisms may be specific to the storage type and hypervisor.
>>>>>>
>>>>>> It its not practical then its not practical, but seems like it
>>>>>> would be
>>>>>> worth trying.
>>>>>>
>>>>>> On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield
>>>>>> <lb...@tqhosting.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mike,
>>>>>>>
>>>>>>> I agree it is a general CloudStack issue that can be addressed
>>>>>>> across
>>>>>>> multiple primary storage options. It's a two stage issue since
>>>>>>> some
>>>>>>> changes will need to be implemented to support these features
>>>>>>> across
>>>>>>> the board, and others will need to be made to each storage
>>>>>>> option.
>>>>>>>
>>>>>>> It would be nice to see a single issue opened to cover this
>>>>>>> across all
>>>>>>> available storage options. Maybe have a community vote on what
>>>>>>> support they want to see, and not consider the feature complete
>>>>>>> until
>>>>>>> all of the desired options are implemented? That would slow down
>>>>>>> development for sure, but it would ensure that it was supported
>>>>>>> where
>>>>>>> it needs to be.
>>>>>>>
>>>>>>> Thank You,
>>>>>>>
>>>>>>> Logan Barfield
>>>>>>> Tranquil Hosting
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
>>>>>>> <mi...@solidfire.com> wrote:
>>>>>>>> For example, Punith from CloudByte sent out an e-mail yesterday
>>>>>>>> that
>>>>>> was
>>>>>>>> very similar to this thread, but he was wondering how to
>>>>>>>> implement
>>>>>> such a
>>>>>>>> concept on his company's SAN technology.
>>>>>>>>
>>>>>>>> On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
>>>>>>>> mike.tutkowski@solidfire.com> wrote:
>>>>>>>>
>>>>>>>>> Yeah, I think it's a similar concept, though.
>>>>>>>>>
>>>>>>>>> You would want to take snapshots on Ceph (or some other
>>>>>>>>> backend system
>>>>>>>>> that acts as primary storage) instead of copying data to
>>>>>>>>> secondary
>>>>>>> storage
>>>>>>>>> and calling it a snapshot.
>>>>>>>>>
>>>>>>>>> For Ceph or any other backend system like that, the idea is to
>>>>>>>>> speed
>>>>>> up
>>>>>>>>> snapshots by not requiring CPU cycles on the front end or
>>>>>>>>> network
>>>>>>> bandwidth
>>>>>>>>> to transfer the data.
>>>>>>>>>
>>>>>>>>> In that sense, this is a general-purpose CloudStack problem
>>>>>>>>> and it
>>>>>>> appears
>>>>>>>>> you are intending on discussing only the Ceph implementation
>>>>>>>>> here,
>>>>>>> which is
>>>>>>>>> fine.
>>>>>>>>>
>>>>>>>>> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
>>>>>>> lbarfield@tqhosting.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Mike,
>>>>>>>>>>
>>>>>>>>>> I think the interest in this issue is primarily for Ceph RBD,
>>>>>>>>>> which
>>>>>>>>>> doesn't use iSCSI or SAN concepts in general. As well I
>>>>>>>>>> believe RBD
>>>>>>>>>> is only currently supported in KVM (and VMware?). QEMU has
>>>>>>>>>> native
>>>>>> RBD
>>>>>>>>>> support, so it attaches the devices directly to the VMs in
>>>>>>>>>> question.
>>>>>>>>>> It also natively supports snapshotting, which is what this
>>>>>>>>>> discussion
>>>>>>>>>> is about.
>>>>>>>>>>
>>>>>>>>>> Thank You,
>>>>>>>>>>
>>>>>>>>>> Logan Barfield
>>>>>>>>>> Tranquil Hosting
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
>>>>>>>>>> <mi...@solidfire.com> wrote:
>>>>>>>>>>> I should have also commented on KVM (since that was the
>>>>>>>>>>> hypervisor
>>>>>>>>>> called
>>>>>>>>>>> out in the initial e-mail).
>>>>>>>>>>>
>>>>>>>>>>> In my situation, most of my customers use XenServer and/or
>>>>>>>>>>> ESXi, so
>>>>>>> KVM
>>>>>>>>>> has
>>>>>>>>>>> received the fewest of my cycles with regards to those
>>>>>>>>>>> three
>>>>>>>>>> hypervisors.
>>>>>>>>>>>
>>>>>>>>>>> KVM, though, is actually the simplest hypervisor for which
>>>>>>>>>>> to
>>>>>>> implement
>>>>>>>>>>> these changes (since I am using the iSCSI adapter of the
>>>>>>>>>>> KVM agent
>>>>>>> and
>>>>>>>>>> it
>>>>>>>>>>> just essentially passes my LUN to the VM in question).
>>>>>>>>>>>
>>>>>>>>>>> For KVM, there is no clustered file system applied to my
>>>>>>>>>>> backend
>>>>>> LUN,
>>>>>>>>>> so I
>>>>>>>>>>> don't have to "worry" about that layer.
>>>>>>>>>>>
>>>>>>>>>>> I don't see any hurdles like *immutable* UUIDs of SRs and
>>>>>>>>>>> VDIs
>>>>>> (such
>>>>>>> is
>>>>>>>>>> the
>>>>>>>>>>> case with XenServer) or having to re-signature anything
>>>>>>>>>>> (such is
>>>>>> the
>>>>>>>>>> case
>>>>>>>>>>> with ESXi).
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
>>>>>>>>>>> mike.tutkowski@solidfire.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I have been working on this on and off for a while now (as
>>>>>>>>>>>> time
>>>>>>>>>> permits).
>>>>>>>>>>>>
>>>>>>>>>>>> Here is an e-mail I sent to a customer of ours that helps
>>>>>>>>>>>> describe
>>>>>>>>>> some of
>>>>>>>>>>>> the issues:
>>>>>>>>>>>>
>>>>>>>>>>>> *** Beginning of e-mail ***
>>>>>>>>>>>>
>>>>>>>>>>>> The main requests were around the following features:
>>>>>>>>>>>>
>>>>>>>>>>>> * The ability to leverage SolidFire snapshots.
>>>>>>>>>>>>
>>>>>>>>>>>> * The ability to create CloudStack templates from
>>>>>>>>>>>> SolidFire
>>>>>>> snapshots.
>>>>>>>>>>>>
>>>>>>>>>>>> I had these on my roadmap, but bumped the priority up and
>>>>>>>>>>>> began
>>>>>>> work on
>>>>>>>>>>>> them for the CS 4.6 release.
>>>>>>>>>>>>
>>>>>>>>>>>> During design, I realized there were issues with the way
>>>>>>>>>>>> XenServer
>>>>>>> is
>>>>>>>>>>>> architected that prevented me from directly using
>>>>>>>>>>>> SolidFire
>>>>>>> snapshots.
>>>>>>>>>>>>
>>>>>>>>>>>> I could definitely take a SolidFire snapshot of a
>>>>>>>>>>>> SolidFire
>>>>>> volume,
>>>>>>> but
>>>>>>>>>>>> this snapshot would not be usable from XenServer if the
>>>>>>>>>>>> original
>>>>>>>>>> volume was
>>>>>>>>>>>> still in use.
>>>>>>>>>>>>
>>>>>>>>>>>> Here is the gist of the problem:
>>>>>>>>>>>>
>>>>>>>>>>>> When XenServer leverages an iSCSI target such as a
>>>>>>>>>>>> SolidFire
>>>>>>> volume, it
>>>>>>>>>>>> applies a clustered files system to it, which they call a
>>>>>>>>>>>> storage
>>>>>>>>>>>> repository (SR). An SR has an *immutable* UUID associated
>>>>>>>>>>>> with it.
>>>>>>>>>>>>
>>>>>>>>>>>> The virtual volume (which a VM sees as a disk) is
>>>>>>>>>>>> represented by a
>>>>>>>>>> virtual
>>>>>>>>>>>> disk image (VDI) in the SR. A VDI also has an *immutable*
>>>>>>>>>>>> UUID
>>>>>>>>>> associated
>>>>>>>>>>>> with it.
>>>>>>>>>>>>
>>>>>>>>>>>> If I take a snapshot (or a clone) of the SolidFire volume
>>>>>>>>>>>> and then
>>>>>>>>>> later
>>>>>>>>>>>> try to use that snapshot from XenServer, XenServer
>>>>>>>>>>>> complains that
>>>>>>> the
>>>>>>>>>> SR on
>>>>>>>>>>>> the snapshot has a UUID that conflicts with an existing
>>>>>>>>>>>> UUID.
>>>>>>>>>>>>
>>>>>>>>>>>> In other words, it is not possible to use the original SR
>>>>>>>>>>>> and the
>>>>>>>>>> snapshot
>>>>>>>>>>>> of this SR from XenServer at the same time, which is
>>>>>>>>>>>> critical in a
>>>>>>>>>> cloud
>>>>>>>>>>>> environment (to enable creating templates from snapshots).
>>>>>>>>>>>>
>>>>>>>>>>>> The way I have proposed circumventing this issue is not
>>>>>>>>>>>> ideal, but
>>>>>>>>>>>> technically works (this code is checked into the CS 4.6
>>>>>>>>>>>> branch):
>>>>>>>>>>>>
>>>>>>>>>>>> When the time comes to take a CloudStack snapshot of a
>>>>>>>>>>>> CloudStack
>>>>>>>>>> volume
>>>>>>>>>>>> that is backed by SolidFire storage via the storage
>>>>>>>>>>>> plug-in, the
>>>>>>>>>> plug-in
>>>>>>>>>>>> will create a new SolidFire volume with characteristics
>>>>>>>>>>>> (size and
>>>>>>> IOPS)
>>>>>>>>>>>> equal to those of the original volume.
>>>>>>>>>>>>
>>>>>>>>>>>> We then have XenServer attach to this new SolidFire
>>>>>>>>>>>> volume,
>>>>>> create a
>>>>>>>>>> *new*
>>>>>>>>>>>> SR on it, and then copy the VDI from the source SR to the
>>>>>>> destination
>>>>>>>>>> SR
>>>>>>>>>>>> (the new SR).
>>>>>>>>>>>>
>>>>>>>>>>>> This leads to us having a copy of the VDI (a "snapshot" of
>>>>>>>>>>>> sorts),
>>>>>>> but
>>>>>>>>>> it
>>>>>>>>>>>> requires CPU cycles on the compute cluster as well as
>>>>>>>>>>>> network
>>>>>>>>>> bandwidth to
>>>>>>>>>>>> write to the SAN (thus it is slower and more resource
>>>>>>>>>>>> intensive
>>>>>>> than a
>>>>>>>>>>>> SolidFire snapshot).
>>>>>>>>>>>>
>>>>>>>>>>>> I spoke with Tim Mackey (who works on XenServer at Citrix)
>>>>>>> concerning
>>>>>>>>>> this
>>>>>>>>>>>> issue before and during the CloudStack Collaboration
>>>>>>>>>>>> Conference in
>>>>>>>>>> Budapest
>>>>>>>>>>>> in November. He agreed that this is a legitimate issue
>>>>>>>>>>>> with the
>>>>>> way
>>>>>>>>>>>> XenServer is designed and could not think of a way (other
>>>>>>>>>>>> than
>>>>>> what
>>>>>>> I
>>>>>>>>>> was
>>>>>>>>>>>> doing) to get around it in current versions of XenServer.
>>>>>>>>>>>>
>>>>>>>>>>>> One thought is to have a feature added to XenServer that
>>>>>>>>>>>> enables
>>>>>>> you to
>>>>>>>>>>>> change the UUID of an SR and of a VDI.
>>>>>>>>>>>>
>>>>>>>>>>>> If I could do that, then I could take a SolidFire snapshot
>>>>>>>>>>>> of the
>>>>>>>>>>>> SolidFire volume and issue commands to XenServer to have
>>>>>>>>>>>> it change
>>>>>>> the
>>>>>>>>>>>> UUIDs of the original SR and the original VDI. I could
>>>>>>>>>>>> then
>>>>>> recored
>>>>>>> the
>>>>>>>>>>>> necessary UUID info in the CS DB.
>>>>>>>>>>>>
>>>>>>>>>>>> *** End of e-mail ***
>>>>>>>>>>>>
>>>>>>>>>>>> I have since investigated this on ESXi.
>>>>>>>>>>>>
>>>>>>>>>>>> ESXi does have a way for us to "re-signature" a datastore,
>>>>>>>>>>>> so
>>>>>>> backend
>>>>>>>>>>>> snapshots can be taken and effectively used on this
>>>>>>>>>>>> hypervisor.
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
>>>>>>>>>> lbarfield@tqhosting.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm just going to stick with the qemu-img option change
>>>>>>>>>>>>> for RBD
>>>>>> for
>>>>>>>>>>>>> now (which should cut snapshot time down drastically),
>>>>>>>>>>>>> and look
>>>>>>>>>>>>> forward to this in the future. I'd be happy to help get
>>>>>>>>>>>>> this
>>>>>>> moving,
>>>>>>>>>>>>> but I'm not enough of a developer to lead the charge.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As far as renaming goes, I agree that maybe backups isn't
>>>>>>>>>>>>> the
>>>>>> right
>>>>>>>>>>>>> word. That being said calling a full-sized copy of a
>>>>>>>>>>>>> volume a
>>>>>>>>>>>>> "snapshot" also isn't the right word. Maybe "image" would
>>>>>>>>>>>>> be
>>>>>>> better?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've also got my reservations about "accounts" vs "users"
>>>>>>>>>>>>> (I
>>>>>> think
>>>>>>>>>>>>> "departments" and "accounts or users" respectively is
>>>>>>>>>>>>> less
>>>>>>> confusing),
>>>>>>>>>>>>> but that's a different thread.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thank You,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Logan Barfield
>>>>>>>>>>>>> Tranquil Hosting
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
>>>>>>> wido@widodh.nl>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 16-02-15 15:38, Logan Barfield wrote:
>>>>>>>>>>>>>>> I like this idea a lot for Ceph RBD. I do think there
>>>>>>>>>>>>>>> should
>>>>>>>>>> still be
>>>>>>>>>>>>>>> support for copying snapshots to secondary storage as
>>>>>>>>>>>>>>> needed
>>>>>>> (for
>>>>>>>>>>>>>>> transfers between zones, etc.). I really think that
>>>>>>>>>>>>>>> this
>>>>>> could
>>>>>>> be
>>>>>>>>>>>>>>> part of a larger move to clarify the naming
>>>>>>>>>>>>>>> conventions used
>>>>>> for
>>>>>>>>>> disk
>>>>>>>>>>>>>>> operations. Currently "Volume Snapshots" should
>>>>>>>>>>>>>>> probably
>>>>>>> really be
>>>>>>>>>>>>>>> called "Backups". So having "snapshot" functionality,
>>>>>>>>>>>>>>> and a
>>>>>>>>>> "convert
>>>>>>>>>>>>>>> snapshot to backup/template" would be a good move.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I fully agree that this would be a very great addition.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I won't be able to work on this any time soon though.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Wido
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank You,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Logan Barfield
>>>>>>>>>>>>>>> Tranquil Hosting
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>>>>>>>>>>>>> andrija.panic@gmail.com> wrote:
>>>>>>>>>>>>>>>> BIG +1
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My team should submit some patch to ACS for better
>>>>>>>>>>>>>>>> KVM
>>>>>>> snapshots,
>>>>>>>>>>>>> including
>>>>>>>>>>>>>>>> whole VM snapshot etc...but it's too early to give
>>>>>>>>>>>>>>>> details...
>>>>>>>>>>>>>>>> best
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
>>>>>>>>>> andrei@arhont.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello guys,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I was hoping to have some feedback from the
>>>>>>>>>>>>>>>>> community on the
>>>>>>>>>> subject
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> having an ability to keep snapshots on the primary
>>>>>>>>>>>>>>>>> storage
>>>>>>> where
>>>>>>>>>> it
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> supported by the storage backend.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The idea behind this functionality is to improve how
>>>>>> snapshots
>>>>>>>>>> are
>>>>>>>>>>>>>>>>> currently handled on KVM hypervisors with Ceph
>>>>>>>>>>>>>>>>> primary
>>>>>>> storage.
>>>>>>>>>> At
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> moment, the snapshots are taken on the primary
>>>>>>>>>>>>>>>>> storage and
>>>>>>> being
>>>>>>>>>>>>> copied to
>>>>>>>>>>>>>>>>> the secondary storage. This method is very slow and
>>>>>>> inefficient
>>>>>>>>>> even
>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>> small infrastructure. Even on medium deployments
>>>>>>>>>>>>>>>>> using
>>>>>>> snapshots
>>>>>>>>>> in
>>>>>>>>>>>>> KVM
>>>>>>>>>>>>>>>>> becomes nearly impossible. If you have tens or
>>>>>>>>>>>>>>>>> hundreds
>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>> snapshots taking place you will have a bunch of
>>>>>>>>>>>>>>>>> timeouts and
>>>>>>>>>> errors,
>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>> network becomes clogged, etc. In addition, using
>>>>>>>>>>>>>>>>> these
>>>>>>> snapshots
>>>>>>>>>> for
>>>>>>>>>>>>>>>>> creating new volumes or reverting back vms also slow
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>> inefficient. As
>>>>>>>>>>>>>>>>> above, when you have tens or hundreds concurrent
>>>>>>>>>>>>>>>>> operations
>>>>>> it
>>>>>>>>>> will
>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> succeed and you will have a majority of tasks with
>>>>>>>>>>>>>>>>> errors or
>>>>>>>>>>>>> timeouts.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At the moment, taking a single snapshot of
>>>>>>>>>>>>>>>>> relatively small
>>>>>>>>>> volumes
>>>>>>>>>>>>> (200GB
>>>>>>>>>>>>>>>>> or 500GB for instance) takes tens if not hundreds of
>>>>>> minutes.
>>>>>>>>>> Taking
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> snapshot of the same volume on ceph primary storage
>>>>>>>>>>>>>>>>> takes a
>>>>>>> few
>>>>>>>>>>>>> seconds at
>>>>>>>>>>>>>>>>> most! Similarly, converting a snapshot to a volume
>>>>>>>>>>>>>>>>> takes
>>>>>> tens
>>>>>>> if
>>>>>>>>>> not
>>>>>>>>>>>>>>>>> hundreds of minutes when secondary storage is
>>>>>>>>>>>>>>>>> involved;
>>>>>>> compared
>>>>>>>>>> with
>>>>>>>>>>>>>>>>> seconds if done directly on the primary storage.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I suggest that the CloudStack should have the
>>>>>>>>>>>>>>>>> ability to
>>>>>> keep
>>>>>>>>>> volume
>>>>>>>>>>>>>>>>> snapshots on the primary storage where this is
>>>>>>>>>>>>>>>>> supported by
>>>>>>> the
>>>>>>>>>>>>> storage.
>>>>>>>>>>>>>>>>> Perhaps having a per primary storage setting that
>>>>>>>>>>>>>>>>> enables
>>>>>> this
>>>>>>>>>>>>>>>>> functionality. This will be beneficial for Ceph
>>>>>>>>>>>>>>>>> primary
>>>>>>> storage
>>>>>>>>>> on
>>>>>>>>>>>>> KVM
>>>>>>>>>>>>>>>>> hypervisors and perhaps on XenServer when Ceph will
>>>>>>>>>>>>>>>>> be
>>>>>>> supported
>>>>>>>>>> in
>>>>>>>>>>>>> a near
>>>>>>>>>>>>>>>>> future.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This will greatly speed up the process of using
>>>>>>>>>>>>>>>>> snapshots on
>>>>>>> KVM
>>>>>>>>>> and
>>>>>>>>>>>>> users
>>>>>>>>>>>>>>>>> will actually start using snapshotting rather than
>>>>>>>>>>>>>>>>> giving up
>>>>>>> with
>>>>>>>>>>>>>>>>> frustration.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have opened the ticket CLOUDSTACK-8256, so please
>>>>>>>>>>>>>>>>> cast
>>>>>> your
>>>>>>>>>> vote
>>>>>>>>>>>>> if you
>>>>>>>>>>>>>>>>> are in agreement.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for your input
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Andrei
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Andrija Panić
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> *Mike Tutkowski*
>>>>>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>>>>>>>>>> e: mike.tutkowski@solidfire.com
>>>>>>>>>>>> o: 303.746.7302
>>>>>>>>>>>> Advancing the way the world uses the cloud
>>>>>>>>>>>> <http://solidfire.com/solution/overview/?video=play>*™*
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Mike Tutkowski*
>>>>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>>>>>>>>> e: mike.tutkowski@solidfire.com
>>>>>>>>>>> o: 303.746.7302
>>>>>>>>>>> Advancing the way the world uses the cloud
>>>>>>>>>>> <http://solidfire.com/solution/overview/?video=play>*™*
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Mike Tutkowski*
>>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>>>>>>> e: mike.tutkowski@solidfire.com
>>>>>>>>> o: 303.746.7302
>>>>>>>>> Advancing the way the world uses the cloud
>>>>>>>>> <http://solidfire.com/solution/overview/?video=play>*™*
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Mike Tutkowski*
>>>>>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>>>>>> e: mike.tutkowski@solidfire.com
>>>>>>>> o: 303.746.7302
>>>>>>>> Advancing the way the world uses the cloud
>>>>>>>> <http://solidfire.com/solution/overview/?video=play>*™*
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Ian Rae*
>>>>>> PDG *| *CEO
>>>>>> t *514.944.4008*
>>>>>>
>>>>>> *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions
>>>>>> Experts
>>>>>> w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|*
>>>>>> Montreal *|*
>>>>>> Quebec *|* H3J 1S6
>>>>>>
>>>>>> <https://www.cloud.ca/>
>>>>>> <
>>>>>>
>>>>
>>>
>> http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
>>>>>>>
>>>>>>
>>>>
>>>>> --
>>>>> *Mike Tutkowski*
>>>>> *Senior CloudStack Developer, SolidFire Inc.*
>>>>> e: mike.tutkowski@solidfire.com
>>>>> o: 303.746.7302
>>>>> Advancing the way the world uses the cloud
>>>>> <http://solidfire.com/solution/overview/?video=play>*™*
>>>>
>>>
>>>
>>>
>>> --
>>> *Mike Tutkowski*
>>> *Senior CloudStack Developer, SolidFire Inc.*
>>> e: mike.tutkowski@solidfire.com
>>> o: 303.746.7302
>>> Advancing the way the world uses the cloud
>>> <http://solidfire.com/solution/overview/?video=play>*™*
>>>
>>
>>
>>
>> --
>> regards,
>>
>> punith s
>> cloudbyte.com
>>
>
>
>

--
Regards,
Rohit Yadav
Software Architect, ShapeBlue
M. +91 8826230892 | rohit.yadav@shapeblue.com
Blog: bhaisaab.org | Twitter: @_bhaisaab
PS. If you see any footer below, I did not add it :)
Find out more about ShapeBlue and our range of CloudStack related services

IaaS Cloud Design & Build<http://shapeblue.com/iaas-cloud-design-and-build//>
CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/>
CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/>
CloudStack Software Engineering<http://shapeblue.com/cloudstack-software-engineering/>
CloudStack Infrastructure Support<http://shapeblue.com/cloudstack-infrastructure-support/>
CloudStack Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>

This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error. Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue Services India LLP is a company incorporated in India and is operated under license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated in Brasil and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The Republic of South Africa and is traded under license from Shape Blue Ltd. ShapeBlue is a registered trademark.

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
So, I spoke with Edison a couple months ago about my desire to take
snapshots on SolidFire's SAN instead of a making a backup on secondary
storage.

As Edison explained, CloudStack is theoretically flexible enough to
accommodate this if the developer implements the SnapshotStrategy interface
and orchestrates what needs to be done.

I went ahead and implemented this approach successfully in XenServer
(although, as I mentioned earlier, not in as ideal a fashion as I would
have liked due to issues in XenServer). However, along the way, I found
that there was code in CloudStack "core" that assumed snapshots would be
stored on secondary storage.

I went ahead and made this core logic more flexible in 4.6 and so 4.6 is
the first release that I will be able to offer the ability for users to
take CloudStack snapshots that are stored on my SAN instead of on secondary
storage.

I then implemented a custom DataMotionStrategy to use with my snapshots so
that I could create templates for general use in CloudStack and so I could
make CloudStack volumes from my snapshots.

Again...I know this thread is more about RBD, but I thought this might be
of interest as an FYI.

On Wed, Feb 18, 2015 at 6:09 AM, Punith S <pu...@cloudbyte.com> wrote:

> +1 with this feature.
>
> like Logan said the current snapshot is just a copy of VDI/vmdk/qcow2's to
> the secondary storage
> hence the current feature acts as a backup feature taking a long time.
>
> also the current cloudstack storage framework is not allowing the third
> party storage vendors like cloudbyte and others to leverage their backend
> storage snapshot feature where file systems support Copy on Write.
>
> for example : In cloudbyte elastistor which is based on zfs filesystem, it
> allows to take multiple snapshots within seconds.
>                      and a volume clones can be created from each snapshot.
> hence if the VDI/vmdk/qcow2's are already residing in
>                      the volume, the clones will just replicate the
> existing virtual disks. hence there will be no overhead of copying a
>                      snapshot to a new volume or primary storage over the
> network.
>
> hence there should be an option provided to leverage a snapshot and
> creating a volume out of it to the corresponding backend storage provider
> in use.
>
> thanks
>
>
> On Tue, Feb 17, 2015 at 3:11 AM, Mike Tutkowski <
> mike.tutkowski@solidfire.com> wrote:
>
> > Whatever way you think makes the most sense.
> >
> > Either way, I'm working on this for XenServer and ESXi (eventually on
> KVM,
> > I expect) for managed storage (SolidFire is an example of managed
> storage).
> >
> > On Mon, Feb 16, 2015 at 2:38 PM, Andrei Mikhailovsky <an...@arhont.com>
> > wrote:
> >
> > > I am happy to see the discussion is taking its pace and a lot of people
> > > tend to agree that we should address this area. I have done the ticket
> > for
> > > that, but I am not sure if this should be dealt in a more general way
> as
> > > suggested. Or perhaps having individual tickets for each hypervisor
> would
> > > achieve a faster response from the community?
> > >
> > > Andrei
> > >
> > > ----- Original Message -----
> > >
> > > > From: "Mike Tutkowski" <mi...@solidfire.com>
> > > > To: dev@cloudstack.apache.org
> > > > Sent: Monday, 16 February, 2015 9:17:26 PM
> > > > Subject: Re: Your thoughts on using Primary Storage for keeping
> > > > snapshots
> > >
> > > > Well...count me in on the general-purpose part (I'm already working
> > > > on that
> > > > and have much of it working).
> > >
> > > > If someone is interested in implementing the RBD part, he/she can
> > > > sync with
> > > > me and see if there is any overlapping work that I've already
> > > > implementing
> > > > from a general-purpose standpoint.
> > >
> > > > On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <ir...@cloudops.com> wrote:
> > >
> > > > > Agree with Logan. As fans of Ceph as well as SolidFire, we are
> > > > > interested
> > > > > in seeing this particular use case (RBD/KVM) being well
> > > > > implemented,
> > > > > however the concept of volume snapshots residing only on primary
> > > > > storage vs
> > > > > being transferred to secondary storage is a more generally useful
> > > > > one that
> > > > > is worth solving with the same terminology and interfaces, even if
> > > > > the
> > > > > mechanisms may be specific to the storage type and hypervisor.
> > > > >
> > > > > It its not practical then its not practical, but seems like it
> > > > > would be
> > > > > worth trying.
> > > > >
> > > > > On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield
> > > > > <lb...@tqhosting.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Mike,
> > > > > >
> > > > > > I agree it is a general CloudStack issue that can be addressed
> > > > > > across
> > > > > > multiple primary storage options. It's a two stage issue since
> > > > > > some
> > > > > > changes will need to be implemented to support these features
> > > > > > across
> > > > > > the board, and others will need to be made to each storage
> > > > > > option.
> > > > > >
> > > > > > It would be nice to see a single issue opened to cover this
> > > > > > across all
> > > > > > available storage options. Maybe have a community vote on what
> > > > > > support they want to see, and not consider the feature complete
> > > > > > until
> > > > > > all of the desired options are implemented? That would slow down
> > > > > > development for sure, but it would ensure that it was supported
> > > > > > where
> > > > > > it needs to be.
> > > > > >
> > > > > > Thank You,
> > > > > >
> > > > > > Logan Barfield
> > > > > > Tranquil Hosting
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> > > > > > <mi...@solidfire.com> wrote:
> > > > > > > For example, Punith from CloudByte sent out an e-mail yesterday
> > > > > > > that
> > > > > was
> > > > > > > very similar to this thread, but he was wondering how to
> > > > > > > implement
> > > > > such a
> > > > > > > concept on his company's SAN technology.
> > > > > > >
> > > > > > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > > > > > > mike.tutkowski@solidfire.com> wrote:
> > > > > > >
> > > > > > >> Yeah, I think it's a similar concept, though.
> > > > > > >>
> > > > > > >> You would want to take snapshots on Ceph (or some other
> > > > > > >> backend system
> > > > > > >> that acts as primary storage) instead of copying data to
> > > > > > >> secondary
> > > > > > storage
> > > > > > >> and calling it a snapshot.
> > > > > > >>
> > > > > > >> For Ceph or any other backend system like that, the idea is to
> > > > > > >> speed
> > > > > up
> > > > > > >> snapshots by not requiring CPU cycles on the front end or
> > > > > > >> network
> > > > > > bandwidth
> > > > > > >> to transfer the data.
> > > > > > >>
> > > > > > >> In that sense, this is a general-purpose CloudStack problem
> > > > > > >> and it
> > > > > > appears
> > > > > > >> you are intending on discussing only the Ceph implementation
> > > > > > >> here,
> > > > > > which is
> > > > > > >> fine.
> > > > > > >>
> > > > > > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> > > > > > lbarfield@tqhosting.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Hi Mike,
> > > > > > >>>
> > > > > > >>> I think the interest in this issue is primarily for Ceph RBD,
> > > > > > >>> which
> > > > > > >>> doesn't use iSCSI or SAN concepts in general. As well I
> > > > > > >>> believe RBD
> > > > > > >>> is only currently supported in KVM (and VMware?). QEMU has
> > > > > > >>> native
> > > > > RBD
> > > > > > >>> support, so it attaches the devices directly to the VMs in
> > > > > > >>> question.
> > > > > > >>> It also natively supports snapshotting, which is what this
> > > > > > >>> discussion
> > > > > > >>> is about.
> > > > > > >>>
> > > > > > >>> Thank You,
> > > > > > >>>
> > > > > > >>> Logan Barfield
> > > > > > >>> Tranquil Hosting
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> > > > > > >>> <mi...@solidfire.com> wrote:
> > > > > > >>> > I should have also commented on KVM (since that was the
> > > > > > >>> > hypervisor
> > > > > > >>> called
> > > > > > >>> > out in the initial e-mail).
> > > > > > >>> >
> > > > > > >>> > In my situation, most of my customers use XenServer and/or
> > > > > > >>> > ESXi, so
> > > > > > KVM
> > > > > > >>> has
> > > > > > >>> > received the fewest of my cycles with regards to those
> > > > > > >>> > three
> > > > > > >>> hypervisors.
> > > > > > >>> >
> > > > > > >>> > KVM, though, is actually the simplest hypervisor for which
> > > > > > >>> > to
> > > > > > implement
> > > > > > >>> > these changes (since I am using the iSCSI adapter of the
> > > > > > >>> > KVM agent
> > > > > > and
> > > > > > >>> it
> > > > > > >>> > just essentially passes my LUN to the VM in question).
> > > > > > >>> >
> > > > > > >>> > For KVM, there is no clustered file system applied to my
> > > > > > >>> > backend
> > > > > LUN,
> > > > > > >>> so I
> > > > > > >>> > don't have to "worry" about that layer.
> > > > > > >>> >
> > > > > > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and
> > > > > > >>> > VDIs
> > > > > (such
> > > > > > is
> > > > > > >>> the
> > > > > > >>> > case with XenServer) or having to re-signature anything
> > > > > > >>> > (such is
> > > > > the
> > > > > > >>> case
> > > > > > >>> > with ESXi).
> > > > > > >>> >
> > > > > > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > > > > > >>> > mike.tutkowski@solidfire.com> wrote:
> > > > > > >>> >
> > > > > > >>> >> I have been working on this on and off for a while now (as
> > > > > > >>> >> time
> > > > > > >>> permits).
> > > > > > >>> >>
> > > > > > >>> >> Here is an e-mail I sent to a customer of ours that helps
> > > > > > >>> >> describe
> > > > > > >>> some of
> > > > > > >>> >> the issues:
> > > > > > >>> >>
> > > > > > >>> >> *** Beginning of e-mail ***
> > > > > > >>> >>
> > > > > > >>> >> The main requests were around the following features:
> > > > > > >>> >>
> > > > > > >>> >> * The ability to leverage SolidFire snapshots.
> > > > > > >>> >>
> > > > > > >>> >> * The ability to create CloudStack templates from
> > > > > > >>> >> SolidFire
> > > > > > snapshots.
> > > > > > >>> >>
> > > > > > >>> >> I had these on my roadmap, but bumped the priority up and
> > > > > > >>> >> began
> > > > > > work on
> > > > > > >>> >> them for the CS 4.6 release.
> > > > > > >>> >>
> > > > > > >>> >> During design, I realized there were issues with the way
> > > > > > >>> >> XenServer
> > > > > > is
> > > > > > >>> >> architected that prevented me from directly using
> > > > > > >>> >> SolidFire
> > > > > > snapshots.
> > > > > > >>> >>
> > > > > > >>> >> I could definitely take a SolidFire snapshot of a
> > > > > > >>> >> SolidFire
> > > > > volume,
> > > > > > but
> > > > > > >>> >> this snapshot would not be usable from XenServer if the
> > > > > > >>> >> original
> > > > > > >>> volume was
> > > > > > >>> >> still in use.
> > > > > > >>> >>
> > > > > > >>> >> Here is the gist of the problem:
> > > > > > >>> >>
> > > > > > >>> >> When XenServer leverages an iSCSI target such as a
> > > > > > >>> >> SolidFire
> > > > > > volume, it
> > > > > > >>> >> applies a clustered files system to it, which they call a
> > > > > > >>> >> storage
> > > > > > >>> >> repository (SR). An SR has an *immutable* UUID associated
> > > > > > >>> >> with it.
> > > > > > >>> >>
> > > > > > >>> >> The virtual volume (which a VM sees as a disk) is
> > > > > > >>> >> represented by a
> > > > > > >>> virtual
> > > > > > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable*
> > > > > > >>> >> UUID
> > > > > > >>> associated
> > > > > > >>> >> with it.
> > > > > > >>> >>
> > > > > > >>> >> If I take a snapshot (or a clone) of the SolidFire volume
> > > > > > >>> >> and then
> > > > > > >>> later
> > > > > > >>> >> try to use that snapshot from XenServer, XenServer
> > > > > > >>> >> complains that
> > > > > > the
> > > > > > >>> SR on
> > > > > > >>> >> the snapshot has a UUID that conflicts with an existing
> > > > > > >>> >> UUID.
> > > > > > >>> >>
> > > > > > >>> >> In other words, it is not possible to use the original SR
> > > > > > >>> >> and the
> > > > > > >>> snapshot
> > > > > > >>> >> of this SR from XenServer at the same time, which is
> > > > > > >>> >> critical in a
> > > > > > >>> cloud
> > > > > > >>> >> environment (to enable creating templates from snapshots).
> > > > > > >>> >>
> > > > > > >>> >> The way I have proposed circumventing this issue is not
> > > > > > >>> >> ideal, but
> > > > > > >>> >> technically works (this code is checked into the CS 4.6
> > > > > > >>> >> branch):
> > > > > > >>> >>
> > > > > > >>> >> When the time comes to take a CloudStack snapshot of a
> > > > > > >>> >> CloudStack
> > > > > > >>> volume
> > > > > > >>> >> that is backed by SolidFire storage via the storage
> > > > > > >>> >> plug-in, the
> > > > > > >>> plug-in
> > > > > > >>> >> will create a new SolidFire volume with characteristics
> > > > > > >>> >> (size and
> > > > > > IOPS)
> > > > > > >>> >> equal to those of the original volume.
> > > > > > >>> >>
> > > > > > >>> >> We then have XenServer attach to this new SolidFire
> > > > > > >>> >> volume,
> > > > > create a
> > > > > > >>> *new*
> > > > > > >>> >> SR on it, and then copy the VDI from the source SR to the
> > > > > > destination
> > > > > > >>> SR
> > > > > > >>> >> (the new SR).
> > > > > > >>> >>
> > > > > > >>> >> This leads to us having a copy of the VDI (a "snapshot" of
> > > > > > >>> >> sorts),
> > > > > > but
> > > > > > >>> it
> > > > > > >>> >> requires CPU cycles on the compute cluster as well as
> > > > > > >>> >> network
> > > > > > >>> bandwidth to
> > > > > > >>> >> write to the SAN (thus it is slower and more resource
> > > > > > >>> >> intensive
> > > > > > than a
> > > > > > >>> >> SolidFire snapshot).
> > > > > > >>> >>
> > > > > > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> > > > > > concerning
> > > > > > >>> this
> > > > > > >>> >> issue before and during the CloudStack Collaboration
> > > > > > >>> >> Conference in
> > > > > > >>> Budapest
> > > > > > >>> >> in November. He agreed that this is a legitimate issue
> > > > > > >>> >> with the
> > > > > way
> > > > > > >>> >> XenServer is designed and could not think of a way (other
> > > > > > >>> >> than
> > > > > what
> > > > > > I
> > > > > > >>> was
> > > > > > >>> >> doing) to get around it in current versions of XenServer.
> > > > > > >>> >>
> > > > > > >>> >> One thought is to have a feature added to XenServer that
> > > > > > >>> >> enables
> > > > > > you to
> > > > > > >>> >> change the UUID of an SR and of a VDI.
> > > > > > >>> >>
> > > > > > >>> >> If I could do that, then I could take a SolidFire snapshot
> > > > > > >>> >> of the
> > > > > > >>> >> SolidFire volume and issue commands to XenServer to have
> > > > > > >>> >> it change
> > > > > > the
> > > > > > >>> >> UUIDs of the original SR and the original VDI. I could
> > > > > > >>> >> then
> > > > > recored
> > > > > > the
> > > > > > >>> >> necessary UUID info in the CS DB.
> > > > > > >>> >>
> > > > > > >>> >> *** End of e-mail ***
> > > > > > >>> >>
> > > > > > >>> >> I have since investigated this on ESXi.
> > > > > > >>> >>
> > > > > > >>> >> ESXi does have a way for us to "re-signature" a datastore,
> > > > > > >>> >> so
> > > > > > backend
> > > > > > >>> >> snapshots can be taken and effectively used on this
> > > > > > >>> >> hypervisor.
> > > > > > >>> >>
> > > > > > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> > > > > > >>> lbarfield@tqhosting.com>
> > > > > > >>> >> wrote:
> > > > > > >>> >>
> > > > > > >>> >>> I'm just going to stick with the qemu-img option change
> > > > > > >>> >>> for RBD
> > > > > for
> > > > > > >>> >>> now (which should cut snapshot time down drastically),
> > > > > > >>> >>> and look
> > > > > > >>> >>> forward to this in the future. I'd be happy to help get
> > > > > > >>> >>> this
> > > > > > moving,
> > > > > > >>> >>> but I'm not enough of a developer to lead the charge.
> > > > > > >>> >>>
> > > > > > >>> >>> As far as renaming goes, I agree that maybe backups isn't
> > > > > > >>> >>> the
> > > > > right
> > > > > > >>> >>> word. That being said calling a full-sized copy of a
> > > > > > >>> >>> volume a
> > > > > > >>> >>> "snapshot" also isn't the right word. Maybe "image" would
> > > > > > >>> >>> be
> > > > > > better?
> > > > > > >>> >>>
> > > > > > >>> >>> I've also got my reservations about "accounts" vs "users"
> > > > > > >>> >>> (I
> > > > > think
> > > > > > >>> >>> "departments" and "accounts or users" respectively is
> > > > > > >>> >>> less
> > > > > > confusing),
> > > > > > >>> >>> but that's a different thread.
> > > > > > >>> >>>
> > > > > > >>> >>> Thank You,
> > > > > > >>> >>>
> > > > > > >>> >>> Logan Barfield
> > > > > > >>> >>> Tranquil Hosting
> > > > > > >>> >>>
> > > > > > >>> >>>
> > > > > > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> > > > > > wido@widodh.nl>
> > > > > > >>> >>> wrote:
> > > > > > >>> >>> >
> > > > > > >>> >>> >
> > > > > > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> > > > > > >>> >>> >> I like this idea a lot for Ceph RBD. I do think there
> > > > > > >>> >>> >> should
> > > > > > >>> still be
> > > > > > >>> >>> >> support for copying snapshots to secondary storage as
> > > > > > >>> >>> >> needed
> > > > > > (for
> > > > > > >>> >>> >> transfers between zones, etc.). I really think that
> > > > > > >>> >>> >> this
> > > > > could
> > > > > > be
> > > > > > >>> >>> >> part of a larger move to clarify the naming
> > > > > > >>> >>> >> conventions used
> > > > > for
> > > > > > >>> disk
> > > > > > >>> >>> >> operations. Currently "Volume Snapshots" should
> > > > > > >>> >>> >> probably
> > > > > > really be
> > > > > > >>> >>> >> called "Backups". So having "snapshot" functionality,
> > > > > > >>> >>> >> and a
> > > > > > >>> "convert
> > > > > > >>> >>> >> snapshot to backup/template" would be a good move.
> > > > > > >>> >>> >>
> > > > > > >>> >>> >
> > > > > > >>> >>> > I fully agree that this would be a very great addition.
> > > > > > >>> >>> >
> > > > > > >>> >>> > I won't be able to work on this any time soon though.
> > > > > > >>> >>> >
> > > > > > >>> >>> > Wido
> > > > > > >>> >>> >
> > > > > > >>> >>> >> Thank You,
> > > > > > >>> >>> >>
> > > > > > >>> >>> >> Logan Barfield
> > > > > > >>> >>> >> Tranquil Hosting
> > > > > > >>> >>> >>
> > > > > > >>> >>> >>
> > > > > > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> > > > > > >>> >>> andrija.panic@gmail.com> wrote:
> > > > > > >>> >>> >>> BIG +1
> > > > > > >>> >>> >>>
> > > > > > >>> >>> >>> My team should submit some patch to ACS for better
> > > > > > >>> >>> >>> KVM
> > > > > > snapshots,
> > > > > > >>> >>> including
> > > > > > >>> >>> >>> whole VM snapshot etc...but it's too early to give
> > > > > > >>> >>> >>> details...
> > > > > > >>> >>> >>> best
> > > > > > >>> >>> >>>
> > > > > > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> > > > > > >>> andrei@arhont.com>
> > > > > > >>> >>> wrote:
> > > > > > >>> >>> >>>
> > > > > > >>> >>> >>>> Hello guys,
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> I was hoping to have some feedback from the
> > > > > > >>> >>> >>>> community on the
> > > > > > >>> subject
> > > > > > >>> >>> of
> > > > > > >>> >>> >>>> having an ability to keep snapshots on the primary
> > > > > > >>> >>> >>>> storage
> > > > > > where
> > > > > > >>> it
> > > > > > >>> >>> is
> > > > > > >>> >>> >>>> supported by the storage backend.
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> The idea behind this functionality is to improve how
> > > > > snapshots
> > > > > > >>> are
> > > > > > >>> >>> >>>> currently handled on KVM hypervisors with Ceph
> > > > > > >>> >>> >>>> primary
> > > > > > storage.
> > > > > > >>> At
> > > > > > >>> >>> the
> > > > > > >>> >>> >>>> moment, the snapshots are taken on the primary
> > > > > > >>> >>> >>>> storage and
> > > > > > being
> > > > > > >>> >>> copied to
> > > > > > >>> >>> >>>> the secondary storage. This method is very slow and
> > > > > > inefficient
> > > > > > >>> even
> > > > > > >>> >>> on
> > > > > > >>> >>> >>>> small infrastructure. Even on medium deployments
> > > > > > >>> >>> >>>> using
> > > > > > snapshots
> > > > > > >>> in
> > > > > > >>> >>> KVM
> > > > > > >>> >>> >>>> becomes nearly impossible. If you have tens or
> > > > > > >>> >>> >>>> hundreds
> > > > > > >>> concurrent
> > > > > > >>> >>> >>>> snapshots taking place you will have a bunch of
> > > > > > >>> >>> >>>> timeouts and
> > > > > > >>> errors,
> > > > > > >>> >>> your
> > > > > > >>> >>> >>>> network becomes clogged, etc. In addition, using
> > > > > > >>> >>> >>>> these
> > > > > > snapshots
> > > > > > >>> for
> > > > > > >>> >>> >>>> creating new volumes or reverting back vms also slow
> > > > > > >>> >>> >>>> and
> > > > > > >>> >>> inefficient. As
> > > > > > >>> >>> >>>> above, when you have tens or hundreds concurrent
> > > > > > >>> >>> >>>> operations
> > > > > it
> > > > > > >>> will
> > > > > > >>> >>> not
> > > > > > >>> >>> >>>> succeed and you will have a majority of tasks with
> > > > > > >>> >>> >>>> errors or
> > > > > > >>> >>> timeouts.
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> At the moment, taking a single snapshot of
> > > > > > >>> >>> >>>> relatively small
> > > > > > >>> volumes
> > > > > > >>> >>> (200GB
> > > > > > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of
> > > > > minutes.
> > > > > > >>> Taking
> > > > > > >>> >>> a
> > > > > > >>> >>> >>>> snapshot of the same volume on ceph primary storage
> > > > > > >>> >>> >>>> takes a
> > > > > > few
> > > > > > >>> >>> seconds at
> > > > > > >>> >>> >>>> most! Similarly, converting a snapshot to a volume
> > > > > > >>> >>> >>>> takes
> > > > > tens
> > > > > > if
> > > > > > >>> not
> > > > > > >>> >>> >>>> hundreds of minutes when secondary storage is
> > > > > > >>> >>> >>>> involved;
> > > > > > compared
> > > > > > >>> with
> > > > > > >>> >>> >>>> seconds if done directly on the primary storage.
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> I suggest that the CloudStack should have the
> > > > > > >>> >>> >>>> ability to
> > > > > keep
> > > > > > >>> volume
> > > > > > >>> >>> >>>> snapshots on the primary storage where this is
> > > > > > >>> >>> >>>> supported by
> > > > > > the
> > > > > > >>> >>> storage.
> > > > > > >>> >>> >>>> Perhaps having a per primary storage setting that
> > > > > > >>> >>> >>>> enables
> > > > > this
> > > > > > >>> >>> >>>> functionality. This will be beneficial for Ceph
> > > > > > >>> >>> >>>> primary
> > > > > > storage
> > > > > > >>> on
> > > > > > >>> >>> KVM
> > > > > > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will
> > > > > > >>> >>> >>>> be
> > > > > > supported
> > > > > > >>> in
> > > > > > >>> >>> a near
> > > > > > >>> >>> >>>> future.
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> This will greatly speed up the process of using
> > > > > > >>> >>> >>>> snapshots on
> > > > > > KVM
> > > > > > >>> and
> > > > > > >>> >>> users
> > > > > > >>> >>> >>>> will actually start using snapshotting rather than
> > > > > > >>> >>> >>>> giving up
> > > > > > with
> > > > > > >>> >>> >>>> frustration.
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please
> > > > > > >>> >>> >>>> cast
> > > > > your
> > > > > > >>> vote
> > > > > > >>> >>> if you
> > > > > > >>> >>> >>>> are in agreement.
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> Thanks for your input
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>> Andrei
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>>
> > > > > > >>> >>> >>>
> > > > > > >>> >>> >>>
> > > > > > >>> >>> >>> --
> > > > > > >>> >>> >>>
> > > > > > >>> >>> >>> Andrija Panić
> > > > > > >>> >>>
> > > > > > >>> >>
> > > > > > >>> >>
> > > > > > >>> >>
> > > > > > >>> >> --
> > > > > > >>> >> *Mike Tutkowski*
> > > > > > >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> > > > > > >>> >> e: mike.tutkowski@solidfire.com
> > > > > > >>> >> o: 303.746.7302
> > > > > > >>> >> Advancing the way the world uses the cloud
> > > > > > >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > > >>> >>
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>> >
> > > > > > >>> > --
> > > > > > >>> > *Mike Tutkowski*
> > > > > > >>> > *Senior CloudStack Developer, SolidFire Inc.*
> > > > > > >>> > e: mike.tutkowski@solidfire.com
> > > > > > >>> > o: 303.746.7302
> > > > > > >>> > Advancing the way the world uses the cloud
> > > > > > >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> *Mike Tutkowski*
> > > > > > >> *Senior CloudStack Developer, SolidFire Inc.*
> > > > > > >> e: mike.tutkowski@solidfire.com
> > > > > > >> o: 303.746.7302
> > > > > > >> Advancing the way the world uses the cloud
> > > > > > >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Mike Tutkowski*
> > > > > > > *Senior CloudStack Developer, SolidFire Inc.*
> > > > > > > e: mike.tutkowski@solidfire.com
> > > > > > > o: 303.746.7302
> > > > > > > Advancing the way the world uses the cloud
> > > > > > > <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Ian Rae*
> > > > > PDG *| *CEO
> > > > > t *514.944.4008*
> > > > >
> > > > > *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions
> > > > > Experts
> > > > > w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|*
> > > > > Montreal *|*
> > > > > Quebec *|* H3J 1S6
> > > > >
> > > > > <https://www.cloud.ca/>
> > > > > <
> > > > >
> > >
> >
> http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
> > > > > >
> > > > >
> > >
> > > > --
> > > > *Mike Tutkowski*
> > > > *Senior CloudStack Developer, SolidFire Inc.*
> > > > e: mike.tutkowski@solidfire.com
> > > > o: 303.746.7302
> > > > Advancing the way the world uses the cloud
> > > > <http://solidfire.com/solution/overview/?video=play>*™*
> > >
> >
> >
> >
> > --
> > *Mike Tutkowski*
> > *Senior CloudStack Developer, SolidFire Inc.*
> > e: mike.tutkowski@solidfire.com
> > o: 303.746.7302
> > Advancing the way the world uses the cloud
> > <http://solidfire.com/solution/overview/?video=play>*™*
> >
>
>
>
> --
> regards,
>
> punith s
> cloudbyte.com
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Punith S <pu...@cloudbyte.com>.
+1 with this feature.

like Logan said the current snapshot is just a copy of VDI/vmdk/qcow2's to
the secondary storage
hence the current feature acts as a backup feature taking a long time.

also the current cloudstack storage framework is not allowing the third
party storage vendors like cloudbyte and others to leverage their backend
storage snapshot feature where file systems support Copy on Write.

for example : In cloudbyte elastistor which is based on zfs filesystem, it
allows to take multiple snapshots within seconds.
                     and a volume clones can be created from each snapshot.
hence if the VDI/vmdk/qcow2's are already residing in
                     the volume, the clones will just replicate the
existing virtual disks. hence there will be no overhead of copying a
                     snapshot to a new volume or primary storage over the
network.

hence there should be an option provided to leverage a snapshot and
creating a volume out of it to the corresponding backend storage provider
in use.

thanks


On Tue, Feb 17, 2015 at 3:11 AM, Mike Tutkowski <
mike.tutkowski@solidfire.com> wrote:

> Whatever way you think makes the most sense.
>
> Either way, I'm working on this for XenServer and ESXi (eventually on KVM,
> I expect) for managed storage (SolidFire is an example of managed storage).
>
> On Mon, Feb 16, 2015 at 2:38 PM, Andrei Mikhailovsky <an...@arhont.com>
> wrote:
>
> > I am happy to see the discussion is taking its pace and a lot of people
> > tend to agree that we should address this area. I have done the ticket
> for
> > that, but I am not sure if this should be dealt in a more general way as
> > suggested. Or perhaps having individual tickets for each hypervisor would
> > achieve a faster response from the community?
> >
> > Andrei
> >
> > ----- Original Message -----
> >
> > > From: "Mike Tutkowski" <mi...@solidfire.com>
> > > To: dev@cloudstack.apache.org
> > > Sent: Monday, 16 February, 2015 9:17:26 PM
> > > Subject: Re: Your thoughts on using Primary Storage for keeping
> > > snapshots
> >
> > > Well...count me in on the general-purpose part (I'm already working
> > > on that
> > > and have much of it working).
> >
> > > If someone is interested in implementing the RBD part, he/she can
> > > sync with
> > > me and see if there is any overlapping work that I've already
> > > implementing
> > > from a general-purpose standpoint.
> >
> > > On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <ir...@cloudops.com> wrote:
> >
> > > > Agree with Logan. As fans of Ceph as well as SolidFire, we are
> > > > interested
> > > > in seeing this particular use case (RBD/KVM) being well
> > > > implemented,
> > > > however the concept of volume snapshots residing only on primary
> > > > storage vs
> > > > being transferred to secondary storage is a more generally useful
> > > > one that
> > > > is worth solving with the same terminology and interfaces, even if
> > > > the
> > > > mechanisms may be specific to the storage type and hypervisor.
> > > >
> > > > It its not practical then its not practical, but seems like it
> > > > would be
> > > > worth trying.
> > > >
> > > > On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield
> > > > <lb...@tqhosting.com>
> > > > wrote:
> > > >
> > > > > Hi Mike,
> > > > >
> > > > > I agree it is a general CloudStack issue that can be addressed
> > > > > across
> > > > > multiple primary storage options. It's a two stage issue since
> > > > > some
> > > > > changes will need to be implemented to support these features
> > > > > across
> > > > > the board, and others will need to be made to each storage
> > > > > option.
> > > > >
> > > > > It would be nice to see a single issue opened to cover this
> > > > > across all
> > > > > available storage options. Maybe have a community vote on what
> > > > > support they want to see, and not consider the feature complete
> > > > > until
> > > > > all of the desired options are implemented? That would slow down
> > > > > development for sure, but it would ensure that it was supported
> > > > > where
> > > > > it needs to be.
> > > > >
> > > > > Thank You,
> > > > >
> > > > > Logan Barfield
> > > > > Tranquil Hosting
> > > > >
> > > > >
> > > > > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> > > > > <mi...@solidfire.com> wrote:
> > > > > > For example, Punith from CloudByte sent out an e-mail yesterday
> > > > > > that
> > > > was
> > > > > > very similar to this thread, but he was wondering how to
> > > > > > implement
> > > > such a
> > > > > > concept on his company's SAN technology.
> > > > > >
> > > > > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > > > > > mike.tutkowski@solidfire.com> wrote:
> > > > > >
> > > > > >> Yeah, I think it's a similar concept, though.
> > > > > >>
> > > > > >> You would want to take snapshots on Ceph (or some other
> > > > > >> backend system
> > > > > >> that acts as primary storage) instead of copying data to
> > > > > >> secondary
> > > > > storage
> > > > > >> and calling it a snapshot.
> > > > > >>
> > > > > >> For Ceph or any other backend system like that, the idea is to
> > > > > >> speed
> > > > up
> > > > > >> snapshots by not requiring CPU cycles on the front end or
> > > > > >> network
> > > > > bandwidth
> > > > > >> to transfer the data.
> > > > > >>
> > > > > >> In that sense, this is a general-purpose CloudStack problem
> > > > > >> and it
> > > > > appears
> > > > > >> you are intending on discussing only the Ceph implementation
> > > > > >> here,
> > > > > which is
> > > > > >> fine.
> > > > > >>
> > > > > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> > > > > lbarfield@tqhosting.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Hi Mike,
> > > > > >>>
> > > > > >>> I think the interest in this issue is primarily for Ceph RBD,
> > > > > >>> which
> > > > > >>> doesn't use iSCSI or SAN concepts in general. As well I
> > > > > >>> believe RBD
> > > > > >>> is only currently supported in KVM (and VMware?). QEMU has
> > > > > >>> native
> > > > RBD
> > > > > >>> support, so it attaches the devices directly to the VMs in
> > > > > >>> question.
> > > > > >>> It also natively supports snapshotting, which is what this
> > > > > >>> discussion
> > > > > >>> is about.
> > > > > >>>
> > > > > >>> Thank You,
> > > > > >>>
> > > > > >>> Logan Barfield
> > > > > >>> Tranquil Hosting
> > > > > >>>
> > > > > >>>
> > > > > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> > > > > >>> <mi...@solidfire.com> wrote:
> > > > > >>> > I should have also commented on KVM (since that was the
> > > > > >>> > hypervisor
> > > > > >>> called
> > > > > >>> > out in the initial e-mail).
> > > > > >>> >
> > > > > >>> > In my situation, most of my customers use XenServer and/or
> > > > > >>> > ESXi, so
> > > > > KVM
> > > > > >>> has
> > > > > >>> > received the fewest of my cycles with regards to those
> > > > > >>> > three
> > > > > >>> hypervisors.
> > > > > >>> >
> > > > > >>> > KVM, though, is actually the simplest hypervisor for which
> > > > > >>> > to
> > > > > implement
> > > > > >>> > these changes (since I am using the iSCSI adapter of the
> > > > > >>> > KVM agent
> > > > > and
> > > > > >>> it
> > > > > >>> > just essentially passes my LUN to the VM in question).
> > > > > >>> >
> > > > > >>> > For KVM, there is no clustered file system applied to my
> > > > > >>> > backend
> > > > LUN,
> > > > > >>> so I
> > > > > >>> > don't have to "worry" about that layer.
> > > > > >>> >
> > > > > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and
> > > > > >>> > VDIs
> > > > (such
> > > > > is
> > > > > >>> the
> > > > > >>> > case with XenServer) or having to re-signature anything
> > > > > >>> > (such is
> > > > the
> > > > > >>> case
> > > > > >>> > with ESXi).
> > > > > >>> >
> > > > > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > > > > >>> > mike.tutkowski@solidfire.com> wrote:
> > > > > >>> >
> > > > > >>> >> I have been working on this on and off for a while now (as
> > > > > >>> >> time
> > > > > >>> permits).
> > > > > >>> >>
> > > > > >>> >> Here is an e-mail I sent to a customer of ours that helps
> > > > > >>> >> describe
> > > > > >>> some of
> > > > > >>> >> the issues:
> > > > > >>> >>
> > > > > >>> >> *** Beginning of e-mail ***
> > > > > >>> >>
> > > > > >>> >> The main requests were around the following features:
> > > > > >>> >>
> > > > > >>> >> * The ability to leverage SolidFire snapshots.
> > > > > >>> >>
> > > > > >>> >> * The ability to create CloudStack templates from
> > > > > >>> >> SolidFire
> > > > > snapshots.
> > > > > >>> >>
> > > > > >>> >> I had these on my roadmap, but bumped the priority up and
> > > > > >>> >> began
> > > > > work on
> > > > > >>> >> them for the CS 4.6 release.
> > > > > >>> >>
> > > > > >>> >> During design, I realized there were issues with the way
> > > > > >>> >> XenServer
> > > > > is
> > > > > >>> >> architected that prevented me from directly using
> > > > > >>> >> SolidFire
> > > > > snapshots.
> > > > > >>> >>
> > > > > >>> >> I could definitely take a SolidFire snapshot of a
> > > > > >>> >> SolidFire
> > > > volume,
> > > > > but
> > > > > >>> >> this snapshot would not be usable from XenServer if the
> > > > > >>> >> original
> > > > > >>> volume was
> > > > > >>> >> still in use.
> > > > > >>> >>
> > > > > >>> >> Here is the gist of the problem:
> > > > > >>> >>
> > > > > >>> >> When XenServer leverages an iSCSI target such as a
> > > > > >>> >> SolidFire
> > > > > volume, it
> > > > > >>> >> applies a clustered files system to it, which they call a
> > > > > >>> >> storage
> > > > > >>> >> repository (SR). An SR has an *immutable* UUID associated
> > > > > >>> >> with it.
> > > > > >>> >>
> > > > > >>> >> The virtual volume (which a VM sees as a disk) is
> > > > > >>> >> represented by a
> > > > > >>> virtual
> > > > > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable*
> > > > > >>> >> UUID
> > > > > >>> associated
> > > > > >>> >> with it.
> > > > > >>> >>
> > > > > >>> >> If I take a snapshot (or a clone) of the SolidFire volume
> > > > > >>> >> and then
> > > > > >>> later
> > > > > >>> >> try to use that snapshot from XenServer, XenServer
> > > > > >>> >> complains that
> > > > > the
> > > > > >>> SR on
> > > > > >>> >> the snapshot has a UUID that conflicts with an existing
> > > > > >>> >> UUID.
> > > > > >>> >>
> > > > > >>> >> In other words, it is not possible to use the original SR
> > > > > >>> >> and the
> > > > > >>> snapshot
> > > > > >>> >> of this SR from XenServer at the same time, which is
> > > > > >>> >> critical in a
> > > > > >>> cloud
> > > > > >>> >> environment (to enable creating templates from snapshots).
> > > > > >>> >>
> > > > > >>> >> The way I have proposed circumventing this issue is not
> > > > > >>> >> ideal, but
> > > > > >>> >> technically works (this code is checked into the CS 4.6
> > > > > >>> >> branch):
> > > > > >>> >>
> > > > > >>> >> When the time comes to take a CloudStack snapshot of a
> > > > > >>> >> CloudStack
> > > > > >>> volume
> > > > > >>> >> that is backed by SolidFire storage via the storage
> > > > > >>> >> plug-in, the
> > > > > >>> plug-in
> > > > > >>> >> will create a new SolidFire volume with characteristics
> > > > > >>> >> (size and
> > > > > IOPS)
> > > > > >>> >> equal to those of the original volume.
> > > > > >>> >>
> > > > > >>> >> We then have XenServer attach to this new SolidFire
> > > > > >>> >> volume,
> > > > create a
> > > > > >>> *new*
> > > > > >>> >> SR on it, and then copy the VDI from the source SR to the
> > > > > destination
> > > > > >>> SR
> > > > > >>> >> (the new SR).
> > > > > >>> >>
> > > > > >>> >> This leads to us having a copy of the VDI (a "snapshot" of
> > > > > >>> >> sorts),
> > > > > but
> > > > > >>> it
> > > > > >>> >> requires CPU cycles on the compute cluster as well as
> > > > > >>> >> network
> > > > > >>> bandwidth to
> > > > > >>> >> write to the SAN (thus it is slower and more resource
> > > > > >>> >> intensive
> > > > > than a
> > > > > >>> >> SolidFire snapshot).
> > > > > >>> >>
> > > > > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> > > > > concerning
> > > > > >>> this
> > > > > >>> >> issue before and during the CloudStack Collaboration
> > > > > >>> >> Conference in
> > > > > >>> Budapest
> > > > > >>> >> in November. He agreed that this is a legitimate issue
> > > > > >>> >> with the
> > > > way
> > > > > >>> >> XenServer is designed and could not think of a way (other
> > > > > >>> >> than
> > > > what
> > > > > I
> > > > > >>> was
> > > > > >>> >> doing) to get around it in current versions of XenServer.
> > > > > >>> >>
> > > > > >>> >> One thought is to have a feature added to XenServer that
> > > > > >>> >> enables
> > > > > you to
> > > > > >>> >> change the UUID of an SR and of a VDI.
> > > > > >>> >>
> > > > > >>> >> If I could do that, then I could take a SolidFire snapshot
> > > > > >>> >> of the
> > > > > >>> >> SolidFire volume and issue commands to XenServer to have
> > > > > >>> >> it change
> > > > > the
> > > > > >>> >> UUIDs of the original SR and the original VDI. I could
> > > > > >>> >> then
> > > > recored
> > > > > the
> > > > > >>> >> necessary UUID info in the CS DB.
> > > > > >>> >>
> > > > > >>> >> *** End of e-mail ***
> > > > > >>> >>
> > > > > >>> >> I have since investigated this on ESXi.
> > > > > >>> >>
> > > > > >>> >> ESXi does have a way for us to "re-signature" a datastore,
> > > > > >>> >> so
> > > > > backend
> > > > > >>> >> snapshots can be taken and effectively used on this
> > > > > >>> >> hypervisor.
> > > > > >>> >>
> > > > > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> > > > > >>> lbarfield@tqhosting.com>
> > > > > >>> >> wrote:
> > > > > >>> >>
> > > > > >>> >>> I'm just going to stick with the qemu-img option change
> > > > > >>> >>> for RBD
> > > > for
> > > > > >>> >>> now (which should cut snapshot time down drastically),
> > > > > >>> >>> and look
> > > > > >>> >>> forward to this in the future. I'd be happy to help get
> > > > > >>> >>> this
> > > > > moving,
> > > > > >>> >>> but I'm not enough of a developer to lead the charge.
> > > > > >>> >>>
> > > > > >>> >>> As far as renaming goes, I agree that maybe backups isn't
> > > > > >>> >>> the
> > > > right
> > > > > >>> >>> word. That being said calling a full-sized copy of a
> > > > > >>> >>> volume a
> > > > > >>> >>> "snapshot" also isn't the right word. Maybe "image" would
> > > > > >>> >>> be
> > > > > better?
> > > > > >>> >>>
> > > > > >>> >>> I've also got my reservations about "accounts" vs "users"
> > > > > >>> >>> (I
> > > > think
> > > > > >>> >>> "departments" and "accounts or users" respectively is
> > > > > >>> >>> less
> > > > > confusing),
> > > > > >>> >>> but that's a different thread.
> > > > > >>> >>>
> > > > > >>> >>> Thank You,
> > > > > >>> >>>
> > > > > >>> >>> Logan Barfield
> > > > > >>> >>> Tranquil Hosting
> > > > > >>> >>>
> > > > > >>> >>>
> > > > > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> > > > > wido@widodh.nl>
> > > > > >>> >>> wrote:
> > > > > >>> >>> >
> > > > > >>> >>> >
> > > > > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> > > > > >>> >>> >> I like this idea a lot for Ceph RBD. I do think there
> > > > > >>> >>> >> should
> > > > > >>> still be
> > > > > >>> >>> >> support for copying snapshots to secondary storage as
> > > > > >>> >>> >> needed
> > > > > (for
> > > > > >>> >>> >> transfers between zones, etc.). I really think that
> > > > > >>> >>> >> this
> > > > could
> > > > > be
> > > > > >>> >>> >> part of a larger move to clarify the naming
> > > > > >>> >>> >> conventions used
> > > > for
> > > > > >>> disk
> > > > > >>> >>> >> operations. Currently "Volume Snapshots" should
> > > > > >>> >>> >> probably
> > > > > really be
> > > > > >>> >>> >> called "Backups". So having "snapshot" functionality,
> > > > > >>> >>> >> and a
> > > > > >>> "convert
> > > > > >>> >>> >> snapshot to backup/template" would be a good move.
> > > > > >>> >>> >>
> > > > > >>> >>> >
> > > > > >>> >>> > I fully agree that this would be a very great addition.
> > > > > >>> >>> >
> > > > > >>> >>> > I won't be able to work on this any time soon though.
> > > > > >>> >>> >
> > > > > >>> >>> > Wido
> > > > > >>> >>> >
> > > > > >>> >>> >> Thank You,
> > > > > >>> >>> >>
> > > > > >>> >>> >> Logan Barfield
> > > > > >>> >>> >> Tranquil Hosting
> > > > > >>> >>> >>
> > > > > >>> >>> >>
> > > > > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> > > > > >>> >>> andrija.panic@gmail.com> wrote:
> > > > > >>> >>> >>> BIG +1
> > > > > >>> >>> >>>
> > > > > >>> >>> >>> My team should submit some patch to ACS for better
> > > > > >>> >>> >>> KVM
> > > > > snapshots,
> > > > > >>> >>> including
> > > > > >>> >>> >>> whole VM snapshot etc...but it's too early to give
> > > > > >>> >>> >>> details...
> > > > > >>> >>> >>> best
> > > > > >>> >>> >>>
> > > > > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> > > > > >>> andrei@arhont.com>
> > > > > >>> >>> wrote:
> > > > > >>> >>> >>>
> > > > > >>> >>> >>>> Hello guys,
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> I was hoping to have some feedback from the
> > > > > >>> >>> >>>> community on the
> > > > > >>> subject
> > > > > >>> >>> of
> > > > > >>> >>> >>>> having an ability to keep snapshots on the primary
> > > > > >>> >>> >>>> storage
> > > > > where
> > > > > >>> it
> > > > > >>> >>> is
> > > > > >>> >>> >>>> supported by the storage backend.
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> The idea behind this functionality is to improve how
> > > > snapshots
> > > > > >>> are
> > > > > >>> >>> >>>> currently handled on KVM hypervisors with Ceph
> > > > > >>> >>> >>>> primary
> > > > > storage.
> > > > > >>> At
> > > > > >>> >>> the
> > > > > >>> >>> >>>> moment, the snapshots are taken on the primary
> > > > > >>> >>> >>>> storage and
> > > > > being
> > > > > >>> >>> copied to
> > > > > >>> >>> >>>> the secondary storage. This method is very slow and
> > > > > inefficient
> > > > > >>> even
> > > > > >>> >>> on
> > > > > >>> >>> >>>> small infrastructure. Even on medium deployments
> > > > > >>> >>> >>>> using
> > > > > snapshots
> > > > > >>> in
> > > > > >>> >>> KVM
> > > > > >>> >>> >>>> becomes nearly impossible. If you have tens or
> > > > > >>> >>> >>>> hundreds
> > > > > >>> concurrent
> > > > > >>> >>> >>>> snapshots taking place you will have a bunch of
> > > > > >>> >>> >>>> timeouts and
> > > > > >>> errors,
> > > > > >>> >>> your
> > > > > >>> >>> >>>> network becomes clogged, etc. In addition, using
> > > > > >>> >>> >>>> these
> > > > > snapshots
> > > > > >>> for
> > > > > >>> >>> >>>> creating new volumes or reverting back vms also slow
> > > > > >>> >>> >>>> and
> > > > > >>> >>> inefficient. As
> > > > > >>> >>> >>>> above, when you have tens or hundreds concurrent
> > > > > >>> >>> >>>> operations
> > > > it
> > > > > >>> will
> > > > > >>> >>> not
> > > > > >>> >>> >>>> succeed and you will have a majority of tasks with
> > > > > >>> >>> >>>> errors or
> > > > > >>> >>> timeouts.
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> At the moment, taking a single snapshot of
> > > > > >>> >>> >>>> relatively small
> > > > > >>> volumes
> > > > > >>> >>> (200GB
> > > > > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of
> > > > minutes.
> > > > > >>> Taking
> > > > > >>> >>> a
> > > > > >>> >>> >>>> snapshot of the same volume on ceph primary storage
> > > > > >>> >>> >>>> takes a
> > > > > few
> > > > > >>> >>> seconds at
> > > > > >>> >>> >>>> most! Similarly, converting a snapshot to a volume
> > > > > >>> >>> >>>> takes
> > > > tens
> > > > > if
> > > > > >>> not
> > > > > >>> >>> >>>> hundreds of minutes when secondary storage is
> > > > > >>> >>> >>>> involved;
> > > > > compared
> > > > > >>> with
> > > > > >>> >>> >>>> seconds if done directly on the primary storage.
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> I suggest that the CloudStack should have the
> > > > > >>> >>> >>>> ability to
> > > > keep
> > > > > >>> volume
> > > > > >>> >>> >>>> snapshots on the primary storage where this is
> > > > > >>> >>> >>>> supported by
> > > > > the
> > > > > >>> >>> storage.
> > > > > >>> >>> >>>> Perhaps having a per primary storage setting that
> > > > > >>> >>> >>>> enables
> > > > this
> > > > > >>> >>> >>>> functionality. This will be beneficial for Ceph
> > > > > >>> >>> >>>> primary
> > > > > storage
> > > > > >>> on
> > > > > >>> >>> KVM
> > > > > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will
> > > > > >>> >>> >>>> be
> > > > > supported
> > > > > >>> in
> > > > > >>> >>> a near
> > > > > >>> >>> >>>> future.
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> This will greatly speed up the process of using
> > > > > >>> >>> >>>> snapshots on
> > > > > KVM
> > > > > >>> and
> > > > > >>> >>> users
> > > > > >>> >>> >>>> will actually start using snapshotting rather than
> > > > > >>> >>> >>>> giving up
> > > > > with
> > > > > >>> >>> >>>> frustration.
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please
> > > > > >>> >>> >>>> cast
> > > > your
> > > > > >>> vote
> > > > > >>> >>> if you
> > > > > >>> >>> >>>> are in agreement.
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> Thanks for your input
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>> Andrei
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>>
> > > > > >>> >>> >>>
> > > > > >>> >>> >>>
> > > > > >>> >>> >>> --
> > > > > >>> >>> >>>
> > > > > >>> >>> >>> Andrija Panić
> > > > > >>> >>>
> > > > > >>> >>
> > > > > >>> >>
> > > > > >>> >>
> > > > > >>> >> --
> > > > > >>> >> *Mike Tutkowski*
> > > > > >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> > > > > >>> >> e: mike.tutkowski@solidfire.com
> > > > > >>> >> o: 303.746.7302
> > > > > >>> >> Advancing the way the world uses the cloud
> > > > > >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > >>> >>
> > > > > >>> >
> > > > > >>> >
> > > > > >>> >
> > > > > >>> > --
> > > > > >>> > *Mike Tutkowski*
> > > > > >>> > *Senior CloudStack Developer, SolidFire Inc.*
> > > > > >>> > e: mike.tutkowski@solidfire.com
> > > > > >>> > o: 303.746.7302
> > > > > >>> > Advancing the way the world uses the cloud
> > > > > >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> *Mike Tutkowski*
> > > > > >> *Senior CloudStack Developer, SolidFire Inc.*
> > > > > >> e: mike.tutkowski@solidfire.com
> > > > > >> o: 303.746.7302
> > > > > >> Advancing the way the world uses the cloud
> > > > > >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Mike Tutkowski*
> > > > > > *Senior CloudStack Developer, SolidFire Inc.*
> > > > > > e: mike.tutkowski@solidfire.com
> > > > > > o: 303.746.7302
> > > > > > Advancing the way the world uses the cloud
> > > > > > <http://solidfire.com/solution/overview/?video=play>*™*
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ian Rae*
> > > > PDG *| *CEO
> > > > t *514.944.4008*
> > > >
> > > > *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions
> > > > Experts
> > > > w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|*
> > > > Montreal *|*
> > > > Quebec *|* H3J 1S6
> > > >
> > > > <https://www.cloud.ca/>
> > > > <
> > > >
> >
> http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
> > > > >
> > > >
> >
> > > --
> > > *Mike Tutkowski*
> > > *Senior CloudStack Developer, SolidFire Inc.*
> > > e: mike.tutkowski@solidfire.com
> > > o: 303.746.7302
> > > Advancing the way the world uses the cloud
> > > <http://solidfire.com/solution/overview/?video=play>*™*
> >
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
regards,

punith s
cloudbyte.com

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
Whatever way you think makes the most sense.

Either way, I'm working on this for XenServer and ESXi (eventually on KVM,
I expect) for managed storage (SolidFire is an example of managed storage).

On Mon, Feb 16, 2015 at 2:38 PM, Andrei Mikhailovsky <an...@arhont.com>
wrote:

> I am happy to see the discussion is taking its pace and a lot of people
> tend to agree that we should address this area. I have done the ticket for
> that, but I am not sure if this should be dealt in a more general way as
> suggested. Or perhaps having individual tickets for each hypervisor would
> achieve a faster response from the community?
>
> Andrei
>
> ----- Original Message -----
>
> > From: "Mike Tutkowski" <mi...@solidfire.com>
> > To: dev@cloudstack.apache.org
> > Sent: Monday, 16 February, 2015 9:17:26 PM
> > Subject: Re: Your thoughts on using Primary Storage for keeping
> > snapshots
>
> > Well...count me in on the general-purpose part (I'm already working
> > on that
> > and have much of it working).
>
> > If someone is interested in implementing the RBD part, he/she can
> > sync with
> > me and see if there is any overlapping work that I've already
> > implementing
> > from a general-purpose standpoint.
>
> > On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <ir...@cloudops.com> wrote:
>
> > > Agree with Logan. As fans of Ceph as well as SolidFire, we are
> > > interested
> > > in seeing this particular use case (RBD/KVM) being well
> > > implemented,
> > > however the concept of volume snapshots residing only on primary
> > > storage vs
> > > being transferred to secondary storage is a more generally useful
> > > one that
> > > is worth solving with the same terminology and interfaces, even if
> > > the
> > > mechanisms may be specific to the storage type and hypervisor.
> > >
> > > It its not practical then its not practical, but seems like it
> > > would be
> > > worth trying.
> > >
> > > On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield
> > > <lb...@tqhosting.com>
> > > wrote:
> > >
> > > > Hi Mike,
> > > >
> > > > I agree it is a general CloudStack issue that can be addressed
> > > > across
> > > > multiple primary storage options. It's a two stage issue since
> > > > some
> > > > changes will need to be implemented to support these features
> > > > across
> > > > the board, and others will need to be made to each storage
> > > > option.
> > > >
> > > > It would be nice to see a single issue opened to cover this
> > > > across all
> > > > available storage options. Maybe have a community vote on what
> > > > support they want to see, and not consider the feature complete
> > > > until
> > > > all of the desired options are implemented? That would slow down
> > > > development for sure, but it would ensure that it was supported
> > > > where
> > > > it needs to be.
> > > >
> > > > Thank You,
> > > >
> > > > Logan Barfield
> > > > Tranquil Hosting
> > > >
> > > >
> > > > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> > > > <mi...@solidfire.com> wrote:
> > > > > For example, Punith from CloudByte sent out an e-mail yesterday
> > > > > that
> > > was
> > > > > very similar to this thread, but he was wondering how to
> > > > > implement
> > > such a
> > > > > concept on his company's SAN technology.
> > > > >
> > > > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > > > > mike.tutkowski@solidfire.com> wrote:
> > > > >
> > > > >> Yeah, I think it's a similar concept, though.
> > > > >>
> > > > >> You would want to take snapshots on Ceph (or some other
> > > > >> backend system
> > > > >> that acts as primary storage) instead of copying data to
> > > > >> secondary
> > > > storage
> > > > >> and calling it a snapshot.
> > > > >>
> > > > >> For Ceph or any other backend system like that, the idea is to
> > > > >> speed
> > > up
> > > > >> snapshots by not requiring CPU cycles on the front end or
> > > > >> network
> > > > bandwidth
> > > > >> to transfer the data.
> > > > >>
> > > > >> In that sense, this is a general-purpose CloudStack problem
> > > > >> and it
> > > > appears
> > > > >> you are intending on discussing only the Ceph implementation
> > > > >> here,
> > > > which is
> > > > >> fine.
> > > > >>
> > > > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> > > > lbarfield@tqhosting.com>
> > > > >> wrote:
> > > > >>
> > > > >>> Hi Mike,
> > > > >>>
> > > > >>> I think the interest in this issue is primarily for Ceph RBD,
> > > > >>> which
> > > > >>> doesn't use iSCSI or SAN concepts in general. As well I
> > > > >>> believe RBD
> > > > >>> is only currently supported in KVM (and VMware?). QEMU has
> > > > >>> native
> > > RBD
> > > > >>> support, so it attaches the devices directly to the VMs in
> > > > >>> question.
> > > > >>> It also natively supports snapshotting, which is what this
> > > > >>> discussion
> > > > >>> is about.
> > > > >>>
> > > > >>> Thank You,
> > > > >>>
> > > > >>> Logan Barfield
> > > > >>> Tranquil Hosting
> > > > >>>
> > > > >>>
> > > > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> > > > >>> <mi...@solidfire.com> wrote:
> > > > >>> > I should have also commented on KVM (since that was the
> > > > >>> > hypervisor
> > > > >>> called
> > > > >>> > out in the initial e-mail).
> > > > >>> >
> > > > >>> > In my situation, most of my customers use XenServer and/or
> > > > >>> > ESXi, so
> > > > KVM
> > > > >>> has
> > > > >>> > received the fewest of my cycles with regards to those
> > > > >>> > three
> > > > >>> hypervisors.
> > > > >>> >
> > > > >>> > KVM, though, is actually the simplest hypervisor for which
> > > > >>> > to
> > > > implement
> > > > >>> > these changes (since I am using the iSCSI adapter of the
> > > > >>> > KVM agent
> > > > and
> > > > >>> it
> > > > >>> > just essentially passes my LUN to the VM in question).
> > > > >>> >
> > > > >>> > For KVM, there is no clustered file system applied to my
> > > > >>> > backend
> > > LUN,
> > > > >>> so I
> > > > >>> > don't have to "worry" about that layer.
> > > > >>> >
> > > > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and
> > > > >>> > VDIs
> > > (such
> > > > is
> > > > >>> the
> > > > >>> > case with XenServer) or having to re-signature anything
> > > > >>> > (such is
> > > the
> > > > >>> case
> > > > >>> > with ESXi).
> > > > >>> >
> > > > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > > > >>> > mike.tutkowski@solidfire.com> wrote:
> > > > >>> >
> > > > >>> >> I have been working on this on and off for a while now (as
> > > > >>> >> time
> > > > >>> permits).
> > > > >>> >>
> > > > >>> >> Here is an e-mail I sent to a customer of ours that helps
> > > > >>> >> describe
> > > > >>> some of
> > > > >>> >> the issues:
> > > > >>> >>
> > > > >>> >> *** Beginning of e-mail ***
> > > > >>> >>
> > > > >>> >> The main requests were around the following features:
> > > > >>> >>
> > > > >>> >> * The ability to leverage SolidFire snapshots.
> > > > >>> >>
> > > > >>> >> * The ability to create CloudStack templates from
> > > > >>> >> SolidFire
> > > > snapshots.
> > > > >>> >>
> > > > >>> >> I had these on my roadmap, but bumped the priority up and
> > > > >>> >> began
> > > > work on
> > > > >>> >> them for the CS 4.6 release.
> > > > >>> >>
> > > > >>> >> During design, I realized there were issues with the way
> > > > >>> >> XenServer
> > > > is
> > > > >>> >> architected that prevented me from directly using
> > > > >>> >> SolidFire
> > > > snapshots.
> > > > >>> >>
> > > > >>> >> I could definitely take a SolidFire snapshot of a
> > > > >>> >> SolidFire
> > > volume,
> > > > but
> > > > >>> >> this snapshot would not be usable from XenServer if the
> > > > >>> >> original
> > > > >>> volume was
> > > > >>> >> still in use.
> > > > >>> >>
> > > > >>> >> Here is the gist of the problem:
> > > > >>> >>
> > > > >>> >> When XenServer leverages an iSCSI target such as a
> > > > >>> >> SolidFire
> > > > volume, it
> > > > >>> >> applies a clustered files system to it, which they call a
> > > > >>> >> storage
> > > > >>> >> repository (SR). An SR has an *immutable* UUID associated
> > > > >>> >> with it.
> > > > >>> >>
> > > > >>> >> The virtual volume (which a VM sees as a disk) is
> > > > >>> >> represented by a
> > > > >>> virtual
> > > > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable*
> > > > >>> >> UUID
> > > > >>> associated
> > > > >>> >> with it.
> > > > >>> >>
> > > > >>> >> If I take a snapshot (or a clone) of the SolidFire volume
> > > > >>> >> and then
> > > > >>> later
> > > > >>> >> try to use that snapshot from XenServer, XenServer
> > > > >>> >> complains that
> > > > the
> > > > >>> SR on
> > > > >>> >> the snapshot has a UUID that conflicts with an existing
> > > > >>> >> UUID.
> > > > >>> >>
> > > > >>> >> In other words, it is not possible to use the original SR
> > > > >>> >> and the
> > > > >>> snapshot
> > > > >>> >> of this SR from XenServer at the same time, which is
> > > > >>> >> critical in a
> > > > >>> cloud
> > > > >>> >> environment (to enable creating templates from snapshots).
> > > > >>> >>
> > > > >>> >> The way I have proposed circumventing this issue is not
> > > > >>> >> ideal, but
> > > > >>> >> technically works (this code is checked into the CS 4.6
> > > > >>> >> branch):
> > > > >>> >>
> > > > >>> >> When the time comes to take a CloudStack snapshot of a
> > > > >>> >> CloudStack
> > > > >>> volume
> > > > >>> >> that is backed by SolidFire storage via the storage
> > > > >>> >> plug-in, the
> > > > >>> plug-in
> > > > >>> >> will create a new SolidFire volume with characteristics
> > > > >>> >> (size and
> > > > IOPS)
> > > > >>> >> equal to those of the original volume.
> > > > >>> >>
> > > > >>> >> We then have XenServer attach to this new SolidFire
> > > > >>> >> volume,
> > > create a
> > > > >>> *new*
> > > > >>> >> SR on it, and then copy the VDI from the source SR to the
> > > > destination
> > > > >>> SR
> > > > >>> >> (the new SR).
> > > > >>> >>
> > > > >>> >> This leads to us having a copy of the VDI (a "snapshot" of
> > > > >>> >> sorts),
> > > > but
> > > > >>> it
> > > > >>> >> requires CPU cycles on the compute cluster as well as
> > > > >>> >> network
> > > > >>> bandwidth to
> > > > >>> >> write to the SAN (thus it is slower and more resource
> > > > >>> >> intensive
> > > > than a
> > > > >>> >> SolidFire snapshot).
> > > > >>> >>
> > > > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> > > > concerning
> > > > >>> this
> > > > >>> >> issue before and during the CloudStack Collaboration
> > > > >>> >> Conference in
> > > > >>> Budapest
> > > > >>> >> in November. He agreed that this is a legitimate issue
> > > > >>> >> with the
> > > way
> > > > >>> >> XenServer is designed and could not think of a way (other
> > > > >>> >> than
> > > what
> > > > I
> > > > >>> was
> > > > >>> >> doing) to get around it in current versions of XenServer.
> > > > >>> >>
> > > > >>> >> One thought is to have a feature added to XenServer that
> > > > >>> >> enables
> > > > you to
> > > > >>> >> change the UUID of an SR and of a VDI.
> > > > >>> >>
> > > > >>> >> If I could do that, then I could take a SolidFire snapshot
> > > > >>> >> of the
> > > > >>> >> SolidFire volume and issue commands to XenServer to have
> > > > >>> >> it change
> > > > the
> > > > >>> >> UUIDs of the original SR and the original VDI. I could
> > > > >>> >> then
> > > recored
> > > > the
> > > > >>> >> necessary UUID info in the CS DB.
> > > > >>> >>
> > > > >>> >> *** End of e-mail ***
> > > > >>> >>
> > > > >>> >> I have since investigated this on ESXi.
> > > > >>> >>
> > > > >>> >> ESXi does have a way for us to "re-signature" a datastore,
> > > > >>> >> so
> > > > backend
> > > > >>> >> snapshots can be taken and effectively used on this
> > > > >>> >> hypervisor.
> > > > >>> >>
> > > > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> > > > >>> lbarfield@tqhosting.com>
> > > > >>> >> wrote:
> > > > >>> >>
> > > > >>> >>> I'm just going to stick with the qemu-img option change
> > > > >>> >>> for RBD
> > > for
> > > > >>> >>> now (which should cut snapshot time down drastically),
> > > > >>> >>> and look
> > > > >>> >>> forward to this in the future. I'd be happy to help get
> > > > >>> >>> this
> > > > moving,
> > > > >>> >>> but I'm not enough of a developer to lead the charge.
> > > > >>> >>>
> > > > >>> >>> As far as renaming goes, I agree that maybe backups isn't
> > > > >>> >>> the
> > > right
> > > > >>> >>> word. That being said calling a full-sized copy of a
> > > > >>> >>> volume a
> > > > >>> >>> "snapshot" also isn't the right word. Maybe "image" would
> > > > >>> >>> be
> > > > better?
> > > > >>> >>>
> > > > >>> >>> I've also got my reservations about "accounts" vs "users"
> > > > >>> >>> (I
> > > think
> > > > >>> >>> "departments" and "accounts or users" respectively is
> > > > >>> >>> less
> > > > confusing),
> > > > >>> >>> but that's a different thread.
> > > > >>> >>>
> > > > >>> >>> Thank You,
> > > > >>> >>>
> > > > >>> >>> Logan Barfield
> > > > >>> >>> Tranquil Hosting
> > > > >>> >>>
> > > > >>> >>>
> > > > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> > > > wido@widodh.nl>
> > > > >>> >>> wrote:
> > > > >>> >>> >
> > > > >>> >>> >
> > > > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> > > > >>> >>> >> I like this idea a lot for Ceph RBD. I do think there
> > > > >>> >>> >> should
> > > > >>> still be
> > > > >>> >>> >> support for copying snapshots to secondary storage as
> > > > >>> >>> >> needed
> > > > (for
> > > > >>> >>> >> transfers between zones, etc.). I really think that
> > > > >>> >>> >> this
> > > could
> > > > be
> > > > >>> >>> >> part of a larger move to clarify the naming
> > > > >>> >>> >> conventions used
> > > for
> > > > >>> disk
> > > > >>> >>> >> operations. Currently "Volume Snapshots" should
> > > > >>> >>> >> probably
> > > > really be
> > > > >>> >>> >> called "Backups". So having "snapshot" functionality,
> > > > >>> >>> >> and a
> > > > >>> "convert
> > > > >>> >>> >> snapshot to backup/template" would be a good move.
> > > > >>> >>> >>
> > > > >>> >>> >
> > > > >>> >>> > I fully agree that this would be a very great addition.
> > > > >>> >>> >
> > > > >>> >>> > I won't be able to work on this any time soon though.
> > > > >>> >>> >
> > > > >>> >>> > Wido
> > > > >>> >>> >
> > > > >>> >>> >> Thank You,
> > > > >>> >>> >>
> > > > >>> >>> >> Logan Barfield
> > > > >>> >>> >> Tranquil Hosting
> > > > >>> >>> >>
> > > > >>> >>> >>
> > > > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> > > > >>> >>> andrija.panic@gmail.com> wrote:
> > > > >>> >>> >>> BIG +1
> > > > >>> >>> >>>
> > > > >>> >>> >>> My team should submit some patch to ACS for better
> > > > >>> >>> >>> KVM
> > > > snapshots,
> > > > >>> >>> including
> > > > >>> >>> >>> whole VM snapshot etc...but it's too early to give
> > > > >>> >>> >>> details...
> > > > >>> >>> >>> best
> > > > >>> >>> >>>
> > > > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> > > > >>> andrei@arhont.com>
> > > > >>> >>> wrote:
> > > > >>> >>> >>>
> > > > >>> >>> >>>> Hello guys,
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> I was hoping to have some feedback from the
> > > > >>> >>> >>>> community on the
> > > > >>> subject
> > > > >>> >>> of
> > > > >>> >>> >>>> having an ability to keep snapshots on the primary
> > > > >>> >>> >>>> storage
> > > > where
> > > > >>> it
> > > > >>> >>> is
> > > > >>> >>> >>>> supported by the storage backend.
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> The idea behind this functionality is to improve how
> > > snapshots
> > > > >>> are
> > > > >>> >>> >>>> currently handled on KVM hypervisors with Ceph
> > > > >>> >>> >>>> primary
> > > > storage.
> > > > >>> At
> > > > >>> >>> the
> > > > >>> >>> >>>> moment, the snapshots are taken on the primary
> > > > >>> >>> >>>> storage and
> > > > being
> > > > >>> >>> copied to
> > > > >>> >>> >>>> the secondary storage. This method is very slow and
> > > > inefficient
> > > > >>> even
> > > > >>> >>> on
> > > > >>> >>> >>>> small infrastructure. Even on medium deployments
> > > > >>> >>> >>>> using
> > > > snapshots
> > > > >>> in
> > > > >>> >>> KVM
> > > > >>> >>> >>>> becomes nearly impossible. If you have tens or
> > > > >>> >>> >>>> hundreds
> > > > >>> concurrent
> > > > >>> >>> >>>> snapshots taking place you will have a bunch of
> > > > >>> >>> >>>> timeouts and
> > > > >>> errors,
> > > > >>> >>> your
> > > > >>> >>> >>>> network becomes clogged, etc. In addition, using
> > > > >>> >>> >>>> these
> > > > snapshots
> > > > >>> for
> > > > >>> >>> >>>> creating new volumes or reverting back vms also slow
> > > > >>> >>> >>>> and
> > > > >>> >>> inefficient. As
> > > > >>> >>> >>>> above, when you have tens or hundreds concurrent
> > > > >>> >>> >>>> operations
> > > it
> > > > >>> will
> > > > >>> >>> not
> > > > >>> >>> >>>> succeed and you will have a majority of tasks with
> > > > >>> >>> >>>> errors or
> > > > >>> >>> timeouts.
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> At the moment, taking a single snapshot of
> > > > >>> >>> >>>> relatively small
> > > > >>> volumes
> > > > >>> >>> (200GB
> > > > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of
> > > minutes.
> > > > >>> Taking
> > > > >>> >>> a
> > > > >>> >>> >>>> snapshot of the same volume on ceph primary storage
> > > > >>> >>> >>>> takes a
> > > > few
> > > > >>> >>> seconds at
> > > > >>> >>> >>>> most! Similarly, converting a snapshot to a volume
> > > > >>> >>> >>>> takes
> > > tens
> > > > if
> > > > >>> not
> > > > >>> >>> >>>> hundreds of minutes when secondary storage is
> > > > >>> >>> >>>> involved;
> > > > compared
> > > > >>> with
> > > > >>> >>> >>>> seconds if done directly on the primary storage.
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> I suggest that the CloudStack should have the
> > > > >>> >>> >>>> ability to
> > > keep
> > > > >>> volume
> > > > >>> >>> >>>> snapshots on the primary storage where this is
> > > > >>> >>> >>>> supported by
> > > > the
> > > > >>> >>> storage.
> > > > >>> >>> >>>> Perhaps having a per primary storage setting that
> > > > >>> >>> >>>> enables
> > > this
> > > > >>> >>> >>>> functionality. This will be beneficial for Ceph
> > > > >>> >>> >>>> primary
> > > > storage
> > > > >>> on
> > > > >>> >>> KVM
> > > > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will
> > > > >>> >>> >>>> be
> > > > supported
> > > > >>> in
> > > > >>> >>> a near
> > > > >>> >>> >>>> future.
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> This will greatly speed up the process of using
> > > > >>> >>> >>>> snapshots on
> > > > KVM
> > > > >>> and
> > > > >>> >>> users
> > > > >>> >>> >>>> will actually start using snapshotting rather than
> > > > >>> >>> >>>> giving up
> > > > with
> > > > >>> >>> >>>> frustration.
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please
> > > > >>> >>> >>>> cast
> > > your
> > > > >>> vote
> > > > >>> >>> if you
> > > > >>> >>> >>>> are in agreement.
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> Thanks for your input
> > > > >>> >>> >>>>
> > > > >>> >>> >>>> Andrei
> > > > >>> >>> >>>>
> > > > >>> >>> >>>>
> > > > >>> >>> >>>>
> > > > >>> >>> >>>>
> > > > >>> >>> >>>>
> > > > >>> >>> >>>
> > > > >>> >>> >>>
> > > > >>> >>> >>> --
> > > > >>> >>> >>>
> > > > >>> >>> >>> Andrija Panić
> > > > >>> >>>
> > > > >>> >>
> > > > >>> >>
> > > > >>> >>
> > > > >>> >> --
> > > > >>> >> *Mike Tutkowski*
> > > > >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> > > > >>> >> e: mike.tutkowski@solidfire.com
> > > > >>> >> o: 303.746.7302
> > > > >>> >> Advancing the way the world uses the cloud
> > > > >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > > >>> >>
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > --
> > > > >>> > *Mike Tutkowski*
> > > > >>> > *Senior CloudStack Developer, SolidFire Inc.*
> > > > >>> > e: mike.tutkowski@solidfire.com
> > > > >>> > o: 303.746.7302
> > > > >>> > Advancing the way the world uses the cloud
> > > > >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> *Mike Tutkowski*
> > > > >> *Senior CloudStack Developer, SolidFire Inc.*
> > > > >> e: mike.tutkowski@solidfire.com
> > > > >> o: 303.746.7302
> > > > >> Advancing the way the world uses the cloud
> > > > >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Mike Tutkowski*
> > > > > *Senior CloudStack Developer, SolidFire Inc.*
> > > > > e: mike.tutkowski@solidfire.com
> > > > > o: 303.746.7302
> > > > > Advancing the way the world uses the cloud
> > > > > <http://solidfire.com/solution/overview/?video=play>*™*
> > > >
> > >
> > >
> > >
> > > --
> > > *Ian Rae*
> > > PDG *| *CEO
> > > t *514.944.4008*
> > >
> > > *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions
> > > Experts
> > > w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|*
> > > Montreal *|*
> > > Quebec *|* H3J 1S6
> > >
> > > <https://www.cloud.ca/>
> > > <
> > >
> http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
> > > >
> > >
>
> > --
> > *Mike Tutkowski*
> > *Senior CloudStack Developer, SolidFire Inc.*
> > e: mike.tutkowski@solidfire.com
> > o: 303.746.7302
> > Advancing the way the world uses the cloud
> > <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Andrei Mikhailovsky <an...@arhont.com>.
I am happy to see the discussion is taking its pace and a lot of people tend to agree that we should address this area. I have done the ticket for that, but I am not sure if this should be dealt in a more general way as suggested. Or perhaps having individual tickets for each hypervisor would achieve a faster response from the community? 

Andrei 

----- Original Message -----

> From: "Mike Tutkowski" <mi...@solidfire.com>
> To: dev@cloudstack.apache.org
> Sent: Monday, 16 February, 2015 9:17:26 PM
> Subject: Re: Your thoughts on using Primary Storage for keeping
> snapshots

> Well...count me in on the general-purpose part (I'm already working
> on that
> and have much of it working).

> If someone is interested in implementing the RBD part, he/she can
> sync with
> me and see if there is any overlapping work that I've already
> implementing
> from a general-purpose standpoint.

> On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <ir...@cloudops.com> wrote:

> > Agree with Logan. As fans of Ceph as well as SolidFire, we are
> > interested
> > in seeing this particular use case (RBD/KVM) being well
> > implemented,
> > however the concept of volume snapshots residing only on primary
> > storage vs
> > being transferred to secondary storage is a more generally useful
> > one that
> > is worth solving with the same terminology and interfaces, even if
> > the
> > mechanisms may be specific to the storage type and hypervisor.
> >
> > It its not practical then its not practical, but seems like it
> > would be
> > worth trying.
> >
> > On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield
> > <lb...@tqhosting.com>
> > wrote:
> >
> > > Hi Mike,
> > >
> > > I agree it is a general CloudStack issue that can be addressed
> > > across
> > > multiple primary storage options. It's a two stage issue since
> > > some
> > > changes will need to be implemented to support these features
> > > across
> > > the board, and others will need to be made to each storage
> > > option.
> > >
> > > It would be nice to see a single issue opened to cover this
> > > across all
> > > available storage options. Maybe have a community vote on what
> > > support they want to see, and not consider the feature complete
> > > until
> > > all of the desired options are implemented? That would slow down
> > > development for sure, but it would ensure that it was supported
> > > where
> > > it needs to be.
> > >
> > > Thank You,
> > >
> > > Logan Barfield
> > > Tranquil Hosting
> > >
> > >
> > > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> > > <mi...@solidfire.com> wrote:
> > > > For example, Punith from CloudByte sent out an e-mail yesterday
> > > > that
> > was
> > > > very similar to this thread, but he was wondering how to
> > > > implement
> > such a
> > > > concept on his company's SAN technology.
> > > >
> > > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > > > mike.tutkowski@solidfire.com> wrote:
> > > >
> > > >> Yeah, I think it's a similar concept, though.
> > > >>
> > > >> You would want to take snapshots on Ceph (or some other
> > > >> backend system
> > > >> that acts as primary storage) instead of copying data to
> > > >> secondary
> > > storage
> > > >> and calling it a snapshot.
> > > >>
> > > >> For Ceph or any other backend system like that, the idea is to
> > > >> speed
> > up
> > > >> snapshots by not requiring CPU cycles on the front end or
> > > >> network
> > > bandwidth
> > > >> to transfer the data.
> > > >>
> > > >> In that sense, this is a general-purpose CloudStack problem
> > > >> and it
> > > appears
> > > >> you are intending on discussing only the Ceph implementation
> > > >> here,
> > > which is
> > > >> fine.
> > > >>
> > > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> > > lbarfield@tqhosting.com>
> > > >> wrote:
> > > >>
> > > >>> Hi Mike,
> > > >>>
> > > >>> I think the interest in this issue is primarily for Ceph RBD,
> > > >>> which
> > > >>> doesn't use iSCSI or SAN concepts in general. As well I
> > > >>> believe RBD
> > > >>> is only currently supported in KVM (and VMware?). QEMU has
> > > >>> native
> > RBD
> > > >>> support, so it attaches the devices directly to the VMs in
> > > >>> question.
> > > >>> It also natively supports snapshotting, which is what this
> > > >>> discussion
> > > >>> is about.
> > > >>>
> > > >>> Thank You,
> > > >>>
> > > >>> Logan Barfield
> > > >>> Tranquil Hosting
> > > >>>
> > > >>>
> > > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> > > >>> <mi...@solidfire.com> wrote:
> > > >>> > I should have also commented on KVM (since that was the
> > > >>> > hypervisor
> > > >>> called
> > > >>> > out in the initial e-mail).
> > > >>> >
> > > >>> > In my situation, most of my customers use XenServer and/or
> > > >>> > ESXi, so
> > > KVM
> > > >>> has
> > > >>> > received the fewest of my cycles with regards to those
> > > >>> > three
> > > >>> hypervisors.
> > > >>> >
> > > >>> > KVM, though, is actually the simplest hypervisor for which
> > > >>> > to
> > > implement
> > > >>> > these changes (since I am using the iSCSI adapter of the
> > > >>> > KVM agent
> > > and
> > > >>> it
> > > >>> > just essentially passes my LUN to the VM in question).
> > > >>> >
> > > >>> > For KVM, there is no clustered file system applied to my
> > > >>> > backend
> > LUN,
> > > >>> so I
> > > >>> > don't have to "worry" about that layer.
> > > >>> >
> > > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and
> > > >>> > VDIs
> > (such
> > > is
> > > >>> the
> > > >>> > case with XenServer) or having to re-signature anything
> > > >>> > (such is
> > the
> > > >>> case
> > > >>> > with ESXi).
> > > >>> >
> > > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > > >>> > mike.tutkowski@solidfire.com> wrote:
> > > >>> >
> > > >>> >> I have been working on this on and off for a while now (as
> > > >>> >> time
> > > >>> permits).
> > > >>> >>
> > > >>> >> Here is an e-mail I sent to a customer of ours that helps
> > > >>> >> describe
> > > >>> some of
> > > >>> >> the issues:
> > > >>> >>
> > > >>> >> *** Beginning of e-mail ***
> > > >>> >>
> > > >>> >> The main requests were around the following features:
> > > >>> >>
> > > >>> >> * The ability to leverage SolidFire snapshots.
> > > >>> >>
> > > >>> >> * The ability to create CloudStack templates from
> > > >>> >> SolidFire
> > > snapshots.
> > > >>> >>
> > > >>> >> I had these on my roadmap, but bumped the priority up and
> > > >>> >> began
> > > work on
> > > >>> >> them for the CS 4.6 release.
> > > >>> >>
> > > >>> >> During design, I realized there were issues with the way
> > > >>> >> XenServer
> > > is
> > > >>> >> architected that prevented me from directly using
> > > >>> >> SolidFire
> > > snapshots.
> > > >>> >>
> > > >>> >> I could definitely take a SolidFire snapshot of a
> > > >>> >> SolidFire
> > volume,
> > > but
> > > >>> >> this snapshot would not be usable from XenServer if the
> > > >>> >> original
> > > >>> volume was
> > > >>> >> still in use.
> > > >>> >>
> > > >>> >> Here is the gist of the problem:
> > > >>> >>
> > > >>> >> When XenServer leverages an iSCSI target such as a
> > > >>> >> SolidFire
> > > volume, it
> > > >>> >> applies a clustered files system to it, which they call a
> > > >>> >> storage
> > > >>> >> repository (SR). An SR has an *immutable* UUID associated
> > > >>> >> with it.
> > > >>> >>
> > > >>> >> The virtual volume (which a VM sees as a disk) is
> > > >>> >> represented by a
> > > >>> virtual
> > > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable*
> > > >>> >> UUID
> > > >>> associated
> > > >>> >> with it.
> > > >>> >>
> > > >>> >> If I take a snapshot (or a clone) of the SolidFire volume
> > > >>> >> and then
> > > >>> later
> > > >>> >> try to use that snapshot from XenServer, XenServer
> > > >>> >> complains that
> > > the
> > > >>> SR on
> > > >>> >> the snapshot has a UUID that conflicts with an existing
> > > >>> >> UUID.
> > > >>> >>
> > > >>> >> In other words, it is not possible to use the original SR
> > > >>> >> and the
> > > >>> snapshot
> > > >>> >> of this SR from XenServer at the same time, which is
> > > >>> >> critical in a
> > > >>> cloud
> > > >>> >> environment (to enable creating templates from snapshots).
> > > >>> >>
> > > >>> >> The way I have proposed circumventing this issue is not
> > > >>> >> ideal, but
> > > >>> >> technically works (this code is checked into the CS 4.6
> > > >>> >> branch):
> > > >>> >>
> > > >>> >> When the time comes to take a CloudStack snapshot of a
> > > >>> >> CloudStack
> > > >>> volume
> > > >>> >> that is backed by SolidFire storage via the storage
> > > >>> >> plug-in, the
> > > >>> plug-in
> > > >>> >> will create a new SolidFire volume with characteristics
> > > >>> >> (size and
> > > IOPS)
> > > >>> >> equal to those of the original volume.
> > > >>> >>
> > > >>> >> We then have XenServer attach to this new SolidFire
> > > >>> >> volume,
> > create a
> > > >>> *new*
> > > >>> >> SR on it, and then copy the VDI from the source SR to the
> > > destination
> > > >>> SR
> > > >>> >> (the new SR).
> > > >>> >>
> > > >>> >> This leads to us having a copy of the VDI (a "snapshot" of
> > > >>> >> sorts),
> > > but
> > > >>> it
> > > >>> >> requires CPU cycles on the compute cluster as well as
> > > >>> >> network
> > > >>> bandwidth to
> > > >>> >> write to the SAN (thus it is slower and more resource
> > > >>> >> intensive
> > > than a
> > > >>> >> SolidFire snapshot).
> > > >>> >>
> > > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> > > concerning
> > > >>> this
> > > >>> >> issue before and during the CloudStack Collaboration
> > > >>> >> Conference in
> > > >>> Budapest
> > > >>> >> in November. He agreed that this is a legitimate issue
> > > >>> >> with the
> > way
> > > >>> >> XenServer is designed and could not think of a way (other
> > > >>> >> than
> > what
> > > I
> > > >>> was
> > > >>> >> doing) to get around it in current versions of XenServer.
> > > >>> >>
> > > >>> >> One thought is to have a feature added to XenServer that
> > > >>> >> enables
> > > you to
> > > >>> >> change the UUID of an SR and of a VDI.
> > > >>> >>
> > > >>> >> If I could do that, then I could take a SolidFire snapshot
> > > >>> >> of the
> > > >>> >> SolidFire volume and issue commands to XenServer to have
> > > >>> >> it change
> > > the
> > > >>> >> UUIDs of the original SR and the original VDI. I could
> > > >>> >> then
> > recored
> > > the
> > > >>> >> necessary UUID info in the CS DB.
> > > >>> >>
> > > >>> >> *** End of e-mail ***
> > > >>> >>
> > > >>> >> I have since investigated this on ESXi.
> > > >>> >>
> > > >>> >> ESXi does have a way for us to "re-signature" a datastore,
> > > >>> >> so
> > > backend
> > > >>> >> snapshots can be taken and effectively used on this
> > > >>> >> hypervisor.
> > > >>> >>
> > > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> > > >>> lbarfield@tqhosting.com>
> > > >>> >> wrote:
> > > >>> >>
> > > >>> >>> I'm just going to stick with the qemu-img option change
> > > >>> >>> for RBD
> > for
> > > >>> >>> now (which should cut snapshot time down drastically),
> > > >>> >>> and look
> > > >>> >>> forward to this in the future. I'd be happy to help get
> > > >>> >>> this
> > > moving,
> > > >>> >>> but I'm not enough of a developer to lead the charge.
> > > >>> >>>
> > > >>> >>> As far as renaming goes, I agree that maybe backups isn't
> > > >>> >>> the
> > right
> > > >>> >>> word. That being said calling a full-sized copy of a
> > > >>> >>> volume a
> > > >>> >>> "snapshot" also isn't the right word. Maybe "image" would
> > > >>> >>> be
> > > better?
> > > >>> >>>
> > > >>> >>> I've also got my reservations about "accounts" vs "users"
> > > >>> >>> (I
> > think
> > > >>> >>> "departments" and "accounts or users" respectively is
> > > >>> >>> less
> > > confusing),
> > > >>> >>> but that's a different thread.
> > > >>> >>>
> > > >>> >>> Thank You,
> > > >>> >>>
> > > >>> >>> Logan Barfield
> > > >>> >>> Tranquil Hosting
> > > >>> >>>
> > > >>> >>>
> > > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> > > wido@widodh.nl>
> > > >>> >>> wrote:
> > > >>> >>> >
> > > >>> >>> >
> > > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> > > >>> >>> >> I like this idea a lot for Ceph RBD. I do think there
> > > >>> >>> >> should
> > > >>> still be
> > > >>> >>> >> support for copying snapshots to secondary storage as
> > > >>> >>> >> needed
> > > (for
> > > >>> >>> >> transfers between zones, etc.). I really think that
> > > >>> >>> >> this
> > could
> > > be
> > > >>> >>> >> part of a larger move to clarify the naming
> > > >>> >>> >> conventions used
> > for
> > > >>> disk
> > > >>> >>> >> operations. Currently "Volume Snapshots" should
> > > >>> >>> >> probably
> > > really be
> > > >>> >>> >> called "Backups". So having "snapshot" functionality,
> > > >>> >>> >> and a
> > > >>> "convert
> > > >>> >>> >> snapshot to backup/template" would be a good move.
> > > >>> >>> >>
> > > >>> >>> >
> > > >>> >>> > I fully agree that this would be a very great addition.
> > > >>> >>> >
> > > >>> >>> > I won't be able to work on this any time soon though.
> > > >>> >>> >
> > > >>> >>> > Wido
> > > >>> >>> >
> > > >>> >>> >> Thank You,
> > > >>> >>> >>
> > > >>> >>> >> Logan Barfield
> > > >>> >>> >> Tranquil Hosting
> > > >>> >>> >>
> > > >>> >>> >>
> > > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> > > >>> >>> andrija.panic@gmail.com> wrote:
> > > >>> >>> >>> BIG +1
> > > >>> >>> >>>
> > > >>> >>> >>> My team should submit some patch to ACS for better
> > > >>> >>> >>> KVM
> > > snapshots,
> > > >>> >>> including
> > > >>> >>> >>> whole VM snapshot etc...but it's too early to give
> > > >>> >>> >>> details...
> > > >>> >>> >>> best
> > > >>> >>> >>>
> > > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> > > >>> andrei@arhont.com>
> > > >>> >>> wrote:
> > > >>> >>> >>>
> > > >>> >>> >>>> Hello guys,
> > > >>> >>> >>>>
> > > >>> >>> >>>> I was hoping to have some feedback from the
> > > >>> >>> >>>> community on the
> > > >>> subject
> > > >>> >>> of
> > > >>> >>> >>>> having an ability to keep snapshots on the primary
> > > >>> >>> >>>> storage
> > > where
> > > >>> it
> > > >>> >>> is
> > > >>> >>> >>>> supported by the storage backend.
> > > >>> >>> >>>>
> > > >>> >>> >>>> The idea behind this functionality is to improve how
> > snapshots
> > > >>> are
> > > >>> >>> >>>> currently handled on KVM hypervisors with Ceph
> > > >>> >>> >>>> primary
> > > storage.
> > > >>> At
> > > >>> >>> the
> > > >>> >>> >>>> moment, the snapshots are taken on the primary
> > > >>> >>> >>>> storage and
> > > being
> > > >>> >>> copied to
> > > >>> >>> >>>> the secondary storage. This method is very slow and
> > > inefficient
> > > >>> even
> > > >>> >>> on
> > > >>> >>> >>>> small infrastructure. Even on medium deployments
> > > >>> >>> >>>> using
> > > snapshots
> > > >>> in
> > > >>> >>> KVM
> > > >>> >>> >>>> becomes nearly impossible. If you have tens or
> > > >>> >>> >>>> hundreds
> > > >>> concurrent
> > > >>> >>> >>>> snapshots taking place you will have a bunch of
> > > >>> >>> >>>> timeouts and
> > > >>> errors,
> > > >>> >>> your
> > > >>> >>> >>>> network becomes clogged, etc. In addition, using
> > > >>> >>> >>>> these
> > > snapshots
> > > >>> for
> > > >>> >>> >>>> creating new volumes or reverting back vms also slow
> > > >>> >>> >>>> and
> > > >>> >>> inefficient. As
> > > >>> >>> >>>> above, when you have tens or hundreds concurrent
> > > >>> >>> >>>> operations
> > it
> > > >>> will
> > > >>> >>> not
> > > >>> >>> >>>> succeed and you will have a majority of tasks with
> > > >>> >>> >>>> errors or
> > > >>> >>> timeouts.
> > > >>> >>> >>>>
> > > >>> >>> >>>> At the moment, taking a single snapshot of
> > > >>> >>> >>>> relatively small
> > > >>> volumes
> > > >>> >>> (200GB
> > > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of
> > minutes.
> > > >>> Taking
> > > >>> >>> a
> > > >>> >>> >>>> snapshot of the same volume on ceph primary storage
> > > >>> >>> >>>> takes a
> > > few
> > > >>> >>> seconds at
> > > >>> >>> >>>> most! Similarly, converting a snapshot to a volume
> > > >>> >>> >>>> takes
> > tens
> > > if
> > > >>> not
> > > >>> >>> >>>> hundreds of minutes when secondary storage is
> > > >>> >>> >>>> involved;
> > > compared
> > > >>> with
> > > >>> >>> >>>> seconds if done directly on the primary storage.
> > > >>> >>> >>>>
> > > >>> >>> >>>> I suggest that the CloudStack should have the
> > > >>> >>> >>>> ability to
> > keep
> > > >>> volume
> > > >>> >>> >>>> snapshots on the primary storage where this is
> > > >>> >>> >>>> supported by
> > > the
> > > >>> >>> storage.
> > > >>> >>> >>>> Perhaps having a per primary storage setting that
> > > >>> >>> >>>> enables
> > this
> > > >>> >>> >>>> functionality. This will be beneficial for Ceph
> > > >>> >>> >>>> primary
> > > storage
> > > >>> on
> > > >>> >>> KVM
> > > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will
> > > >>> >>> >>>> be
> > > supported
> > > >>> in
> > > >>> >>> a near
> > > >>> >>> >>>> future.
> > > >>> >>> >>>>
> > > >>> >>> >>>> This will greatly speed up the process of using
> > > >>> >>> >>>> snapshots on
> > > KVM
> > > >>> and
> > > >>> >>> users
> > > >>> >>> >>>> will actually start using snapshotting rather than
> > > >>> >>> >>>> giving up
> > > with
> > > >>> >>> >>>> frustration.
> > > >>> >>> >>>>
> > > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please
> > > >>> >>> >>>> cast
> > your
> > > >>> vote
> > > >>> >>> if you
> > > >>> >>> >>>> are in agreement.
> > > >>> >>> >>>>
> > > >>> >>> >>>> Thanks for your input
> > > >>> >>> >>>>
> > > >>> >>> >>>> Andrei
> > > >>> >>> >>>>
> > > >>> >>> >>>>
> > > >>> >>> >>>>
> > > >>> >>> >>>>
> > > >>> >>> >>>>
> > > >>> >>> >>>
> > > >>> >>> >>>
> > > >>> >>> >>> --
> > > >>> >>> >>>
> > > >>> >>> >>> Andrija Panić
> > > >>> >>>
> > > >>> >>
> > > >>> >>
> > > >>> >>
> > > >>> >> --
> > > >>> >> *Mike Tutkowski*
> > > >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> > > >>> >> e: mike.tutkowski@solidfire.com
> > > >>> >> o: 303.746.7302
> > > >>> >> Advancing the way the world uses the cloud
> > > >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > >>> >>
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> > --
> > > >>> > *Mike Tutkowski*
> > > >>> > *Senior CloudStack Developer, SolidFire Inc.*
> > > >>> > e: mike.tutkowski@solidfire.com
> > > >>> > o: 303.746.7302
> > > >>> > Advancing the way the world uses the cloud
> > > >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> *Mike Tutkowski*
> > > >> *Senior CloudStack Developer, SolidFire Inc.*
> > > >> e: mike.tutkowski@solidfire.com
> > > >> o: 303.746.7302
> > > >> Advancing the way the world uses the cloud
> > > >> <http://solidfire.com/solution/overview/?video=play>*™*
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > *Mike Tutkowski*
> > > > *Senior CloudStack Developer, SolidFire Inc.*
> > > > e: mike.tutkowski@solidfire.com
> > > > o: 303.746.7302
> > > > Advancing the way the world uses the cloud
> > > > <http://solidfire.com/solution/overview/?video=play>*™*
> > >
> >
> >
> >
> > --
> > *Ian Rae*
> > PDG *| *CEO
> > t *514.944.4008*
> >
> > *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions
> > Experts
> > w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|*
> > Montreal *|*
> > Quebec *|* H3J 1S6
> >
> > <https://www.cloud.ca/>
> > <
> > http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
> > >
> >

> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
Hey Ian,

Since it looks like the intent of this particular thread is just to discuss
RBD and snapshots (which I don't think your business uses), you would be
more interested in the "Query on snapshot and cloning for managed storage"
thread as that one talks about this issue at a more general level.

Talk to you later!
Mike

On Mon, Feb 16, 2015 at 10:42 AM, Mike Tutkowski <
mike.tutkowski@solidfire.com> wrote:

> For example, Punith from CloudByte sent out an e-mail yesterday that was
> very similar to this thread, but he was wondering how to implement such a
> concept on his company's SAN technology.
>
> On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> mike.tutkowski@solidfire.com> wrote:
>
>> Yeah, I think it's a similar concept, though.
>>
>> You would want to take snapshots on Ceph (or some other backend system
>> that acts as primary storage) instead of copying data to secondary storage
>> and calling it a snapshot.
>>
>> For Ceph or any other backend system like that, the idea is to speed up
>> snapshots by not requiring CPU cycles on the front end or network bandwidth
>> to transfer the data.
>>
>> In that sense, this is a general-purpose CloudStack problem and it
>> appears you are intending on discussing only the Ceph implementation here,
>> which is fine.
>>
>> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <lbarfield@tqhosting.com
>> > wrote:
>>
>>> Hi Mike,
>>>
>>> I think the interest in this issue is primarily for Ceph RBD, which
>>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
>>> is only currently supported in KVM (and VMware?).  QEMU has native RBD
>>> support, so it attaches the devices directly to the VMs in question.
>>> It also natively supports snapshotting, which is what this discussion
>>> is about.
>>>
>>> Thank You,
>>>
>>> Logan Barfield
>>> Tranquil Hosting
>>>
>>>
>>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
>>> <mi...@solidfire.com> wrote:
>>> > I should have also commented on KVM (since that was the hypervisor
>>> called
>>> > out in the initial e-mail).
>>> >
>>> > In my situation, most of my customers use XenServer and/or ESXi, so
>>> KVM has
>>> > received the fewest of my cycles with regards to those three
>>> hypervisors.
>>> >
>>> > KVM, though, is actually the simplest hypervisor for which to implement
>>> > these changes (since I am using the iSCSI adapter of the KVM agent and
>>> it
>>> > just essentially passes my LUN to the VM in question).
>>> >
>>> > For KVM, there is no clustered file system applied to my backend LUN,
>>> so I
>>> > don't have to "worry" about that layer.
>>> >
>>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such
>>> is the
>>> > case with XenServer) or having to re-signature anything (such is the
>>> case
>>> > with ESXi).
>>> >
>>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
>>> > mike.tutkowski@solidfire.com> wrote:
>>> >
>>> >> I have been working on this on and off for a while now (as time
>>> permits).
>>> >>
>>> >> Here is an e-mail I sent to a customer of ours that helps describe
>>> some of
>>> >> the issues:
>>> >>
>>> >> *** Beginning of e-mail ***
>>> >>
>>> >> The main requests were around the following features:
>>> >>
>>> >> * The ability to leverage SolidFire snapshots.
>>> >>
>>> >> * The ability to create CloudStack templates from SolidFire snapshots.
>>> >>
>>> >> I had these on my roadmap, but bumped the priority up and began work
>>> on
>>> >> them for the CS 4.6 release.
>>> >>
>>> >> During design, I realized there were issues with the way XenServer is
>>> >> architected that prevented me from directly using SolidFire snapshots.
>>> >>
>>> >> I could definitely take a SolidFire snapshot of a SolidFire volume,
>>> but
>>> >> this snapshot would not be usable from XenServer if the original
>>> volume was
>>> >> still in use.
>>> >>
>>> >> Here is the gist of the problem:
>>> >>
>>> >> When XenServer leverages an iSCSI target such as a SolidFire volume,
>>> it
>>> >> applies a clustered files system to it, which they call a storage
>>> >> repository (SR). An SR has an *immutable* UUID associated with it.
>>> >>
>>> >> The virtual volume (which a VM sees as a disk) is represented by a
>>> virtual
>>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
>>> associated
>>> >> with it.
>>> >>
>>> >> If I take a snapshot (or a clone) of the SolidFire volume and then
>>> later
>>> >> try to use that snapshot from XenServer, XenServer complains that the
>>> SR on
>>> >> the snapshot has a UUID that conflicts with an existing UUID.
>>> >>
>>> >> In other words, it is not possible to use the original SR and the
>>> snapshot
>>> >> of this SR from XenServer at the same time, which is critical in a
>>> cloud
>>> >> environment (to enable creating templates from snapshots).
>>> >>
>>> >> The way I have proposed circumventing this issue is not ideal, but
>>> >> technically works (this code is checked into the CS 4.6 branch):
>>> >>
>>> >> When the time comes to take a CloudStack snapshot of a CloudStack
>>> volume
>>> >> that is backed by SolidFire storage via the storage plug-in, the
>>> plug-in
>>> >> will create a new SolidFire volume with characteristics (size and
>>> IOPS)
>>> >> equal to those of the original volume.
>>> >>
>>> >> We then have XenServer attach to this new SolidFire volume, create a
>>> *new*
>>> >> SR on it, and then copy the VDI from the source SR to the destination
>>> SR
>>> >> (the new SR).
>>> >>
>>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts),
>>> but it
>>> >> requires CPU cycles on the compute cluster as well as network
>>> bandwidth to
>>> >> write to the SAN (thus it is slower and more resource intensive than a
>>> >> SolidFire snapshot).
>>> >>
>>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning
>>> this
>>> >> issue before and during the CloudStack Collaboration Conference in
>>> Budapest
>>> >> in November. He agreed that this is a legitimate issue with the way
>>> >> XenServer is designed and could not think of a way (other than what I
>>> was
>>> >> doing) to get around it in current versions of XenServer.
>>> >>
>>> >> One thought is to have a feature added to XenServer that enables you
>>> to
>>> >> change the UUID of an SR and of a VDI.
>>> >>
>>> >> If I could do that, then I could take a SolidFire snapshot of the
>>> >> SolidFire volume and issue commands to XenServer to have it change the
>>> >> UUIDs of the original SR and the original VDI. I could then recored
>>> the
>>> >> necessary UUID info in the CS DB.
>>> >>
>>> >> *** End of e-mail ***
>>> >>
>>> >> I have since investigated this on ESXi.
>>> >>
>>> >> ESXi does have a way for us to "re-signature" a datastore, so backend
>>> >> snapshots can be taken and effectively used on this hypervisor.
>>> >>
>>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
>>> lbarfield@tqhosting.com>
>>> >> wrote:
>>> >>
>>> >>> I'm just going to stick with the qemu-img option change for RBD for
>>> >>> now (which should cut snapshot time down drastically), and look
>>> >>> forward to this in the future.  I'd be happy to help get this moving,
>>> >>> but I'm not enough of a developer to lead the charge.
>>> >>>
>>> >>> As far as renaming goes, I agree that maybe backups isn't the right
>>> >>> word.  That being said calling a full-sized copy of a volume a
>>> >>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>>> >>>
>>> >>> I've also got my reservations about "accounts" vs "users" (I think
>>> >>> "departments" and "accounts or users" respectively is less
>>> confusing),
>>> >>> but that's a different thread.
>>> >>>
>>> >>> Thank You,
>>> >>>
>>> >>> Logan Barfield
>>> >>> Tranquil Hosting
>>> >>>
>>> >>>
>>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wido@widodh.nl
>>> >
>>> >>> wrote:
>>> >>> >
>>> >>> >
>>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
>>> >>> >> I like this idea a lot for Ceph RBD.  I do think there should
>>> still be
>>> >>> >> support for copying snapshots to secondary storage as needed (for
>>> >>> >> transfers between zones, etc.).  I really think that this could be
>>> >>> >> part of a larger move to clarify the naming conventions used for
>>> disk
>>> >>> >> operations.  Currently "Volume Snapshots" should probably really
>>> be
>>> >>> >> called "Backups".  So having "snapshot" functionality, and a
>>> "convert
>>> >>> >> snapshot to backup/template" would be a good move.
>>> >>> >>
>>> >>> >
>>> >>> > I fully agree that this would be a very great addition.
>>> >>> >
>>> >>> > I won't be able to work on this any time soon though.
>>> >>> >
>>> >>> > Wido
>>> >>> >
>>> >>> >> Thank You,
>>> >>> >>
>>> >>> >> Logan Barfield
>>> >>> >> Tranquil Hosting
>>> >>> >>
>>> >>> >>
>>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>>> >>> andrija.panic@gmail.com> wrote:
>>> >>> >>> BIG +1
>>> >>> >>>
>>> >>> >>> My team should submit some patch to ACS for better KVM snapshots,
>>> >>> including
>>> >>> >>> whole VM snapshot etc...but it's too early to give details...
>>> >>> >>> best
>>> >>> >>>
>>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
>>> andrei@arhont.com>
>>> >>> wrote:
>>> >>> >>>
>>> >>> >>>> Hello guys,
>>> >>> >>>>
>>> >>> >>>> I was hoping to have some feedback from the community on the
>>> subject
>>> >>> of
>>> >>> >>>> having an ability to keep snapshots on the primary storage
>>> where it
>>> >>> is
>>> >>> >>>> supported by the storage backend.
>>> >>> >>>>
>>> >>> >>>> The idea behind this functionality is to improve how snapshots
>>> are
>>> >>> >>>> currently handled on KVM hypervisors with Ceph primary storage.
>>> At
>>> >>> the
>>> >>> >>>> moment, the snapshots are taken on the primary storage and being
>>> >>> copied to
>>> >>> >>>> the secondary storage. This method is very slow and inefficient
>>> even
>>> >>> on
>>> >>> >>>> small infrastructure. Even on medium deployments using
>>> snapshots in
>>> >>> KVM
>>> >>> >>>> becomes nearly impossible. If you have tens or hundreds
>>> concurrent
>>> >>> >>>> snapshots taking place you will have a bunch of timeouts and
>>> errors,
>>> >>> your
>>> >>> >>>> network becomes clogged, etc. In addition, using these
>>> snapshots for
>>> >>> >>>> creating new volumes or reverting back vms also slow and
>>> >>> inefficient. As
>>> >>> >>>> above, when you have tens or hundreds concurrent operations it
>>> will
>>> >>> not
>>> >>> >>>> succeed and you will have a majority of tasks with errors or
>>> >>> timeouts.
>>> >>> >>>>
>>> >>> >>>> At the moment, taking a single snapshot of relatively small
>>> volumes
>>> >>> (200GB
>>> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes.
>>> Taking
>>> >>> a
>>> >>> >>>> snapshot of the same volume on ceph primary storage takes a few
>>> >>> seconds at
>>> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens
>>> if not
>>> >>> >>>> hundreds of minutes when secondary storage is involved;
>>> compared with
>>> >>> >>>> seconds if done directly on the primary storage.
>>> >>> >>>>
>>> >>> >>>> I suggest that the CloudStack should have the ability to keep
>>> volume
>>> >>> >>>> snapshots on the primary storage where this is supported by the
>>> >>> storage.
>>> >>> >>>> Perhaps having a per primary storage setting that enables this
>>> >>> >>>> functionality. This will be beneficial for Ceph primary storage
>>> on
>>> >>> KVM
>>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be
>>> supported in
>>> >>> a near
>>> >>> >>>> future.
>>> >>> >>>>
>>> >>> >>>> This will greatly speed up the process of using snapshots on
>>> KVM and
>>> >>> users
>>> >>> >>>> will actually start using snapshotting rather than giving up
>>> with
>>> >>> >>>> frustration.
>>> >>> >>>>
>>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your
>>> vote
>>> >>> if you
>>> >>> >>>> are in agreement.
>>> >>> >>>>
>>> >>> >>>> Thanks for your input
>>> >>> >>>>
>>> >>> >>>> Andrei
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>
>>> >>> >>>
>>> >>> >>> --
>>> >>> >>>
>>> >>> >>> Andrija Panić
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> *Mike Tutkowski*
>>> >> *Senior CloudStack Developer, SolidFire Inc.*
>>> >> e: mike.tutkowski@solidfire.com
>>> >> o: 303.746.7302
>>> >> Advancing the way the world uses the cloud
>>> >> <http://solidfire.com/solution/overview/?video=play>*™*
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > *Mike Tutkowski*
>>> > *Senior CloudStack Developer, SolidFire Inc.*
>>> > e: mike.tutkowski@solidfire.com
>>> > o: 303.746.7302
>>> > Advancing the way the world uses the cloud
>>> > <http://solidfire.com/solution/overview/?video=play>*™*
>>>
>>
>>
>>
>> --
>> *Mike Tutkowski*
>> *Senior CloudStack Developer, SolidFire Inc.*
>> e: mike.tutkowski@solidfire.com
>> o: 303.746.7302
>> Advancing the way the world uses the cloud
>> <http://solidfire.com/solution/overview/?video=play>*™*
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
Well...count me in on the general-purpose part (I'm already working on that
and have much of it working).

If someone is interested in implementing the RBD part, he/she can sync with
me and see if there is any overlapping work that I've already implementing
from a general-purpose standpoint.

On Mon, Feb 16, 2015 at 1:39 PM, Ian Rae <ir...@cloudops.com> wrote:

> Agree with Logan. As fans of Ceph as well as SolidFire, we are interested
> in seeing this particular use case (RBD/KVM) being well implemented,
> however the concept of volume snapshots residing only on primary storage vs
> being transferred to secondary storage is a more generally useful one that
> is worth solving with the same terminology and interfaces, even if the
> mechanisms may be specific to the storage type and hypervisor.
>
> It its not practical then its not practical, but seems like it would be
> worth trying.
>
> On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield <lb...@tqhosting.com>
> wrote:
>
> > Hi Mike,
> >
> > I agree it is a general CloudStack issue that can be addressed across
> > multiple primary storage options.  It's a two stage issue since some
> > changes will need to be implemented to support these features across
> > the board, and others will need to be made to each storage option.
> >
> > It would be nice to see a single issue opened to cover this across all
> > available storage options.  Maybe have a community vote on what
> > support they want to see, and not consider the feature complete until
> > all of the desired options are implemented?  That would slow down
> > development for sure, but it would ensure that it was supported where
> > it needs to be.
> >
> > Thank You,
> >
> > Logan Barfield
> > Tranquil Hosting
> >
> >
> > On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> > <mi...@solidfire.com> wrote:
> > > For example, Punith from CloudByte sent out an e-mail yesterday that
> was
> > > very similar to this thread, but he was wondering how to implement
> such a
> > > concept on his company's SAN technology.
> > >
> > > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > > mike.tutkowski@solidfire.com> wrote:
> > >
> > >> Yeah, I think it's a similar concept, though.
> > >>
> > >> You would want to take snapshots on Ceph (or some other backend system
> > >> that acts as primary storage) instead of copying data to secondary
> > storage
> > >> and calling it a snapshot.
> > >>
> > >> For Ceph or any other backend system like that, the idea is to speed
> up
> > >> snapshots by not requiring CPU cycles on the front end or network
> > bandwidth
> > >> to transfer the data.
> > >>
> > >> In that sense, this is a general-purpose CloudStack problem and it
> > appears
> > >> you are intending on discussing only the Ceph implementation here,
> > which is
> > >> fine.
> > >>
> > >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> > lbarfield@tqhosting.com>
> > >> wrote:
> > >>
> > >>> Hi Mike,
> > >>>
> > >>> I think the interest in this issue is primarily for Ceph RBD, which
> > >>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
> > >>> is only currently supported in KVM (and VMware?).  QEMU has native
> RBD
> > >>> support, so it attaches the devices directly to the VMs in question.
> > >>> It also natively supports snapshotting, which is what this discussion
> > >>> is about.
> > >>>
> > >>> Thank You,
> > >>>
> > >>> Logan Barfield
> > >>> Tranquil Hosting
> > >>>
> > >>>
> > >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> > >>> <mi...@solidfire.com> wrote:
> > >>> > I should have also commented on KVM (since that was the hypervisor
> > >>> called
> > >>> > out in the initial e-mail).
> > >>> >
> > >>> > In my situation, most of my customers use XenServer and/or ESXi, so
> > KVM
> > >>> has
> > >>> > received the fewest of my cycles with regards to those three
> > >>> hypervisors.
> > >>> >
> > >>> > KVM, though, is actually the simplest hypervisor for which to
> > implement
> > >>> > these changes (since I am using the iSCSI adapter of the KVM agent
> > and
> > >>> it
> > >>> > just essentially passes my LUN to the VM in question).
> > >>> >
> > >>> > For KVM, there is no clustered file system applied to my backend
> LUN,
> > >>> so I
> > >>> > don't have to "worry" about that layer.
> > >>> >
> > >>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs
> (such
> > is
> > >>> the
> > >>> > case with XenServer) or having to re-signature anything (such is
> the
> > >>> case
> > >>> > with ESXi).
> > >>> >
> > >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > >>> > mike.tutkowski@solidfire.com> wrote:
> > >>> >
> > >>> >> I have been working on this on and off for a while now (as time
> > >>> permits).
> > >>> >>
> > >>> >> Here is an e-mail I sent to a customer of ours that helps describe
> > >>> some of
> > >>> >> the issues:
> > >>> >>
> > >>> >> *** Beginning of e-mail ***
> > >>> >>
> > >>> >> The main requests were around the following features:
> > >>> >>
> > >>> >> * The ability to leverage SolidFire snapshots.
> > >>> >>
> > >>> >> * The ability to create CloudStack templates from SolidFire
> > snapshots.
> > >>> >>
> > >>> >> I had these on my roadmap, but bumped the priority up and began
> > work on
> > >>> >> them for the CS 4.6 release.
> > >>> >>
> > >>> >> During design, I realized there were issues with the way XenServer
> > is
> > >>> >> architected that prevented me from directly using SolidFire
> > snapshots.
> > >>> >>
> > >>> >> I could definitely take a SolidFire snapshot of a SolidFire
> volume,
> > but
> > >>> >> this snapshot would not be usable from XenServer if the original
> > >>> volume was
> > >>> >> still in use.
> > >>> >>
> > >>> >> Here is the gist of the problem:
> > >>> >>
> > >>> >> When XenServer leverages an iSCSI target such as a SolidFire
> > volume, it
> > >>> >> applies a clustered files system to it, which they call a storage
> > >>> >> repository (SR). An SR has an *immutable* UUID associated with it.
> > >>> >>
> > >>> >> The virtual volume (which a VM sees as a disk) is represented by a
> > >>> virtual
> > >>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
> > >>> associated
> > >>> >> with it.
> > >>> >>
> > >>> >> If I take a snapshot (or a clone) of the SolidFire volume and then
> > >>> later
> > >>> >> try to use that snapshot from XenServer, XenServer complains that
> > the
> > >>> SR on
> > >>> >> the snapshot has a UUID that conflicts with an existing UUID.
> > >>> >>
> > >>> >> In other words, it is not possible to use the original SR and the
> > >>> snapshot
> > >>> >> of this SR from XenServer at the same time, which is critical in a
> > >>> cloud
> > >>> >> environment (to enable creating templates from snapshots).
> > >>> >>
> > >>> >> The way I have proposed circumventing this issue is not ideal, but
> > >>> >> technically works (this code is checked into the CS 4.6 branch):
> > >>> >>
> > >>> >> When the time comes to take a CloudStack snapshot of a CloudStack
> > >>> volume
> > >>> >> that is backed by SolidFire storage via the storage plug-in, the
> > >>> plug-in
> > >>> >> will create a new SolidFire volume with characteristics (size and
> > IOPS)
> > >>> >> equal to those of the original volume.
> > >>> >>
> > >>> >> We then have XenServer attach to this new SolidFire volume,
> create a
> > >>> *new*
> > >>> >> SR on it, and then copy the VDI from the source SR to the
> > destination
> > >>> SR
> > >>> >> (the new SR).
> > >>> >>
> > >>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts),
> > but
> > >>> it
> > >>> >> requires CPU cycles on the compute cluster as well as network
> > >>> bandwidth to
> > >>> >> write to the SAN (thus it is slower and more resource intensive
> > than a
> > >>> >> SolidFire snapshot).
> > >>> >>
> > >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> > concerning
> > >>> this
> > >>> >> issue before and during the CloudStack Collaboration Conference in
> > >>> Budapest
> > >>> >> in November. He agreed that this is a legitimate issue with the
> way
> > >>> >> XenServer is designed and could not think of a way (other than
> what
> > I
> > >>> was
> > >>> >> doing) to get around it in current versions of XenServer.
> > >>> >>
> > >>> >> One thought is to have a feature added to XenServer that enables
> > you to
> > >>> >> change the UUID of an SR and of a VDI.
> > >>> >>
> > >>> >> If I could do that, then I could take a SolidFire snapshot of the
> > >>> >> SolidFire volume and issue commands to XenServer to have it change
> > the
> > >>> >> UUIDs of the original SR and the original VDI. I could then
> recored
> > the
> > >>> >> necessary UUID info in the CS DB.
> > >>> >>
> > >>> >> *** End of e-mail ***
> > >>> >>
> > >>> >> I have since investigated this on ESXi.
> > >>> >>
> > >>> >> ESXi does have a way for us to "re-signature" a datastore, so
> > backend
> > >>> >> snapshots can be taken and effectively used on this hypervisor.
> > >>> >>
> > >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> > >>> lbarfield@tqhosting.com>
> > >>> >> wrote:
> > >>> >>
> > >>> >>> I'm just going to stick with the qemu-img option change for RBD
> for
> > >>> >>> now (which should cut snapshot time down drastically), and look
> > >>> >>> forward to this in the future.  I'd be happy to help get this
> > moving,
> > >>> >>> but I'm not enough of a developer to lead the charge.
> > >>> >>>
> > >>> >>> As far as renaming goes, I agree that maybe backups isn't the
> right
> > >>> >>> word.  That being said calling a full-sized copy of a volume a
> > >>> >>> "snapshot" also isn't the right word.  Maybe "image" would be
> > better?
> > >>> >>>
> > >>> >>> I've also got my reservations about "accounts" vs "users" (I
> think
> > >>> >>> "departments" and "accounts or users" respectively is less
> > confusing),
> > >>> >>> but that's a different thread.
> > >>> >>>
> > >>> >>> Thank You,
> > >>> >>>
> > >>> >>> Logan Barfield
> > >>> >>> Tranquil Hosting
> > >>> >>>
> > >>> >>>
> > >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> > wido@widodh.nl>
> > >>> >>> wrote:
> > >>> >>> >
> > >>> >>> >
> > >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> > >>> >>> >> I like this idea a lot for Ceph RBD.  I do think there should
> > >>> still be
> > >>> >>> >> support for copying snapshots to secondary storage as needed
> > (for
> > >>> >>> >> transfers between zones, etc.).  I really think that this
> could
> > be
> > >>> >>> >> part of a larger move to clarify the naming conventions used
> for
> > >>> disk
> > >>> >>> >> operations.  Currently "Volume Snapshots" should probably
> > really be
> > >>> >>> >> called "Backups".  So having "snapshot" functionality, and a
> > >>> "convert
> > >>> >>> >> snapshot to backup/template" would be a good move.
> > >>> >>> >>
> > >>> >>> >
> > >>> >>> > I fully agree that this would be a very great addition.
> > >>> >>> >
> > >>> >>> > I won't be able to work on this any time soon though.
> > >>> >>> >
> > >>> >>> > Wido
> > >>> >>> >
> > >>> >>> >> Thank You,
> > >>> >>> >>
> > >>> >>> >> Logan Barfield
> > >>> >>> >> Tranquil Hosting
> > >>> >>> >>
> > >>> >>> >>
> > >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> > >>> >>> andrija.panic@gmail.com> wrote:
> > >>> >>> >>> BIG +1
> > >>> >>> >>>
> > >>> >>> >>> My team should submit some patch to ACS for better KVM
> > snapshots,
> > >>> >>> including
> > >>> >>> >>> whole VM snapshot etc...but it's too early to give details...
> > >>> >>> >>> best
> > >>> >>> >>>
> > >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> > >>> andrei@arhont.com>
> > >>> >>> wrote:
> > >>> >>> >>>
> > >>> >>> >>>> Hello guys,
> > >>> >>> >>>>
> > >>> >>> >>>> I was hoping to have some feedback from the community on the
> > >>> subject
> > >>> >>> of
> > >>> >>> >>>> having an ability to keep snapshots on the primary storage
> > where
> > >>> it
> > >>> >>> is
> > >>> >>> >>>> supported by the storage backend.
> > >>> >>> >>>>
> > >>> >>> >>>> The idea behind this functionality is to improve how
> snapshots
> > >>> are
> > >>> >>> >>>> currently handled on KVM hypervisors with Ceph primary
> > storage.
> > >>> At
> > >>> >>> the
> > >>> >>> >>>> moment, the snapshots are taken on the primary storage and
> > being
> > >>> >>> copied to
> > >>> >>> >>>> the secondary storage. This method is very slow and
> > inefficient
> > >>> even
> > >>> >>> on
> > >>> >>> >>>> small infrastructure. Even on medium deployments using
> > snapshots
> > >>> in
> > >>> >>> KVM
> > >>> >>> >>>> becomes nearly impossible. If you have tens or hundreds
> > >>> concurrent
> > >>> >>> >>>> snapshots taking place you will have a bunch of timeouts and
> > >>> errors,
> > >>> >>> your
> > >>> >>> >>>> network becomes clogged, etc. In addition, using these
> > snapshots
> > >>> for
> > >>> >>> >>>> creating new volumes or reverting back vms also slow and
> > >>> >>> inefficient. As
> > >>> >>> >>>> above, when you have tens or hundreds concurrent operations
> it
> > >>> will
> > >>> >>> not
> > >>> >>> >>>> succeed and you will have a majority of tasks with errors or
> > >>> >>> timeouts.
> > >>> >>> >>>>
> > >>> >>> >>>> At the moment, taking a single snapshot of relatively small
> > >>> volumes
> > >>> >>> (200GB
> > >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of
> minutes.
> > >>> Taking
> > >>> >>> a
> > >>> >>> >>>> snapshot of the same volume on ceph primary storage takes a
> > few
> > >>> >>> seconds at
> > >>> >>> >>>> most! Similarly, converting a snapshot to a volume takes
> tens
> > if
> > >>> not
> > >>> >>> >>>> hundreds of minutes when secondary storage is involved;
> > compared
> > >>> with
> > >>> >>> >>>> seconds if done directly on the primary storage.
> > >>> >>> >>>>
> > >>> >>> >>>> I suggest that the CloudStack should have the ability to
> keep
> > >>> volume
> > >>> >>> >>>> snapshots on the primary storage where this is supported by
> > the
> > >>> >>> storage.
> > >>> >>> >>>> Perhaps having a per primary storage setting that enables
> this
> > >>> >>> >>>> functionality. This will be beneficial for Ceph primary
> > storage
> > >>> on
> > >>> >>> KVM
> > >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be
> > supported
> > >>> in
> > >>> >>> a near
> > >>> >>> >>>> future.
> > >>> >>> >>>>
> > >>> >>> >>>> This will greatly speed up the process of using snapshots on
> > KVM
> > >>> and
> > >>> >>> users
> > >>> >>> >>>> will actually start using snapshotting rather than giving up
> > with
> > >>> >>> >>>> frustration.
> > >>> >>> >>>>
> > >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast
> your
> > >>> vote
> > >>> >>> if you
> > >>> >>> >>>> are in agreement.
> > >>> >>> >>>>
> > >>> >>> >>>> Thanks for your input
> > >>> >>> >>>>
> > >>> >>> >>>> Andrei
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>>
> > >>> >>> >>>
> > >>> >>> >>>
> > >>> >>> >>> --
> > >>> >>> >>>
> > >>> >>> >>> Andrija Panić
> > >>> >>>
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >> --
> > >>> >> *Mike Tutkowski*
> > >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> > >>> >> e: mike.tutkowski@solidfire.com
> > >>> >> o: 303.746.7302
> > >>> >> Advancing the way the world uses the cloud
> > >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> > >>> >>
> > >>> >
> > >>> >
> > >>> >
> > >>> > --
> > >>> > *Mike Tutkowski*
> > >>> > *Senior CloudStack Developer, SolidFire Inc.*
> > >>> > e: mike.tutkowski@solidfire.com
> > >>> > o: 303.746.7302
> > >>> > Advancing the way the world uses the cloud
> > >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> *Mike Tutkowski*
> > >> *Senior CloudStack Developer, SolidFire Inc.*
> > >> e: mike.tutkowski@solidfire.com
> > >> o: 303.746.7302
> > >> Advancing the way the world uses the cloud
> > >> <http://solidfire.com/solution/overview/?video=play>*™*
> > >>
> > >
> > >
> > >
> > > --
> > > *Mike Tutkowski*
> > > *Senior CloudStack Developer, SolidFire Inc.*
> > > e: mike.tutkowski@solidfire.com
> > > o: 303.746.7302
> > > Advancing the way the world uses the cloud
> > > <http://solidfire.com/solution/overview/?video=play>*™*
> >
>
>
>
> --
> *Ian Rae*
> PDG *| *CEO
> t *514.944.4008*
>
> *CloudOps* Votre partenaire infonuagique* | *Cloud Solutions Experts
> w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|* Montreal *|*
>  Quebec *|* H3J 1S6
>
> <https://www.cloud.ca/>
> <
> http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/
> >
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Ian Rae <ir...@cloudops.com>.
Agree with Logan. As fans of Ceph as well as SolidFire, we are interested
in seeing this particular use case (RBD/KVM) being well implemented,
however the concept of volume snapshots residing only on primary storage vs
being transferred to secondary storage is a more generally useful one that
is worth solving with the same terminology and interfaces, even if the
mechanisms may be specific to the storage type and hypervisor.

It its not practical then its not practical, but seems like it would be
worth trying.

On Mon, Feb 16, 2015 at 1:02 PM, Logan Barfield <lb...@tqhosting.com>
wrote:

> Hi Mike,
>
> I agree it is a general CloudStack issue that can be addressed across
> multiple primary storage options.  It's a two stage issue since some
> changes will need to be implemented to support these features across
> the board, and others will need to be made to each storage option.
>
> It would be nice to see a single issue opened to cover this across all
> available storage options.  Maybe have a community vote on what
> support they want to see, and not consider the feature complete until
> all of the desired options are implemented?  That would slow down
> development for sure, but it would ensure that it was supported where
> it needs to be.
>
> Thank You,
>
> Logan Barfield
> Tranquil Hosting
>
>
> On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
> <mi...@solidfire.com> wrote:
> > For example, Punith from CloudByte sent out an e-mail yesterday that was
> > very similar to this thread, but he was wondering how to implement such a
> > concept on his company's SAN technology.
> >
> > On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> > mike.tutkowski@solidfire.com> wrote:
> >
> >> Yeah, I think it's a similar concept, though.
> >>
> >> You would want to take snapshots on Ceph (or some other backend system
> >> that acts as primary storage) instead of copying data to secondary
> storage
> >> and calling it a snapshot.
> >>
> >> For Ceph or any other backend system like that, the idea is to speed up
> >> snapshots by not requiring CPU cycles on the front end or network
> bandwidth
> >> to transfer the data.
> >>
> >> In that sense, this is a general-purpose CloudStack problem and it
> appears
> >> you are intending on discussing only the Ceph implementation here,
> which is
> >> fine.
> >>
> >> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <
> lbarfield@tqhosting.com>
> >> wrote:
> >>
> >>> Hi Mike,
> >>>
> >>> I think the interest in this issue is primarily for Ceph RBD, which
> >>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
> >>> is only currently supported in KVM (and VMware?).  QEMU has native RBD
> >>> support, so it attaches the devices directly to the VMs in question.
> >>> It also natively supports snapshotting, which is what this discussion
> >>> is about.
> >>>
> >>> Thank You,
> >>>
> >>> Logan Barfield
> >>> Tranquil Hosting
> >>>
> >>>
> >>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> >>> <mi...@solidfire.com> wrote:
> >>> > I should have also commented on KVM (since that was the hypervisor
> >>> called
> >>> > out in the initial e-mail).
> >>> >
> >>> > In my situation, most of my customers use XenServer and/or ESXi, so
> KVM
> >>> has
> >>> > received the fewest of my cycles with regards to those three
> >>> hypervisors.
> >>> >
> >>> > KVM, though, is actually the simplest hypervisor for which to
> implement
> >>> > these changes (since I am using the iSCSI adapter of the KVM agent
> and
> >>> it
> >>> > just essentially passes my LUN to the VM in question).
> >>> >
> >>> > For KVM, there is no clustered file system applied to my backend LUN,
> >>> so I
> >>> > don't have to "worry" about that layer.
> >>> >
> >>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such
> is
> >>> the
> >>> > case with XenServer) or having to re-signature anything (such is the
> >>> case
> >>> > with ESXi).
> >>> >
> >>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> >>> > mike.tutkowski@solidfire.com> wrote:
> >>> >
> >>> >> I have been working on this on and off for a while now (as time
> >>> permits).
> >>> >>
> >>> >> Here is an e-mail I sent to a customer of ours that helps describe
> >>> some of
> >>> >> the issues:
> >>> >>
> >>> >> *** Beginning of e-mail ***
> >>> >>
> >>> >> The main requests were around the following features:
> >>> >>
> >>> >> * The ability to leverage SolidFire snapshots.
> >>> >>
> >>> >> * The ability to create CloudStack templates from SolidFire
> snapshots.
> >>> >>
> >>> >> I had these on my roadmap, but bumped the priority up and began
> work on
> >>> >> them for the CS 4.6 release.
> >>> >>
> >>> >> During design, I realized there were issues with the way XenServer
> is
> >>> >> architected that prevented me from directly using SolidFire
> snapshots.
> >>> >>
> >>> >> I could definitely take a SolidFire snapshot of a SolidFire volume,
> but
> >>> >> this snapshot would not be usable from XenServer if the original
> >>> volume was
> >>> >> still in use.
> >>> >>
> >>> >> Here is the gist of the problem:
> >>> >>
> >>> >> When XenServer leverages an iSCSI target such as a SolidFire
> volume, it
> >>> >> applies a clustered files system to it, which they call a storage
> >>> >> repository (SR). An SR has an *immutable* UUID associated with it.
> >>> >>
> >>> >> The virtual volume (which a VM sees as a disk) is represented by a
> >>> virtual
> >>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
> >>> associated
> >>> >> with it.
> >>> >>
> >>> >> If I take a snapshot (or a clone) of the SolidFire volume and then
> >>> later
> >>> >> try to use that snapshot from XenServer, XenServer complains that
> the
> >>> SR on
> >>> >> the snapshot has a UUID that conflicts with an existing UUID.
> >>> >>
> >>> >> In other words, it is not possible to use the original SR and the
> >>> snapshot
> >>> >> of this SR from XenServer at the same time, which is critical in a
> >>> cloud
> >>> >> environment (to enable creating templates from snapshots).
> >>> >>
> >>> >> The way I have proposed circumventing this issue is not ideal, but
> >>> >> technically works (this code is checked into the CS 4.6 branch):
> >>> >>
> >>> >> When the time comes to take a CloudStack snapshot of a CloudStack
> >>> volume
> >>> >> that is backed by SolidFire storage via the storage plug-in, the
> >>> plug-in
> >>> >> will create a new SolidFire volume with characteristics (size and
> IOPS)
> >>> >> equal to those of the original volume.
> >>> >>
> >>> >> We then have XenServer attach to this new SolidFire volume, create a
> >>> *new*
> >>> >> SR on it, and then copy the VDI from the source SR to the
> destination
> >>> SR
> >>> >> (the new SR).
> >>> >>
> >>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts),
> but
> >>> it
> >>> >> requires CPU cycles on the compute cluster as well as network
> >>> bandwidth to
> >>> >> write to the SAN (thus it is slower and more resource intensive
> than a
> >>> >> SolidFire snapshot).
> >>> >>
> >>> >> I spoke with Tim Mackey (who works on XenServer at Citrix)
> concerning
> >>> this
> >>> >> issue before and during the CloudStack Collaboration Conference in
> >>> Budapest
> >>> >> in November. He agreed that this is a legitimate issue with the way
> >>> >> XenServer is designed and could not think of a way (other than what
> I
> >>> was
> >>> >> doing) to get around it in current versions of XenServer.
> >>> >>
> >>> >> One thought is to have a feature added to XenServer that enables
> you to
> >>> >> change the UUID of an SR and of a VDI.
> >>> >>
> >>> >> If I could do that, then I could take a SolidFire snapshot of the
> >>> >> SolidFire volume and issue commands to XenServer to have it change
> the
> >>> >> UUIDs of the original SR and the original VDI. I could then recored
> the
> >>> >> necessary UUID info in the CS DB.
> >>> >>
> >>> >> *** End of e-mail ***
> >>> >>
> >>> >> I have since investigated this on ESXi.
> >>> >>
> >>> >> ESXi does have a way for us to "re-signature" a datastore, so
> backend
> >>> >> snapshots can be taken and effectively used on this hypervisor.
> >>> >>
> >>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> >>> lbarfield@tqhosting.com>
> >>> >> wrote:
> >>> >>
> >>> >>> I'm just going to stick with the qemu-img option change for RBD for
> >>> >>> now (which should cut snapshot time down drastically), and look
> >>> >>> forward to this in the future.  I'd be happy to help get this
> moving,
> >>> >>> but I'm not enough of a developer to lead the charge.
> >>> >>>
> >>> >>> As far as renaming goes, I agree that maybe backups isn't the right
> >>> >>> word.  That being said calling a full-sized copy of a volume a
> >>> >>> "snapshot" also isn't the right word.  Maybe "image" would be
> better?
> >>> >>>
> >>> >>> I've also got my reservations about "accounts" vs "users" (I think
> >>> >>> "departments" and "accounts or users" respectively is less
> confusing),
> >>> >>> but that's a different thread.
> >>> >>>
> >>> >>> Thank You,
> >>> >>>
> >>> >>> Logan Barfield
> >>> >>> Tranquil Hosting
> >>> >>>
> >>> >>>
> >>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <
> wido@widodh.nl>
> >>> >>> wrote:
> >>> >>> >
> >>> >>> >
> >>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> >>> >>> >> I like this idea a lot for Ceph RBD.  I do think there should
> >>> still be
> >>> >>> >> support for copying snapshots to secondary storage as needed
> (for
> >>> >>> >> transfers between zones, etc.).  I really think that this could
> be
> >>> >>> >> part of a larger move to clarify the naming conventions used for
> >>> disk
> >>> >>> >> operations.  Currently "Volume Snapshots" should probably
> really be
> >>> >>> >> called "Backups".  So having "snapshot" functionality, and a
> >>> "convert
> >>> >>> >> snapshot to backup/template" would be a good move.
> >>> >>> >>
> >>> >>> >
> >>> >>> > I fully agree that this would be a very great addition.
> >>> >>> >
> >>> >>> > I won't be able to work on this any time soon though.
> >>> >>> >
> >>> >>> > Wido
> >>> >>> >
> >>> >>> >> Thank You,
> >>> >>> >>
> >>> >>> >> Logan Barfield
> >>> >>> >> Tranquil Hosting
> >>> >>> >>
> >>> >>> >>
> >>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> >>> >>> andrija.panic@gmail.com> wrote:
> >>> >>> >>> BIG +1
> >>> >>> >>>
> >>> >>> >>> My team should submit some patch to ACS for better KVM
> snapshots,
> >>> >>> including
> >>> >>> >>> whole VM snapshot etc...but it's too early to give details...
> >>> >>> >>> best
> >>> >>> >>>
> >>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> >>> andrei@arhont.com>
> >>> >>> wrote:
> >>> >>> >>>
> >>> >>> >>>> Hello guys,
> >>> >>> >>>>
> >>> >>> >>>> I was hoping to have some feedback from the community on the
> >>> subject
> >>> >>> of
> >>> >>> >>>> having an ability to keep snapshots on the primary storage
> where
> >>> it
> >>> >>> is
> >>> >>> >>>> supported by the storage backend.
> >>> >>> >>>>
> >>> >>> >>>> The idea behind this functionality is to improve how snapshots
> >>> are
> >>> >>> >>>> currently handled on KVM hypervisors with Ceph primary
> storage.
> >>> At
> >>> >>> the
> >>> >>> >>>> moment, the snapshots are taken on the primary storage and
> being
> >>> >>> copied to
> >>> >>> >>>> the secondary storage. This method is very slow and
> inefficient
> >>> even
> >>> >>> on
> >>> >>> >>>> small infrastructure. Even on medium deployments using
> snapshots
> >>> in
> >>> >>> KVM
> >>> >>> >>>> becomes nearly impossible. If you have tens or hundreds
> >>> concurrent
> >>> >>> >>>> snapshots taking place you will have a bunch of timeouts and
> >>> errors,
> >>> >>> your
> >>> >>> >>>> network becomes clogged, etc. In addition, using these
> snapshots
> >>> for
> >>> >>> >>>> creating new volumes or reverting back vms also slow and
> >>> >>> inefficient. As
> >>> >>> >>>> above, when you have tens or hundreds concurrent operations it
> >>> will
> >>> >>> not
> >>> >>> >>>> succeed and you will have a majority of tasks with errors or
> >>> >>> timeouts.
> >>> >>> >>>>
> >>> >>> >>>> At the moment, taking a single snapshot of relatively small
> >>> volumes
> >>> >>> (200GB
> >>> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes.
> >>> Taking
> >>> >>> a
> >>> >>> >>>> snapshot of the same volume on ceph primary storage takes a
> few
> >>> >>> seconds at
> >>> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens
> if
> >>> not
> >>> >>> >>>> hundreds of minutes when secondary storage is involved;
> compared
> >>> with
> >>> >>> >>>> seconds if done directly on the primary storage.
> >>> >>> >>>>
> >>> >>> >>>> I suggest that the CloudStack should have the ability to keep
> >>> volume
> >>> >>> >>>> snapshots on the primary storage where this is supported by
> the
> >>> >>> storage.
> >>> >>> >>>> Perhaps having a per primary storage setting that enables this
> >>> >>> >>>> functionality. This will be beneficial for Ceph primary
> storage
> >>> on
> >>> >>> KVM
> >>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be
> supported
> >>> in
> >>> >>> a near
> >>> >>> >>>> future.
> >>> >>> >>>>
> >>> >>> >>>> This will greatly speed up the process of using snapshots on
> KVM
> >>> and
> >>> >>> users
> >>> >>> >>>> will actually start using snapshotting rather than giving up
> with
> >>> >>> >>>> frustration.
> >>> >>> >>>>
> >>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your
> >>> vote
> >>> >>> if you
> >>> >>> >>>> are in agreement.
> >>> >>> >>>>
> >>> >>> >>>> Thanks for your input
> >>> >>> >>>>
> >>> >>> >>>> Andrei
> >>> >>> >>>>
> >>> >>> >>>>
> >>> >>> >>>>
> >>> >>> >>>>
> >>> >>> >>>>
> >>> >>> >>>
> >>> >>> >>>
> >>> >>> >>> --
> >>> >>> >>>
> >>> >>> >>> Andrija Panić
> >>> >>>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> *Mike Tutkowski*
> >>> >> *Senior CloudStack Developer, SolidFire Inc.*
> >>> >> e: mike.tutkowski@solidfire.com
> >>> >> o: 303.746.7302
> >>> >> Advancing the way the world uses the cloud
> >>> >> <http://solidfire.com/solution/overview/?video=play>*™*
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > *Mike Tutkowski*
> >>> > *Senior CloudStack Developer, SolidFire Inc.*
> >>> > e: mike.tutkowski@solidfire.com
> >>> > o: 303.746.7302
> >>> > Advancing the way the world uses the cloud
> >>> > <http://solidfire.com/solution/overview/?video=play>*™*
> >>>
> >>
> >>
> >>
> >> --
> >> *Mike Tutkowski*
> >> *Senior CloudStack Developer, SolidFire Inc.*
> >> e: mike.tutkowski@solidfire.com
> >> o: 303.746.7302
> >> Advancing the way the world uses the cloud
> >> <http://solidfire.com/solution/overview/?video=play>*™*
> >>
> >
> >
> >
> > --
> > *Mike Tutkowski*
> > *Senior CloudStack Developer, SolidFire Inc.*
> > e: mike.tutkowski@solidfire.com
> > o: 303.746.7302
> > Advancing the way the world uses the cloud
> > <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Ian Rae*
PDG *| *CEO
t *514.944.4008*

*CloudOps* Votre partenaire infonuagique* | *Cloud Solutions Experts
w cloudops.com <http://www.cloudops.com/> *|* 420 rue Guy *|* Montreal *|*
 Quebec *|* H3J 1S6

<https://www.cloud.ca/>
<http://www.cloudops.com/2014/11/cloudops-tops-deloittes-technology-fast-50/>

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Logan Barfield <lb...@tqhosting.com>.
Hi Mike,

I agree it is a general CloudStack issue that can be addressed across
multiple primary storage options.  It's a two stage issue since some
changes will need to be implemented to support these features across
the board, and others will need to be made to each storage option.

It would be nice to see a single issue opened to cover this across all
available storage options.  Maybe have a community vote on what
support they want to see, and not consider the feature complete until
all of the desired options are implemented?  That would slow down
development for sure, but it would ensure that it was supported where
it needs to be.

Thank You,

Logan Barfield
Tranquil Hosting


On Mon, Feb 16, 2015 at 12:42 PM, Mike Tutkowski
<mi...@solidfire.com> wrote:
> For example, Punith from CloudByte sent out an e-mail yesterday that was
> very similar to this thread, but he was wondering how to implement such a
> concept on his company's SAN technology.
>
> On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
> mike.tutkowski@solidfire.com> wrote:
>
>> Yeah, I think it's a similar concept, though.
>>
>> You would want to take snapshots on Ceph (or some other backend system
>> that acts as primary storage) instead of copying data to secondary storage
>> and calling it a snapshot.
>>
>> For Ceph or any other backend system like that, the idea is to speed up
>> snapshots by not requiring CPU cycles on the front end or network bandwidth
>> to transfer the data.
>>
>> In that sense, this is a general-purpose CloudStack problem and it appears
>> you are intending on discussing only the Ceph implementation here, which is
>> fine.
>>
>> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <lb...@tqhosting.com>
>> wrote:
>>
>>> Hi Mike,
>>>
>>> I think the interest in this issue is primarily for Ceph RBD, which
>>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
>>> is only currently supported in KVM (and VMware?).  QEMU has native RBD
>>> support, so it attaches the devices directly to the VMs in question.
>>> It also natively supports snapshotting, which is what this discussion
>>> is about.
>>>
>>> Thank You,
>>>
>>> Logan Barfield
>>> Tranquil Hosting
>>>
>>>
>>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
>>> <mi...@solidfire.com> wrote:
>>> > I should have also commented on KVM (since that was the hypervisor
>>> called
>>> > out in the initial e-mail).
>>> >
>>> > In my situation, most of my customers use XenServer and/or ESXi, so KVM
>>> has
>>> > received the fewest of my cycles with regards to those three
>>> hypervisors.
>>> >
>>> > KVM, though, is actually the simplest hypervisor for which to implement
>>> > these changes (since I am using the iSCSI adapter of the KVM agent and
>>> it
>>> > just essentially passes my LUN to the VM in question).
>>> >
>>> > For KVM, there is no clustered file system applied to my backend LUN,
>>> so I
>>> > don't have to "worry" about that layer.
>>> >
>>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such is
>>> the
>>> > case with XenServer) or having to re-signature anything (such is the
>>> case
>>> > with ESXi).
>>> >
>>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
>>> > mike.tutkowski@solidfire.com> wrote:
>>> >
>>> >> I have been working on this on and off for a while now (as time
>>> permits).
>>> >>
>>> >> Here is an e-mail I sent to a customer of ours that helps describe
>>> some of
>>> >> the issues:
>>> >>
>>> >> *** Beginning of e-mail ***
>>> >>
>>> >> The main requests were around the following features:
>>> >>
>>> >> * The ability to leverage SolidFire snapshots.
>>> >>
>>> >> * The ability to create CloudStack templates from SolidFire snapshots.
>>> >>
>>> >> I had these on my roadmap, but bumped the priority up and began work on
>>> >> them for the CS 4.6 release.
>>> >>
>>> >> During design, I realized there were issues with the way XenServer is
>>> >> architected that prevented me from directly using SolidFire snapshots.
>>> >>
>>> >> I could definitely take a SolidFire snapshot of a SolidFire volume, but
>>> >> this snapshot would not be usable from XenServer if the original
>>> volume was
>>> >> still in use.
>>> >>
>>> >> Here is the gist of the problem:
>>> >>
>>> >> When XenServer leverages an iSCSI target such as a SolidFire volume, it
>>> >> applies a clustered files system to it, which they call a storage
>>> >> repository (SR). An SR has an *immutable* UUID associated with it.
>>> >>
>>> >> The virtual volume (which a VM sees as a disk) is represented by a
>>> virtual
>>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
>>> associated
>>> >> with it.
>>> >>
>>> >> If I take a snapshot (or a clone) of the SolidFire volume and then
>>> later
>>> >> try to use that snapshot from XenServer, XenServer complains that the
>>> SR on
>>> >> the snapshot has a UUID that conflicts with an existing UUID.
>>> >>
>>> >> In other words, it is not possible to use the original SR and the
>>> snapshot
>>> >> of this SR from XenServer at the same time, which is critical in a
>>> cloud
>>> >> environment (to enable creating templates from snapshots).
>>> >>
>>> >> The way I have proposed circumventing this issue is not ideal, but
>>> >> technically works (this code is checked into the CS 4.6 branch):
>>> >>
>>> >> When the time comes to take a CloudStack snapshot of a CloudStack
>>> volume
>>> >> that is backed by SolidFire storage via the storage plug-in, the
>>> plug-in
>>> >> will create a new SolidFire volume with characteristics (size and IOPS)
>>> >> equal to those of the original volume.
>>> >>
>>> >> We then have XenServer attach to this new SolidFire volume, create a
>>> *new*
>>> >> SR on it, and then copy the VDI from the source SR to the destination
>>> SR
>>> >> (the new SR).
>>> >>
>>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts), but
>>> it
>>> >> requires CPU cycles on the compute cluster as well as network
>>> bandwidth to
>>> >> write to the SAN (thus it is slower and more resource intensive than a
>>> >> SolidFire snapshot).
>>> >>
>>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning
>>> this
>>> >> issue before and during the CloudStack Collaboration Conference in
>>> Budapest
>>> >> in November. He agreed that this is a legitimate issue with the way
>>> >> XenServer is designed and could not think of a way (other than what I
>>> was
>>> >> doing) to get around it in current versions of XenServer.
>>> >>
>>> >> One thought is to have a feature added to XenServer that enables you to
>>> >> change the UUID of an SR and of a VDI.
>>> >>
>>> >> If I could do that, then I could take a SolidFire snapshot of the
>>> >> SolidFire volume and issue commands to XenServer to have it change the
>>> >> UUIDs of the original SR and the original VDI. I could then recored the
>>> >> necessary UUID info in the CS DB.
>>> >>
>>> >> *** End of e-mail ***
>>> >>
>>> >> I have since investigated this on ESXi.
>>> >>
>>> >> ESXi does have a way for us to "re-signature" a datastore, so backend
>>> >> snapshots can be taken and effectively used on this hypervisor.
>>> >>
>>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
>>> lbarfield@tqhosting.com>
>>> >> wrote:
>>> >>
>>> >>> I'm just going to stick with the qemu-img option change for RBD for
>>> >>> now (which should cut snapshot time down drastically), and look
>>> >>> forward to this in the future.  I'd be happy to help get this moving,
>>> >>> but I'm not enough of a developer to lead the charge.
>>> >>>
>>> >>> As far as renaming goes, I agree that maybe backups isn't the right
>>> >>> word.  That being said calling a full-sized copy of a volume a
>>> >>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>>> >>>
>>> >>> I've also got my reservations about "accounts" vs "users" (I think
>>> >>> "departments" and "accounts or users" respectively is less confusing),
>>> >>> but that's a different thread.
>>> >>>
>>> >>> Thank You,
>>> >>>
>>> >>> Logan Barfield
>>> >>> Tranquil Hosting
>>> >>>
>>> >>>
>>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl>
>>> >>> wrote:
>>> >>> >
>>> >>> >
>>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
>>> >>> >> I like this idea a lot for Ceph RBD.  I do think there should
>>> still be
>>> >>> >> support for copying snapshots to secondary storage as needed (for
>>> >>> >> transfers between zones, etc.).  I really think that this could be
>>> >>> >> part of a larger move to clarify the naming conventions used for
>>> disk
>>> >>> >> operations.  Currently "Volume Snapshots" should probably really be
>>> >>> >> called "Backups".  So having "snapshot" functionality, and a
>>> "convert
>>> >>> >> snapshot to backup/template" would be a good move.
>>> >>> >>
>>> >>> >
>>> >>> > I fully agree that this would be a very great addition.
>>> >>> >
>>> >>> > I won't be able to work on this any time soon though.
>>> >>> >
>>> >>> > Wido
>>> >>> >
>>> >>> >> Thank You,
>>> >>> >>
>>> >>> >> Logan Barfield
>>> >>> >> Tranquil Hosting
>>> >>> >>
>>> >>> >>
>>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>>> >>> andrija.panic@gmail.com> wrote:
>>> >>> >>> BIG +1
>>> >>> >>>
>>> >>> >>> My team should submit some patch to ACS for better KVM snapshots,
>>> >>> including
>>> >>> >>> whole VM snapshot etc...but it's too early to give details...
>>> >>> >>> best
>>> >>> >>>
>>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
>>> andrei@arhont.com>
>>> >>> wrote:
>>> >>> >>>
>>> >>> >>>> Hello guys,
>>> >>> >>>>
>>> >>> >>>> I was hoping to have some feedback from the community on the
>>> subject
>>> >>> of
>>> >>> >>>> having an ability to keep snapshots on the primary storage where
>>> it
>>> >>> is
>>> >>> >>>> supported by the storage backend.
>>> >>> >>>>
>>> >>> >>>> The idea behind this functionality is to improve how snapshots
>>> are
>>> >>> >>>> currently handled on KVM hypervisors with Ceph primary storage.
>>> At
>>> >>> the
>>> >>> >>>> moment, the snapshots are taken on the primary storage and being
>>> >>> copied to
>>> >>> >>>> the secondary storage. This method is very slow and inefficient
>>> even
>>> >>> on
>>> >>> >>>> small infrastructure. Even on medium deployments using snapshots
>>> in
>>> >>> KVM
>>> >>> >>>> becomes nearly impossible. If you have tens or hundreds
>>> concurrent
>>> >>> >>>> snapshots taking place you will have a bunch of timeouts and
>>> errors,
>>> >>> your
>>> >>> >>>> network becomes clogged, etc. In addition, using these snapshots
>>> for
>>> >>> >>>> creating new volumes or reverting back vms also slow and
>>> >>> inefficient. As
>>> >>> >>>> above, when you have tens or hundreds concurrent operations it
>>> will
>>> >>> not
>>> >>> >>>> succeed and you will have a majority of tasks with errors or
>>> >>> timeouts.
>>> >>> >>>>
>>> >>> >>>> At the moment, taking a single snapshot of relatively small
>>> volumes
>>> >>> (200GB
>>> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes.
>>> Taking
>>> >>> a
>>> >>> >>>> snapshot of the same volume on ceph primary storage takes a few
>>> >>> seconds at
>>> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens if
>>> not
>>> >>> >>>> hundreds of minutes when secondary storage is involved; compared
>>> with
>>> >>> >>>> seconds if done directly on the primary storage.
>>> >>> >>>>
>>> >>> >>>> I suggest that the CloudStack should have the ability to keep
>>> volume
>>> >>> >>>> snapshots on the primary storage where this is supported by the
>>> >>> storage.
>>> >>> >>>> Perhaps having a per primary storage setting that enables this
>>> >>> >>>> functionality. This will be beneficial for Ceph primary storage
>>> on
>>> >>> KVM
>>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be supported
>>> in
>>> >>> a near
>>> >>> >>>> future.
>>> >>> >>>>
>>> >>> >>>> This will greatly speed up the process of using snapshots on KVM
>>> and
>>> >>> users
>>> >>> >>>> will actually start using snapshotting rather than giving up with
>>> >>> >>>> frustration.
>>> >>> >>>>
>>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your
>>> vote
>>> >>> if you
>>> >>> >>>> are in agreement.
>>> >>> >>>>
>>> >>> >>>> Thanks for your input
>>> >>> >>>>
>>> >>> >>>> Andrei
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>>
>>> >>> >>>
>>> >>> >>>
>>> >>> >>> --
>>> >>> >>>
>>> >>> >>> Andrija Panić
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> *Mike Tutkowski*
>>> >> *Senior CloudStack Developer, SolidFire Inc.*
>>> >> e: mike.tutkowski@solidfire.com
>>> >> o: 303.746.7302
>>> >> Advancing the way the world uses the cloud
>>> >> <http://solidfire.com/solution/overview/?video=play>*™*
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > *Mike Tutkowski*
>>> > *Senior CloudStack Developer, SolidFire Inc.*
>>> > e: mike.tutkowski@solidfire.com
>>> > o: 303.746.7302
>>> > Advancing the way the world uses the cloud
>>> > <http://solidfire.com/solution/overview/?video=play>*™*
>>>
>>
>>
>>
>> --
>> *Mike Tutkowski*
>> *Senior CloudStack Developer, SolidFire Inc.*
>> e: mike.tutkowski@solidfire.com
>> o: 303.746.7302
>> Advancing the way the world uses the cloud
>> <http://solidfire.com/solution/overview/?video=play>*™*
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
For example, Punith from CloudByte sent out an e-mail yesterday that was
very similar to this thread, but he was wondering how to implement such a
concept on his company's SAN technology.

On Mon, Feb 16, 2015 at 10:40 AM, Mike Tutkowski <
mike.tutkowski@solidfire.com> wrote:

> Yeah, I think it's a similar concept, though.
>
> You would want to take snapshots on Ceph (or some other backend system
> that acts as primary storage) instead of copying data to secondary storage
> and calling it a snapshot.
>
> For Ceph or any other backend system like that, the idea is to speed up
> snapshots by not requiring CPU cycles on the front end or network bandwidth
> to transfer the data.
>
> In that sense, this is a general-purpose CloudStack problem and it appears
> you are intending on discussing only the Ceph implementation here, which is
> fine.
>
> On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <lb...@tqhosting.com>
> wrote:
>
>> Hi Mike,
>>
>> I think the interest in this issue is primarily for Ceph RBD, which
>> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
>> is only currently supported in KVM (and VMware?).  QEMU has native RBD
>> support, so it attaches the devices directly to the VMs in question.
>> It also natively supports snapshotting, which is what this discussion
>> is about.
>>
>> Thank You,
>>
>> Logan Barfield
>> Tranquil Hosting
>>
>>
>> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
>> <mi...@solidfire.com> wrote:
>> > I should have also commented on KVM (since that was the hypervisor
>> called
>> > out in the initial e-mail).
>> >
>> > In my situation, most of my customers use XenServer and/or ESXi, so KVM
>> has
>> > received the fewest of my cycles with regards to those three
>> hypervisors.
>> >
>> > KVM, though, is actually the simplest hypervisor for which to implement
>> > these changes (since I am using the iSCSI adapter of the KVM agent and
>> it
>> > just essentially passes my LUN to the VM in question).
>> >
>> > For KVM, there is no clustered file system applied to my backend LUN,
>> so I
>> > don't have to "worry" about that layer.
>> >
>> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such is
>> the
>> > case with XenServer) or having to re-signature anything (such is the
>> case
>> > with ESXi).
>> >
>> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
>> > mike.tutkowski@solidfire.com> wrote:
>> >
>> >> I have been working on this on and off for a while now (as time
>> permits).
>> >>
>> >> Here is an e-mail I sent to a customer of ours that helps describe
>> some of
>> >> the issues:
>> >>
>> >> *** Beginning of e-mail ***
>> >>
>> >> The main requests were around the following features:
>> >>
>> >> * The ability to leverage SolidFire snapshots.
>> >>
>> >> * The ability to create CloudStack templates from SolidFire snapshots.
>> >>
>> >> I had these on my roadmap, but bumped the priority up and began work on
>> >> them for the CS 4.6 release.
>> >>
>> >> During design, I realized there were issues with the way XenServer is
>> >> architected that prevented me from directly using SolidFire snapshots.
>> >>
>> >> I could definitely take a SolidFire snapshot of a SolidFire volume, but
>> >> this snapshot would not be usable from XenServer if the original
>> volume was
>> >> still in use.
>> >>
>> >> Here is the gist of the problem:
>> >>
>> >> When XenServer leverages an iSCSI target such as a SolidFire volume, it
>> >> applies a clustered files system to it, which they call a storage
>> >> repository (SR). An SR has an *immutable* UUID associated with it.
>> >>
>> >> The virtual volume (which a VM sees as a disk) is represented by a
>> virtual
>> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
>> associated
>> >> with it.
>> >>
>> >> If I take a snapshot (or a clone) of the SolidFire volume and then
>> later
>> >> try to use that snapshot from XenServer, XenServer complains that the
>> SR on
>> >> the snapshot has a UUID that conflicts with an existing UUID.
>> >>
>> >> In other words, it is not possible to use the original SR and the
>> snapshot
>> >> of this SR from XenServer at the same time, which is critical in a
>> cloud
>> >> environment (to enable creating templates from snapshots).
>> >>
>> >> The way I have proposed circumventing this issue is not ideal, but
>> >> technically works (this code is checked into the CS 4.6 branch):
>> >>
>> >> When the time comes to take a CloudStack snapshot of a CloudStack
>> volume
>> >> that is backed by SolidFire storage via the storage plug-in, the
>> plug-in
>> >> will create a new SolidFire volume with characteristics (size and IOPS)
>> >> equal to those of the original volume.
>> >>
>> >> We then have XenServer attach to this new SolidFire volume, create a
>> *new*
>> >> SR on it, and then copy the VDI from the source SR to the destination
>> SR
>> >> (the new SR).
>> >>
>> >> This leads to us having a copy of the VDI (a "snapshot" of sorts), but
>> it
>> >> requires CPU cycles on the compute cluster as well as network
>> bandwidth to
>> >> write to the SAN (thus it is slower and more resource intensive than a
>> >> SolidFire snapshot).
>> >>
>> >> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning
>> this
>> >> issue before and during the CloudStack Collaboration Conference in
>> Budapest
>> >> in November. He agreed that this is a legitimate issue with the way
>> >> XenServer is designed and could not think of a way (other than what I
>> was
>> >> doing) to get around it in current versions of XenServer.
>> >>
>> >> One thought is to have a feature added to XenServer that enables you to
>> >> change the UUID of an SR and of a VDI.
>> >>
>> >> If I could do that, then I could take a SolidFire snapshot of the
>> >> SolidFire volume and issue commands to XenServer to have it change the
>> >> UUIDs of the original SR and the original VDI. I could then recored the
>> >> necessary UUID info in the CS DB.
>> >>
>> >> *** End of e-mail ***
>> >>
>> >> I have since investigated this on ESXi.
>> >>
>> >> ESXi does have a way for us to "re-signature" a datastore, so backend
>> >> snapshots can be taken and effectively used on this hypervisor.
>> >>
>> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
>> lbarfield@tqhosting.com>
>> >> wrote:
>> >>
>> >>> I'm just going to stick with the qemu-img option change for RBD for
>> >>> now (which should cut snapshot time down drastically), and look
>> >>> forward to this in the future.  I'd be happy to help get this moving,
>> >>> but I'm not enough of a developer to lead the charge.
>> >>>
>> >>> As far as renaming goes, I agree that maybe backups isn't the right
>> >>> word.  That being said calling a full-sized copy of a volume a
>> >>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>> >>>
>> >>> I've also got my reservations about "accounts" vs "users" (I think
>> >>> "departments" and "accounts or users" respectively is less confusing),
>> >>> but that's a different thread.
>> >>>
>> >>> Thank You,
>> >>>
>> >>> Logan Barfield
>> >>> Tranquil Hosting
>> >>>
>> >>>
>> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl>
>> >>> wrote:
>> >>> >
>> >>> >
>> >>> > On 16-02-15 15:38, Logan Barfield wrote:
>> >>> >> I like this idea a lot for Ceph RBD.  I do think there should
>> still be
>> >>> >> support for copying snapshots to secondary storage as needed (for
>> >>> >> transfers between zones, etc.).  I really think that this could be
>> >>> >> part of a larger move to clarify the naming conventions used for
>> disk
>> >>> >> operations.  Currently "Volume Snapshots" should probably really be
>> >>> >> called "Backups".  So having "snapshot" functionality, and a
>> "convert
>> >>> >> snapshot to backup/template" would be a good move.
>> >>> >>
>> >>> >
>> >>> > I fully agree that this would be a very great addition.
>> >>> >
>> >>> > I won't be able to work on this any time soon though.
>> >>> >
>> >>> > Wido
>> >>> >
>> >>> >> Thank You,
>> >>> >>
>> >>> >> Logan Barfield
>> >>> >> Tranquil Hosting
>> >>> >>
>> >>> >>
>> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>> >>> andrija.panic@gmail.com> wrote:
>> >>> >>> BIG +1
>> >>> >>>
>> >>> >>> My team should submit some patch to ACS for better KVM snapshots,
>> >>> including
>> >>> >>> whole VM snapshot etc...but it's too early to give details...
>> >>> >>> best
>> >>> >>>
>> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
>> andrei@arhont.com>
>> >>> wrote:
>> >>> >>>
>> >>> >>>> Hello guys,
>> >>> >>>>
>> >>> >>>> I was hoping to have some feedback from the community on the
>> subject
>> >>> of
>> >>> >>>> having an ability to keep snapshots on the primary storage where
>> it
>> >>> is
>> >>> >>>> supported by the storage backend.
>> >>> >>>>
>> >>> >>>> The idea behind this functionality is to improve how snapshots
>> are
>> >>> >>>> currently handled on KVM hypervisors with Ceph primary storage.
>> At
>> >>> the
>> >>> >>>> moment, the snapshots are taken on the primary storage and being
>> >>> copied to
>> >>> >>>> the secondary storage. This method is very slow and inefficient
>> even
>> >>> on
>> >>> >>>> small infrastructure. Even on medium deployments using snapshots
>> in
>> >>> KVM
>> >>> >>>> becomes nearly impossible. If you have tens or hundreds
>> concurrent
>> >>> >>>> snapshots taking place you will have a bunch of timeouts and
>> errors,
>> >>> your
>> >>> >>>> network becomes clogged, etc. In addition, using these snapshots
>> for
>> >>> >>>> creating new volumes or reverting back vms also slow and
>> >>> inefficient. As
>> >>> >>>> above, when you have tens or hundreds concurrent operations it
>> will
>> >>> not
>> >>> >>>> succeed and you will have a majority of tasks with errors or
>> >>> timeouts.
>> >>> >>>>
>> >>> >>>> At the moment, taking a single snapshot of relatively small
>> volumes
>> >>> (200GB
>> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes.
>> Taking
>> >>> a
>> >>> >>>> snapshot of the same volume on ceph primary storage takes a few
>> >>> seconds at
>> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens if
>> not
>> >>> >>>> hundreds of minutes when secondary storage is involved; compared
>> with
>> >>> >>>> seconds if done directly on the primary storage.
>> >>> >>>>
>> >>> >>>> I suggest that the CloudStack should have the ability to keep
>> volume
>> >>> >>>> snapshots on the primary storage where this is supported by the
>> >>> storage.
>> >>> >>>> Perhaps having a per primary storage setting that enables this
>> >>> >>>> functionality. This will be beneficial for Ceph primary storage
>> on
>> >>> KVM
>> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be supported
>> in
>> >>> a near
>> >>> >>>> future.
>> >>> >>>>
>> >>> >>>> This will greatly speed up the process of using snapshots on KVM
>> and
>> >>> users
>> >>> >>>> will actually start using snapshotting rather than giving up with
>> >>> >>>> frustration.
>> >>> >>>>
>> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your
>> vote
>> >>> if you
>> >>> >>>> are in agreement.
>> >>> >>>>
>> >>> >>>> Thanks for your input
>> >>> >>>>
>> >>> >>>> Andrei
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>
>> >>> >>>
>> >>> >>> --
>> >>> >>>
>> >>> >>> Andrija Panić
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> *Mike Tutkowski*
>> >> *Senior CloudStack Developer, SolidFire Inc.*
>> >> e: mike.tutkowski@solidfire.com
>> >> o: 303.746.7302
>> >> Advancing the way the world uses the cloud
>> >> <http://solidfire.com/solution/overview/?video=play>*™*
>> >>
>> >
>> >
>> >
>> > --
>> > *Mike Tutkowski*
>> > *Senior CloudStack Developer, SolidFire Inc.*
>> > e: mike.tutkowski@solidfire.com
>> > o: 303.746.7302
>> > Advancing the way the world uses the cloud
>> > <http://solidfire.com/solution/overview/?video=play>*™*
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
Yeah, I think it's a similar concept, though.

You would want to take snapshots on Ceph (or some other backend system that
acts as primary storage) instead of copying data to secondary storage and
calling it a snapshot.

For Ceph or any other backend system like that, the idea is to speed up
snapshots by not requiring CPU cycles on the front end or network bandwidth
to transfer the data.

In that sense, this is a general-purpose CloudStack problem and it appears
you are intending on discussing only the Ceph implementation here, which is
fine.

On Mon, Feb 16, 2015 at 10:34 AM, Logan Barfield <lb...@tqhosting.com>
wrote:

> Hi Mike,
>
> I think the interest in this issue is primarily for Ceph RBD, which
> doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
> is only currently supported in KVM (and VMware?).  QEMU has native RBD
> support, so it attaches the devices directly to the VMs in question.
> It also natively supports snapshotting, which is what this discussion
> is about.
>
> Thank You,
>
> Logan Barfield
> Tranquil Hosting
>
>
> On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
> <mi...@solidfire.com> wrote:
> > I should have also commented on KVM (since that was the hypervisor called
> > out in the initial e-mail).
> >
> > In my situation, most of my customers use XenServer and/or ESXi, so KVM
> has
> > received the fewest of my cycles with regards to those three hypervisors.
> >
> > KVM, though, is actually the simplest hypervisor for which to implement
> > these changes (since I am using the iSCSI adapter of the KVM agent and it
> > just essentially passes my LUN to the VM in question).
> >
> > For KVM, there is no clustered file system applied to my backend LUN, so
> I
> > don't have to "worry" about that layer.
> >
> > I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such is
> the
> > case with XenServer) or having to re-signature anything (such is the case
> > with ESXi).
> >
> > On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> > mike.tutkowski@solidfire.com> wrote:
> >
> >> I have been working on this on and off for a while now (as time
> permits).
> >>
> >> Here is an e-mail I sent to a customer of ours that helps describe some
> of
> >> the issues:
> >>
> >> *** Beginning of e-mail ***
> >>
> >> The main requests were around the following features:
> >>
> >> * The ability to leverage SolidFire snapshots.
> >>
> >> * The ability to create CloudStack templates from SolidFire snapshots.
> >>
> >> I had these on my roadmap, but bumped the priority up and began work on
> >> them for the CS 4.6 release.
> >>
> >> During design, I realized there were issues with the way XenServer is
> >> architected that prevented me from directly using SolidFire snapshots.
> >>
> >> I could definitely take a SolidFire snapshot of a SolidFire volume, but
> >> this snapshot would not be usable from XenServer if the original volume
> was
> >> still in use.
> >>
> >> Here is the gist of the problem:
> >>
> >> When XenServer leverages an iSCSI target such as a SolidFire volume, it
> >> applies a clustered files system to it, which they call a storage
> >> repository (SR). An SR has an *immutable* UUID associated with it.
> >>
> >> The virtual volume (which a VM sees as a disk) is represented by a
> virtual
> >> disk image (VDI) in the SR. A VDI also has an *immutable* UUID
> associated
> >> with it.
> >>
> >> If I take a snapshot (or a clone) of the SolidFire volume and then later
> >> try to use that snapshot from XenServer, XenServer complains that the
> SR on
> >> the snapshot has a UUID that conflicts with an existing UUID.
> >>
> >> In other words, it is not possible to use the original SR and the
> snapshot
> >> of this SR from XenServer at the same time, which is critical in a cloud
> >> environment (to enable creating templates from snapshots).
> >>
> >> The way I have proposed circumventing this issue is not ideal, but
> >> technically works (this code is checked into the CS 4.6 branch):
> >>
> >> When the time comes to take a CloudStack snapshot of a CloudStack volume
> >> that is backed by SolidFire storage via the storage plug-in, the plug-in
> >> will create a new SolidFire volume with characteristics (size and IOPS)
> >> equal to those of the original volume.
> >>
> >> We then have XenServer attach to this new SolidFire volume, create a
> *new*
> >> SR on it, and then copy the VDI from the source SR to the destination SR
> >> (the new SR).
> >>
> >> This leads to us having a copy of the VDI (a "snapshot" of sorts), but
> it
> >> requires CPU cycles on the compute cluster as well as network bandwidth
> to
> >> write to the SAN (thus it is slower and more resource intensive than a
> >> SolidFire snapshot).
> >>
> >> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning
> this
> >> issue before and during the CloudStack Collaboration Conference in
> Budapest
> >> in November. He agreed that this is a legitimate issue with the way
> >> XenServer is designed and could not think of a way (other than what I
> was
> >> doing) to get around it in current versions of XenServer.
> >>
> >> One thought is to have a feature added to XenServer that enables you to
> >> change the UUID of an SR and of a VDI.
> >>
> >> If I could do that, then I could take a SolidFire snapshot of the
> >> SolidFire volume and issue commands to XenServer to have it change the
> >> UUIDs of the original SR and the original VDI. I could then recored the
> >> necessary UUID info in the CS DB.
> >>
> >> *** End of e-mail ***
> >>
> >> I have since investigated this on ESXi.
> >>
> >> ESXi does have a way for us to "re-signature" a datastore, so backend
> >> snapshots can be taken and effectively used on this hypervisor.
> >>
> >> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <
> lbarfield@tqhosting.com>
> >> wrote:
> >>
> >>> I'm just going to stick with the qemu-img option change for RBD for
> >>> now (which should cut snapshot time down drastically), and look
> >>> forward to this in the future.  I'd be happy to help get this moving,
> >>> but I'm not enough of a developer to lead the charge.
> >>>
> >>> As far as renaming goes, I agree that maybe backups isn't the right
> >>> word.  That being said calling a full-sized copy of a volume a
> >>> "snapshot" also isn't the right word.  Maybe "image" would be better?
> >>>
> >>> I've also got my reservations about "accounts" vs "users" (I think
> >>> "departments" and "accounts or users" respectively is less confusing),
> >>> but that's a different thread.
> >>>
> >>> Thank You,
> >>>
> >>> Logan Barfield
> >>> Tranquil Hosting
> >>>
> >>>
> >>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl>
> >>> wrote:
> >>> >
> >>> >
> >>> > On 16-02-15 15:38, Logan Barfield wrote:
> >>> >> I like this idea a lot for Ceph RBD.  I do think there should still
> be
> >>> >> support for copying snapshots to secondary storage as needed (for
> >>> >> transfers between zones, etc.).  I really think that this could be
> >>> >> part of a larger move to clarify the naming conventions used for
> disk
> >>> >> operations.  Currently "Volume Snapshots" should probably really be
> >>> >> called "Backups".  So having "snapshot" functionality, and a
> "convert
> >>> >> snapshot to backup/template" would be a good move.
> >>> >>
> >>> >
> >>> > I fully agree that this would be a very great addition.
> >>> >
> >>> > I won't be able to work on this any time soon though.
> >>> >
> >>> > Wido
> >>> >
> >>> >> Thank You,
> >>> >>
> >>> >> Logan Barfield
> >>> >> Tranquil Hosting
> >>> >>
> >>> >>
> >>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
> >>> andrija.panic@gmail.com> wrote:
> >>> >>> BIG +1
> >>> >>>
> >>> >>> My team should submit some patch to ACS for better KVM snapshots,
> >>> including
> >>> >>> whole VM snapshot etc...but it's too early to give details...
> >>> >>> best
> >>> >>>
> >>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <
> andrei@arhont.com>
> >>> wrote:
> >>> >>>
> >>> >>>> Hello guys,
> >>> >>>>
> >>> >>>> I was hoping to have some feedback from the community on the
> subject
> >>> of
> >>> >>>> having an ability to keep snapshots on the primary storage where
> it
> >>> is
> >>> >>>> supported by the storage backend.
> >>> >>>>
> >>> >>>> The idea behind this functionality is to improve how snapshots are
> >>> >>>> currently handled on KVM hypervisors with Ceph primary storage. At
> >>> the
> >>> >>>> moment, the snapshots are taken on the primary storage and being
> >>> copied to
> >>> >>>> the secondary storage. This method is very slow and inefficient
> even
> >>> on
> >>> >>>> small infrastructure. Even on medium deployments using snapshots
> in
> >>> KVM
> >>> >>>> becomes nearly impossible. If you have tens or hundreds concurrent
> >>> >>>> snapshots taking place you will have a bunch of timeouts and
> errors,
> >>> your
> >>> >>>> network becomes clogged, etc. In addition, using these snapshots
> for
> >>> >>>> creating new volumes or reverting back vms also slow and
> >>> inefficient. As
> >>> >>>> above, when you have tens or hundreds concurrent operations it
> will
> >>> not
> >>> >>>> succeed and you will have a majority of tasks with errors or
> >>> timeouts.
> >>> >>>>
> >>> >>>> At the moment, taking a single snapshot of relatively small
> volumes
> >>> (200GB
> >>> >>>> or 500GB for instance) takes tens if not hundreds of minutes.
> Taking
> >>> a
> >>> >>>> snapshot of the same volume on ceph primary storage takes a few
> >>> seconds at
> >>> >>>> most! Similarly, converting a snapshot to a volume takes tens if
> not
> >>> >>>> hundreds of minutes when secondary storage is involved; compared
> with
> >>> >>>> seconds if done directly on the primary storage.
> >>> >>>>
> >>> >>>> I suggest that the CloudStack should have the ability to keep
> volume
> >>> >>>> snapshots on the primary storage where this is supported by the
> >>> storage.
> >>> >>>> Perhaps having a per primary storage setting that enables this
> >>> >>>> functionality. This will be beneficial for Ceph primary storage on
> >>> KVM
> >>> >>>> hypervisors and perhaps on XenServer when Ceph will be supported
> in
> >>> a near
> >>> >>>> future.
> >>> >>>>
> >>> >>>> This will greatly speed up the process of using snapshots on KVM
> and
> >>> users
> >>> >>>> will actually start using snapshotting rather than giving up with
> >>> >>>> frustration.
> >>> >>>>
> >>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote
> >>> if you
> >>> >>>> are in agreement.
> >>> >>>>
> >>> >>>> Thanks for your input
> >>> >>>>
> >>> >>>> Andrei
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>
> >>> >>>
> >>> >>> --
> >>> >>>
> >>> >>> Andrija Panić
> >>>
> >>
> >>
> >>
> >> --
> >> *Mike Tutkowski*
> >> *Senior CloudStack Developer, SolidFire Inc.*
> >> e: mike.tutkowski@solidfire.com
> >> o: 303.746.7302
> >> Advancing the way the world uses the cloud
> >> <http://solidfire.com/solution/overview/?video=play>*™*
> >>
> >
> >
> >
> > --
> > *Mike Tutkowski*
> > *Senior CloudStack Developer, SolidFire Inc.*
> > e: mike.tutkowski@solidfire.com
> > o: 303.746.7302
> > Advancing the way the world uses the cloud
> > <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Logan Barfield <lb...@tqhosting.com>.
Hi Mike,

I think the interest in this issue is primarily for Ceph RBD, which
doesn't use iSCSI or SAN concepts in general.  As well I believe RBD
is only currently supported in KVM (and VMware?).  QEMU has native RBD
support, so it attaches the devices directly to the VMs in question.
It also natively supports snapshotting, which is what this discussion
is about.

Thank You,

Logan Barfield
Tranquil Hosting


On Mon, Feb 16, 2015 at 11:46 AM, Mike Tutkowski
<mi...@solidfire.com> wrote:
> I should have also commented on KVM (since that was the hypervisor called
> out in the initial e-mail).
>
> In my situation, most of my customers use XenServer and/or ESXi, so KVM has
> received the fewest of my cycles with regards to those three hypervisors.
>
> KVM, though, is actually the simplest hypervisor for which to implement
> these changes (since I am using the iSCSI adapter of the KVM agent and it
> just essentially passes my LUN to the VM in question).
>
> For KVM, there is no clustered file system applied to my backend LUN, so I
> don't have to "worry" about that layer.
>
> I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such is the
> case with XenServer) or having to re-signature anything (such is the case
> with ESXi).
>
> On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
> mike.tutkowski@solidfire.com> wrote:
>
>> I have been working on this on and off for a while now (as time permits).
>>
>> Here is an e-mail I sent to a customer of ours that helps describe some of
>> the issues:
>>
>> *** Beginning of e-mail ***
>>
>> The main requests were around the following features:
>>
>> * The ability to leverage SolidFire snapshots.
>>
>> * The ability to create CloudStack templates from SolidFire snapshots.
>>
>> I had these on my roadmap, but bumped the priority up and began work on
>> them for the CS 4.6 release.
>>
>> During design, I realized there were issues with the way XenServer is
>> architected that prevented me from directly using SolidFire snapshots.
>>
>> I could definitely take a SolidFire snapshot of a SolidFire volume, but
>> this snapshot would not be usable from XenServer if the original volume was
>> still in use.
>>
>> Here is the gist of the problem:
>>
>> When XenServer leverages an iSCSI target such as a SolidFire volume, it
>> applies a clustered files system to it, which they call a storage
>> repository (SR). An SR has an *immutable* UUID associated with it.
>>
>> The virtual volume (which a VM sees as a disk) is represented by a virtual
>> disk image (VDI) in the SR. A VDI also has an *immutable* UUID associated
>> with it.
>>
>> If I take a snapshot (or a clone) of the SolidFire volume and then later
>> try to use that snapshot from XenServer, XenServer complains that the SR on
>> the snapshot has a UUID that conflicts with an existing UUID.
>>
>> In other words, it is not possible to use the original SR and the snapshot
>> of this SR from XenServer at the same time, which is critical in a cloud
>> environment (to enable creating templates from snapshots).
>>
>> The way I have proposed circumventing this issue is not ideal, but
>> technically works (this code is checked into the CS 4.6 branch):
>>
>> When the time comes to take a CloudStack snapshot of a CloudStack volume
>> that is backed by SolidFire storage via the storage plug-in, the plug-in
>> will create a new SolidFire volume with characteristics (size and IOPS)
>> equal to those of the original volume.
>>
>> We then have XenServer attach to this new SolidFire volume, create a *new*
>> SR on it, and then copy the VDI from the source SR to the destination SR
>> (the new SR).
>>
>> This leads to us having a copy of the VDI (a "snapshot" of sorts), but it
>> requires CPU cycles on the compute cluster as well as network bandwidth to
>> write to the SAN (thus it is slower and more resource intensive than a
>> SolidFire snapshot).
>>
>> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning this
>> issue before and during the CloudStack Collaboration Conference in Budapest
>> in November. He agreed that this is a legitimate issue with the way
>> XenServer is designed and could not think of a way (other than what I was
>> doing) to get around it in current versions of XenServer.
>>
>> One thought is to have a feature added to XenServer that enables you to
>> change the UUID of an SR and of a VDI.
>>
>> If I could do that, then I could take a SolidFire snapshot of the
>> SolidFire volume and issue commands to XenServer to have it change the
>> UUIDs of the original SR and the original VDI. I could then recored the
>> necessary UUID info in the CS DB.
>>
>> *** End of e-mail ***
>>
>> I have since investigated this on ESXi.
>>
>> ESXi does have a way for us to "re-signature" a datastore, so backend
>> snapshots can be taken and effectively used on this hypervisor.
>>
>> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <lb...@tqhosting.com>
>> wrote:
>>
>>> I'm just going to stick with the qemu-img option change for RBD for
>>> now (which should cut snapshot time down drastically), and look
>>> forward to this in the future.  I'd be happy to help get this moving,
>>> but I'm not enough of a developer to lead the charge.
>>>
>>> As far as renaming goes, I agree that maybe backups isn't the right
>>> word.  That being said calling a full-sized copy of a volume a
>>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>>>
>>> I've also got my reservations about "accounts" vs "users" (I think
>>> "departments" and "accounts or users" respectively is less confusing),
>>> but that's a different thread.
>>>
>>> Thank You,
>>>
>>> Logan Barfield
>>> Tranquil Hosting
>>>
>>>
>>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl>
>>> wrote:
>>> >
>>> >
>>> > On 16-02-15 15:38, Logan Barfield wrote:
>>> >> I like this idea a lot for Ceph RBD.  I do think there should still be
>>> >> support for copying snapshots to secondary storage as needed (for
>>> >> transfers between zones, etc.).  I really think that this could be
>>> >> part of a larger move to clarify the naming conventions used for disk
>>> >> operations.  Currently "Volume Snapshots" should probably really be
>>> >> called "Backups".  So having "snapshot" functionality, and a "convert
>>> >> snapshot to backup/template" would be a good move.
>>> >>
>>> >
>>> > I fully agree that this would be a very great addition.
>>> >
>>> > I won't be able to work on this any time soon though.
>>> >
>>> > Wido
>>> >
>>> >> Thank You,
>>> >>
>>> >> Logan Barfield
>>> >> Tranquil Hosting
>>> >>
>>> >>
>>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>>> andrija.panic@gmail.com> wrote:
>>> >>> BIG +1
>>> >>>
>>> >>> My team should submit some patch to ACS for better KVM snapshots,
>>> including
>>> >>> whole VM snapshot etc...but it's too early to give details...
>>> >>> best
>>> >>>
>>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com>
>>> wrote:
>>> >>>
>>> >>>> Hello guys,
>>> >>>>
>>> >>>> I was hoping to have some feedback from the community on the subject
>>> of
>>> >>>> having an ability to keep snapshots on the primary storage where it
>>> is
>>> >>>> supported by the storage backend.
>>> >>>>
>>> >>>> The idea behind this functionality is to improve how snapshots are
>>> >>>> currently handled on KVM hypervisors with Ceph primary storage. At
>>> the
>>> >>>> moment, the snapshots are taken on the primary storage and being
>>> copied to
>>> >>>> the secondary storage. This method is very slow and inefficient even
>>> on
>>> >>>> small infrastructure. Even on medium deployments using snapshots in
>>> KVM
>>> >>>> becomes nearly impossible. If you have tens or hundreds concurrent
>>> >>>> snapshots taking place you will have a bunch of timeouts and errors,
>>> your
>>> >>>> network becomes clogged, etc. In addition, using these snapshots for
>>> >>>> creating new volumes or reverting back vms also slow and
>>> inefficient. As
>>> >>>> above, when you have tens or hundreds concurrent operations it will
>>> not
>>> >>>> succeed and you will have a majority of tasks with errors or
>>> timeouts.
>>> >>>>
>>> >>>> At the moment, taking a single snapshot of relatively small volumes
>>> (200GB
>>> >>>> or 500GB for instance) takes tens if not hundreds of minutes. Taking
>>> a
>>> >>>> snapshot of the same volume on ceph primary storage takes a few
>>> seconds at
>>> >>>> most! Similarly, converting a snapshot to a volume takes tens if not
>>> >>>> hundreds of minutes when secondary storage is involved; compared with
>>> >>>> seconds if done directly on the primary storage.
>>> >>>>
>>> >>>> I suggest that the CloudStack should have the ability to keep volume
>>> >>>> snapshots on the primary storage where this is supported by the
>>> storage.
>>> >>>> Perhaps having a per primary storage setting that enables this
>>> >>>> functionality. This will be beneficial for Ceph primary storage on
>>> KVM
>>> >>>> hypervisors and perhaps on XenServer when Ceph will be supported in
>>> a near
>>> >>>> future.
>>> >>>>
>>> >>>> This will greatly speed up the process of using snapshots on KVM and
>>> users
>>> >>>> will actually start using snapshotting rather than giving up with
>>> >>>> frustration.
>>> >>>>
>>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote
>>> if you
>>> >>>> are in agreement.
>>> >>>>
>>> >>>> Thanks for your input
>>> >>>>
>>> >>>> Andrei
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>>
>>> >>> Andrija Panić
>>>
>>
>>
>>
>> --
>> *Mike Tutkowski*
>> *Senior CloudStack Developer, SolidFire Inc.*
>> e: mike.tutkowski@solidfire.com
>> o: 303.746.7302
>> Advancing the way the world uses the cloud
>> <http://solidfire.com/solution/overview/?video=play>*™*
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
I should have also commented on KVM (since that was the hypervisor called
out in the initial e-mail).

In my situation, most of my customers use XenServer and/or ESXi, so KVM has
received the fewest of my cycles with regards to those three hypervisors.

KVM, though, is actually the simplest hypervisor for which to implement
these changes (since I am using the iSCSI adapter of the KVM agent and it
just essentially passes my LUN to the VM in question).

For KVM, there is no clustered file system applied to my backend LUN, so I
don't have to "worry" about that layer.

I don't see any hurdles like *immutable* UUIDs of SRs and VDIs (such is the
case with XenServer) or having to re-signature anything (such is the case
with ESXi).

On Mon, Feb 16, 2015 at 9:33 AM, Mike Tutkowski <
mike.tutkowski@solidfire.com> wrote:

> I have been working on this on and off for a while now (as time permits).
>
> Here is an e-mail I sent to a customer of ours that helps describe some of
> the issues:
>
> *** Beginning of e-mail ***
>
> The main requests were around the following features:
>
> * The ability to leverage SolidFire snapshots.
>
> * The ability to create CloudStack templates from SolidFire snapshots.
>
> I had these on my roadmap, but bumped the priority up and began work on
> them for the CS 4.6 release.
>
> During design, I realized there were issues with the way XenServer is
> architected that prevented me from directly using SolidFire snapshots.
>
> I could definitely take a SolidFire snapshot of a SolidFire volume, but
> this snapshot would not be usable from XenServer if the original volume was
> still in use.
>
> Here is the gist of the problem:
>
> When XenServer leverages an iSCSI target such as a SolidFire volume, it
> applies a clustered files system to it, which they call a storage
> repository (SR). An SR has an *immutable* UUID associated with it.
>
> The virtual volume (which a VM sees as a disk) is represented by a virtual
> disk image (VDI) in the SR. A VDI also has an *immutable* UUID associated
> with it.
>
> If I take a snapshot (or a clone) of the SolidFire volume and then later
> try to use that snapshot from XenServer, XenServer complains that the SR on
> the snapshot has a UUID that conflicts with an existing UUID.
>
> In other words, it is not possible to use the original SR and the snapshot
> of this SR from XenServer at the same time, which is critical in a cloud
> environment (to enable creating templates from snapshots).
>
> The way I have proposed circumventing this issue is not ideal, but
> technically works (this code is checked into the CS 4.6 branch):
>
> When the time comes to take a CloudStack snapshot of a CloudStack volume
> that is backed by SolidFire storage via the storage plug-in, the plug-in
> will create a new SolidFire volume with characteristics (size and IOPS)
> equal to those of the original volume.
>
> We then have XenServer attach to this new SolidFire volume, create a *new*
> SR on it, and then copy the VDI from the source SR to the destination SR
> (the new SR).
>
> This leads to us having a copy of the VDI (a "snapshot" of sorts), but it
> requires CPU cycles on the compute cluster as well as network bandwidth to
> write to the SAN (thus it is slower and more resource intensive than a
> SolidFire snapshot).
>
> I spoke with Tim Mackey (who works on XenServer at Citrix) concerning this
> issue before and during the CloudStack Collaboration Conference in Budapest
> in November. He agreed that this is a legitimate issue with the way
> XenServer is designed and could not think of a way (other than what I was
> doing) to get around it in current versions of XenServer.
>
> One thought is to have a feature added to XenServer that enables you to
> change the UUID of an SR and of a VDI.
>
> If I could do that, then I could take a SolidFire snapshot of the
> SolidFire volume and issue commands to XenServer to have it change the
> UUIDs of the original SR and the original VDI. I could then recored the
> necessary UUID info in the CS DB.
>
> *** End of e-mail ***
>
> I have since investigated this on ESXi.
>
> ESXi does have a way for us to "re-signature" a datastore, so backend
> snapshots can be taken and effectively used on this hypervisor.
>
> On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <lb...@tqhosting.com>
> wrote:
>
>> I'm just going to stick with the qemu-img option change for RBD for
>> now (which should cut snapshot time down drastically), and look
>> forward to this in the future.  I'd be happy to help get this moving,
>> but I'm not enough of a developer to lead the charge.
>>
>> As far as renaming goes, I agree that maybe backups isn't the right
>> word.  That being said calling a full-sized copy of a volume a
>> "snapshot" also isn't the right word.  Maybe "image" would be better?
>>
>> I've also got my reservations about "accounts" vs "users" (I think
>> "departments" and "accounts or users" respectively is less confusing),
>> but that's a different thread.
>>
>> Thank You,
>>
>> Logan Barfield
>> Tranquil Hosting
>>
>>
>> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl>
>> wrote:
>> >
>> >
>> > On 16-02-15 15:38, Logan Barfield wrote:
>> >> I like this idea a lot for Ceph RBD.  I do think there should still be
>> >> support for copying snapshots to secondary storage as needed (for
>> >> transfers between zones, etc.).  I really think that this could be
>> >> part of a larger move to clarify the naming conventions used for disk
>> >> operations.  Currently "Volume Snapshots" should probably really be
>> >> called "Backups".  So having "snapshot" functionality, and a "convert
>> >> snapshot to backup/template" would be a good move.
>> >>
>> >
>> > I fully agree that this would be a very great addition.
>> >
>> > I won't be able to work on this any time soon though.
>> >
>> > Wido
>> >
>> >> Thank You,
>> >>
>> >> Logan Barfield
>> >> Tranquil Hosting
>> >>
>> >>
>> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <
>> andrija.panic@gmail.com> wrote:
>> >>> BIG +1
>> >>>
>> >>> My team should submit some patch to ACS for better KVM snapshots,
>> including
>> >>> whole VM snapshot etc...but it's too early to give details...
>> >>> best
>> >>>
>> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com>
>> wrote:
>> >>>
>> >>>> Hello guys,
>> >>>>
>> >>>> I was hoping to have some feedback from the community on the subject
>> of
>> >>>> having an ability to keep snapshots on the primary storage where it
>> is
>> >>>> supported by the storage backend.
>> >>>>
>> >>>> The idea behind this functionality is to improve how snapshots are
>> >>>> currently handled on KVM hypervisors with Ceph primary storage. At
>> the
>> >>>> moment, the snapshots are taken on the primary storage and being
>> copied to
>> >>>> the secondary storage. This method is very slow and inefficient even
>> on
>> >>>> small infrastructure. Even on medium deployments using snapshots in
>> KVM
>> >>>> becomes nearly impossible. If you have tens or hundreds concurrent
>> >>>> snapshots taking place you will have a bunch of timeouts and errors,
>> your
>> >>>> network becomes clogged, etc. In addition, using these snapshots for
>> >>>> creating new volumes or reverting back vms also slow and
>> inefficient. As
>> >>>> above, when you have tens or hundreds concurrent operations it will
>> not
>> >>>> succeed and you will have a majority of tasks with errors or
>> timeouts.
>> >>>>
>> >>>> At the moment, taking a single snapshot of relatively small volumes
>> (200GB
>> >>>> or 500GB for instance) takes tens if not hundreds of minutes. Taking
>> a
>> >>>> snapshot of the same volume on ceph primary storage takes a few
>> seconds at
>> >>>> most! Similarly, converting a snapshot to a volume takes tens if not
>> >>>> hundreds of minutes when secondary storage is involved; compared with
>> >>>> seconds if done directly on the primary storage.
>> >>>>
>> >>>> I suggest that the CloudStack should have the ability to keep volume
>> >>>> snapshots on the primary storage where this is supported by the
>> storage.
>> >>>> Perhaps having a per primary storage setting that enables this
>> >>>> functionality. This will be beneficial for Ceph primary storage on
>> KVM
>> >>>> hypervisors and perhaps on XenServer when Ceph will be supported in
>> a near
>> >>>> future.
>> >>>>
>> >>>> This will greatly speed up the process of using snapshots on KVM and
>> users
>> >>>> will actually start using snapshotting rather than giving up with
>> >>>> frustration.
>> >>>>
>> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote
>> if you
>> >>>> are in agreement.
>> >>>>
>> >>>> Thanks for your input
>> >>>>
>> >>>> Andrei
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>>
>> >>> Andrija Panić
>>
>
>
>
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e: mike.tutkowski@solidfire.com
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <http://solidfire.com/solution/overview/?video=play>*™*
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Mike Tutkowski <mi...@solidfire.com>.
I have been working on this on and off for a while now (as time permits).

Here is an e-mail I sent to a customer of ours that helps describe some of
the issues:

*** Beginning of e-mail ***

The main requests were around the following features:

* The ability to leverage SolidFire snapshots.

* The ability to create CloudStack templates from SolidFire snapshots.

I had these on my roadmap, but bumped the priority up and began work on
them for the CS 4.6 release.

During design, I realized there were issues with the way XenServer is
architected that prevented me from directly using SolidFire snapshots.

I could definitely take a SolidFire snapshot of a SolidFire volume, but
this snapshot would not be usable from XenServer if the original volume was
still in use.

Here is the gist of the problem:

When XenServer leverages an iSCSI target such as a SolidFire volume, it
applies a clustered files system to it, which they call a storage
repository (SR). An SR has an *immutable* UUID associated with it.

The virtual volume (which a VM sees as a disk) is represented by a virtual
disk image (VDI) in the SR. A VDI also has an *immutable* UUID associated
with it.

If I take a snapshot (or a clone) of the SolidFire volume and then later
try to use that snapshot from XenServer, XenServer complains that the SR on
the snapshot has a UUID that conflicts with an existing UUID.

In other words, it is not possible to use the original SR and the snapshot
of this SR from XenServer at the same time, which is critical in a cloud
environment (to enable creating templates from snapshots).

The way I have proposed circumventing this issue is not ideal, but
technically works (this code is checked into the CS 4.6 branch):

When the time comes to take a CloudStack snapshot of a CloudStack volume
that is backed by SolidFire storage via the storage plug-in, the plug-in
will create a new SolidFire volume with characteristics (size and IOPS)
equal to those of the original volume.

We then have XenServer attach to this new SolidFire volume, create a *new*
SR on it, and then copy the VDI from the source SR to the destination SR
(the new SR).

This leads to us having a copy of the VDI (a "snapshot" of sorts), but it
requires CPU cycles on the compute cluster as well as network bandwidth to
write to the SAN (thus it is slower and more resource intensive than a
SolidFire snapshot).

I spoke with Tim Mackey (who works on XenServer at Citrix) concerning this
issue before and during the CloudStack Collaboration Conference in Budapest
in November. He agreed that this is a legitimate issue with the way
XenServer is designed and could not think of a way (other than what I was
doing) to get around it in current versions of XenServer.

One thought is to have a feature added to XenServer that enables you to
change the UUID of an SR and of a VDI.

If I could do that, then I could take a SolidFire snapshot of the SolidFire
volume and issue commands to XenServer to have it change the UUIDs of the
original SR and the original VDI. I could then recored the necessary UUID
info in the CS DB.

*** End of e-mail ***

I have since investigated this on ESXi.

ESXi does have a way for us to "re-signature" a datastore, so backend
snapshots can be taken and effectively used on this hypervisor.

On Mon, Feb 16, 2015 at 8:19 AM, Logan Barfield <lb...@tqhosting.com>
wrote:

> I'm just going to stick with the qemu-img option change for RBD for
> now (which should cut snapshot time down drastically), and look
> forward to this in the future.  I'd be happy to help get this moving,
> but I'm not enough of a developer to lead the charge.
>
> As far as renaming goes, I agree that maybe backups isn't the right
> word.  That being said calling a full-sized copy of a volume a
> "snapshot" also isn't the right word.  Maybe "image" would be better?
>
> I've also got my reservations about "accounts" vs "users" (I think
> "departments" and "accounts or users" respectively is less confusing),
> but that's a different thread.
>
> Thank You,
>
> Logan Barfield
> Tranquil Hosting
>
>
> On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl>
> wrote:
> >
> >
> > On 16-02-15 15:38, Logan Barfield wrote:
> >> I like this idea a lot for Ceph RBD.  I do think there should still be
> >> support for copying snapshots to secondary storage as needed (for
> >> transfers between zones, etc.).  I really think that this could be
> >> part of a larger move to clarify the naming conventions used for disk
> >> operations.  Currently "Volume Snapshots" should probably really be
> >> called "Backups".  So having "snapshot" functionality, and a "convert
> >> snapshot to backup/template" would be a good move.
> >>
> >
> > I fully agree that this would be a very great addition.
> >
> > I won't be able to work on this any time soon though.
> >
> > Wido
> >
> >> Thank You,
> >>
> >> Logan Barfield
> >> Tranquil Hosting
> >>
> >>
> >> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <an...@gmail.com>
> wrote:
> >>> BIG +1
> >>>
> >>> My team should submit some patch to ACS for better KVM snapshots,
> including
> >>> whole VM snapshot etc...but it's too early to give details...
> >>> best
> >>>
> >>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com>
> wrote:
> >>>
> >>>> Hello guys,
> >>>>
> >>>> I was hoping to have some feedback from the community on the subject
> of
> >>>> having an ability to keep snapshots on the primary storage where it is
> >>>> supported by the storage backend.
> >>>>
> >>>> The idea behind this functionality is to improve how snapshots are
> >>>> currently handled on KVM hypervisors with Ceph primary storage. At the
> >>>> moment, the snapshots are taken on the primary storage and being
> copied to
> >>>> the secondary storage. This method is very slow and inefficient even
> on
> >>>> small infrastructure. Even on medium deployments using snapshots in
> KVM
> >>>> becomes nearly impossible. If you have tens or hundreds concurrent
> >>>> snapshots taking place you will have a bunch of timeouts and errors,
> your
> >>>> network becomes clogged, etc. In addition, using these snapshots for
> >>>> creating new volumes or reverting back vms also slow and inefficient.
> As
> >>>> above, when you have tens or hundreds concurrent operations it will
> not
> >>>> succeed and you will have a majority of tasks with errors or timeouts.
> >>>>
> >>>> At the moment, taking a single snapshot of relatively small volumes
> (200GB
> >>>> or 500GB for instance) takes tens if not hundreds of minutes. Taking a
> >>>> snapshot of the same volume on ceph primary storage takes a few
> seconds at
> >>>> most! Similarly, converting a snapshot to a volume takes tens if not
> >>>> hundreds of minutes when secondary storage is involved; compared with
> >>>> seconds if done directly on the primary storage.
> >>>>
> >>>> I suggest that the CloudStack should have the ability to keep volume
> >>>> snapshots on the primary storage where this is supported by the
> storage.
> >>>> Perhaps having a per primary storage setting that enables this
> >>>> functionality. This will be beneficial for Ceph primary storage on KVM
> >>>> hypervisors and perhaps on XenServer when Ceph will be supported in a
> near
> >>>> future.
> >>>>
> >>>> This will greatly speed up the process of using snapshots on KVM and
> users
> >>>> will actually start using snapshotting rather than giving up with
> >>>> frustration.
> >>>>
> >>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote if
> you
> >>>> are in agreement.
> >>>>
> >>>> Thanks for your input
> >>>>
> >>>> Andrei
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>> Andrija Panić
>



-- 
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkowski@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
<http://solidfire.com/solution/overview/?video=play>*™*

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Logan Barfield <lb...@tqhosting.com>.
I'm just going to stick with the qemu-img option change for RBD for
now (which should cut snapshot time down drastically), and look
forward to this in the future.  I'd be happy to help get this moving,
but I'm not enough of a developer to lead the charge.

As far as renaming goes, I agree that maybe backups isn't the right
word.  That being said calling a full-sized copy of a volume a
"snapshot" also isn't the right word.  Maybe "image" would be better?

I've also got my reservations about "accounts" vs "users" (I think
"departments" and "accounts or users" respectively is less confusing),
but that's a different thread.

Thank You,

Logan Barfield
Tranquil Hosting


On Mon, Feb 16, 2015 at 10:04 AM, Wido den Hollander <wi...@widodh.nl> wrote:
>
>
> On 16-02-15 15:38, Logan Barfield wrote:
>> I like this idea a lot for Ceph RBD.  I do think there should still be
>> support for copying snapshots to secondary storage as needed (for
>> transfers between zones, etc.).  I really think that this could be
>> part of a larger move to clarify the naming conventions used for disk
>> operations.  Currently "Volume Snapshots" should probably really be
>> called "Backups".  So having "snapshot" functionality, and a "convert
>> snapshot to backup/template" would be a good move.
>>
>
> I fully agree that this would be a very great addition.
>
> I won't be able to work on this any time soon though.
>
> Wido
>
>> Thank You,
>>
>> Logan Barfield
>> Tranquil Hosting
>>
>>
>> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <an...@gmail.com> wrote:
>>> BIG +1
>>>
>>> My team should submit some patch to ACS for better KVM snapshots, including
>>> whole VM snapshot etc...but it's too early to give details...
>>> best
>>>
>>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com> wrote:
>>>
>>>> Hello guys,
>>>>
>>>> I was hoping to have some feedback from the community on the subject of
>>>> having an ability to keep snapshots on the primary storage where it is
>>>> supported by the storage backend.
>>>>
>>>> The idea behind this functionality is to improve how snapshots are
>>>> currently handled on KVM hypervisors with Ceph primary storage. At the
>>>> moment, the snapshots are taken on the primary storage and being copied to
>>>> the secondary storage. This method is very slow and inefficient even on
>>>> small infrastructure. Even on medium deployments using snapshots in KVM
>>>> becomes nearly impossible. If you have tens or hundreds concurrent
>>>> snapshots taking place you will have a bunch of timeouts and errors, your
>>>> network becomes clogged, etc. In addition, using these snapshots for
>>>> creating new volumes or reverting back vms also slow and inefficient. As
>>>> above, when you have tens or hundreds concurrent operations it will not
>>>> succeed and you will have a majority of tasks with errors or timeouts.
>>>>
>>>> At the moment, taking a single snapshot of relatively small volumes (200GB
>>>> or 500GB for instance) takes tens if not hundreds of minutes. Taking a
>>>> snapshot of the same volume on ceph primary storage takes a few seconds at
>>>> most! Similarly, converting a snapshot to a volume takes tens if not
>>>> hundreds of minutes when secondary storage is involved; compared with
>>>> seconds if done directly on the primary storage.
>>>>
>>>> I suggest that the CloudStack should have the ability to keep volume
>>>> snapshots on the primary storage where this is supported by the storage.
>>>> Perhaps having a per primary storage setting that enables this
>>>> functionality. This will be beneficial for Ceph primary storage on KVM
>>>> hypervisors and perhaps on XenServer when Ceph will be supported in a near
>>>> future.
>>>>
>>>> This will greatly speed up the process of using snapshots on KVM and users
>>>> will actually start using snapshotting rather than giving up with
>>>> frustration.
>>>>
>>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you
>>>> are in agreement.
>>>>
>>>> Thanks for your input
>>>>
>>>> Andrei
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Andrija Panić

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Wido den Hollander <wi...@widodh.nl>.

On 16-02-15 15:38, Logan Barfield wrote:
> I like this idea a lot for Ceph RBD.  I do think there should still be
> support for copying snapshots to secondary storage as needed (for
> transfers between zones, etc.).  I really think that this could be
> part of a larger move to clarify the naming conventions used for disk
> operations.  Currently "Volume Snapshots" should probably really be
> called "Backups".  So having "snapshot" functionality, and a "convert
> snapshot to backup/template" would be a good move.
> 

I fully agree that this would be a very great addition.

I won't be able to work on this any time soon though.

Wido

> Thank You,
> 
> Logan Barfield
> Tranquil Hosting
> 
> 
> On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <an...@gmail.com> wrote:
>> BIG +1
>>
>> My team should submit some patch to ACS for better KVM snapshots, including
>> whole VM snapshot etc...but it's too early to give details...
>> best
>>
>> On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com> wrote:
>>
>>> Hello guys,
>>>
>>> I was hoping to have some feedback from the community on the subject of
>>> having an ability to keep snapshots on the primary storage where it is
>>> supported by the storage backend.
>>>
>>> The idea behind this functionality is to improve how snapshots are
>>> currently handled on KVM hypervisors with Ceph primary storage. At the
>>> moment, the snapshots are taken on the primary storage and being copied to
>>> the secondary storage. This method is very slow and inefficient even on
>>> small infrastructure. Even on medium deployments using snapshots in KVM
>>> becomes nearly impossible. If you have tens or hundreds concurrent
>>> snapshots taking place you will have a bunch of timeouts and errors, your
>>> network becomes clogged, etc. In addition, using these snapshots for
>>> creating new volumes or reverting back vms also slow and inefficient. As
>>> above, when you have tens or hundreds concurrent operations it will not
>>> succeed and you will have a majority of tasks with errors or timeouts.
>>>
>>> At the moment, taking a single snapshot of relatively small volumes (200GB
>>> or 500GB for instance) takes tens if not hundreds of minutes. Taking a
>>> snapshot of the same volume on ceph primary storage takes a few seconds at
>>> most! Similarly, converting a snapshot to a volume takes tens if not
>>> hundreds of minutes when secondary storage is involved; compared with
>>> seconds if done directly on the primary storage.
>>>
>>> I suggest that the CloudStack should have the ability to keep volume
>>> snapshots on the primary storage where this is supported by the storage.
>>> Perhaps having a per primary storage setting that enables this
>>> functionality. This will be beneficial for Ceph primary storage on KVM
>>> hypervisors and perhaps on XenServer when Ceph will be supported in a near
>>> future.
>>>
>>> This will greatly speed up the process of using snapshots on KVM and users
>>> will actually start using snapshotting rather than giving up with
>>> frustration.
>>>
>>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you
>>> are in agreement.
>>>
>>> Thanks for your input
>>>
>>> Andrei
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>>
>> Andrija Panić

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Logan Barfield <lb...@tqhosting.com>.
I like this idea a lot for Ceph RBD.  I do think there should still be
support for copying snapshots to secondary storage as needed (for
transfers between zones, etc.).  I really think that this could be
part of a larger move to clarify the naming conventions used for disk
operations.  Currently "Volume Snapshots" should probably really be
called "Backups".  So having "snapshot" functionality, and a "convert
snapshot to backup/template" would be a good move.

Thank You,

Logan Barfield
Tranquil Hosting


On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic <an...@gmail.com> wrote:
> BIG +1
>
> My team should submit some patch to ACS for better KVM snapshots, including
> whole VM snapshot etc...but it's too early to give details...
> best
>
> On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com> wrote:
>
>> Hello guys,
>>
>> I was hoping to have some feedback from the community on the subject of
>> having an ability to keep snapshots on the primary storage where it is
>> supported by the storage backend.
>>
>> The idea behind this functionality is to improve how snapshots are
>> currently handled on KVM hypervisors with Ceph primary storage. At the
>> moment, the snapshots are taken on the primary storage and being copied to
>> the secondary storage. This method is very slow and inefficient even on
>> small infrastructure. Even on medium deployments using snapshots in KVM
>> becomes nearly impossible. If you have tens or hundreds concurrent
>> snapshots taking place you will have a bunch of timeouts and errors, your
>> network becomes clogged, etc. In addition, using these snapshots for
>> creating new volumes or reverting back vms also slow and inefficient. As
>> above, when you have tens or hundreds concurrent operations it will not
>> succeed and you will have a majority of tasks with errors or timeouts.
>>
>> At the moment, taking a single snapshot of relatively small volumes (200GB
>> or 500GB for instance) takes tens if not hundreds of minutes. Taking a
>> snapshot of the same volume on ceph primary storage takes a few seconds at
>> most! Similarly, converting a snapshot to a volume takes tens if not
>> hundreds of minutes when secondary storage is involved; compared with
>> seconds if done directly on the primary storage.
>>
>> I suggest that the CloudStack should have the ability to keep volume
>> snapshots on the primary storage where this is supported by the storage.
>> Perhaps having a per primary storage setting that enables this
>> functionality. This will be beneficial for Ceph primary storage on KVM
>> hypervisors and perhaps on XenServer when Ceph will be supported in a near
>> future.
>>
>> This will greatly speed up the process of using snapshots on KVM and users
>> will actually start using snapshotting rather than giving up with
>> frustration.
>>
>> I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you
>> are in agreement.
>>
>> Thanks for your input
>>
>> Andrei
>>
>>
>>
>>
>>
>
>
> --
>
> Andrija Panić

Re: Your thoughts on using Primary Storage for keeping snapshots

Posted by Andrija Panic <an...@gmail.com>.
BIG +1

My team should submit some patch to ACS for better KVM snapshots, including
whole VM snapshot etc...but it's too early to give details...
best

On 16 February 2015 at 13:01, Andrei Mikhailovsky <an...@arhont.com> wrote:

> Hello guys,
>
> I was hoping to have some feedback from the community on the subject of
> having an ability to keep snapshots on the primary storage where it is
> supported by the storage backend.
>
> The idea behind this functionality is to improve how snapshots are
> currently handled on KVM hypervisors with Ceph primary storage. At the
> moment, the snapshots are taken on the primary storage and being copied to
> the secondary storage. This method is very slow and inefficient even on
> small infrastructure. Even on medium deployments using snapshots in KVM
> becomes nearly impossible. If you have tens or hundreds concurrent
> snapshots taking place you will have a bunch of timeouts and errors, your
> network becomes clogged, etc. In addition, using these snapshots for
> creating new volumes or reverting back vms also slow and inefficient. As
> above, when you have tens or hundreds concurrent operations it will not
> succeed and you will have a majority of tasks with errors or timeouts.
>
> At the moment, taking a single snapshot of relatively small volumes (200GB
> or 500GB for instance) takes tens if not hundreds of minutes. Taking a
> snapshot of the same volume on ceph primary storage takes a few seconds at
> most! Similarly, converting a snapshot to a volume takes tens if not
> hundreds of minutes when secondary storage is involved; compared with
> seconds if done directly on the primary storage.
>
> I suggest that the CloudStack should have the ability to keep volume
> snapshots on the primary storage where this is supported by the storage.
> Perhaps having a per primary storage setting that enables this
> functionality. This will be beneficial for Ceph primary storage on KVM
> hypervisors and perhaps on XenServer when Ceph will be supported in a near
> future.
>
> This will greatly speed up the process of using snapshots on KVM and users
> will actually start using snapshotting rather than giving up with
> frustration.
>
> I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you
> are in agreement.
>
> Thanks for your input
>
> Andrei
>
>
>
>
>


-- 

Andrija Panić