You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cloudstack.apache.org by Wido den Hollander <wi...@widodh.nl> on 2012/07/02 14:59:04 UTC

Re: First review of RBD support for primary storage

Hi,

On 29-06-12 17:59, Wido den Hollander wrote:
> Now, the RBD support for primary storage knows limitations:
>
> - It only works with KVM
>
> - You are NOT able to snapshot RBD volumes. This is due to CloudStack
> wanting to backup snapshots to the secondary storage and uses 'qemu-img
> convert' for this. That doesn't work with RBD, but it's also very
> inefficient.
>
> RBD supports native snapshots inside the Ceph cluster. RBD disks also
> have the potential to reach very large sizes. Disks of 1TB won't be the
> exception. It would stress your network heavily. I'm thinking about
> implementing "internal snapshots", but that is step #2. For now no
> snapshots.
>
> - You are able create a template from a RBD volume, but creating a new
> instance with RBD storage from a template is still a hit-and-miss.
> Working on that one.
>

I just pushed a fix for creating instances from a template. That should 
work now!

Wido

RE: First review of RBD support for primary storage

Posted by Edison Su <Ed...@citrix.com>.


> -----Original Message-----
> From: Chiradeep Vittal [mailto:Chiradeep.Vittal@citrix.com]
> Sent: Thursday, July 05, 2012 3:54 PM
> To: CloudStack DeveloperList
> Subject: Re: First review of RBD support for primary storage
> 
> I took a first glance at this. Really pleased about this feature. EBS-
> like
> scalable primary storage is within reach!
> 
> A few comments:
>  1. I see quite a few blocks of code ( > 20 times?) that are like
>      if (pool.getType() == StoragePoolType.RBD)
>     I realize that there is existing code that does these kinds of
> checks
> as well. To me this can be solved simply by the "chain of
> responsibility"
> pattern: you hand over the operation to a configured chain of handlers.
> The first handler (usually) that says it can handle it, terminates the
> chain.

It's in my to-do-list, refactor storage part code, to make adding a new storage type into cloudstack much easier.

>  2. 'user_info' can actually be pushed into the 'storage_pool_details'
> table. Generally we avoid modifying existing tables if we can.
>  3. Copying a snapshot to secondary storage is desirable: to be
> consistent
> with other storage types, to be able to instantiate new volumes in
> other
> zones (when S3 support is available across the region). I'd like to
> understand the blockers here.
> 
> 
> On 7/2/12 5:59 AM, "Wido den Hollander" <wi...@widodh.nl> wrote:
> 
> >Hi,
> >
> >On 29-06-12 17:59, Wido den Hollander wrote:
> >> Now, the RBD support for primary storage knows limitations:
> >>
> >> - It only works with KVM
> >>
> >> - You are NOT able to snapshot RBD volumes. This is due to
> CloudStack
> >> wanting to backup snapshots to the secondary storage and uses 'qemu-
> img
> >> convert' for this. That doesn't work with RBD, but it's also very
> >> inefficient.
> >>
> >> RBD supports native snapshots inside the Ceph cluster. RBD disks
> also
> >> have the potential to reach very large sizes. Disks of 1TB won't be
> the
> >> exception. It would stress your network heavily. I'm thinking about
> >> implementing "internal snapshots", but that is step #2. For now no
> >> snapshots.
> >>
> >> - You are able create a template from a RBD volume, but creating a
> new
> >> instance with RBD storage from a template is still a hit-and-miss.
> >> Working on that one.
> >>
> >
> >I just pushed a fix for creating instances from a template. That
> should
> >work now!
> >
> >Wido

Re: First review of RBD support for primary storage

Posted by Wido den Hollander <wi...@widodh.nl>.

First: Thanks for reviewing!

On 07/06/2012 12:54 AM, Chiradeep Vittal wrote:
> I took a first glance at this. Really pleased about this feature. EBS-like
> scalable primary storage is within reach!
>
> A few comments:
>   1. I see quite a few blocks of code ( > 20 times?) that are like
>       if (pool.getType() == StoragePoolType.RBD)
>      I realize that there is existing code that does these kinds of checks
> as well. To me this can be solved simply by the "chain of responsibility"
> pattern: you hand over the operation to a configured chain of handlers.
> The first handler (usually) that says it can handle it, terminates the
> chain.

Yes, that would indeed be better. The current code is not very flexible, 
it assumes everything is a regular file or block device, which is not 
the case with RBD.

In the current code I saw no other way then checking for the storage 
pool type multiple times.

>   2. 'user_info' can actually be pushed into the 'storage_pool_details'
> table. Generally we avoid modifying existing tables if we can.

I get that, but user_info is something that comes from java.net.URI, 
just like host, port and name. So I figure that user_info was at the 
right place in the storage_pool.

>   3. Copying a snapshot to secondary storage is desirable: to be consistent
> with other storage types, to be able to instantiate new volumes in other
> zones (when S3 support is available across the region). I'd like to
> understand the blockers here.

You can't copy a snapshot out of a RBD image to another destionation, 
this is not supported by qemu-img.

root@stack02:~# qemu-img convert -f raw -O qcow2 -s wido 
rbd:rbd/cloudstack:mon_host=stack02.ceph.widodh.nl:auth_supported=none 
/root/wido-snapshot.qcow2
qemu-img: Failed to load snapshot
root@stack02:~#

Here I'm trying to extract the snapshot "wido" out of the image 
"cloudstack" and copy it to a qcow2 image.

I prefer not to use the "rbd" CLI tool since that would bring another 
dependency into the picture.

This could probably be fixed inside qemu-img, but that would involve 
more patching to be done.

However, there is a warning here: RBD disks can become large, very 
large. In public clouds there should be a way to disable this for 
administrators, otherwise users could start snapshotting 5TB disks and 
copying them to the secondary storage.

That would eat CPU and network capacity.

Wido

>
>
> On 7/2/12 5:59 AM, "Wido den Hollander" <wi...@widodh.nl> wrote:
>
>> Hi,
>>
>> On 29-06-12 17:59, Wido den Hollander wrote:
>>> Now, the RBD support for primary storage knows limitations:
>>>
>>> - It only works with KVM
>>>
>>> - You are NOT able to snapshot RBD volumes. This is due to CloudStack
>>> wanting to backup snapshots to the secondary storage and uses 'qemu-img
>>> convert' for this. That doesn't work with RBD, but it's also very
>>> inefficient.
>>>
>>> RBD supports native snapshots inside the Ceph cluster. RBD disks also
>>> have the potential to reach very large sizes. Disks of 1TB won't be the
>>> exception. It would stress your network heavily. I'm thinking about
>>> implementing "internal snapshots", but that is step #2. For now no
>>> snapshots.
>>>
>>> - You are able create a template from a RBD volume, but creating a new
>>> instance with RBD storage from a template is still a hit-and-miss.
>>> Working on that one.
>>>
>>
>> I just pushed a fix for creating instances from a template. That should
>> work now!
>>
>> Wido
>

Re: First review of RBD support for primary storage

Posted by Chiradeep Vittal <Ch...@citrix.com>.

I took a first glance at this. Really pleased about this feature. EBS-like
scalable primary storage is within reach!

A few comments:
 1. I see quite a few blocks of code ( > 20 times?) that are like
     if (pool.getType() == StoragePoolType.RBD)
    I realize that there is existing code that does these kinds of checks
as well. To me this can be solved simply by the "chain of responsibility"
pattern: you hand over the operation to a configured chain of handlers.
The first handler (usually) that says it can handle it, terminates the
chain. 
 2. 'user_info' can actually be pushed into the 'storage_pool_details'
table. Generally we avoid modifying existing tables if we can.
 3. Copying a snapshot to secondary storage is desirable: to be consistent
with other storage types, to be able to instantiate new volumes in other
zones (when S3 support is available across the region). I'd like to
understand the blockers here.


On 7/2/12 5:59 AM, "Wido den Hollander" <wi...@widodh.nl> wrote:

>Hi,
>
>On 29-06-12 17:59, Wido den Hollander wrote:
>> Now, the RBD support for primary storage knows limitations:
>>
>> - It only works with KVM
>>
>> - You are NOT able to snapshot RBD volumes. This is due to CloudStack
>> wanting to backup snapshots to the secondary storage and uses 'qemu-img
>> convert' for this. That doesn't work with RBD, but it's also very
>> inefficient.
>>
>> RBD supports native snapshots inside the Ceph cluster. RBD disks also
>> have the potential to reach very large sizes. Disks of 1TB won't be the
>> exception. It would stress your network heavily. I'm thinking about
>> implementing "internal snapshots", but that is step #2. For now no
>> snapshots.
>>
>> - You are able create a template from a RBD volume, but creating a new
>> instance with RBD storage from a template is still a hit-and-miss.
>> Working on that one.
>>
>
>I just pushed a fix for creating instances from a template. That should
>work now!
>
>Wido