You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@libcloud.apache.org by Paul Querna <pa...@querna.org> on 2010/04/06 05:04:41 UTC

[libcloud] Integrating Images for Providers and End Users

Hi Everyone,

Today at the Cloudkick Office, I met with a dozen people involved in
the Cloud Hackers group <http://twitter.com/cloudhackers>.  Included
in the group were Sebastien and Solomon who wrote dotCloud's
Cloudlets.

We had many discussions about Cloud Image Formats, and Cloudlets,
including a cool demo of using Cloudlets.

What I would like to focus on in this mail is the design of how
Cloudlet images would interact with Providers, and how users would be
able to create their own images for use on multiple providers.

I believe the user facing interface for cloudlets on the command line
is already very useful for anyone managing images;  It provides
versioning and an easy way to describe how the image should be
rendered.

The Cloudlet Image format, while still in flux around some areas, has
two main parts: The Manifest file, and a filesystem.  The manifest is
a JSON file that specifies many attributes of the images, from how it
is booted, to what files are templates, and what the inputs are to a
template.

Fundamentally, I believe most providers do not want a complicated
system tied to booting nodes -- I believe any design needs to enable
them to already leverage their existing methods of distributing
images, so directly extending the create_node API in every provider to
reference a URL or something down this path is difficult.

I believe the best way of integration for a provider, would be to add
a create image call to their API, which took a manifest file as the
payload.   The Provider would then return to the user an ID for this
image.  Inside the manifest file, it would reference a URL to a
tarball which contained the filesystem.  (There is some discussion
about the details of what is included in the create_image call, do you
PUT the whole tarball, reference it, encrypt it, sign it, etc, but IMO
these are details to be worked out, not a fundamental change in
architecture.)

The key piece of software that would need to be constructed, is a
Cloudlets Rendering service.  It would securely take the manifest
file, an output path, output provider, and an output type, to produce
an bootable image.  By making the outputs configurable, you could make
it write to a SAN in OVF, or to an HTTP url in a Xen format --
whatever the hosting provider needs.   This must be plugable, because
I believe it will vary greatly between deployments, and I believe will
follow how different our drivers are for libcloud already.

The Rendering service would take care of downloading the tarball,
caching it, and then coveting it to the output format, and saving it
to the output location.  The providers would be able to at render time
be able to pre-set much of the metadata used by the image, such as its
IP address configuration.   Once the rendering service outputs the
image, it would be the responsibility of the hosting provider to close
the loop: Provide a way for the image to show up in their list images,
and of course for it to be used as a parameter to create image.

Things like a mirror network built around the images, can easily be
done; the rendering service can know to rewrite mirrors.cloudlets.com
urls to a local mirror at each provider, so if we produced a set of
golden base images, or even high level services, it would be easy to
let each provider mirror them for faster service to their users.

Sebastien and Solomon are discussing how to implement a demo of the
Cloudlets Rendering Service, built on top of Amazon EC2.   You would
make an rest call to create_images, it would build an AMI, store it on
s3, and then return to you an AMI Id that you could use to boot it.
As long as we built the outputs and formats in plug able ways,
building it on top of amazon would make it easier to make sure the
basic ideas all work, and serve as an exmaple for hopefully how
hosting providers could implement it themselves.

I would really like some feedback on the general idea from the tech
side of the hosting providers.  I am happy to explain more of the
idea, and hopefully will find time tomorrow to draw a picture of it;
I don't know enough of how everyone is distributing images already
internally, are you using SANs, something over http, or nfs, or
something else completely, but I think as long as we made it in easily
replaceable modules, I think we can make it work for everyone.

Thoughts?

Thanks,

Paul