You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vcl.apache.org by Brian Bouterse <bm...@ncsu.edu> on 2009/03/30 13:50:12 UTC

ESX Provisioning Module Update (and a few concerns)

I wanted to let folks know the ESX provisioning module should be  
finished, or at least in a beta form.  I have closed the JIRA ticket  
(VCL-29) corresponding to the creation of this module.  Here is a  
review of the new functionality:

Manage multiple ESX or ESX 3i hypervisors
Deploys virtual machines onto these hypervisors
Supports virtual machine capture routing to allow for updating a VM  
image, and the creation of "new" derivative images

One last improvement will include refactoring the way our module  
gathers the private IP address of the VM that is being provisioned.   
The change include deprecating the "watching of the arp table" in  
favor of monitoring the dhcpd.leases file.

While VCL now supports ESX/ESX 3i netboot based virtual machines, the  
VCL architecture presents a lot of real challenges for making this a  
scalable solution, and I'd like to identify and discuss a few points/ 
concerns here.

VCL requires an entry in the computers table for each VM, and this  
entry needs to be tied to a vmhost.  By hard selecting the virtual  
machine entries in the computer table (a VM "slot") up front, the  
decision about where to place the next virtual machine isn't handled  
effectively.  Each VM "slot" gets statically assigned physical  
characteristics at creation time.  This very quickly creates a  
situation where there is space in the datacenter for a particular VM  
on one hypervisor or another, but VCL can't figure it out because the  
large RAM slots got used first for other images, and now the image in  
question can't find a slot to meet it's meta-data requirements.  VCL  
will incorrectly report that there is not space in the infrastructure,  
when there really is.  This is bad.

Also, it makes the setup much more difficult since an average, modern  
blade can run 20 - 30 VMs, and if you manage an entire blade center  
you're manually creating 350 computer table entries.  Each entry  
requires multiple, manual updates to the database.  This is not a  
tractable solution, and needs to be addressed.

As a possible solution (or part of one), one major moving part is the  
placement decision for a particular reservation (tantamount to which  
hypervisor this VM will be reserved on).  Placement today in VCL is  
decided in the front-end without asking the hypervisors what their  
capabilities are of accepting the next VM (or cluster of VMs) in  
question.  One possible way around this is to create a ESX placement  
controller module which determines where to place things.  This module  
would be part of the VCL backend (although it could be called from the  
frontend).  The ESX provisioning module authors each VMX file on the  
fly based on meta-data from the database, so it is already dynamic  
enough.

Best,
Brian


Brian Bouterse
Secure Open Systems Initiative
919.698.8796





Re: ESX Provisioning Module Update (and a few concerns)

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday March 30, 2009, Brian Bouterse wrote:
> While VCL now supports ESX/ESX 3i netboot based virtual machines, the
> VCL architecture presents a lot of real challenges for making this a
> scalable solution, and I'd like to identify and discuss a few points/
> concerns here.
>
> VCL requires an entry in the computers table for each VM, and this
> entry needs to be tied to a vmhost.  By hard selecting the virtual
> machine entries in the computer table (a VM "slot") up front, the
> decision about where to place the next virtual machine isn't handled
> effectively.  Each VM "slot" gets statically assigned physical
> characteristics at creation time.  This very quickly creates a
> situation where there is space in the datacenter for a particular VM
> on one hypervisor or another, but VCL can't figure it out because the
> large RAM slots got used first for other images, and now the image in
> question can't find a slot to meet it's meta-data requirements.  VCL
> will incorrectly report that there is not space in the infrastructure,
> when there really is.  This is bad.
>
> Also, it makes the setup much more difficult since an average, modern
> blade can run 20 - 30 VMs, and if you manage an entire blade center
> you're manually creating 350 computer table entries.  Each entry
> requires multiple, manual updates to the database.  This is not a
> tractable solution, and needs to be addressed.

To summarize what you've stated - the scheduler in the web frontend is not 
hypervisor aware.  The reason for this goes way back to a time when Aaron was 
not backlogged in development (on the backend) and I was backlogged (on the 
frontend).  Aaron came up with a way to add in support for vmware as a 
backend only implementation (since I didn't have time to add it in to the 
frontend).  A few months later, we designed out a really good solution for 
dynamically managing hypervisors that would handle all of the virtual machine 
creation in the computer table, automatically adding/removing host servers as 
needed, and reassigning VMs to hosts as needed.  Unfortunately, the tide 
switched and Aaron became swamped, and I had more development time.  I was 
able to implement the frontend part of this (though it never made it into 
production code), but Aaron never had time to implement the backend part.  
Since we had something working that met our needs, other things became a 
priority.

Aaron just looked yesterday, and he still has the notes on what we designed 
(back in 2007).  I still have a copy of what I coded around somewhere.  We'll 
see about getting those notes posted online and possibly creating a JIRA 
issue for it to place in our release schedule somewhere.

> As a possible solution (or part of one), one major moving part is the
> placement decision for a particular reservation (tantamount to which
> hypervisor this VM will be reserved on).  Placement today in VCL is
> decided in the front-end without asking the hypervisors what their
> capabilities are of accepting the next VM (or cluster of VMs) in
> question.  One possible way around this is to create a ESX placement
> controller module which determines where to place things.  This module
> would be part of the VCL backend (although it could be called from the
> frontend).  The ESX provisioning module authors each VMX file on the
> fly based on meta-data from the database, so it is already dynamic
> enough.

This concept is tricky.  Currently, there is no mechanism for frontend 
initiated communication directly with the backend.  I agree with your point 
that certain provisioning engines are aware of their own capacity, but can't 
easily keep that capacity in the database such that it is easily accessible 
by the frontend.  However, it's difficult to make a scalable solution where 
there could be many provisioning servers that would have to be quickly polled 
by the frontend before reporting to a user whether or not their request can 
be fulfilled.  I think for the near term, the solution mentioned above will 
meet most of our needs so that we can address this issue of the frontend 
having to get capacity information directly from the backend further down the 
road.

Josh
- -- 
- -------------------------------
Josh Thompson
Systems Programmer
Virtual Computing Lab (VCL)
North Carolina State University

Josh_Thompson@ncsu.edu
919-515-5323

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFJ0j+TV/LQcNdtPQMRAtPxAJ93eOtvxtNTPOKycpUla0hkAcsShQCfdcTU
340T4950AeZfXmcSVDGIVrA=
=1yn+
-----END PGP SIGNATURE-----

Re: ESX Provisioning Module Update (and a few concerns)

Posted by John Bass <jc...@gmail.com>.
Hi All,

Instead of a 'slot', can there be a notion of available resource for a
computer (where computer in this case is a hypervisor)? This will allow vcl
to 'pack' vm's onto a hypervisor based on the worst case resource parameters
for a vm image.

This also leads to the notion of a placement module outside the front end
web stuff to make the decision making algorithm for placement of vm's (or
bare-metal images) more extensible/configurable.

Any thoughts?

John Bass
john_bass@ncsu.edu
www.cnl.ncsu.edu
(919) 515-0154


On Mon, Mar 30, 2009 at 7:50 AM, Brian Bouterse <bm...@ncsu.edu> wrote:

> I wanted to let folks know the ESX provisioning module should be finished,
> or at least in a beta form.  I have closed the JIRA ticket (VCL-29)
> corresponding to the creation of this module.  Here is a review of the new
> functionality:
>
> Manage multiple ESX or ESX 3i hypervisors
> Deploys virtual machines onto these hypervisors
> Supports virtual machine capture routing to allow for updating a VM image,
> and the creation of "new" derivative images
>
> One last improvement will include refactoring the way our module gathers
> the private IP address of the VM that is being provisioned.  The change
> include deprecating the "watching of the arp table" in favor of monitoring
> the dhcpd.leases file.
>
> While VCL now supports ESX/ESX 3i netboot based virtual machines, the VCL
> architecture presents a lot of real challenges for making this a scalable
> solution, and I'd like to identify and discuss a few points/concerns here.
>
> VCL requires an entry in the computers table for each VM, and this entry
> needs to be tied to a vmhost.  By hard selecting the virtual machine entries
> in the computer table (a VM "slot") up front, the decision about where to
> place the next virtual machine isn't handled effectively.  Each VM "slot"
> gets statically assigned physical characteristics at creation time.  This
> very quickly creates a situation where there is space in the datacenter for
> a particular VM on one hypervisor or another, but VCL can't figure it out
> because the large RAM slots got used first for other images, and now the
> image in question can't find a slot to meet it's meta-data requirements.
>  VCL will incorrectly report that there is not space in the infrastructure,
> when there really is.  This is bad.
>
> Also, it makes the setup much more difficult since an average, modern blade
> can run 20 - 30 VMs, and if you manage an entire blade center you're
> manually creating 350 computer table entries.  Each entry requires multiple,
> manual updates to the database.  This is not a tractable solution, and needs
> to be addressed.
>
> As a possible solution (or part of one), one major moving part is the
> placement decision for a particular reservation (tantamount to which
> hypervisor this VM will be reserved on).  Placement today in VCL is decided
> in the front-end without asking the hypervisors what their capabilities are
> of accepting the next VM (or cluster of VMs) in question.  One possible way
> around this is to create a ESX placement controller module which determines
> where to place things.  This module would be part of the VCL backend
> (although it could be called from the frontend).  The ESX provisioning
> module authors each VMX file on the fly based on meta-data from the
> database, so it is already dynamic enough.
>
> Best,
> Brian
>
>
> Brian Bouterse
> Secure Open Systems Initiative
> 919.698.8796
>
>
>
>
>