You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vcl.apache.org by Aaron Coburn <ac...@amherst.edu> on 2013/01/10 22:35:50 UTC

Datastore I/O latency

Hi,

The VCL currently allows VM Host profiles to identify a single "VM Working Directory Path", which can be either a dedicated or shared disk. All running VMs on a given host will use this datastore path for I/O operations. 

In our system, we are using a clustered VM host (VMware vCenter) infrastructure connected to a shared network (SAN) storage array. In the VM host profile, the "Virtual Disk Path" and "VM Working Directory Path" use separate datastores, each of which use distinct storage processors in the SAN. But still, all running VMs in the cluster use the same datastore for their working directories.

We have recently been encountering some significant I/O congestion on the host bus adaptors that connect to the VM working directory, resulting in really high I/O latency.

My current thinking on this is that I would like to make it possible for a VM host profile to identify multiple paths for the "Working Directory". This would mean that the path for a given VM would be determined via round robin. This would obviously have certain implications for the database and existing provisioning module code, as well as involve some front-end GUI work.

Do any of you have thoughts on this?

Thanks,

Aaron Coburn
Systems Administrator and Programmer
Academic Technology Services, Amherst College
acoburn@amherst.edu







Re: Datastore I/O latency

Posted by Aaron Coburn <ac...@amherst.edu>.
Andy, 
Thanks for the ideas, my comments are below.

On Jan 22, 2013, at 2:09 PM, Andy Kurth <an...@ncsu.edu> wrote:

> Regarding the original multiple paths idea, being able to define multiple
> VM profiles for a single VM host is desirable.  Rather than add additional
> complexity to the VM profile table, I would prefer to be able to define
> multiple vmhost entries for the same physical VM host computer.  One
> benefit of this approach is that it doesn't require any changes to the vcld
> code.
> 
> I have configured this manually in some situations in our environment by
> manually adding entries to the vmhost table with the same vmhost.computerid
> value but different vmhost.vmprofileid values.  This was done to be able to
> define a different datastore for some special purpose VMs.  With the new
> vmprofile.resourcepath and folder path options, being able to define
> multiple profiles for the same host would be more valuable.

If a VM host is assigned multiple profiles, then, presumably, any find commands (e.g. 'SearchDatastore*') would be applied to each of the profiles iteratively, stopping at the first match, while any write commands (e.g. 'http_put_file') would be applied in round-robin fashion -- perhaps just using a modulo operator.

> Additional comments below...
> 
> On Thu, Jan 17, 2013 at 3:11 PM, Aaron Coburn <ac...@amherst.edu> wrote:
> 
>> 
>> We have made some changes to the backend SAN Datastore that significantly
>> reduces the I/O congestion, so I don't presently have a need to add
>> multiple datastores to a given vmhost configuration. If that becomes
>> necessary, I will write some code to implement that.
>> 
>> On our system, in addition to moving the workspace datastore to faster
>> disks, I made another change, with which we are currently experimenting. I
>> would be curious to know what you think. I would be happy to submit the
>> relevant changes if there is interest.
>> 
>> In looking into the source of this I/O spike, it appeared to be
>> originating from the concurrent creation of the users' default profile in
>> Windows 7. Specifically, at the beginning of a block allocation in which a
>> large number of reservations begin simultaneously -- and therefore each
>> environment copies the default profile at about the same time, this puts a
>> huge I/O load on the backend datastore. The CPU and Memory usage of the
>> server cluster remained very low at the time, but the I/O congestion
>> rendered the VCL almost unusable.
>> 
> 
> I have seen similar problems before, especially when Adobe/Macromedia apps
> are installed.  They can add 100's of MBs to the AppData directory in the
> user profile.  If the apps are run when setting up the default user
> profile, this data is copied to the new user account's profile before the
> desktop appears.  If not, the data is copied when the user first launches
> the app.

Yes, and a block allocation for such an environment will only compound the problem.


>> What I have done is add an additional field to the image table, where the
>> name of a default user account can be stored. Then, when such an image is
>> created, that user account can be pre-created similar to the steps
>> currently outlined for configuring the default profile [1]. When the image
>> is captured, the account is disabled but not deleted.
>> 
>> When a user makes a reservation for the image, this default user account
>> is enabled and a password is set. Then, the user is provided with login
>> credentials for this account. When the user connects to the image, the
>> profile is already setup, and hence the login time is significantly
>> reduced. If the account does not exist on the image, it is simply created
>> anew.
>> 
>> I realize that this would not work for systems that integrate the Windows
>> OS-based login with a campus AD backend, and it may be potentially
>> confusing for users who are asked to login to a VCL environment using some
>> arbitrary username (i.e. vcl). Neither of these are issues in our setup,
>> because images are not tied to campus services and users can auto-connect
>> to their reservations (without typing the vcl-generated password), but
>> others may find this unappealing. This also isn't relevant for linux-based
>> images.
> 
> 
> I really like the idea of pre-staging the user account.  This would speed
> things up for end users.
> 
> It would also make it much easier for people to configure the default
> profile.  There are problems with the steps explained in [1].  The profile
> directory path of the configuration account often gets saved in several
> places in the registry such as C:\Users\Profile.  This causes various
> problems.
> 
> How about pre-staging the user account as you have described and renaming
> it when an image is reserved via:
> wmic useraccount where name='oldname' rename newname
> 
> If the user account is renamed, do you still see the need to use different
> account names for different images or could the same name be used globally?

This is a good idea, and I like it much more than my approach. I will experiment with it to see how it performs.


> Also out of curiosity, how are you doing the auto-connect?  Could it be
> added as an optional feature?

For this, I am using a custom protocol handler, i.e.

  rdp://username@password:hostname?parameters

Linking to that URI will then pass the data to an external application (if the OS is configured to accept it).

For OS X, CoRD [2] already does this. No additional steps are necessary. The downside of CoRD, however, is that it doesn't forward audio, which is important in certain cases.

For Windows, you will need an entry in the local registry that can both understand the rdp scheme and pass the data to an application. Unfortunately, the built-in RDP client for windows doesn't accept passwords from the command line, so I wrote a desktop app (wrapper) around the Terminal Services library that does [3]. The installer program will also add the registry command. We have been using this application for about a year, and the only issue that has come up with it (recently) has been that it doesn't support virtual channels for those who are using JAWS on a VCL image -- this is something I plan to address at some point.

For linux, I wrote a perl handler [4], which a user can install locally and then run some commands (i.e. for gnome) to enable:

  gconftool-2 -s /desktop/gnome/url-handlers/rdp/command '/path/to/rdp-handler.pl %s' --type String
  gconftool-2 -s /desktop/gnome/url-handlers/rdp/enabled --type Boolean true

Then, for the UI, I made a number of customizations to the theme, so that when a user clicks the "Connect" button (once a reservation is ready), there is a prominent "Connect!" link (and a drop-down menu for setting the screen size). When a user clicks on this link, some javascript grabs the value from the screen size menu and adds that, along with any other connection information to an rdp:// URI.

Let me know if you'd like more details.

Ideally, all of this would be part of a connect-method in the database and could be enabled as an optional feature.

Aaron C


[2] http://cord.sourceforge.net
[3] https://vcl.fivecolleges.edu/downloads/
[4] https://vcl.fivecolleges.edu/rdp-handler.pl


>> 
>> [1] http://vcl.apache.org/docs/configure-the-default-profile.html
>> 
>> 
>> 
>> 
>> 
>> On Jan 11, 2013, at 4:35 PM, Josh Thompson <jo...@ncsu.edu> wrote:
>> 
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>> 
>>> If this works best for the backend, I'm good with making it happen on the
>>> frontend.
>>> 
>>> Josh
>>> 
>>> On Thursday, January 10, 2013 9:35:50 PM Aaron Coburn wrote:
>>>> Hi,
>>>> 
>>>> The VCL currently allows VM Host profiles to identify a single "VM
>> Working
>>>> Directory Path", which can be either a dedicated or shared disk. All
>>>> running VMs on a given host will use this datastore path for I/O
>>>> operations.
>>>> 
>>>> In our system, we are using a clustered VM host (VMware vCenter)
>>>> infrastructure connected to a shared network (SAN) storage array. In
>> the VM
>>>> host profile, the "Virtual Disk Path" and "VM Working Directory Path"
>> use
>>>> separate datastores, each of which use distinct storage processors in
>> the
>>>> SAN. But still, all running VMs in the cluster use the same datastore
>> for
>>>> their working directories.
>>>> 
>>>> We have recently been encountering some significant I/O congestion on
>> the
>>>> host bus adaptors that connect to the VM working directory, resulting in
>>>> really high I/O latency.
>>>> 
>>>> My current thinking on this is that I would like to make it possible
>> for a
>>>> VM host profile to identify multiple paths for the "Working Directory".
>>>> This would mean that the path for a given VM would be determined via
>> round
>>>> robin. This would obviously have certain implications for the database
>> and
>>>> existing provisioning module code, as well as involve some front-end GUI
>>>> work.
>>>> 
>>>> Do any of you have thoughts on this?
>>>> 
>>>> Thanks,
>>>> 
>>>> Aaron Coburn
>>>> Systems Administrator and Programmer
>>>> Academic Technology Services, Amherst College
>>>> acoburn@amherst.edu
>>> - --
>>> - -------------------------------
>>> Josh Thompson
>>> VCL Developer
>>> North Carolina State University
>>> 
>>> my GPG/PGP key can be found at pgp.mit.edu
>>> 
>>> All electronic mail messages in connection with State business which
>>> are sent to or received by this account are subject to the NC Public
>>> Records Law and may be disclosed to third parties.
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.19 (GNU/Linux)
>>> 
>>> iEYEARECAAYFAlDwhb8ACgkQV/LQcNdtPQOS2ACeMSw188W0zNWv9dSuHdoY/JE7
>>> OhEAni0ga32Oawpwwcs4wljl21BadzzI
>>> =cyuN
>>> -----END PGP SIGNATURE-----
>>> 
>> 
>> 


Re: Datastore I/O latency

Posted by Andy Kurth <an...@ncsu.edu>.
Regarding the original multiple paths idea, being able to define multiple
VM profiles for a single VM host is desirable.  Rather than add additional
complexity to the VM profile table, I would prefer to be able to define
multiple vmhost entries for the same physical VM host computer.  One
benefit of this approach is that it doesn't require any changes to the vcld
code.

I have configured this manually in some situations in our environment by
manually adding entries to the vmhost table with the same vmhost.computerid
value but different vmhost.vmprofileid values.  This was done to be able to
define a different datastore for some special purpose VMs.  With the new
vmprofile.resourcepath and folder path options, being able to define
multiple profiles for the same host would be more valuable.

Additional comments below...

On Thu, Jan 17, 2013 at 3:11 PM, Aaron Coburn <ac...@amherst.edu> wrote:

>
> We have made some changes to the backend SAN Datastore that significantly
> reduces the I/O congestion, so I don't presently have a need to add
> multiple datastores to a given vmhost configuration. If that becomes
> necessary, I will write some code to implement that.
>
> On our system, in addition to moving the workspace datastore to faster
> disks, I made another change, with which we are currently experimenting. I
> would be curious to know what you think. I would be happy to submit the
> relevant changes if there is interest.
>
> In looking into the source of this I/O spike, it appeared to be
> originating from the concurrent creation of the users' default profile in
> Windows 7. Specifically, at the beginning of a block allocation in which a
> large number of reservations begin simultaneously -- and therefore each
> environment copies the default profile at about the same time, this puts a
> huge I/O load on the backend datastore. The CPU and Memory usage of the
> server cluster remained very low at the time, but the I/O congestion
> rendered the VCL almost unusable.
>

I have seen similar problems before, especially when Adobe/Macromedia apps
are installed.  They can add 100's of MBs to the AppData directory in the
user profile.  If the apps are run when setting up the default user
profile, this data is copied to the new user account's profile before the
desktop appears.  If not, the data is copied when the user first launches
the app.


>
> What I have done is add an additional field to the image table, where the
> name of a default user account can be stored. Then, when such an image is
> created, that user account can be pre-created similar to the steps
> currently outlined for configuring the default profile [1]. When the image
> is captured, the account is disabled but not deleted.
>
> When a user makes a reservation for the image, this default user account
> is enabled and a password is set. Then, the user is provided with login
> credentials for this account. When the user connects to the image, the
> profile is already setup, and hence the login time is significantly
> reduced. If the account does not exist on the image, it is simply created
> anew.
>
> I realize that this would not work for systems that integrate the Windows
> OS-based login with a campus AD backend, and it may be potentially
> confusing for users who are asked to login to a VCL environment using some
> arbitrary username (i.e. vcl). Neither of these are issues in our setup,
> because images are not tied to campus services and users can auto-connect
> to their reservations (without typing the vcl-generated password), but
> others may find this unappealing. This also isn't relevant for linux-based
> images.


I really like the idea of pre-staging the user account.  This would speed
things up for end users.

It would also make it much easier for people to configure the default
profile.  There are problems with the steps explained in [1].  The profile
directory path of the configuration account often gets saved in several
places in the registry such as C:\Users\Profile.  This causes various
problems.

How about pre-staging the user account as you have described and renaming
it when an image is reserved via:
wmic useraccount where name='oldname' rename newname

If the user account is renamed, do you still see the need to use different
account names for different images or could the same name be used globally?

Also out of curiosity, how are you doing the auto-connect?  Could it be
added as an optional feature?

Thanks,
Andy


> Aaron
>
> [1] http://vcl.apache.org/docs/configure-the-default-profile.html
>
>
>
>
>
> On Jan 11, 2013, at 4:35 PM, Josh Thompson <jo...@ncsu.edu> wrote:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > If this works best for the backend, I'm good with making it happen on the
> > frontend.
> >
> > Josh
> >
> > On Thursday, January 10, 2013 9:35:50 PM Aaron Coburn wrote:
> >> Hi,
> >>
> >> The VCL currently allows VM Host profiles to identify a single "VM
> Working
> >> Directory Path", which can be either a dedicated or shared disk. All
> >> running VMs on a given host will use this datastore path for I/O
> >> operations.
> >>
> >> In our system, we are using a clustered VM host (VMware vCenter)
> >> infrastructure connected to a shared network (SAN) storage array. In
> the VM
> >> host profile, the "Virtual Disk Path" and "VM Working Directory Path"
> use
> >> separate datastores, each of which use distinct storage processors in
> the
> >> SAN. But still, all running VMs in the cluster use the same datastore
> for
> >> their working directories.
> >>
> >> We have recently been encountering some significant I/O congestion on
> the
> >> host bus adaptors that connect to the VM working directory, resulting in
> >> really high I/O latency.
> >>
> >> My current thinking on this is that I would like to make it possible
> for a
> >> VM host profile to identify multiple paths for the "Working Directory".
> >> This would mean that the path for a given VM would be determined via
> round
> >> robin. This would obviously have certain implications for the database
> and
> >> existing provisioning module code, as well as involve some front-end GUI
> >> work.
> >>
> >> Do any of you have thoughts on this?
> >>
> >> Thanks,
> >>
> >> Aaron Coburn
> >> Systems Administrator and Programmer
> >> Academic Technology Services, Amherst College
> >> acoburn@amherst.edu
> > - --
> > - -------------------------------
> > Josh Thompson
> > VCL Developer
> > North Carolina State University
> >
> > my GPG/PGP key can be found at pgp.mit.edu
> >
> > All electronic mail messages in connection with State business which
> > are sent to or received by this account are subject to the NC Public
> > Records Law and may be disclosed to third parties.
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v2.0.19 (GNU/Linux)
> >
> > iEYEARECAAYFAlDwhb8ACgkQV/LQcNdtPQOS2ACeMSw188W0zNWv9dSuHdoY/JE7
> > OhEAni0ga32Oawpwwcs4wljl21BadzzI
> > =cyuN
> > -----END PGP SIGNATURE-----
> >
>
>

Re: Datastore I/O latency

Posted by Aaron Coburn <ac...@amherst.edu>.
We have made some changes to the backend SAN Datastore that significantly reduces the I/O congestion, so I don't presently have a need to add multiple datastores to a given vmhost configuration. If that becomes necessary, I will write some code to implement that.

On our system, in addition to moving the workspace datastore to faster disks, I made another change, with which we are currently experimenting. I would be curious to know what you think. I would be happy to submit the relevant changes if there is interest.

In looking into the source of this I/O spike, it appeared to be originating from the concurrent creation of the users' default profile in Windows 7. Specifically, at the beginning of a block allocation in which a large number of reservations begin simultaneously -- and therefore each environment copies the default profile at about the same time, this puts a huge I/O load on the backend datastore. The CPU and Memory usage of the server cluster remained very low at the time, but the I/O congestion rendered the VCL almost unusable.

What I have done is add an additional field to the image table, where the name of a default user account can be stored. Then, when such an image is created, that user account can be pre-created similar to the steps currently outlined for configuring the default profile [1]. When the image is captured, the account is disabled but not deleted.

When a user makes a reservation for the image, this default user account is enabled and a password is set. Then, the user is provided with login credentials for this account. When the user connects to the image, the profile is already setup, and hence the login time is significantly reduced. If the account does not exist on the image, it is simply created anew.

I realize that this would not work for systems that integrate the Windows OS-based login with a campus AD backend, and it may be potentially confusing for users who are asked to login to a VCL environment using some arbitrary username (i.e. vcl). Neither of these are issues in our setup, because images are not tied to campus services and users can auto-connect to their reservations (without typing the vcl-generated password), but others may find this unappealing. This also isn't relevant for linux-based images.

Aaron

[1] http://vcl.apache.org/docs/configure-the-default-profile.html





On Jan 11, 2013, at 4:35 PM, Josh Thompson <jo...@ncsu.edu> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> If this works best for the backend, I'm good with making it happen on the 
> frontend.
> 
> Josh
> 
> On Thursday, January 10, 2013 9:35:50 PM Aaron Coburn wrote:
>> Hi,
>> 
>> The VCL currently allows VM Host profiles to identify a single "VM Working
>> Directory Path", which can be either a dedicated or shared disk. All
>> running VMs on a given host will use this datastore path for I/O
>> operations.
>> 
>> In our system, we are using a clustered VM host (VMware vCenter)
>> infrastructure connected to a shared network (SAN) storage array. In the VM
>> host profile, the "Virtual Disk Path" and "VM Working Directory Path" use
>> separate datastores, each of which use distinct storage processors in the
>> SAN. But still, all running VMs in the cluster use the same datastore for
>> their working directories.
>> 
>> We have recently been encountering some significant I/O congestion on the
>> host bus adaptors that connect to the VM working directory, resulting in
>> really high I/O latency.
>> 
>> My current thinking on this is that I would like to make it possible for a
>> VM host profile to identify multiple paths for the "Working Directory".
>> This would mean that the path for a given VM would be determined via round
>> robin. This would obviously have certain implications for the database and
>> existing provisioning module code, as well as involve some front-end GUI
>> work.
>> 
>> Do any of you have thoughts on this?
>> 
>> Thanks,
>> 
>> Aaron Coburn
>> Systems Administrator and Programmer
>> Academic Technology Services, Amherst College
>> acoburn@amherst.edu
> - -- 
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
> 
> my GPG/PGP key can be found at pgp.mit.edu
> 
> All electronic mail messages in connection with State business which
> are sent to or received by this account are subject to the NC Public
> Records Law and may be disclosed to third parties.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.19 (GNU/Linux)
> 
> iEYEARECAAYFAlDwhb8ACgkQV/LQcNdtPQOS2ACeMSw188W0zNWv9dSuHdoY/JE7
> OhEAni0ga32Oawpwwcs4wljl21BadzzI
> =cyuN
> -----END PGP SIGNATURE-----
> 


Re: Datastore I/O latency

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

If this works best for the backend, I'm good with making it happen on the 
frontend.

Josh

On Thursday, January 10, 2013 9:35:50 PM Aaron Coburn wrote:
> Hi,
> 
> The VCL currently allows VM Host profiles to identify a single "VM Working
> Directory Path", which can be either a dedicated or shared disk. All
> running VMs on a given host will use this datastore path for I/O
> operations.
> 
> In our system, we are using a clustered VM host (VMware vCenter)
> infrastructure connected to a shared network (SAN) storage array. In the VM
> host profile, the "Virtual Disk Path" and "VM Working Directory Path" use
> separate datastores, each of which use distinct storage processors in the
> SAN. But still, all running VMs in the cluster use the same datastore for
> their working directories.
> 
> We have recently been encountering some significant I/O congestion on the
> host bus adaptors that connect to the VM working directory, resulting in
> really high I/O latency.
> 
> My current thinking on this is that I would like to make it possible for a
> VM host profile to identify multiple paths for the "Working Directory".
> This would mean that the path for a given VM would be determined via round
> robin. This would obviously have certain implications for the database and
> existing provisioning module code, as well as involve some front-end GUI
> work.
> 
> Do any of you have thoughts on this?
> 
> Thanks,
> 
> Aaron Coburn
> Systems Administrator and Programmer
> Academic Technology Services, Amherst College
> acoburn@amherst.edu
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iEYEARECAAYFAlDwhb8ACgkQV/LQcNdtPQOS2ACeMSw188W0zNWv9dSuHdoY/JE7
OhEAni0ga32Oawpwwcs4wljl21BadzzI
=cyuN
-----END PGP SIGNATURE-----