You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vcl.apache.org by Sunil Venkatesh <su...@umbc.edu> on 2011/05/19 20:31:02 UTC

[VCL 2.2.1] [Power7] Problem with image reservation

Hi,

We are currently in the process of configuring VCL 2.2.1 to work on a 
Power 7 blade. Our current setup is:

1. A web-server that hosts the Database and the Web Code. The same 
server acts as the Management node. xCAT is configured as the 
provisioning module on this node.
2. Power7 is our compute node.
3. I used the command "vcld --setup" command to create/capture base 
image of RHEL 5 that is running on the Power7 blade (by specifying the 
IP address of Power7 blade when prompted for an address).

The creation process failed as Xianqing Yu had mentioned to us earlier. 
Although, before it failed it created appropriate entries in the tables 
image, imagerevision and resource. I was able to "Undelete" the image 
from the web page and see it under "New Reservations".

I am facing similar problems that Mike Waldron had faced with the 
reservation. Even after making memory adjustment, I wasn't able to make 
a reservation. The time table shows all green (available), however, when 
I choose any entry from the list, it takes me directly to "New 
Reservation" page without any status/feedback. And, I don't see any 
reservations created when I check under "Current Reservations". I am 
just assuming the groupings of Images and Computers are correct, is 
there anyway I could verify the same. Also, if there is any reference to 
how the grouping need to be done, please let me know of the same.

Please do correct me if there is anything wrong with the system setup.

Regards,
Sunil Venkatesh
Research Assistant,
MC2 Lab, UMBC.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
Sunil,

I can respond to #1.

I think you just have a permissions or mapping problem, but it may be 
something deeper.  Please double check the following things through the web 
interface:

-make sure the image is in an image group
-make sure the image group is mapped to a computer group containing computers 
that are in the available state, have platformid set to the same thing it is 
set to for the image, and have RAM, procnumber, procspeed, and network greater 
than or equal to those settings for the image
-make sure your user account has imageCheckOut at a node where the image group 
and computer group are available
-make sure the computer group is mapped to a management node group that 
contains a management node that is actively checking in
-make sure the computers in the computer group have a schedule assigned that 
is available during the time you have selected

If all of that is correct, you'll need to look at isAvailable in utils.php to 
see where it is returning something < 1.

Josh

On Tue August 2 2011 2:17:59 pm Sunil Venkatesh wrote:
> Hi Josh,
> 
> So, I was able to get the VCL to capture & restore the images on to the
> Power7 blade without any errors. The VCL web portal shows the Power7 node
> being loaded with the captured image. I was looking for some clarification
> with a couple of things.
> 
> 1. The web portal shows "Selection currently not available" when I intend
> to make a reservation on the Power7 blade that I have been working with
> all this while.
> 
> 2. Once the image is loaded onto the blade, I am able to login from the
> management node without a password. However, when I am using rcons to
> login as root on the Power7 blade, it does not accept the root password I
> had used during the OS installation. Does the root password get reset to a
> default one? I was checking vcl/lib/VCL/Module/OS/Linux.pm if that is the
> case.
> 
> Thanks & Regards,
> Sunil
> 
> On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:
> > This and your next question are both deeper into the backend code that
> > I've worked with.  Andy or Aaron may be able to answer your questions
> > further.
> > 
> > Josh
-- 
-------------------------------
Josh Thompson
Systems Programmer
Virtual Computing Lab (VCL)
North Carolina State University

Josh_Thompson@ncsu.edu
919-515-5323

my GPG/PGP key can be found at www.keyserver.net

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sunil,

If you change the password as root (or administrator on windows), you don't 
need the old password to set a new one.

Josh

On Thursday August 04, 2011, Sunil Venkatesh wrote:
> Aaron,
> 
> I wanted to try the ssh method to change the root password, but I stopped
> short since it will ask for old password before changing to a new one.
> However, I planning to create and make use of user accounts instead of
> looking for the root password.
> 
> Thank you for the input.
> 
> Regards,
> Sunil
> 
> On Aug 4, 2011, at 12:27 PM, Aaron Peeler wrote:
> > Sunil,
> > 
> > On #2. The VCL load process is to randomize the root (and
> > administrator for windows) password. It should be in the vcld.log
> > file, but an easier option might be to ssh into the node and set the
> > root password to something you know when using xcat's rcons to look at
> > the console.
> > 
> > Aaron
> > 
> > On Tue, Aug 2, 2011 at 2:17 PM, Sunil Venkatesh <su...@umbc.edu> wrote:
> >> Hi Josh,
> >> 
> >> So, I was able to get the VCL to capture & restore the images on to the
> >> Power7 blade without any errors. The VCL web portal shows the Power7
> >> node being loaded with the captured image. I was looking for some
> >> clarification with a couple of things.
> >> 
> >> 1. The web portal shows "Selection currently not available" when I
> >> intend to make a reservation on the Power7 blade that I have been
> >> working with all this while.
> >> 
> >> 2. Once the image is loaded onto the blade, I am able to login from the
> >> management node without a password. However, when I am using rcons to
> >> login as root on the Power7 blade, it does not accept the root password
> >> I had used during the OS installation. Does the root password get reset
> >> to a default one? I was checking vcl/lib/VCL/Module/OS/Linux.pm if that
> >> is the case.
> >> 
> >> Thanks & Regards,
> >> Sunil
> >> 
> >> On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:
> >>> This and your next question are both deeper into the backend code that
> >>> I've worked with.  Andy or Aaron may be able to answer your questions
> >>> further.
> >>> 
> >>> Josh
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk46zvEACgkQV/LQcNdtPQMnbwCeKas+RR+36HDGEBdnVQUH/Ts8
NG8An2f92p1k9Rsi/FNrz2PrikHUJGP3
=AafE
-----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Aaron Peeler <aa...@ncsu.edu>.
interesting it's prompting, which version of linux.
The code is using:
echo $passwd | /usr/bin/passwd -f $account --stdin

sounds this it worked in the vcl code, so should also work at the cmdline.

Aaron

On Thu, Aug 4, 2011 at 12:49 PM, Sunil Venkatesh <su...@umbc.edu> wrote:
> Aaron,
>
> I wanted to try the ssh method to change the root password, but I stopped short since it will ask for old password before changing to a new one. However, I planning to create and make use of user accounts instead of looking for the root password.
>
> Thank you for the input.
>
> Regards,
> Sunil
>
> On Aug 4, 2011, at 12:27 PM, Aaron Peeler wrote:
>
>> Sunil,
>>
>> On #2. The VCL load process is to randomize the root (and
>> administrator for windows) password. It should be in the vcld.log
>> file, but an easier option might be to ssh into the node and set the
>> root password to something you know when using xcat's rcons to look at
>> the console.
>>
>> Aaron
>>
>>
>> On Tue, Aug 2, 2011 at 2:17 PM, Sunil Venkatesh <su...@umbc.edu> wrote:
>>> Hi Josh,
>>>
>>> So, I was able to get the VCL to capture & restore the images on to the Power7 blade without any errors. The VCL web portal shows the Power7 node being loaded with the captured image. I was looking for some clarification with a couple of things.
>>>
>>> 1. The web portal shows "Selection currently not available" when I intend to make a reservation on the Power7 blade that I have been working with all this while.
>>>
>>> 2. Once the image is loaded onto the blade, I am able to login from the management node without a password. However, when I am using rcons to login as root on the Power7 blade, it does not accept the root password I had used during the OS installation. Does the root password get reset to a default one? I was checking vcl/lib/VCL/Module/OS/Linux.pm if that is the case.
>>>
>>> Thanks & Regards,
>>> Sunil
>>>
>>> On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:
>>>
>>>> This and your next question are both deeper into the backend code that I've
>>>> worked with.  Andy or Aaron may be able to answer your questions further.
>>>>
>>>> Josh
>>>
>>>
>>
>>
>>
>> --
>> Aaron Peeler
>> Program Manager
>> Virtual Computing Lab
>> NC State University
>>
>> All electronic mail messages in connection with State business which
>> are sent to or received by this account are subject to the NC Public
>> Records Law and may be disclosed to third parties.
>
>



-- 
Aaron Peeler
Program Manager
Virtual Computing Lab
NC State University

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Aaron,

I wanted to try the ssh method to change the root password, but I stopped short since it will ask for old password before changing to a new one. However, I planning to create and make use of user accounts instead of looking for the root password.

Thank you for the input.

Regards,
Sunil

On Aug 4, 2011, at 12:27 PM, Aaron Peeler wrote:

> Sunil,
> 
> On #2. The VCL load process is to randomize the root (and
> administrator for windows) password. It should be in the vcld.log
> file, but an easier option might be to ssh into the node and set the
> root password to something you know when using xcat's rcons to look at
> the console.
> 
> Aaron
> 
> 
> On Tue, Aug 2, 2011 at 2:17 PM, Sunil Venkatesh <su...@umbc.edu> wrote:
>> Hi Josh,
>> 
>> So, I was able to get the VCL to capture & restore the images on to the Power7 blade without any errors. The VCL web portal shows the Power7 node being loaded with the captured image. I was looking for some clarification with a couple of things.
>> 
>> 1. The web portal shows "Selection currently not available" when I intend to make a reservation on the Power7 blade that I have been working with all this while.
>> 
>> 2. Once the image is loaded onto the blade, I am able to login from the management node without a password. However, when I am using rcons to login as root on the Power7 blade, it does not accept the root password I had used during the OS installation. Does the root password get reset to a default one? I was checking vcl/lib/VCL/Module/OS/Linux.pm if that is the case.
>> 
>> Thanks & Regards,
>> Sunil
>> 
>> On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:
>> 
>>> This and your next question are both deeper into the backend code that I've
>>> worked with.  Andy or Aaron may be able to answer your questions further.
>>> 
>>> Josh
>> 
>> 
> 
> 
> 
> -- 
> Aaron Peeler
> Program Manager
> Virtual Computing Lab
> NC State University
> 
> All electronic mail messages in connection with State business which
> are sent to or received by this account are subject to the NC Public
> Records Law and may be disclosed to third parties.


Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Aaron Peeler <fa...@ncsu.edu>.
Sunil,

On #2. The VCL load process is to randomize the root (and
administrator for windows) password. It should be in the vcld.log
file, but an easier option might be to ssh into the node and set the
root password to something you know when using xcat's rcons to look at
the console.

Aaron


On Tue, Aug 2, 2011 at 2:17 PM, Sunil Venkatesh <su...@umbc.edu> wrote:
> Hi Josh,
>
> So, I was able to get the VCL to capture & restore the images on to the Power7 blade without any errors. The VCL web portal shows the Power7 node being loaded with the captured image. I was looking for some clarification with a couple of things.
>
> 1. The web portal shows "Selection currently not available" when I intend to make a reservation on the Power7 blade that I have been working with all this while.
>
> 2. Once the image is loaded onto the blade, I am able to login from the management node without a password. However, when I am using rcons to login as root on the Power7 blade, it does not accept the root password I had used during the OS installation. Does the root password get reset to a default one? I was checking vcl/lib/VCL/Module/OS/Linux.pm if that is the case.
>
> Thanks & Regards,
> Sunil
>
> On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:
>
>> This and your next question are both deeper into the backend code that I've
>> worked with.  Andy or Aaron may be able to answer your questions further.
>>
>> Josh
>
>



-- 
Aaron Peeler
Program Manager
Virtual Computing Lab
NC State University

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Hi Josh,

So, I was able to get the VCL to capture & restore the images on to the Power7 blade without any errors. The VCL web portal shows the Power7 node being loaded with the captured image. I was looking for some clarification with a couple of things.

1. The web portal shows "Selection currently not available" when I intend to make a reservation on the Power7 blade that I have been working with all this while. 

2. Once the image is loaded onto the blade, I am able to login from the management node without a password. However, when I am using rcons to login as root on the Power7 blade, it does not accept the root password I had used during the OS installation. Does the root password get reset to a default one? I was checking vcl/lib/VCL/Module/OS/Linux.pm if that is the case.

Thanks & Regards,
Sunil

On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:

> This and your next question are both deeper into the backend code that I've 
> worked with.  Andy or Aaron may be able to answer your questions further.
> 
> Josh


Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Josh,

Could you provide me with the links to the resources of the VCL workshop? If there is a way to witness the workshop while it is in progress, that would help too. 

Regards,
Sunil


On Jul 7, 2011, at 2:03 PM, Josh Thompson wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Sunil,
> 
> On Thursday July 07, 2011, Sunil Venkatesh wrote:
>> Thanks Josh. My professor was asking about the details of VCL workshop
>> in NC. Are you aware of these details?
> 
> The workshop is hosted by NCSU.  It takes people from an introduction to VCL 
> to actually installing and managing it.  It is already full, but I think 
> recordings of the sessions may be available when it is over.
> 
>> 
>> Please bare with my comments inline.
> 
> Responses also inline.
> 
>> On 7/7/11 11:13 AM, Josh Thompson wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>> 
>>> On Tuesday July 05, 2011, Sunil Venkatesh wrote:
>>>> Hi Josh,
>>>> 
>>>> I was able to get the following things done in respect to getting VCL to
>>>> work on POWER.
>>>> 
>>>> 1. Made modifications in the xcat tables to get the capture process
>>>> working with statelite images instead of stateless images. Particularly
>>>> the noderes&  bootparams table.
>>>> 
>>>> 2. Used partimage to capture the images (did NOT set usepartimageng to
>>>> 1).
>>>> 
>>>> -rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
>>>> -rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
>>>> -rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
>>>> -rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
>>>> -rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
>>>> -rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
>>>> -rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk
>>>> 
>>>> 
>>>> 2 partitions including the boot partition present on the blade were
>>>> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on
>>>> a 600 GB partition due to which the capture process failed. The image of
>>>> the partition was generated once the partition size was reduced to 6GB.
>>>> Is it necessary for me to use partimage-ng instead of partimage itself?
>>> 
>>> Are you asking if you need to use partimage-ng for partitions that are
>>> 600GB? If so, I don't really know.  We've never dealt with partitions
>>> that large.
>> 
>> Here, I am just asking if images captured using partimage are recognized
>> by VCL or is it required that I use partimage-ng. From your earlier
>> emails to Prem, I could notice that the only difference between
>> partimage & partimage-ng (after setting userpartimageng to 1) is the
>> former generates images with .gz and the later generates .img. Am I
>> right here? Also, I was able to get the 600GB partition captured, since
>> the partition was empty, it resulted in a ~17MB image file.
> 
> VCL can deploy images captured with both partimage and partimage-ng.  At NCSU, 
> we were going to switch to partimage-ng, which is why I added in support for 
> it, but then we realized we'd have to upgrade all of our management nodes to 
> xCAT2 at the same time or some of them wouldn't be able to deploy newly 
> captured images that were captured with partimage-ng (the support for xCAT1.x 
> can't deploy using partimage-ng).  So, we just stuck with partimage.  The 
> captured file format between the two is different.
> 
>>>> When proceeding further with "vcld --setup", the script was not able to
>>>> find the images that were created using partimage. The options that are
>>>> provided in the script does not allow for selecting an architecture
>>>> other than x86/x86_64.
>>> 
>>> You'll need to modify the vcld image.pm module.  Look in
>>> /usr/local/vcl/lib/VCL.  In image.pm, look for the function
>>> 'setup_capture_base_image'; then, find 'my @architecture_choices' and add
>>> 'ppc' as another option.
>> 
>> As a matter of fact, I tried this step. But, the
>> _get_image_repository_path function in
>> /usr/local/vcl/lib/VCL/Module/Provisioning/xCAT.pm does not recognize
>> the architecture when I choose ppc/ppc64 in the menu. On line 2922 in
>> the same file, image_architecture is set to undefined. I think the list
>> of supported architectures is stored in some mysql table. I haven't
>> checked regarding this, i was trying to get VCL to recognize the images
>> as x86/x86_64 by setting up soft links in the search paths of VCL.
> 
> This and your next question are both deeper into the backend code that I've 
> worked with.  Andy or Aaron may be able to answer your questions further.
> 
> Josh
> 
>>>> Also, in the error log vcld is looking for
>>>> 
>>>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
>>>> 
>>>> and cannot find the template file. Should the template file that needs
>>>> to be accessed in this case be createimage.ppc64.tmpl?
>>> 
>>> This is actually a check to make sure the image doesn't already exist
>>> before trying to capturing it.  So, it is good that it doesn't find it.
>> 
>> If possible, could you please provide me with the details of steps that
>> take place here. If there are any documentation available regarding
>> this, that would work too. U said "image doesn't already exist before
>> trying to capturing it", how does VCL capture the images? does it make
>> use of the images that are already generated using partimage? if so, in
>> what places does it look for the images?
>> 
>> Sorry for asking too many questions. I could trace the scripts to check
>> the flow, but, that would take a lot of time. You have been really
>> patient with all my queries, appreciate that.
>> 
>> Thanks
>> Sunil
>> 
>>> It sounds like you're almost there.  Great work!
>>> 
>>> Josh
>>> 
>>>> I have attached a log at the end of the mail. I am not sure where I have
>>>> gone wrong with the VCL configuration.
>>>> 
>>>> -Sunil
>>>> 
>>>> -----
>>>> 
>>>> rh5image-power010701bi34-v0 image creation failed
>>>> ------------------------------------------------------------------------
>>>> time: 2011-07-05 11:03:25
>>>> caller: image.pm:reservation_failed(385)
>>>> ( 0) image.pm, reservation_failed (line: 385)
>>>> (-1) image.pm, process (line: 167)
>>>> (-2) vcld, make_new_child (line: 568)
>>>> (-3) vcld, main (line: 346)
>>>> ------------------------------------------------------------------------
>>>> management node: web1.bluegrit.cs.umbc.edu
>>>> reservation PID: 9866
>>>> parent vcld PID: 19110
>>>> 
>>>> request ID: 30
>>>> reservation ID: 30
>>>> request state/laststate: image/image
>>>> request start time: 2011-07-05 11:03:20
>>>> request end time: 2011-07-05 12:03:20
>>>> for imaging: no
>>>> log ID: none
>>>> 
>>>> computer: power01.bluegrit.cs.umbc.edu
>>>> computer id: 2
>>>> computer type: blade
>>>> computer eth0 MAC address:<undefined>
>>>> computer eth1 MAC address:<undefined>
>>>> computer private IP address: 172.20.106.1
>>>> computer public IP address: 172.20.106.1
>>>> computer in block allocation: no
>>>> provisioning module: VCL::Module::Provisioning::xCAT2
>>>> 
>>>> image: rh5image-power010701bi34-v0
>>>> image display name: power010701bi
>>>> image ID: 34
>>>> image revision ID: 34
>>>> image size: 1450 MB
>>>> use Sysprep: yes
>>>> root access: yes
>>>> image owner ID: 1
>>>> image owner affiliation: Local
>>>> image revision date created: 2011-07-05 11:03:25
>>>> image revision production: yes
>>>> OS module: VCL::Module::OS::Linux
>>>> 
>>>> user: admin
>>>> user name: vcl admin
>>>> user ID: 1
>>>> user affiliation: Local
>>>> ------------------------------------------------------------------------
>>>> RECENT LOG ENTRIES FOR THIS PROCESS:
>>>> 2011-07-05
>>>> 
>>>> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|
> VCL::Module::OS:
>>>> :Linux OS object created for rh5image-power010701bi34-v0, address:
>>>> :88fb070
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT
>>>> environment variable is not set, using /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found:
>>>> /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module
>>>> initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
>>>> variable is not set, using /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
>>>> /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module
>>>> initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL:
>>>> :M odule::Provisioning::xCAT2 module loaded 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management
>>>> node OS object has already been created, address: 88f23b0, returning 1
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::
>>>> xC AT2 object created for computer power01, address: 88fb0e0 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
>>>> variable is not set, using /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
>>>> /opt/xcat 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module
>>>> initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL:
>>>> :M odule::Provisioning::xCAT2 provisioner object created for power01,
>>>> address: 88fb0e0 2011-07-05
>>>> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object
>>>> created and initialized 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail
>>>> To:shrusun@gmail.com, VCL IMAGE Creation Started:
>>>> rh5image-power010701bi34-v0 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS
>>>> install type: partimage 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|manag
>>>> em ent node identifier argument was not specified
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|
>>> 
>>> xCAT.pm:_get_image_repository_path(2932)|attempting to determine
>>> repository
>>> 
>>> path for image on web1.bluegrit.cs.umbc.edu:
>>>> |9866|30:30|image| image id: 34
>>>> |9866|30:30|image| OS name: rh5image
>>>> |9866|30:30|image| OS type: linux
>>>> |9866|30:30|image| OS install type: partimage
>>>> |9866|30:30|image| OS source path: image
>>>> |9866|30:30|image| architecture: x86_64
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did
>>>> not find any images under /tftpboot/xcat//linux_image/x86_64 on
>>>> web1.bluegrit.cs.umbc.edu 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|retur
>>>> ni ng repository path for web1.bluegrit.cs.umbc.edu:
>>>> /tftpboot/xcat//image/x86_64 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image
>>>> repository path: /tftpboot/xcat//image/x86_64
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed
>>> 
>>> command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0*
>>> 2>&1
>>> 
>>> | grep total 2>&1, pid: 9877, exit status: 0, output:
>>>> |9866|30:30|image| 0 total
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image does NOT
>>>> exist: rh5image-power010701bi34-v0 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manage
>>>> me nt node identifier argument was not specified
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|
>>> 
>>> xCAT2.pm:_get_image_template_path(2115)|attempting to determine template
>>> path
>>> 
>>> for image:
>>>> |9866|30:30|image| image name: rh5image-power010701bi34-v0
>>>> |9866|30:30|image| OS install type: partimage
>>>> |9866|30:30|image| OS source path: image
>>>> |9866|30:30|image| xCAT 2.x OS source path: image
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|return
>>>> in g: /opt/xcat/share/xcat/install/image 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template
>>>> repository path for rh5image-power010701bi34-v0:
>>>> /opt/xcat/share/xcat/install/image 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file
>>>> does not exist:
>>>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image
>>>> rh5image-power010701bi34-v0 does NOT exist on this management node
>>>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image
>>>> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
>>>> structure updated:
>>>> $self->request_data->{reservation}{30}{image}{lastupdate}
>>>> 
>>>> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
>>>> structure updated:
>>>> $self->request_data->{reservation}{30}{imagerevision}{datecreated}
>>>> 
>>>> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling
>>>> provisioning module's capture() subroutine 2011-07-05
>>>> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power0107
>>>> 01 bi34-v0, computer=power01
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
>>> 
>>> executing SSH command on power01:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
>>>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
>>> 
>>> run_ssh_command output:
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied
>>>> |(publickey,gssapi-with-mic,password).
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
>>> 
>>> command executed on power01, command:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
>>>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image|
>>>> |returning (255, "Permission denied, please try ...")
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated
>>>> ownership and permissions on currentimage.txt
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
>>> 
>>> executing SSH command on power01:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
>>>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim
>>>> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05
>>>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc
>>>> |.e du">   currentimage.txt&&   cat currentimage.txt' 2>&1
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
>>> 
>>> run_ssh_command output:
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied
>>>> |(publickey,gssapi-with-mic,password).
>>>> 
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
>>> 
>>> command executed on power01, command:
>>>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>>>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
>>>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim
>>>> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05
>>>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc
>>>> |.e du">   currentimage.txt&&   cat currentimage.txt' 2>&1
>>>> |9866|30:30|image| returning (255, "Permission denied, please try ...")
>>>> |9866|30:30|image| ---- WARNING ----
>>>> |9866|30:30|image| 2011-07-05
>>>> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed
>>>> |to create currentimage.txt file on power01: 9866|30:30|image|
>>>> |Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied, please try again.
>>>> |9866|30:30|image| Permission denied
>>>> |(publickey,gssapi-with-mic,password). 9866|30:30|image| ( 0) utils.pm,
>>>> |write_currentimage_txt (line: 5699) 9866|30:30|image| (-1) xCAT2.pm,
>>>> |capture (line: 779)
>>>> |9866|30:30|image| (-2) image.pm, process (line: 162)
>>>> |9866|30:30|image| (-3) vcld, make_new_child (line: 568)
>>>> |9866|30:30|image| (-4) vcld, main (line: 346)
>>>> |9866|30:30|image| ---- WARNING ----
>>>> |9866|30:30|image| 2011-07-05
>>>> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update
>>>> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture
>>>> |(line: 783)
>>>> |9866|30:30|image| (-1) image.pm, process (line: 162)
>>>> |9866|30:30|image| (-2) vcld, make_new_child (line: 568)
>>>> |9866|30:30|image| (-3) vcld, main (line: 346)
>>>> |9866|30:30|image| ---- WARNING ----
>>>> |9866|30:30|image| 2011-07-05
>>>> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi3
>>>> |4- v0 image failed to be captured by provisioning module
>>>> |9866|30:30|image| ( 0) image.pm, process (line: 166)
>>>> |9866|30:30|image| (-1) vcld, make_new_child (line: 568)
>>>> |9866|30:30|image| (-2) vcld, main (line: 346)
>>>> 
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
>>>> ss (1581)|attempting to retrieve private IP address for computer:
>>>> power01 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
>>>> ss (1585)|retrieved contents of /etc/hosts on this management node,
>>>> contains 158 lines 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
>>>> ss (1645)|returning IP address from /etc/hosts file: 172.20.106.1
>>>> 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows
>>>> were returned from database select 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(20
>>>> 35 )|image owner id: 1 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
>>>> (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
>>>> handle stored in $ENV{dbh} 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|atte
>>>> mp ting to retrieve and store data for user: user.id = '1' 2011-07-05
>>>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
>>>> (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
>>>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
>>>> handle stored in $ENV{dbh} 2011-07-05
>>>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data
>>>> has been retrieved for user: admin (id: 1)
>>>> 
>>>> On 6/24/11 10:13 AM, Josh Thompson wrote:
>>>>> Sunil,
>>>>> 
>>>>> "nodeset<nodename>   image" sets up all the xCAT stuff so that the next
>>>>> time the node is booted, it will boot the stateless/statelite image and
>>>>> capture an image of the node.
>>>>> 
>>>>> Can you double check that you have 'os' in the nodetype table set to
>>>>> image for the node you are using?  If you look in the partimageng.pm
>>>>> xCAT module, you see toward the top where it registers the
>>>>> "handled_commands".  The "mk" gets stripped off.  So, that module is
>>>>> registering "install" and "image" for os type = "image".  As long as
>>>>> you have os in the nodetype table set to image, it should be using
>>>>> that module.
>>>>> 
>>>>> You will need to make sure you have all of the required files in
>>>>> locations using 'ppc64' as the arch.
>>>>> 
>>>>> Josh
>>>>> 
>>>>> On Wednesday June 22, 2011, Sunil Venkatesh wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Update !
>>>>>> 
>>>>>> I was able to fix the problem that I was facing with the scripts by
>>>>>> disabling the firewall. But, I still have a problem with the command-
>>>>>> 
>>>>>> nodeset<nodename>   image
>>>>>> 
>>>>>> Unless this error is fixed, I don't think partimage will work. Am I
>>>>>> right here?
>>>>>> 
>>>>>> Thanks,
>>>>>> Sunil
>>>>>> 
>>>>>> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<su...@umbc.edu>
>>> 
>>> wrote:
>>>>>>> Josh,
>>>>>>> 
>>>>>>> I have reached a point where I am able to boot the ppc using the
>>>>>>> statelite images created using genimage. But, I was wondering how
>>>>>>> significant the following command is.
>>>>>>> 
>>>>>>> nodeset<nodename>   image
>>>>>>> 
>>>>>>> I got the same error that Prem had mentioned.
>>>>>>> 
>>>>>>> 
>>>>>>> power01: Error: Unable to identify plugin for this command, check
>>>>>>> relevant tables: nodetype.os
>>>>>>> Error: Some nodes failed to set up image resources, aborting
>>>>>>> 
>>>>>>> I tried changing the 'os' field to 'image' under nodetype, that
>>>>>>> doesn't seem to help. I get the same error even after the change.
>>>>>>> 'arch' in my case is set to 'ppc64'.
>>>>>>> 
>>>>>>> 
>>>>>>> Also, I think partimage plugin needs to be changed to support the ppc
>>>>>>> architecture, from what you had mentioned in the other thread.
>>>>>>> 
>>>>>>> I am not sure what the command 'nodeset<nodename>   image' does, but,
>>>>>>> I am able to boot the statelite images by making changes to the
>>>>>>> yaboot configuration files. The ppc blade currently uses LVM, that
>>>>>>> needs to be replaced with ext2/ext3 from what I read from the other
>>>>>>> thread, am I right? Also, just out of curiosity I left the statelite
>>>>>>> image to boot with my current setting. I can see the xcat script
>>>>>>> throwing an error-
>>>>>>> 
>>>>>>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No
>>>>>>> such file or directory
>>>>>>> /tmp/mypostscript: line 16: updateflag.awk: command not found
>>>>>>> 
>>>>>>> both getpostscript.awk&   updateflag.awk are not found in the rootimg
>>>>>>> created by genimage. Is there any place I could find these scripts?
>>>>>>> 
>>>>>>> Also, please correct me if there is anything wrong with the procedure
>>>>>>> I am following.
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks in advance.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Sunil
>>>>>>> 
>>>>>>> On 6/13/11 4:13 PM, Josh Thompson wrote:
>>>>>>>> Sunil,
>>>>>>>> 
>>>>>>>>   From what I remember, I didn't have to do much to the rootimg.gz
>>>>>>>>   image to
>>>>>>>> 
>>>>>>>> make
>>>>>>>> it work.  I created the files I supply before xCAT started using
>>>>>>>> "statelite"
>>>>>>>> instead of "stateless".  I think statelite uses NFS to mount the
>>>>>>>> image, and
>>>>>>>> stateless uses an image file downloaded to the node and run out of
>>>>>>>> RAM.
>>>>>>>> 
>>>>>>>>   Since
>>>>>>>> 
>>>>>>>> generating a statelite image is pretty straightforward use of xCAT,
>>>>>>>> you may
>>>>>>>> want to ask on the xcat-user email list for help with it.
>>>>>>>> 
>>>>>>>> Unless you can have the admins of the other dhcp server on your
>>>>>>>> network exclude the MAC addresses of your blades, you'll need to
>>>>>>>> create a separate private network to control your VCL stuff, either
>>>>>>>> physically or with VLANs.
>>>>>>>> 
>>>>>>>> If they can exclude the MACs, you can set up the dhcp server on your
>>>>>>>> management node to only answer to requests from your blades.
>>>>>>>> 
>>>>>>>> Josh
>>>>>>>> 
>>>>>>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
>>>>>>>>> Josh,
>>>>>>>>> 
>>>>>>>>> Again, Thank you for your valuable inputs. I have got to the point
>>>>>>>>> where I can get the compute node to boot using the stateless
>>>>>>>>> images. I had to manually configure the netboot since we already
>>>>>>>>> had a DHCP server which is not the same as our Management node.
>>>>>>>>> Since our setup is not in an isolated environment, I could not let
>>>>>>>>> xcat handle the dhcp&    netboot configuration (it messed up out
>>>>>>>>> network
>>>>>>>>> configuration when i let xcat handle it,we had 2 dhcp servers
>>>>>>>>> running at that point). Are you aware of any way to let xcat handle
>>>>>>>>> such scenarios?
>>>>>>>>> 
>>>>>>>>> Although I am able to get the compute node to boot with the kernel
>>>>>>>>> image&    initrd, and NFS mount the rootimg that was generated
>>>>>>>>> using 'genimage', I am getting the following error on the compute
>>>>>>>>> node's console -
>>>>>>>>> 
>>>>>>>>>       FATAL error: could not get the entries from litefile
>>>>>>>>>       table...
>>>>>>>>> 
>>>>>>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is
>>>>>>>>> not present in the rootimg. I am currently checking the xcat
>>>>>>>>> packages for its availability. If you know the procedure to get it
>>>>>>>>> onto the compute node, please let me know the same.
>>>>>>>>> 
>>>>>>>>> Appreciate your support.
>>>>>>>>> 
>>>>>>>>> Thanking you,
>>>>>>>>> Sunil
>>>>>>>>> 
>>>>>>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
>>>>>>>>>> Sunil,
>>>>>>>>>> 
>>>>>>>>>> I don't recall seeing any documentation on those parts.  I had to
>>>>>>>>>> poke around looking at parts of xCAT to see how it worked.  It's
>>>>>>>>>> been a few years since I did that; so, I don't remember much about
>>>>>>>>>> the process. My recommendation would be to start looking at things
>>>>>>>>>> in the rootimg.gz image.  Looking at it now, I see that
>>>>>>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
>>>>>>>>>> like it downloads all of the postscripts from the management node
>>>>>>>>>> and then run getpostscript.awk which issues a command to xcatd to
>>>>>>>>>> get the primary postscript for that machine.  I've forgotten how
>>>>>>>>>> xcatd then builds the primary postscript. I do remember that in
>>>>>>>>>> the partimageng.pm module, I had it add the partimageng
>>>>>>>>>> postscript.
>>>>>>>>>> 
>>>>>>>>>> So, you'll really have to start digging through how the xcat
>>>>>>>>>> postscript system works.
>>>>>>>>>> 
>>>>>>>>>> Josh
>>>>>>>>>> 
>>>>>>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>> Josh,
>>>>>>>>>>> 
>>>>>>>>>>> Is there any place I could find some details on
>>>>>>>>>>> 
>>>>>>>>>>> "... /Once the compute node is booted with the stateless
>>>>>>>>>>> image, it uses NFS to mount some things from the management node,
>>>>>>>>>>> and then runs some xcat postscripts,/.... "
>>>>>>>>>>> 
>>>>>>>>>>> I have the stateless images ready with partimage compiled for
>>>>>>>>>>> PPC. For the compute node (power 7) to boot using the stateless
>>>>>>>>>>> images, i need to
>>>>>>>>>>> configure the yaboot instead of pxeboot (which is specific to
>>>>>>>>>>> x86). I wanted to know where in the startup files the execution
>>>>>>>>>>> of partimage and
>>>>>>>>>>> NFS mount is configured. Is it configured by the "genimage"
>>>>>>>>>>> command itself? Considering the way in which the nodes are
>>>>>>>>>>> configured in the network, it would not be a good idea to let
>>>>>>>>>>> xcat take care of configuring the details like DHCPD for
>>>>>>>>>>> netboot. So, I need to make changes to the configuration files
>>>>>>>>>>> manually, which is why this query came up.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Sunil
>>>>>>>>>>> 
>>>>>>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>>>>>>>>>>> Sunil,
>>>>>>>>>>>> 
>>>>>>>>>>>> The "stateless" image I refer to is what is actually booted on
>>>>>>>>>>>> the compute node containing the image to be captured.  It's
>>>>>>>>>>>> called stateless because it is loaded completely in RAM and
>>>>>>>>>>>> does not maintain any state when a reboot occurs.
>>>>>>>>>>>> 
>>>>>>>>>>>> The partimage binary is part of this stateless image and
>>>>>>>>>>>> actually runs on the compute node.  It does not run on the
>>>>>>>>>>>> management node. The management node does not have block level
>>>>>>>>>>>> access to the disk on the compute node to be able to capture
>>>>>>>>>>>> the image from the disk.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'll try to describe the process a little better.  The
>>>>>>>>>>>> management node issues a reboot command to the compute node. 
>>>>>>>>>>>> The compute node uses PXE
>>>>>>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk
>>>>>>>>>>>> (initrd.img), and
>>>>>>>>>>>> a root filesystem (rootimg.gz) from the management node.  All
>>>>>>>>>>>> three of these together make up the stateless image.  Once the
>>>>>>>>>>>> compute node is booted with the stateless image, it uses NFS to
>>>>>>>>>>>> mount some things from the management node, and then runs some
>>>>>>>>>>>> xcat
>>>>>>>>>>>> postscripts, one of which is the partimageng postscript.  This
>>>>>>>>>>>> postscript determines what partitions are on the compute node
>>>>>>>>>>>> and, depending on how the postscript
>>>>>>>>>>>> is configured, uses partimage or partimageng to capture an image
>>>>>>>>>>>> of the
>>>>>>>>>>>> compute node disk that is then saved to the management node.
>>>>>>>>>>>> When it is
>>>>>>>>>>>> finished capturing the image, it notifies xcat on the management
>>>>>>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
>>>>>>>>>>>> compute node to
>>>>>>>>>>>> boot off of disk at next boot.  When the compute node comes up,
>>>>>>>>>>>> it uses
>>>>>>>>>>>> PXE to ask the management node how to boot.  The management node
>>>>>>>>>>>> tells it to boot off of disk.
>>>>>>>>>>>> 
>>>>>>>>>>>> I hope that clarifies how the system works.  If any of it is
>>>>>>>>>>>> unclear, please ask for further clarification.
>>>>>>>>>>>> 
>>>>>>>>>>>> Josh
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>>>> Josh,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I had one more clarification.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> partimage binaries run in the management node to capture an
>>>>>>>>>>>>> (stateless) image from the compute node right? In that case, is
>>>>>>>>>>>>> there a need for these binaries to go into the rootimg.gz??
>>>>>>>>>>>>> 
>>>>>>>>>>>>> My assumption is, partimage runs on the management node (an
>>>>>>>>>>>>> intel blade in our case) to capture a stateless image from a
>>>>>>>>>>>>> compute node (a power 7 blade) and stores these images under "
>>>>>>>>>>>>> /install " of the management node. Please correct me if I am
>>>>>>>>>>>>> wrong here.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Sunil
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>>>>>>>> Hash: SHA1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I used the steps that were mentioned under
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
>>>>>>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL
>>>>>>>>>>>>>>> /A dd ing+support+for+p>   ar ti mag e+and+partimage-
>>>>>>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need
>>>>>>>>>>>>>>> to change references to x86&       x86_64 (as directories) to
>>>>>>>>>>>>>>> reflect the
>>>>>>>>>>>>>>> ppc architecture, as the web page says "The architecture for
>>>>>>>>>>>>>>> the node must always be set to x86 for this..". I have with
>>>>>>>>>>>>>>> me the vmlinuz (kernel image) and initrd for the capture
>>>>>>>>>>>>>>> process. The 2 nodeset commands
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your
>>>>>>>>>>>>>> power blades, not the ones linked to off of the page you
>>>>>>>>>>>>>> listed above? If you do, that's a good start.  However,
>>>>>>>>>>>>>> you'll also need rootimg.gz. rootimg.gz is the root
>>>>>>>>>>>>>> filesystem for the stateless image.  It also contains the
>>>>>>>>>>>>>> partimage and partimageng binaries. Assuming partimage or
>>>>>>>>>>>>>> partimageng can actually capture partitions from power
>>>>>>>>>>>>>> systems, you'll need to compile at least one of them to run
>>>>>>>>>>>>>> on power.  For the rootimg.gz image I provided, I compiled
>>>>>>>>>>>>>> them statically so that I didn't have to worry about
>>>>>>>>>>>>>> including any library dependencies in rootimg.gz.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It would be a good idea to research how to use xcat's genimage
>>>>>>>>>>>>>> command to generate stateless images to learn how to do this.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If there's any part of the above that you don't fully
>>>>>>>>>>>>>> understand, please ask me to clarify it.  Until you have a
>>>>>>>>>>>>>> stateless image that you can deploy to your power blades,
>>>>>>>>>>>>>> there's no point in trying to debug any VCL specific items.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Josh
>>>>>>>>>>>>>> - --
>>>>>>>>>>>>>> - ------------------------------**-
>>>>>>>>>>>>>> Josh Thompson
>>>>>>>>>>>>>> VCL Developer
>>>>>>>>>>>>>> North Carolina State University
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>>>>>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
>>>>>>>>>>>>>> g75RqGZY/j
>>>>>>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
>>>>>>>>>>>>>> =exBV
>>>>>>>>>>>>>> -----END PGP SIGNATURE-----
>>> 
>>> - --
>>> - -------------------------------
>>> Josh Thompson
>>> VCL Developer
>>> North Carolina State University
>>> 
>>> my GPG/PGP key can be found at pgp.mit.edu
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>> 
>>> iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S
>>> 6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15
>>> =Yls1
>>> -----END PGP SIGNATURE-----
> - -- 
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
> 
> my GPG/PGP key can be found at pgp.mit.edu
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
> 
> iEYEARECAAYFAk4V9PEACgkQV/LQcNdtPQM6ZgCfaPLJh9MuEVLqRYdHNLqC8BzQ
> JOsAn35U1e4V+xuxFPajb2rVVcg4gril
> =CWDr
> -----END PGP SIGNATURE-----


Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sunil,

On Thursday July 07, 2011, Sunil Venkatesh wrote:
> Thanks Josh. My professor was asking about the details of VCL workshop
> in NC. Are you aware of these details?

The workshop is hosted by NCSU.  It takes people from an introduction to VCL 
to actually installing and managing it.  It is already full, but I think 
recordings of the sessions may be available when it is over.
 
> 
> Please bare with my comments inline.

Responses also inline.

> On 7/7/11 11:13 AM, Josh Thompson wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > On Tuesday July 05, 2011, Sunil Venkatesh wrote:
> >> Hi Josh,
> >> 
> >> I was able to get the following things done in respect to getting VCL to
> >> work on POWER.
> >> 
> >> 1. Made modifications in the xcat tables to get the capture process
> >> working with statelite images instead of stateless images. Particularly
> >> the noderes&  bootparams table.
> >> 
> >> 2. Used partimage to capture the images (did NOT set usepartimageng to
> >> 1).
> >> 
> >> -rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
> >> -rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
> >> -rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
> >> -rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
> >> -rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
> >> -rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
> >> -rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk
> >> 
> >> 
> >> 2 partitions including the boot partition present on the blade were
> >> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on
> >> a 600 GB partition due to which the capture process failed. The image of
> >> the partition was generated once the partition size was reduced to 6GB.
> >> Is it necessary for me to use partimage-ng instead of partimage itself?
> > 
> > Are you asking if you need to use partimage-ng for partitions that are
> > 600GB? If so, I don't really know.  We've never dealt with partitions
> > that large.
> 
> Here, I am just asking if images captured using partimage are recognized
> by VCL or is it required that I use partimage-ng. From your earlier
> emails to Prem, I could notice that the only difference between
> partimage & partimage-ng (after setting userpartimageng to 1) is the
> former generates images with .gz and the later generates .img. Am I
> right here? Also, I was able to get the 600GB partition captured, since
> the partition was empty, it resulted in a ~17MB image file.

VCL can deploy images captured with both partimage and partimage-ng.  At NCSU, 
we were going to switch to partimage-ng, which is why I added in support for 
it, but then we realized we'd have to upgrade all of our management nodes to 
xCAT2 at the same time or some of them wouldn't be able to deploy newly 
captured images that were captured with partimage-ng (the support for xCAT1.x 
can't deploy using partimage-ng).  So, we just stuck with partimage.  The 
captured file format between the two is different.

> >> When proceeding further with "vcld --setup", the script was not able to
> >> find the images that were created using partimage. The options that are
> >> provided in the script does not allow for selecting an architecture
> >> other than x86/x86_64.
> > 
> > You'll need to modify the vcld image.pm module.  Look in
> > /usr/local/vcl/lib/VCL.  In image.pm, look for the function
> > 'setup_capture_base_image'; then, find 'my @architecture_choices' and add
> > 'ppc' as another option.
> 
> As a matter of fact, I tried this step. But, the
> _get_image_repository_path function in
> /usr/local/vcl/lib/VCL/Module/Provisioning/xCAT.pm does not recognize
> the architecture when I choose ppc/ppc64 in the menu. On line 2922 in
> the same file, image_architecture is set to undefined. I think the list
> of supported architectures is stored in some mysql table. I haven't
> checked regarding this, i was trying to get VCL to recognize the images
> as x86/x86_64 by setting up soft links in the search paths of VCL.

This and your next question are both deeper into the backend code that I've 
worked with.  Andy or Aaron may be able to answer your questions further.

Josh

> >> Also, in the error log vcld is looking for
> >> 
> >> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
> >> 
> >> and cannot find the template file. Should the template file that needs
> >> to be accessed in this case be createimage.ppc64.tmpl?
> > 
> > This is actually a check to make sure the image doesn't already exist
> > before trying to capturing it.  So, it is good that it doesn't find it.
> 
> If possible, could you please provide me with the details of steps that
> take place here. If there are any documentation available regarding
> this, that would work too. U said "image doesn't already exist before
> trying to capturing it", how does VCL capture the images? does it make
> use of the images that are already generated using partimage? if so, in
> what places does it look for the images?
> 
> Sorry for asking too many questions. I could trace the scripts to check
> the flow, but, that would take a lot of time. You have been really
> patient with all my queries, appreciate that.
> 
> Thanks
> Sunil
> 
> > It sounds like you're almost there.  Great work!
> > 
> > Josh
> > 
> >> I have attached a log at the end of the mail. I am not sure where I have
> >> gone wrong with the VCL configuration.
> >> 
> >> -Sunil
> >> 
> >> -----
> >> 
> >> rh5image-power010701bi34-v0 image creation failed
> >> ------------------------------------------------------------------------
> >> time: 2011-07-05 11:03:25
> >> caller: image.pm:reservation_failed(385)
> >> ( 0) image.pm, reservation_failed (line: 385)
> >> (-1) image.pm, process (line: 167)
> >> (-2) vcld, make_new_child (line: 568)
> >> (-3) vcld, main (line: 346)
> >> ------------------------------------------------------------------------
> >> management node: web1.bluegrit.cs.umbc.edu
> >> reservation PID: 9866
> >> parent vcld PID: 19110
> >> 
> >> request ID: 30
> >> reservation ID: 30
> >> request state/laststate: image/image
> >> request start time: 2011-07-05 11:03:20
> >> request end time: 2011-07-05 12:03:20
> >> for imaging: no
> >> log ID: none
> >> 
> >> computer: power01.bluegrit.cs.umbc.edu
> >> computer id: 2
> >> computer type: blade
> >> computer eth0 MAC address:<undefined>
> >> computer eth1 MAC address:<undefined>
> >> computer private IP address: 172.20.106.1
> >> computer public IP address: 172.20.106.1
> >> computer in block allocation: no
> >> provisioning module: VCL::Module::Provisioning::xCAT2
> >> 
> >> image: rh5image-power010701bi34-v0
> >> image display name: power010701bi
> >> image ID: 34
> >> image revision ID: 34
> >> image size: 1450 MB
> >> use Sysprep: yes
> >> root access: yes
> >> image owner ID: 1
> >> image owner affiliation: Local
> >> image revision date created: 2011-07-05 11:03:25
> >> image revision production: yes
> >> OS module: VCL::Module::OS::Linux
> >> 
> >> user: admin
> >> user name: vcl admin
> >> user ID: 1
> >> user affiliation: Local
> >> ------------------------------------------------------------------------
> >> RECENT LOG ENTRIES FOR THIS PROCESS:
> >> 2011-07-05
> >> 
> >> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|
VCL::Module::OS:
> >> :Linux OS object created for rh5image-power010701bi34-v0, address:
> >> :88fb070
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT
> >> environment variable is not set, using /opt/xcat 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found:
> >> /opt/xcat 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module
> >> initialized 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
> >> variable is not set, using /opt/xcat 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
> >> /opt/xcat 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module
> >> initialized 2011-07-05
> >> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL:
> >> :M odule::Provisioning::xCAT2 module loaded 2011-07-05
> >> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management
> >> node OS object has already been created, address: 88f23b0, returning 1
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::
> >> xC AT2 object created for computer power01, address: 88fb0e0 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
> >> variable is not set, using /opt/xcat 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
> >> /opt/xcat 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module
> >> initialized 2011-07-05
> >> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL:
> >> :M odule::Provisioning::xCAT2 provisioner object created for power01,
> >> address: 88fb0e0 2011-07-05
> >> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object
> >> created and initialized 2011-07-05
> >> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail
> >> To:shrusun@gmail.com, VCL IMAGE Creation Started:
> >> rh5image-power010701bi34-v0 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS
> >> install type: partimage 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|manag
> >> em ent node identifier argument was not specified
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|
> > 
> > xCAT.pm:_get_image_repository_path(2932)|attempting to determine
> > repository
> > 
> > path for image on web1.bluegrit.cs.umbc.edu:
> >> |9866|30:30|image| image id: 34
> >> |9866|30:30|image| OS name: rh5image
> >> |9866|30:30|image| OS type: linux
> >> |9866|30:30|image| OS install type: partimage
> >> |9866|30:30|image| OS source path: image
> >> |9866|30:30|image| architecture: x86_64
> >> 
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did
> >> not find any images under /tftpboot/xcat//linux_image/x86_64 on
> >> web1.bluegrit.cs.umbc.edu 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|retur
> >> ni ng repository path for web1.bluegrit.cs.umbc.edu:
> >> /tftpboot/xcat//image/x86_64 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image
> >> repository path: /tftpboot/xcat//image/x86_64
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed
> > 
> > command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0*
> > 2>&1
> > 
> > | grep total 2>&1, pid: 9877, exit status: 0, output:
> >> |9866|30:30|image| 0 total
> >> 
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image does NOT
> >> exist: rh5image-power010701bi34-v0 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manage
> >> me nt node identifier argument was not specified
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|
> > 
> > xCAT2.pm:_get_image_template_path(2115)|attempting to determine template
> > path
> > 
> > for image:
> >> |9866|30:30|image| image name: rh5image-power010701bi34-v0
> >> |9866|30:30|image| OS install type: partimage
> >> |9866|30:30|image| OS source path: image
> >> |9866|30:30|image| xCAT 2.x OS source path: image
> >> 
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|return
> >> in g: /opt/xcat/share/xcat/install/image 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template
> >> repository path for rh5image-power010701bi34-v0:
> >> /opt/xcat/share/xcat/install/image 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file
> >> does not exist:
> >> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image
> >> rh5image-power010701bi34-v0 does NOT exist on this management node
> >> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image
> >> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
> >> structure updated:
> >> $self->request_data->{reservation}{30}{image}{lastupdate}
> >> 
> >> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
> >> 
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
> >> structure updated:
> >> $self->request_data->{reservation}{30}{imagerevision}{datecreated}
> >> 
> >> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling
> >> provisioning module's capture() subroutine 2011-07-05
> >> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power0107
> >> 01 bi34-v0, computer=power01
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
> > 
> > executing SSH command on power01:
> >> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> >> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
> >> |currentimage.txt; chmod 777 currentimage.txt' 2>&1
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
> > 
> > run_ssh_command output:
> >> |9866|30:30|image| Permission denied, please try again.
> >> |9866|30:30|image| Permission denied, please try again.
> >> |9866|30:30|image| Permission denied
> >> |(publickey,gssapi-with-mic,password).
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
> > 
> > command executed on power01, command:
> >> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> >> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
> >> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image|
> >> |returning (255, "Permission denied, please try ...")
> >> 
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated
> >> ownership and permissions on currentimage.txt
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
> > 
> > executing SSH command on power01:
> >> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> >> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
> >> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim
> >> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05
> >> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc
> >> |.e du">   currentimage.txt&&   cat currentimage.txt' 2>&1
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
> > 
> > run_ssh_command output:
> >> |9866|30:30|image| Permission denied, please try again.
> >> |9866|30:30|image| Permission denied, please try again.
> >> |9866|30:30|image| Permission denied
> >> |(publickey,gssapi-with-mic,password).
> >> 
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
> > 
> > command executed on power01, command:
> >> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> >> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
> >> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nim
> >> |ag erevision_id=34\r\nimagerevision_datecreated=2011-07-05
> >> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc
> >> |.e du">   currentimage.txt&&   cat currentimage.txt' 2>&1
> >> |9866|30:30|image| returning (255, "Permission denied, please try ...")
> >> |9866|30:30|image| ---- WARNING ----
> >> |9866|30:30|image| 2011-07-05
> >> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed
> >> |to create currentimage.txt file on power01: 9866|30:30|image|
> >> |Permission denied, please try again.
> >> |9866|30:30|image| Permission denied, please try again.
> >> |9866|30:30|image| Permission denied
> >> |(publickey,gssapi-with-mic,password). 9866|30:30|image| ( 0) utils.pm,
> >> |write_currentimage_txt (line: 5699) 9866|30:30|image| (-1) xCAT2.pm,
> >> |capture (line: 779)
> >> |9866|30:30|image| (-2) image.pm, process (line: 162)
> >> |9866|30:30|image| (-3) vcld, make_new_child (line: 568)
> >> |9866|30:30|image| (-4) vcld, main (line: 346)
> >> |9866|30:30|image| ---- WARNING ----
> >> |9866|30:30|image| 2011-07-05
> >> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update
> >> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture
> >> |(line: 783)
> >> |9866|30:30|image| (-1) image.pm, process (line: 162)
> >> |9866|30:30|image| (-2) vcld, make_new_child (line: 568)
> >> |9866|30:30|image| (-3) vcld, main (line: 346)
> >> |9866|30:30|image| ---- WARNING ----
> >> |9866|30:30|image| 2011-07-05
> >> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi3
> >> |4- v0 image failed to be captured by provisioning module
> >> |9866|30:30|image| ( 0) image.pm, process (line: 166)
> >> |9866|30:30|image| (-1) vcld, make_new_child (line: 568)
> >> |9866|30:30|image| (-2) vcld, main (line: 346)
> >> 
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
> >> ss (1581)|attempting to retrieve private IP address for computer:
> >> power01 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
> >> ss (1585)|retrieved contents of /etc/hosts on this management node,
> >> contains 158 lines 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_addre
> >> ss (1645)|returning IP address from /etc/hosts file: 172.20.106.1
> >> 2011-07-05
> >> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows
> >> were returned from database select 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(20
> >> 35 )|image owner id: 1 2011-07-05
> >> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
> >> (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
> >> handle stored in $ENV{dbh} 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|atte
> >> mp ting to retrieve and store data for user: user.id = '1' 2011-07-05
> >> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
> >> (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
> >> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
> >> handle stored in $ENV{dbh} 2011-07-05
> >> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data
> >> has been retrieved for user: admin (id: 1)
> >> 
> >> On 6/24/11 10:13 AM, Josh Thompson wrote:
> >>> Sunil,
> >>> 
> >>> "nodeset<nodename>   image" sets up all the xCAT stuff so that the next
> >>> time the node is booted, it will boot the stateless/statelite image and
> >>> capture an image of the node.
> >>> 
> >>> Can you double check that you have 'os' in the nodetype table set to
> >>> image for the node you are using?  If you look in the partimageng.pm
> >>> xCAT module, you see toward the top where it registers the
> >>> "handled_commands".  The "mk" gets stripped off.  So, that module is
> >>> registering "install" and "image" for os type = "image".  As long as
> >>> you have os in the nodetype table set to image, it should be using
> >>> that module.
> >>> 
> >>> You will need to make sure you have all of the required files in
> >>> locations using 'ppc64' as the arch.
> >>> 
> >>> Josh
> >>> 
> >>> On Wednesday June 22, 2011, Sunil Venkatesh wrote:
> >>>> Hi,
> >>>> 
> >>>> Update !
> >>>> 
> >>>> I was able to fix the problem that I was facing with the scripts by
> >>>> disabling the firewall. But, I still have a problem with the command-
> >>>> 
> >>>> nodeset<nodename>   image
> >>>> 
> >>>> Unless this error is fixed, I don't think partimage will work. Am I
> >>>> right here?
> >>>> 
> >>>> Thanks,
> >>>> Sunil
> >>>> 
> >>>> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<su...@umbc.edu>
> > 
> > wrote:
> >>>>> Josh,
> >>>>> 
> >>>>> I have reached a point where I am able to boot the ppc using the
> >>>>> statelite images created using genimage. But, I was wondering how
> >>>>> significant the following command is.
> >>>>> 
> >>>>> nodeset<nodename>   image
> >>>>> 
> >>>>> I got the same error that Prem had mentioned.
> >>>>> 
> >>>>> 
> >>>>> power01: Error: Unable to identify plugin for this command, check
> >>>>> relevant tables: nodetype.os
> >>>>> Error: Some nodes failed to set up image resources, aborting
> >>>>> 
> >>>>> I tried changing the 'os' field to 'image' under nodetype, that
> >>>>> doesn't seem to help. I get the same error even after the change.
> >>>>> 'arch' in my case is set to 'ppc64'.
> >>>>> 
> >>>>> 
> >>>>> Also, I think partimage plugin needs to be changed to support the ppc
> >>>>> architecture, from what you had mentioned in the other thread.
> >>>>> 
> >>>>> I am not sure what the command 'nodeset<nodename>   image' does, but,
> >>>>> I am able to boot the statelite images by making changes to the
> >>>>> yaboot configuration files. The ppc blade currently uses LVM, that
> >>>>> needs to be replaced with ext2/ext3 from what I read from the other
> >>>>> thread, am I right? Also, just out of curiosity I left the statelite
> >>>>> image to boot with my current setting. I can see the xcat script
> >>>>> throwing an error-
> >>>>> 
> >>>>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No
> >>>>> such file or directory
> >>>>> /tmp/mypostscript: line 16: updateflag.awk: command not found
> >>>>> 
> >>>>> both getpostscript.awk&   updateflag.awk are not found in the rootimg
> >>>>> created by genimage. Is there any place I could find these scripts?
> >>>>> 
> >>>>> Also, please correct me if there is anything wrong with the procedure
> >>>>> I am following.
> >>>>> 
> >>>>> 
> >>>>> Thanks in advance.
> >>>>> 
> >>>>> Regards,
> >>>>> Sunil
> >>>>> 
> >>>>> On 6/13/11 4:13 PM, Josh Thompson wrote:
> >>>>>> Sunil,
> >>>>>> 
> >>>>>>    From what I remember, I didn't have to do much to the rootimg.gz
> >>>>>>    image to
> >>>>>> 
> >>>>>> make
> >>>>>> it work.  I created the files I supply before xCAT started using
> >>>>>> "statelite"
> >>>>>> instead of "stateless".  I think statelite uses NFS to mount the
> >>>>>> image, and
> >>>>>> stateless uses an image file downloaded to the node and run out of
> >>>>>> RAM.
> >>>>>> 
> >>>>>>    Since
> >>>>>> 
> >>>>>> generating a statelite image is pretty straightforward use of xCAT,
> >>>>>> you may
> >>>>>> want to ask on the xcat-user email list for help with it.
> >>>>>> 
> >>>>>> Unless you can have the admins of the other dhcp server on your
> >>>>>> network exclude the MAC addresses of your blades, you'll need to
> >>>>>> create a separate private network to control your VCL stuff, either
> >>>>>> physically or with VLANs.
> >>>>>> 
> >>>>>> If they can exclude the MACs, you can set up the dhcp server on your
> >>>>>> management node to only answer to requests from your blades.
> >>>>>> 
> >>>>>> Josh
> >>>>>> 
> >>>>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
> >>>>>>> Josh,
> >>>>>>> 
> >>>>>>> Again, Thank you for your valuable inputs. I have got to the point
> >>>>>>> where I can get the compute node to boot using the stateless
> >>>>>>> images. I had to manually configure the netboot since we already
> >>>>>>> had a DHCP server which is not the same as our Management node.
> >>>>>>> Since our setup is not in an isolated environment, I could not let
> >>>>>>> xcat handle the dhcp&    netboot configuration (it messed up out
> >>>>>>> network
> >>>>>>> configuration when i let xcat handle it,we had 2 dhcp servers
> >>>>>>> running at that point). Are you aware of any way to let xcat handle
> >>>>>>> such scenarios?
> >>>>>>> 
> >>>>>>> Although I am able to get the compute node to boot with the kernel
> >>>>>>> image&    initrd, and NFS mount the rootimg that was generated
> >>>>>>> using 'genimage', I am getting the following error on the compute
> >>>>>>> node's console -
> >>>>>>> 
> >>>>>>>        FATAL error: could not get the entries from litefile
> >>>>>>>        table...
> >>>>>>> 
> >>>>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is
> >>>>>>> not present in the rootimg. I am currently checking the xcat
> >>>>>>> packages for its availability. If you know the procedure to get it
> >>>>>>> onto the compute node, please let me know the same.
> >>>>>>> 
> >>>>>>> Appreciate your support.
> >>>>>>> 
> >>>>>>> Thanking you,
> >>>>>>> Sunil
> >>>>>>> 
> >>>>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
> >>>>>>>> Sunil,
> >>>>>>>> 
> >>>>>>>> I don't recall seeing any documentation on those parts.  I had to
> >>>>>>>> poke around looking at parts of xCAT to see how it worked.  It's
> >>>>>>>> been a few years since I did that; so, I don't remember much about
> >>>>>>>> the process. My recommendation would be to start looking at things
> >>>>>>>> in the rootimg.gz image.  Looking at it now, I see that
> >>>>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
> >>>>>>>> like it downloads all of the postscripts from the management node
> >>>>>>>> and then run getpostscript.awk which issues a command to xcatd to
> >>>>>>>> get the primary postscript for that machine.  I've forgotten how
> >>>>>>>> xcatd then builds the primary postscript. I do remember that in
> >>>>>>>> the partimageng.pm module, I had it add the partimageng
> >>>>>>>> postscript.
> >>>>>>>> 
> >>>>>>>> So, you'll really have to start digging through how the xcat
> >>>>>>>> postscript system works.
> >>>>>>>> 
> >>>>>>>> Josh
> >>>>>>>> 
> >>>>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
> >>>>>>>>> Josh,
> >>>>>>>>> 
> >>>>>>>>> Is there any place I could find some details on
> >>>>>>>>> 
> >>>>>>>>> "... /Once the compute node is booted with the stateless
> >>>>>>>>> image, it uses NFS to mount some things from the management node,
> >>>>>>>>> and then runs some xcat postscripts,/.... "
> >>>>>>>>> 
> >>>>>>>>> I have the stateless images ready with partimage compiled for
> >>>>>>>>> PPC. For the compute node (power 7) to boot using the stateless
> >>>>>>>>> images, i need to
> >>>>>>>>> configure the yaboot instead of pxeboot (which is specific to
> >>>>>>>>> x86). I wanted to know where in the startup files the execution
> >>>>>>>>> of partimage and
> >>>>>>>>> NFS mount is configured. Is it configured by the "genimage"
> >>>>>>>>> command itself? Considering the way in which the nodes are
> >>>>>>>>> configured in the network, it would not be a good idea to let
> >>>>>>>>> xcat take care of configuring the details like DHCPD for
> >>>>>>>>> netboot. So, I need to make changes to the configuration files
> >>>>>>>>> manually, which is why this query came up.
> >>>>>>>>> 
> >>>>>>>>> Thanks in advance.
> >>>>>>>>> 
> >>>>>>>>> Regards,
> >>>>>>>>> Sunil
> >>>>>>>>> 
> >>>>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
> >>>>>>>>>> Sunil,
> >>>>>>>>>> 
> >>>>>>>>>> The "stateless" image I refer to is what is actually booted on
> >>>>>>>>>> the compute node containing the image to be captured.  It's
> >>>>>>>>>> called stateless because it is loaded completely in RAM and
> >>>>>>>>>> does not maintain any state when a reboot occurs.
> >>>>>>>>>> 
> >>>>>>>>>> The partimage binary is part of this stateless image and
> >>>>>>>>>> actually runs on the compute node.  It does not run on the
> >>>>>>>>>> management node. The management node does not have block level
> >>>>>>>>>> access to the disk on the compute node to be able to capture
> >>>>>>>>>> the image from the disk.
> >>>>>>>>>> 
> >>>>>>>>>> I'll try to describe the process a little better.  The
> >>>>>>>>>> management node issues a reboot command to the compute node. 
> >>>>>>>>>> The compute node uses PXE
> >>>>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk
> >>>>>>>>>> (initrd.img), and
> >>>>>>>>>> a root filesystem (rootimg.gz) from the management node.  All
> >>>>>>>>>> three of these together make up the stateless image.  Once the
> >>>>>>>>>> compute node is booted with the stateless image, it uses NFS to
> >>>>>>>>>> mount some things from the management node, and then runs some
> >>>>>>>>>> xcat
> >>>>>>>>>> postscripts, one of which is the partimageng postscript.  This
> >>>>>>>>>> postscript determines what partitions are on the compute node
> >>>>>>>>>> and, depending on how the postscript
> >>>>>>>>>> is configured, uses partimage or partimageng to capture an image
> >>>>>>>>>> of the
> >>>>>>>>>> compute node disk that is then saved to the management node.
> >>>>>>>>>> When it is
> >>>>>>>>>> finished capturing the image, it notifies xcat on the management
> >>>>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
> >>>>>>>>>> compute node to
> >>>>>>>>>> boot off of disk at next boot.  When the compute node comes up,
> >>>>>>>>>> it uses
> >>>>>>>>>> PXE to ask the management node how to boot.  The management node
> >>>>>>>>>> tells it to boot off of disk.
> >>>>>>>>>> 
> >>>>>>>>>> I hope that clarifies how the system works.  If any of it is
> >>>>>>>>>> unclear, please ask for further clarification.
> >>>>>>>>>> 
> >>>>>>>>>> Josh
> >>>>>>>>>> 
> >>>>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> >>>>>>>>>>> Josh,
> >>>>>>>>>>> 
> >>>>>>>>>>> I had one more clarification.
> >>>>>>>>>>> 
> >>>>>>>>>>> partimage binaries run in the management node to capture an
> >>>>>>>>>>> (stateless) image from the compute node right? In that case, is
> >>>>>>>>>>> there a need for these binaries to go into the rootimg.gz??
> >>>>>>>>>>> 
> >>>>>>>>>>> My assumption is, partimage runs on the management node (an
> >>>>>>>>>>> intel blade in our case) to capture a stateless image from a
> >>>>>>>>>>> compute node (a power 7 blade) and stores these images under "
> >>>>>>>>>>> /install " of the management node. Please correct me if I am
> >>>>>>>>>>> wrong here.
> >>>>>>>>>>> 
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Sunil
> >>>>>>>>>>> 
> >>>>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
> >>>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
> >>>>>>>>>>>> Hash: SHA1
> >>>>>>>>>>>> 
> >>>>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> I used the steps that were mentioned under
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
> >>>>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL
> >>>>>>>>>>>>> /A dd ing+support+for+p>   ar ti mag e+and+partimage-
> >>>>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need
> >>>>>>>>>>>>> to change references to x86&       x86_64 (as directories) to
> >>>>>>>>>>>>> reflect the
> >>>>>>>>>>>>> ppc architecture, as the web page says "The architecture for
> >>>>>>>>>>>>> the node must always be set to x86 for this..". I have with
> >>>>>>>>>>>>> me the vmlinuz (kernel image) and initrd for the capture
> >>>>>>>>>>>>> process. The 2 nodeset commands
> >>>>>>>>>>>> 
> >>>>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your
> >>>>>>>>>>>> power blades, not the ones linked to off of the page you
> >>>>>>>>>>>> listed above? If you do, that's a good start.  However,
> >>>>>>>>>>>> you'll also need rootimg.gz. rootimg.gz is the root
> >>>>>>>>>>>> filesystem for the stateless image.  It also contains the
> >>>>>>>>>>>> partimage and partimageng binaries. Assuming partimage or
> >>>>>>>>>>>> partimageng can actually capture partitions from power
> >>>>>>>>>>>> systems, you'll need to compile at least one of them to run
> >>>>>>>>>>>> on power.  For the rootimg.gz image I provided, I compiled
> >>>>>>>>>>>> them statically so that I didn't have to worry about
> >>>>>>>>>>>> including any library dependencies in rootimg.gz.
> >>>>>>>>>>>> 
> >>>>>>>>>>>> It would be a good idea to research how to use xcat's genimage
> >>>>>>>>>>>> command to generate stateless images to learn how to do this.
> >>>>>>>>>>>> 
> >>>>>>>>>>>> If there's any part of the above that you don't fully
> >>>>>>>>>>>> understand, please ask me to clarify it.  Until you have a
> >>>>>>>>>>>> stateless image that you can deploy to your power blades,
> >>>>>>>>>>>> there's no point in trying to debug any VCL specific items.
> >>>>>>>>>>>> 
> >>>>>>>>>>>> Josh
> >>>>>>>>>>>> - --
> >>>>>>>>>>>> - ------------------------------**-
> >>>>>>>>>>>> Josh Thompson
> >>>>>>>>>>>> VCL Developer
> >>>>>>>>>>>> North Carolina State University
> >>>>>>>>>>>> 
> >>>>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
> >>>>>>>>>>>> -----BEGIN PGP SIGNATURE-----
> >>>>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
> >>>>>>>>>>>> 
> >>>>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
> >>>>>>>>>>>> g75RqGZY/j
> >>>>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
> >>>>>>>>>>>> =exBV
> >>>>>>>>>>>> -----END PGP SIGNATURE-----
> > 
> > - --
> > - -------------------------------
> > Josh Thompson
> > VCL Developer
> > North Carolina State University
> > 
> > my GPG/PGP key can be found at pgp.mit.edu
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v2.0.17 (GNU/Linux)
> > 
> > iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S
> > 6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15
> > =Yls1
> > -----END PGP SIGNATURE-----
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk4V9PEACgkQV/LQcNdtPQM6ZgCfaPLJh9MuEVLqRYdHNLqC8BzQ
JOsAn35U1e4V+xuxFPajb2rVVcg4gril
=CWDr
-----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Thanks Josh. My professor was asking about the details of VCL workshop 
in NC. Are you aware of these details?


Please bare with my comments inline.

On 7/7/11 11:13 AM, Josh Thompson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Tuesday July 05, 2011, Sunil Venkatesh wrote:
>> Hi Josh,
>>
>> I was able to get the following things done in respect to getting VCL to
>> work on POWER.
>>
>> 1. Made modifications in the xcat tables to get the capture process
>> working with statelite images instead of stateless images. Particularly
>> the noderes&  bootparams table.
>>
>> 2. Used partimage to capture the images (did NOT set usepartimageng to 1).
>>
>> -rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
>> -rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
>> -rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
>> -rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
>> -rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
>> -rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
>> -rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk
>>
>>
>> 2 partitions including the boot partition present on the blade were
>> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on
>> a 600 GB partition due to which the capture process failed. The image of
>> the partition was generated once the partition size was reduced to 6GB.
>> Is it necessary for me to use partimage-ng instead of partimage itself?
> Are you asking if you need to use partimage-ng for partitions that are 600GB?
> If so, I don't really know.  We've never dealt with partitions that large.
Here, I am just asking if images captured using partimage are recognized 
by VCL or is it required that I use partimage-ng. From your earlier 
emails to Prem, I could notice that the only difference between 
partimage & partimage-ng (after setting userpartimageng to 1) is the 
former generates images with .gz and the later generates .img. Am I 
right here? Also, I was able to get the 600GB partition captured, since 
the partition was empty, it resulted in a ~17MB image file.
>
>> When proceeding further with "vcld --setup", the script was not able to
>> find the images that were created using partimage. The options that are
>> provided in the script does not allow for selecting an architecture
>> other than x86/x86_64.
> You'll need to modify the vcld image.pm module.  Look in
> /usr/local/vcl/lib/VCL.  In image.pm, look for the function
> 'setup_capture_base_image'; then, find 'my @architecture_choices' and add
> 'ppc' as another option.

As a matter of fact, I tried this step. But, the 
_get_image_repository_path function in 
/usr/local/vcl/lib/VCL/Module/Provisioning/xCAT.pm does not recognize 
the architecture when I choose ppc/ppc64 in the menu. On line 2922 in 
the same file, image_architecture is set to undefined. I think the list 
of supported architectures is stored in some mysql table. I haven't 
checked regarding this, i was trying to get VCL to recognize the images 
as x86/x86_64 by setting up soft links in the search paths of VCL.
>> Also, in the error log vcld is looking for
>>
>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
>>
>> and cannot find the template file. Should the template file that needs
>> to be accessed in this case be createimage.ppc64.tmpl?
> This is actually a check to make sure the image doesn't already exist before
> trying to capturing it.  So, it is good that it doesn't find it.
If possible, could you please provide me with the details of steps that 
take place here. If there are any documentation available regarding 
this, that would work too. U said "image doesn't already exist before 
trying to capturing it", how does VCL capture the images? does it make 
use of the images that are already generated using partimage? if so, in 
what places does it look for the images?

Sorry for asking too many questions. I could trace the scripts to check 
the flow, but, that would take a lot of time. You have been really 
patient with all my queries, appreciate that.

Thanks
Sunil
> It sounds like you're almost there.  Great work!
>
> Josh
>
>> I have attached a log at the end of the mail. I am not sure where I have
>> gone wrong with the VCL configuration.
>>
>> -Sunil
>>
>> -----
>>
>> rh5image-power010701bi34-v0 image creation failed
>> ------------------------------------------------------------------------
>> time: 2011-07-05 11:03:25
>> caller: image.pm:reservation_failed(385)
>> ( 0) image.pm, reservation_failed (line: 385)
>> (-1) image.pm, process (line: 167)
>> (-2) vcld, make_new_child (line: 568)
>> (-3) vcld, main (line: 346)
>> ------------------------------------------------------------------------
>> management node: web1.bluegrit.cs.umbc.edu
>> reservation PID: 9866
>> parent vcld PID: 19110
>>
>> request ID: 30
>> reservation ID: 30
>> request state/laststate: image/image
>> request start time: 2011-07-05 11:03:20
>> request end time: 2011-07-05 12:03:20
>> for imaging: no
>> log ID: none
>>
>> computer: power01.bluegrit.cs.umbc.edu
>> computer id: 2
>> computer type: blade
>> computer eth0 MAC address:<undefined>
>> computer eth1 MAC address:<undefined>
>> computer private IP address: 172.20.106.1
>> computer public IP address: 172.20.106.1
>> computer in block allocation: no
>> provisioning module: VCL::Module::Provisioning::xCAT2
>>
>> image: rh5image-power010701bi34-v0
>> image display name: power010701bi
>> image ID: 34
>> image revision ID: 34
>> image size: 1450 MB
>> use Sysprep: yes
>> root access: yes
>> image owner ID: 1
>> image owner affiliation: Local
>> image revision date created: 2011-07-05 11:03:25
>> image revision production: yes
>> OS module: VCL::Module::OS::Linux
>>
>> user: admin
>> user name: vcl admin
>> user ID: 1
>> user affiliation: Local
>> ------------------------------------------------------------------------
>> RECENT LOG ENTRIES FOR THIS PROCESS:
>> 2011-07-05
>> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|VCL::Module::OS:
>> :Linux OS object created for rh5image-power010701bi34-v0, address: 88fb070
>> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT
>> environment variable is not set, using /opt/xcat 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found:
>> /opt/xcat 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module initialized
>> 2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT
>> environment variable is not set, using /opt/xcat 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
>> /opt/xcat 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
>> 2011-07-05
>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL::M
>> odule::Provisioning::xCAT2 module loaded 2011-07-05
>> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management
>> node OS object has already been created, address: 88f23b0, returning 1
>> 2011-07-05
>> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::xC
>> AT2 object created for computer power01, address: 88fb0e0 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
>> variable is not set, using /opt/xcat 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
>> /opt/xcat 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
>> 2011-07-05
>> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL::M
>> odule::Provisioning::xCAT2 provisioner object created for power01, address:
>> 88fb0e0 2011-07-05
>> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1 2011-07-05
>> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object
>> created and initialized 2011-07-05
>> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail
>> To:shrusun@gmail.com, VCL IMAGE Creation Started:
>> rh5image-power010701bi34-v0 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS install
>> type: partimage 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|managem
>> ent node identifier argument was not specified
>>
>> 2011-07-05 11:03:25|9866|30:30|image|
> xCAT.pm:_get_image_repository_path(2932)|attempting to determine repository
> path for image on web1.bluegrit.cs.umbc.edu:
>> |9866|30:30|image| image id: 34
>> |9866|30:30|image| OS name: rh5image
>> |9866|30:30|image| OS type: linux
>> |9866|30:30|image| OS install type: partimage
>> |9866|30:30|image| OS source path: image
>> |9866|30:30|image| architecture: x86_64
>>
>> 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did not
>> find any images under /tftpboot/xcat//linux_image/x86_64 on
>> web1.bluegrit.cs.umbc.edu 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|returni
>> ng repository path for web1.bluegrit.cs.umbc.edu:
>> /tftpboot/xcat//image/x86_64 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image repository
>> path: /tftpboot/xcat//image/x86_64
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed
> command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0* 2>&1
> | grep total 2>&1, pid: 9877, exit status: 0, output:
>> |9866|30:30|image| 0 total
>>
>> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image
>> does NOT exist: rh5image-power010701bi34-v0 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manageme
>> nt node identifier argument was not specified
>>
>> 2011-07-05 11:03:25|9866|30:30|image|
> xCAT2.pm:_get_image_template_path(2115)|attempting to determine template path
> for image:
>> |9866|30:30|image| image name: rh5image-power010701bi34-v0
>> |9866|30:30|image| OS install type: partimage
>> |9866|30:30|image| OS source path: image
>> |9866|30:30|image| xCAT 2.x OS source path: image
>>
>> 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|returnin
>> g: /opt/xcat/share/xcat/install/image 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template
>> repository path for rh5image-power010701bi34-v0:
>> /opt/xcat/share/xcat/install/image 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file
>> does not exist:
>> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
>> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image
>> rh5image-power010701bi34-v0 does NOT exist on this management node
>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image
>> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data structure
>> updated: $self->request_data->{reservation}{30}{image}{lastupdate}
>>
>> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
>>
>> 2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
>> structure updated:
>> $self->request_data->{reservation}{30}{imagerevision}{datecreated}
>>
>> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
>>
>> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling
>> provisioning module's capture() subroutine 2011-07-05
>> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power010701
>> bi34-v0, computer=power01
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
> executing SSH command on power01:
>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
> run_ssh_command output:
>> |9866|30:30|image| Permission denied, please try again.
>> |9866|30:30|image| Permission denied, please try again.
>> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
> command executed on power01, command:
>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
>> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image|
>> |returning (255, "Permission denied, please try ...")
>>
>> 2011-07-05
>> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated
>> ownership and permissions on currentimage.txt
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
> executing SSH command on power01:
>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimag
>> |erevision_id=34\r\nimagerevision_datecreated=2011-07-05
>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.e
>> |du">   currentimage.txt&&   cat currentimage.txt' 2>&1
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
> run_ssh_command output:
>> |9866|30:30|image| Permission denied, please try again.
>> |9866|30:30|image| Permission denied, please try again.
>> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
>>
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH
> command executed on power01, command:
>> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
>> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
>> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimag
>> |erevision_id=34\r\nimagerevision_datecreated=2011-07-05
>> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.e
>> |du">   currentimage.txt&&   cat currentimage.txt' 2>&1 9866|30:30|image|
>> |returning (255, "Permission denied, please try ...") 9866|30:30|image|
>> |---- WARNING ----
>> |9866|30:30|image| 2011-07-05
>> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed to
>> |create currentimage.txt file on power01: 9866|30:30|image| Permission
>> |denied, please try again.
>> |9866|30:30|image| Permission denied, please try again.
>> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
>> |9866|30:30|image| ( 0) utils.pm, write_currentimage_txt (line: 5699)
>> |9866|30:30|image| (-1) xCAT2.pm, capture (line: 779)
>> |9866|30:30|image| (-2) image.pm, process (line: 162)
>> |9866|30:30|image| (-3) vcld, make_new_child (line: 568)
>> |9866|30:30|image| (-4) vcld, main (line: 346)
>> |9866|30:30|image| ---- WARNING ----
>> |9866|30:30|image| 2011-07-05
>> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update
>> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture
>> |(line: 783)
>> |9866|30:30|image| (-1) image.pm, process (line: 162)
>> |9866|30:30|image| (-2) vcld, make_new_child (line: 568)
>> |9866|30:30|image| (-3) vcld, main (line: 346)
>> |9866|30:30|image| ---- WARNING ----
>> |9866|30:30|image| 2011-07-05
>> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi34-
>> |v0 image failed to be captured by provisioning module 9866|30:30|image| (
>> |0) image.pm, process (line: 166)
>> |9866|30:30|image| (-1) vcld, make_new_child (line: 568)
>> |9866|30:30|image| (-2) vcld, main (line: 346)
>>
>> 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
>> (1581)|attempting to retrieve private IP address for computer: power01
>> 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
>> (1585)|retrieved contents of /etc/hosts on this management node, contains
>> 158 lines 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
>> (1645)|returning IP address from /etc/hosts file: 172.20.106.1 2011-07-05
>> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows were
>> returned from database select 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(2035
>> )|image owner id: 1 2011-07-05
>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
>> (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
>> handle stored in $ENV{dbh} 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|attemp
>> ting to retrieve and store data for user: user.id = '1' 2011-07-05
>> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
>> (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
>> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
>> handle stored in $ENV{dbh} 2011-07-05
>> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data
>> has been retrieved for user: admin (id: 1)
>>
>> On 6/24/11 10:13 AM, Josh Thompson wrote:
>>> Sunil,
>>>
>>> "nodeset<nodename>   image" sets up all the xCAT stuff so that the next
>>> time the node is booted, it will boot the stateless/statelite image and
>>> capture an image of the node.
>>>
>>> Can you double check that you have 'os' in the nodetype table set to
>>> image for the node you are using?  If you look in the partimageng.pm
>>> xCAT module, you see toward the top where it registers the
>>> "handled_commands".  The "mk" gets stripped off.  So, that module is
>>> registering "install" and "image" for os type = "image".  As long as you
>>> have os in the nodetype table set to image, it should be using that
>>> module.
>>>
>>> You will need to make sure you have all of the required files in
>>> locations using 'ppc64' as the arch.
>>>
>>> Josh
>>>
>>> On Wednesday June 22, 2011, Sunil Venkatesh wrote:
>>>> Hi,
>>>>
>>>> Update !
>>>>
>>>> I was able to fix the problem that I was facing with the scripts by
>>>> disabling the firewall. But, I still have a problem with the command-
>>>>
>>>> nodeset<nodename>   image
>>>>
>>>> Unless this error is fixed, I don't think partimage will work. Am I
>>>> right here?
>>>>
>>>> Thanks,
>>>> Sunil
>>>>
>>>> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<su...@umbc.edu>
> wrote:
>>>>> Josh,
>>>>>
>>>>> I have reached a point where I am able to boot the ppc using the
>>>>> statelite images created using genimage. But, I was wondering how
>>>>> significant the following command is.
>>>>>
>>>>> nodeset<nodename>   image
>>>>>
>>>>> I got the same error that Prem had mentioned.
>>>>>
>>>>>
>>>>> power01: Error: Unable to identify plugin for this command, check
>>>>> relevant tables: nodetype.os
>>>>> Error: Some nodes failed to set up image resources, aborting
>>>>>
>>>>> I tried changing the 'os' field to 'image' under nodetype, that doesn't
>>>>> seem to help. I get the same error even after the change. 'arch' in my
>>>>> case is set to 'ppc64'.
>>>>>
>>>>>
>>>>> Also, I think partimage plugin needs to be changed to support the ppc
>>>>> architecture, from what you had mentioned in the other thread.
>>>>>
>>>>> I am not sure what the command 'nodeset<nodename>   image' does, but, I
>>>>> am able to boot the statelite images by making changes to the yaboot
>>>>> configuration files. The ppc blade currently uses LVM, that needs to
>>>>> be replaced with ext2/ext3 from what I read from the other thread, am
>>>>> I right? Also, just out of curiosity I left the statelite image to
>>>>> boot with my current setting. I can see the xcat script throwing an
>>>>> error-
>>>>>
>>>>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such
>>>>> file or directory
>>>>> /tmp/mypostscript: line 16: updateflag.awk: command not found
>>>>>
>>>>> both getpostscript.awk&   updateflag.awk are not found in the rootimg
>>>>> created by genimage. Is there any place I could find these scripts?
>>>>>
>>>>> Also, please correct me if there is anything wrong with the procedure I
>>>>> am following.
>>>>>
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Regards,
>>>>> Sunil
>>>>>
>>>>> On 6/13/11 4:13 PM, Josh Thompson wrote:
>>>>>> Sunil,
>>>>>>
>>>>>>    From what I remember, I didn't have to do much to the rootimg.gz
>>>>>>    image to
>>>>>>
>>>>>> make
>>>>>> it work.  I created the files I supply before xCAT started using
>>>>>> "statelite"
>>>>>> instead of "stateless".  I think statelite uses NFS to mount the
>>>>>> image, and
>>>>>> stateless uses an image file downloaded to the node and run out of
>>>>>> RAM.
>>>>>>
>>>>>>    Since
>>>>>>
>>>>>> generating a statelite image is pretty straightforward use of xCAT,
>>>>>> you may
>>>>>> want to ask on the xcat-user email list for help with it.
>>>>>>
>>>>>> Unless you can have the admins of the other dhcp server on your
>>>>>> network exclude the MAC addresses of your blades, you'll need to
>>>>>> create a separate private network to control your VCL stuff, either
>>>>>> physically or with VLANs.
>>>>>>
>>>>>> If they can exclude the MACs, you can set up the dhcp server on your
>>>>>> management node to only answer to requests from your blades.
>>>>>>
>>>>>> Josh
>>>>>>
>>>>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
>>>>>>> Josh,
>>>>>>>
>>>>>>> Again, Thank you for your valuable inputs. I have got to the point
>>>>>>> where I can get the compute node to boot using the stateless images.
>>>>>>> I had to manually configure the netboot since we already had a DHCP
>>>>>>> server which is not the same as our Management node. Since our setup
>>>>>>> is not in an isolated environment, I could not let xcat handle the
>>>>>>> dhcp&    netboot configuration (it messed up out network
>>>>>>> configuration when i let xcat handle it,we had 2 dhcp servers
>>>>>>> running at that point). Are you aware of any way to let xcat handle
>>>>>>> such scenarios?
>>>>>>>
>>>>>>> Although I am able to get the compute node to boot with the kernel
>>>>>>> image&    initrd, and NFS mount the rootimg that was generated using
>>>>>>> 'genimage', I am getting the following error on the compute node's
>>>>>>> console -
>>>>>>>
>>>>>>>        FATAL error: could not get the entries from litefile table...
>>>>>>>
>>>>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is
>>>>>>> not present in the rootimg. I am currently checking the xcat
>>>>>>> packages for its availability. If you know the procedure to get it
>>>>>>> onto the compute node, please let me know the same.
>>>>>>>
>>>>>>> Appreciate your support.
>>>>>>>
>>>>>>> Thanking you,
>>>>>>> Sunil
>>>>>>>
>>>>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
>>>>>>>> Sunil,
>>>>>>>>
>>>>>>>> I don't recall seeing any documentation on those parts.  I had to
>>>>>>>> poke around looking at parts of xCAT to see how it worked.  It's
>>>>>>>> been a few years since I did that; so, I don't remember much about
>>>>>>>> the process. My recommendation would be to start looking at things
>>>>>>>> in the rootimg.gz image.  Looking at it now, I see that
>>>>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
>>>>>>>> like it downloads all of the postscripts from the management node
>>>>>>>> and then run getpostscript.awk which issues a command to xcatd to
>>>>>>>> get the primary postscript for that machine.  I've forgotten how
>>>>>>>> xcatd then builds the primary postscript. I do remember that in the
>>>>>>>> partimageng.pm module, I had it add the partimageng postscript.
>>>>>>>>
>>>>>>>> So, you'll really have to start digging through how the xcat
>>>>>>>> postscript system works.
>>>>>>>>
>>>>>>>> Josh
>>>>>>>>
>>>>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>>>>>>>>> Josh,
>>>>>>>>>
>>>>>>>>> Is there any place I could find some details on
>>>>>>>>>
>>>>>>>>> "... /Once the compute node is booted with the stateless
>>>>>>>>> image, it uses NFS to mount some things from the management node,
>>>>>>>>> and then runs some xcat postscripts,/.... "
>>>>>>>>>
>>>>>>>>> I have the stateless images ready with partimage compiled for PPC.
>>>>>>>>> For the compute node (power 7) to boot using the stateless images,
>>>>>>>>> i need to
>>>>>>>>> configure the yaboot instead of pxeboot (which is specific to x86).
>>>>>>>>> I wanted to know where in the startup files the execution of
>>>>>>>>> partimage and
>>>>>>>>> NFS mount is configured. Is it configured by the "genimage" command
>>>>>>>>> itself? Considering the way in which the nodes are configured in
>>>>>>>>> the network, it would not be a good idea to let xcat take care of
>>>>>>>>> configuring the details like DHCPD for netboot. So, I need to make
>>>>>>>>> changes to the configuration files manually, which is why this
>>>>>>>>> query came up.
>>>>>>>>>
>>>>>>>>> Thanks in advance.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sunil
>>>>>>>>>
>>>>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>>>>>>>>> Sunil,
>>>>>>>>>>
>>>>>>>>>> The "stateless" image I refer to is what is actually booted on the
>>>>>>>>>> compute node containing the image to be captured.  It's called
>>>>>>>>>> stateless because it is loaded completely in RAM and does not
>>>>>>>>>> maintain any state when a reboot occurs.
>>>>>>>>>>
>>>>>>>>>> The partimage binary is part of this stateless image and actually
>>>>>>>>>> runs on the compute node.  It does not run on the management node.
>>>>>>>>>> The management node does not have block level access to the disk
>>>>>>>>>> on the compute node to be able to capture the image from the
>>>>>>>>>> disk.
>>>>>>>>>>
>>>>>>>>>> I'll try to describe the process a little better.  The management
>>>>>>>>>> node issues a reboot command to the compute node.  The compute
>>>>>>>>>> node uses PXE
>>>>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk
>>>>>>>>>> (initrd.img), and
>>>>>>>>>> a root filesystem (rootimg.gz) from the management node.  All
>>>>>>>>>> three of these together make up the stateless image.  Once the
>>>>>>>>>> compute node is booted with the stateless image, it uses NFS to
>>>>>>>>>> mount some things from the management node, and then runs some
>>>>>>>>>> xcat
>>>>>>>>>> postscripts, one of which is the partimageng postscript.  This
>>>>>>>>>> postscript determines what partitions are on the compute node and,
>>>>>>>>>> depending on how the postscript
>>>>>>>>>> is configured, uses partimage or partimageng to capture an image
>>>>>>>>>> of the
>>>>>>>>>> compute node disk that is then saved to the management node. When
>>>>>>>>>> it is
>>>>>>>>>> finished capturing the image, it notifies xcat on the management
>>>>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
>>>>>>>>>> compute node to
>>>>>>>>>> boot off of disk at next boot.  When the compute node comes up, it
>>>>>>>>>> uses
>>>>>>>>>> PXE to ask the management node how to boot.  The management node
>>>>>>>>>> tells it to boot off of disk.
>>>>>>>>>>
>>>>>>>>>> I hope that clarifies how the system works.  If any of it is
>>>>>>>>>> unclear, please ask for further clarification.
>>>>>>>>>>
>>>>>>>>>> Josh
>>>>>>>>>>
>>>>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>> Josh,
>>>>>>>>>>>
>>>>>>>>>>> I had one more clarification.
>>>>>>>>>>>
>>>>>>>>>>> partimage binaries run in the management node to capture an
>>>>>>>>>>> (stateless) image from the compute node right? In that case, is
>>>>>>>>>>> there a need for these binaries to go into the rootimg.gz??
>>>>>>>>>>>
>>>>>>>>>>> My assumption is, partimage runs on the management node (an intel
>>>>>>>>>>> blade in our case) to capture a stateless image from a compute
>>>>>>>>>>> node (a power 7 blade) and stores these images under " /install
>>>>>>>>>>> " of the management node. Please correct me if I am wrong here.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Sunil
>>>>>>>>>>>
>>>>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>>>>>> Hash: SHA1
>>>>>>>>>>>>
>>>>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I used the steps that were mentioned under
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
>>>>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL/A
>>>>>>>>>>>>> dd ing+support+for+p>   ar ti mag e+and+partimage-
>>>>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
>>>>>>>>>>>>>
>>>>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need
>>>>>>>>>>>>> to change references to x86&       x86_64 (as directories) to
>>>>>>>>>>>>> reflect the
>>>>>>>>>>>>> ppc architecture, as the web page says "The architecture for
>>>>>>>>>>>>> the node must always be set to x86 for this..". I have with me
>>>>>>>>>>>>> the vmlinuz (kernel image) and initrd for the capture process.
>>>>>>>>>>>>> The 2 nodeset commands
>>>>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
>>>>>>>>>>>> blades, not the ones linked to off of the page you listed above?
>>>>>>>>>>>> If you do, that's a good start.  However, you'll also need
>>>>>>>>>>>> rootimg.gz. rootimg.gz is the root filesystem for the stateless
>>>>>>>>>>>> image.  It also contains the partimage and partimageng binaries.
>>>>>>>>>>>> Assuming partimage or partimageng can actually capture
>>>>>>>>>>>> partitions from power systems, you'll need to compile at least
>>>>>>>>>>>> one of them to run on power.  For the rootimg.gz image I
>>>>>>>>>>>> provided, I compiled them statically so that I didn't have to
>>>>>>>>>>>> worry about including any library dependencies in rootimg.gz.
>>>>>>>>>>>>
>>>>>>>>>>>> It would be a good idea to research how to use xcat's genimage
>>>>>>>>>>>> command to generate stateless images to learn how to do this.
>>>>>>>>>>>>
>>>>>>>>>>>> If there's any part of the above that you don't fully
>>>>>>>>>>>> understand, please ask me to clarify it.  Until you have a
>>>>>>>>>>>> stateless image that you can deploy to your power blades,
>>>>>>>>>>>> there's no point in trying to debug any VCL specific items.
>>>>>>>>>>>>
>>>>>>>>>>>> Josh
>>>>>>>>>>>> - --
>>>>>>>>>>>> - ------------------------------**-
>>>>>>>>>>>> Josh Thompson
>>>>>>>>>>>> VCL Developer
>>>>>>>>>>>> North Carolina State University
>>>>>>>>>>>>
>>>>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>>>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>>>>>>>>
>>>>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
>>>>>>>>>>>> g75RqGZY/j
>>>>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
>>>>>>>>>>>> =exBV
>>>>>>>>>>>> -----END PGP SIGNATURE-----
> - -- 
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
>
> my GPG/PGP key can be found at pgp.mit.edu
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
>
> iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S
> 6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15
> =Yls1
> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday July 05, 2011, Sunil Venkatesh wrote:
> Hi Josh,
> 
> I was able to get the following things done in respect to getting VCL to
> work on POWER.
> 
> 1. Made modifications in the xcat tables to get the capture process
> working with statelite images instead of stateless images. Particularly
> the noderes & bootparams table.
> 
> 2. Used partimage to capture the images (did NOT set usepartimageng to 1).
> 
> -rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
> -rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
> -rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
> -rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
> -rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
> -rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
> -rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk
> 
> 
> 2 partitions including the boot partition present on the blade were
> captured under /install/image/ppc64/. Initially, RHEL 5 was installed on
> a 600 GB partition due to which the capture process failed. The image of
> the partition was generated once the partition size was reduced to 6GB.
> Is it necessary for me to use partimage-ng instead of partimage itself?

Are you asking if you need to use partimage-ng for partitions that are 600GB?  
If so, I don't really know.  We've never dealt with partitions that large.
 
> When proceeding further with "vcld --setup", the script was not able to
> find the images that were created using partimage. The options that are
> provided in the script does not allow for selecting an architecture
> other than x86/x86_64.

You'll need to modify the vcld image.pm module.  Look in 
/usr/local/vcl/lib/VCL.  In image.pm, look for the function 
'setup_capture_base_image'; then, find 'my @architecture_choices' and add 
'ppc' as another option.

> Also, in the error log vcld is looking for
> 
> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
> 
> and cannot find the template file. Should the template file that needs
> to be accessed in this case be createimage.ppc64.tmpl?

This is actually a check to make sure the image doesn't already exist before 
trying to capturing it.  So, it is good that it doesn't find it.

It sounds like you're almost there.  Great work!

Josh

> I have attached a log at the end of the mail. I am not sure where I have
> gone wrong with the VCL configuration.
> 
> -Sunil
> 
> -----
> 
> rh5image-power010701bi34-v0 image creation failed
> ------------------------------------------------------------------------
> time: 2011-07-05 11:03:25
> caller: image.pm:reservation_failed(385)
> ( 0) image.pm, reservation_failed (line: 385)
> (-1) image.pm, process (line: 167)
> (-2) vcld, make_new_child (line: 568)
> (-3) vcld, main (line: 346)
> ------------------------------------------------------------------------
> management node: web1.bluegrit.cs.umbc.edu
> reservation PID: 9866
> parent vcld PID: 19110
> 
> request ID: 30
> reservation ID: 30
> request state/laststate: image/image
> request start time: 2011-07-05 11:03:20
> request end time: 2011-07-05 12:03:20
> for imaging: no
> log ID: none
> 
> computer: power01.bluegrit.cs.umbc.edu
> computer id: 2
> computer type: blade
> computer eth0 MAC address:<undefined>
> computer eth1 MAC address:<undefined>
> computer private IP address: 172.20.106.1
> computer public IP address: 172.20.106.1
> computer in block allocation: no
> provisioning module: VCL::Module::Provisioning::xCAT2
> 
> image: rh5image-power010701bi34-v0
> image display name: power010701bi
> image ID: 34
> image revision ID: 34
> image size: 1450 MB
> use Sysprep: yes
> root access: yes
> image owner ID: 1
> image owner affiliation: Local
> image revision date created: 2011-07-05 11:03:25
> image revision production: yes
> OS module: VCL::Module::OS::Linux
> 
> user: admin
> user name: vcl admin
> user ID: 1
> user affiliation: Local
> ------------------------------------------------------------------------
> RECENT LOG ENTRIES FOR THIS PROCESS:
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|VCL::Module::OS:
> :Linux OS object created for rh5image-power010701bi34-v0, address: 88fb070
> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT
> environment variable is not set, using /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found:
> /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module initialized
> 2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT
> environment variable is not set, using /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
> /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL::M
> odule::Provisioning::xCAT2 module loaded 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management
> node OS object has already been created, address: 88f23b0, returning 1
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::xC
> AT2 object created for computer power01, address: 88fb0e0 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment
> variable is not set, using /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found:
> /opt/xcat 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
> 2011-07-05
> 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL::M
> odule::Provisioning::xCAT2 provisioner object created for power01, address:
> 88fb0e0 2011-07-05
> 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1 2011-07-05
> 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object
> created and initialized 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail
> To:shrusun@gmail.com, VCL IMAGE Creation Started:
> rh5image-power010701bi34-v0 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS install
> type: partimage 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|managem
> ent node identifier argument was not specified
> 
> 2011-07-05 11:03:25|9866|30:30|image|
xCAT.pm:_get_image_repository_path(2932)|attempting to determine repository 
path for image on web1.bluegrit.cs.umbc.edu:
> |9866|30:30|image| image id: 34
> |9866|30:30|image| OS name: rh5image
> |9866|30:30|image| OS type: linux
> |9866|30:30|image| OS install type: partimage
> |9866|30:30|image| OS source path: image
> |9866|30:30|image| architecture: x86_64
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did not
> find any images under /tftpboot/xcat//linux_image/x86_64 on
> web1.bluegrit.cs.umbc.edu 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|returni
> ng repository path for web1.bluegrit.cs.umbc.edu:
> /tftpboot/xcat//image/x86_64 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image repository
> path: /tftpboot/xcat//image/x86_64
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed 
command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0* 2>&1 
| grep total 2>&1, pid: 9877, exit status: 0, output:
> |9866|30:30|image| 0 total
> 
> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image
> does NOT exist: rh5image-power010701bi34-v0 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|manageme
> nt node identifier argument was not specified
> 
> 2011-07-05 11:03:25|9866|30:30|image|
xCAT2.pm:_get_image_template_path(2115)|attempting to determine template path 
for image:
> |9866|30:30|image| image name: rh5image-power010701bi34-v0
> |9866|30:30|image| OS install type: partimage
> |9866|30:30|image| OS source path: image
> |9866|30:30|image| xCAT 2.x OS source path: image
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|returnin
> g: /opt/xcat/share/xcat/install/image 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template
> repository path for rh5image-power010701bi34-v0:
> /opt/xcat/share/xcat/install/image 2011-07-05
> 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file
> does not exist:
> /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
> 2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image
> rh5image-power010701bi34-v0 does NOT exist on this management node
> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image
> rh5image-power010701bi34-v0 does not exist in the repository 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data structure
> updated: $self->request_data->{reservation}{30}{image}{lastupdate}
> 
> |9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
> 
> 2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data
> structure updated:
> $self->request_data->{reservation}{30}{imagerevision}{datecreated}
> 
> |9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
> 
> 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling
> provisioning module's capture() subroutine 2011-07-05
> 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power010701
> bi34-v0, computer=power01
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
executing SSH command on power01:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
> |currentimage.txt; chmod 777 currentimage.txt' 2>&1
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
run_ssh_command output:
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH 
command executed on power01, command:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root
> |currentimage.txt; chmod 777 currentimage.txt' 2>&1 9866|30:30|image|
> |returning (255, "Permission denied, please try ...")
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated
> ownership and permissions on currentimage.txt
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|
executing SSH command on power01:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimag
> |erevision_id=34\r\nimagerevision_datecreated=2011-07-05
> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.e
> |du">  currentimage.txt&&  cat currentimage.txt' 2>&1
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|
run_ssh_command output:
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
> 
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH 
command executed on power01, command:
> |9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o
> |StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e
> |"rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimag
> |erevision_id=34\r\nimagerevision_datecreated=2011-07-05
> |11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.e
> |du">  currentimage.txt&&  cat currentimage.txt' 2>&1 9866|30:30|image|
> |returning (255, "Permission denied, please try ...") 9866|30:30|image|
> |---- WARNING ----
> |9866|30:30|image| 2011-07-05
> |11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed to
> |create currentimage.txt file on power01: 9866|30:30|image| Permission
> |denied, please try again.
> |9866|30:30|image| Permission denied, please try again.
> |9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
> |9866|30:30|image| ( 0) utils.pm, write_currentimage_txt (line: 5699)
> |9866|30:30|image| (-1) xCAT2.pm, capture (line: 779)
> |9866|30:30|image| (-2) image.pm, process (line: 162)
> |9866|30:30|image| (-3) vcld, make_new_child (line: 568)
> |9866|30:30|image| (-4) vcld, main (line: 346)
> |9866|30:30|image| ---- WARNING ----
> |9866|30:30|image| 2011-07-05
> |11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update
> |currentimage.txt on power01 9866|30:30|image| ( 0) xCAT2.pm, capture
> |(line: 783)
> |9866|30:30|image| (-1) image.pm, process (line: 162)
> |9866|30:30|image| (-2) vcld, make_new_child (line: 568)
> |9866|30:30|image| (-3) vcld, main (line: 346)
> |9866|30:30|image| ---- WARNING ----
> |9866|30:30|image| 2011-07-05
> |11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi34-
> |v0 image failed to be captured by provisioning module 9866|30:30|image| (
> |0) image.pm, process (line: 166)
> |9866|30:30|image| (-1) vcld, make_new_child (line: 568)
> |9866|30:30|image| (-2) vcld, main (line: 346)
> 
> 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
> (1581)|attempting to retrieve private IP address for computer: power01
> 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
> (1585)|retrieved contents of /etc/hosts on this management node, contains
> 158 lines 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address
> (1645)|returning IP address from /etc/hosts file: 172.20.106.1 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows were
> returned from database select 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(2035
> )|image owner id: 1 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
> (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
> handle stored in $ENV{dbh} 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|attemp
> ting to retrieve and store data for user: user.id = '1' 2011-07-05
> 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested
> (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
> 2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database
> handle stored in $ENV{dbh} 2011-07-05
> 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data
> has been retrieved for user: admin (id: 1)
> 
> On 6/24/11 10:13 AM, Josh Thompson wrote:
> > Sunil,
> > 
> > "nodeset<nodename>  image" sets up all the xCAT stuff so that the next
> > time the node is booted, it will boot the stateless/statelite image and
> > capture an image of the node.
> > 
> > Can you double check that you have 'os' in the nodetype table set to
> > image for the node you are using?  If you look in the partimageng.pm
> > xCAT module, you see toward the top where it registers the
> > "handled_commands".  The "mk" gets stripped off.  So, that module is
> > registering "install" and "image" for os type = "image".  As long as you
> > have os in the nodetype table set to image, it should be using that
> > module.
> > 
> > You will need to make sure you have all of the required files in
> > locations using 'ppc64' as the arch.
> > 
> > Josh
> > 
> > On Wednesday June 22, 2011, Sunil Venkatesh wrote:
> >> Hi,
> >> 
> >> Update !
> >> 
> >> I was able to fix the problem that I was facing with the scripts by
> >> disabling the firewall. But, I still have a problem with the command-
> >> 
> >> nodeset<nodename>  image
> >> 
> >> Unless this error is fixed, I don't think partimage will work. Am I
> >> right here?
> >> 
> >> Thanks,
> >> Sunil
> >> 
> >> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<su...@umbc.edu>  
wrote:
> >>> Josh,
> >>> 
> >>> I have reached a point where I am able to boot the ppc using the
> >>> statelite images created using genimage. But, I was wondering how
> >>> significant the following command is.
> >>> 
> >>> nodeset<nodename>  image
> >>> 
> >>> I got the same error that Prem had mentioned.
> >>> 
> >>> 
> >>> power01: Error: Unable to identify plugin for this command, check
> >>> relevant tables: nodetype.os
> >>> Error: Some nodes failed to set up image resources, aborting
> >>> 
> >>> I tried changing the 'os' field to 'image' under nodetype, that doesn't
> >>> seem to help. I get the same error even after the change. 'arch' in my
> >>> case is set to 'ppc64'.
> >>> 
> >>> 
> >>> Also, I think partimage plugin needs to be changed to support the ppc
> >>> architecture, from what you had mentioned in the other thread.
> >>> 
> >>> I am not sure what the command 'nodeset<nodename>  image' does, but, I
> >>> am able to boot the statelite images by making changes to the yaboot
> >>> configuration files. The ppc blade currently uses LVM, that needs to
> >>> be replaced with ext2/ext3 from what I read from the other thread, am
> >>> I right? Also, just out of curiosity I left the statelite image to
> >>> boot with my current setting. I can see the xcat script throwing an
> >>> error-
> >>> 
> >>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such
> >>> file or directory
> >>> /tmp/mypostscript: line 16: updateflag.awk: command not found
> >>> 
> >>> both getpostscript.awk&  updateflag.awk are not found in the rootimg
> >>> created by genimage. Is there any place I could find these scripts?
> >>> 
> >>> Also, please correct me if there is anything wrong with the procedure I
> >>> am following.
> >>> 
> >>> 
> >>> Thanks in advance.
> >>> 
> >>> Regards,
> >>> Sunil
> >>> 
> >>> On 6/13/11 4:13 PM, Josh Thompson wrote:
> >>>> Sunil,
> >>>> 
> >>>>   From what I remember, I didn't have to do much to the rootimg.gz
> >>>>   image to
> >>>> 
> >>>> make
> >>>> it work.  I created the files I supply before xCAT started using
> >>>> "statelite"
> >>>> instead of "stateless".  I think statelite uses NFS to mount the
> >>>> image, and
> >>>> stateless uses an image file downloaded to the node and run out of
> >>>> RAM.
> >>>> 
> >>>>   Since
> >>>> 
> >>>> generating a statelite image is pretty straightforward use of xCAT,
> >>>> you may
> >>>> want to ask on the xcat-user email list for help with it.
> >>>> 
> >>>> Unless you can have the admins of the other dhcp server on your
> >>>> network exclude the MAC addresses of your blades, you'll need to
> >>>> create a separate private network to control your VCL stuff, either
> >>>> physically or with VLANs.
> >>>> 
> >>>> If they can exclude the MACs, you can set up the dhcp server on your
> >>>> management node to only answer to requests from your blades.
> >>>> 
> >>>> Josh
> >>>> 
> >>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
> >>>>> Josh,
> >>>>> 
> >>>>> Again, Thank you for your valuable inputs. I have got to the point
> >>>>> where I can get the compute node to boot using the stateless images.
> >>>>> I had to manually configure the netboot since we already had a DHCP
> >>>>> server which is not the same as our Management node. Since our setup
> >>>>> is not in an isolated environment, I could not let xcat handle the
> >>>>> dhcp&   netboot configuration (it messed up out network
> >>>>> configuration when i let xcat handle it,we had 2 dhcp servers
> >>>>> running at that point). Are you aware of any way to let xcat handle
> >>>>> such scenarios?
> >>>>> 
> >>>>> Although I am able to get the compute node to boot with the kernel
> >>>>> image&   initrd, and NFS mount the rootimg that was generated using
> >>>>> 'genimage', I am getting the following error on the compute node's
> >>>>> console -
> >>>>> 
> >>>>>       FATAL error: could not get the entries from litefile table...
> >>>>> 
> >>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is
> >>>>> not present in the rootimg. I am currently checking the xcat
> >>>>> packages for its availability. If you know the procedure to get it
> >>>>> onto the compute node, please let me know the same.
> >>>>> 
> >>>>> Appreciate your support.
> >>>>> 
> >>>>> Thanking you,
> >>>>> Sunil
> >>>>> 
> >>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
> >>>>>> Sunil,
> >>>>>> 
> >>>>>> I don't recall seeing any documentation on those parts.  I had to
> >>>>>> poke around looking at parts of xCAT to see how it worked.  It's
> >>>>>> been a few years since I did that; so, I don't remember much about
> >>>>>> the process. My recommendation would be to start looking at things
> >>>>>> in the rootimg.gz image.  Looking at it now, I see that
> >>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
> >>>>>> like it downloads all of the postscripts from the management node
> >>>>>> and then run getpostscript.awk which issues a command to xcatd to
> >>>>>> get the primary postscript for that machine.  I've forgotten how
> >>>>>> xcatd then builds the primary postscript. I do remember that in the
> >>>>>> partimageng.pm module, I had it add the partimageng postscript.
> >>>>>> 
> >>>>>> So, you'll really have to start digging through how the xcat
> >>>>>> postscript system works.
> >>>>>> 
> >>>>>> Josh
> >>>>>> 
> >>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
> >>>>>>> Josh,
> >>>>>>> 
> >>>>>>> Is there any place I could find some details on
> >>>>>>> 
> >>>>>>> "... /Once the compute node is booted with the stateless
> >>>>>>> image, it uses NFS to mount some things from the management node,
> >>>>>>> and then runs some xcat postscripts,/.... "
> >>>>>>> 
> >>>>>>> I have the stateless images ready with partimage compiled for PPC.
> >>>>>>> For the compute node (power 7) to boot using the stateless images,
> >>>>>>> i need to
> >>>>>>> configure the yaboot instead of pxeboot (which is specific to x86).
> >>>>>>> I wanted to know where in the startup files the execution of
> >>>>>>> partimage and
> >>>>>>> NFS mount is configured. Is it configured by the "genimage" command
> >>>>>>> itself? Considering the way in which the nodes are configured in
> >>>>>>> the network, it would not be a good idea to let xcat take care of
> >>>>>>> configuring the details like DHCPD for netboot. So, I need to make
> >>>>>>> changes to the configuration files manually, which is why this
> >>>>>>> query came up.
> >>>>>>> 
> >>>>>>> Thanks in advance.
> >>>>>>> 
> >>>>>>> Regards,
> >>>>>>> Sunil
> >>>>>>> 
> >>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
> >>>>>>>> Sunil,
> >>>>>>>> 
> >>>>>>>> The "stateless" image I refer to is what is actually booted on the
> >>>>>>>> compute node containing the image to be captured.  It's called
> >>>>>>>> stateless because it is loaded completely in RAM and does not
> >>>>>>>> maintain any state when a reboot occurs.
> >>>>>>>> 
> >>>>>>>> The partimage binary is part of this stateless image and actually
> >>>>>>>> runs on the compute node.  It does not run on the management node.
> >>>>>>>> The management node does not have block level access to the disk
> >>>>>>>> on the compute node to be able to capture the image from the
> >>>>>>>> disk.
> >>>>>>>> 
> >>>>>>>> I'll try to describe the process a little better.  The management
> >>>>>>>> node issues a reboot command to the compute node.  The compute
> >>>>>>>> node uses PXE
> >>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk
> >>>>>>>> (initrd.img), and
> >>>>>>>> a root filesystem (rootimg.gz) from the management node.  All
> >>>>>>>> three of these together make up the stateless image.  Once the
> >>>>>>>> compute node is booted with the stateless image, it uses NFS to
> >>>>>>>> mount some things from the management node, and then runs some
> >>>>>>>> xcat
> >>>>>>>> postscripts, one of which is the partimageng postscript.  This
> >>>>>>>> postscript determines what partitions are on the compute node and,
> >>>>>>>> depending on how the postscript
> >>>>>>>> is configured, uses partimage or partimageng to capture an image
> >>>>>>>> of the
> >>>>>>>> compute node disk that is then saved to the management node. When
> >>>>>>>> it is
> >>>>>>>> finished capturing the image, it notifies xcat on the management
> >>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
> >>>>>>>> compute node to
> >>>>>>>> boot off of disk at next boot.  When the compute node comes up, it
> >>>>>>>> uses
> >>>>>>>> PXE to ask the management node how to boot.  The management node
> >>>>>>>> tells it to boot off of disk.
> >>>>>>>> 
> >>>>>>>> I hope that clarifies how the system works.  If any of it is
> >>>>>>>> unclear, please ask for further clarification.
> >>>>>>>> 
> >>>>>>>> Josh
> >>>>>>>> 
> >>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> >>>>>>>>> Josh,
> >>>>>>>>> 
> >>>>>>>>> I had one more clarification.
> >>>>>>>>> 
> >>>>>>>>> partimage binaries run in the management node to capture an
> >>>>>>>>> (stateless) image from the compute node right? In that case, is
> >>>>>>>>> there a need for these binaries to go into the rootimg.gz??
> >>>>>>>>> 
> >>>>>>>>> My assumption is, partimage runs on the management node (an intel
> >>>>>>>>> blade in our case) to capture a stateless image from a compute
> >>>>>>>>> node (a power 7 blade) and stores these images under " /install
> >>>>>>>>> " of the management node. Please correct me if I am wrong here.
> >>>>>>>>> 
> >>>>>>>>> Regards,
> >>>>>>>>> Sunil
> >>>>>>>>> 
> >>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
> >>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
> >>>>>>>>>> Hash: SHA1
> >>>>>>>>>> 
> >>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>> 
> >>>>>>>>>>> I used the steps that were mentioned under
> >>>>>>>>>>> 
> >>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
> >>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL/A
> >>>>>>>>>>> dd ing+support+for+p>  ar ti mag e+and+partimage-
> >>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
> >>>>>>>>>>> 
> >>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need
> >>>>>>>>>>> to change references to x86&      x86_64 (as directories) to
> >>>>>>>>>>> reflect the
> >>>>>>>>>>> ppc architecture, as the web page says "The architecture for
> >>>>>>>>>>> the node must always be set to x86 for this..". I have with me
> >>>>>>>>>>> the vmlinuz (kernel image) and initrd for the capture process.
> >>>>>>>>>>> The 2 nodeset commands
> >>>>>>>>>> 
> >>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
> >>>>>>>>>> blades, not the ones linked to off of the page you listed above?
> >>>>>>>>>> If you do, that's a good start.  However, you'll also need
> >>>>>>>>>> rootimg.gz. rootimg.gz is the root filesystem for the stateless
> >>>>>>>>>> image.  It also contains the partimage and partimageng binaries.
> >>>>>>>>>> Assuming partimage or partimageng can actually capture
> >>>>>>>>>> partitions from power systems, you'll need to compile at least
> >>>>>>>>>> one of them to run on power.  For the rootimg.gz image I
> >>>>>>>>>> provided, I compiled them statically so that I didn't have to
> >>>>>>>>>> worry about including any library dependencies in rootimg.gz.
> >>>>>>>>>> 
> >>>>>>>>>> It would be a good idea to research how to use xcat's genimage
> >>>>>>>>>> command to generate stateless images to learn how to do this.
> >>>>>>>>>> 
> >>>>>>>>>> If there's any part of the above that you don't fully
> >>>>>>>>>> understand, please ask me to clarify it.  Until you have a
> >>>>>>>>>> stateless image that you can deploy to your power blades,
> >>>>>>>>>> there's no point in trying to debug any VCL specific items.
> >>>>>>>>>> 
> >>>>>>>>>> Josh
> >>>>>>>>>> - --
> >>>>>>>>>> - ------------------------------**-
> >>>>>>>>>> Josh Thompson
> >>>>>>>>>> VCL Developer
> >>>>>>>>>> North Carolina State University
> >>>>>>>>>> 
> >>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
> >>>>>>>>>> -----BEGIN PGP SIGNATURE-----
> >>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
> >>>>>>>>>> 
> >>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
> >>>>>>>>>> g75RqGZY/j
> >>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
> >>>>>>>>>> =exBV
> >>>>>>>>>> -----END PGP SIGNATURE-----
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk4VzQoACgkQV/LQcNdtPQM8YQCePg3O5vp5AXEhiO+5aIRIUO/S
6IgAn1Xt4ytGnmxpfJVteCScFi0dRz15
=Yls1
-----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Hi Josh,

I was able to get the following things done in respect to getting VCL to 
work on POWER.

1. Made modifications in the xcat tables to get the capture process 
working with statelite images instead of stateless images. Particularly 
the noderes & bootparams table.

2. Used partimage to capture the images (did NOT set usepartimageng to 1).

-rw-r--r-- 1 root root    0 Jul  5 16:38 compute.img.capturedone
-rw-r--r-- 1 root root    0 Jul  5 15:58 compute.img.capturefailed
-rw------- 1 root root 6.5M Jul  5 16:07 compute-parta2.gz
-rw------- 1 root root 679M Jul  5 16:10 compute-parta3.gz
-rw------- 1 root root  23M Jul  5 16:38 compute-parta6.gz
-rw-r--r-- 1 root root  512 Jul  5 16:07 compute-sda.mbr
-rw-r--r-- 1 root root  363 Jul  5 16:07 compute-sda.sfdisk


2 partitions including the boot partition present on the blade were 
captured under /install/image/ppc64/. Initially, RHEL 5 was installed on 
a 600 GB partition due to which the capture process failed. The image of 
the partition was generated once the partition size was reduced to 6GB. 
Is it necessary for me to use partimage-ng instead of partimage itself?

When proceeding further with "vcld --setup", the script was not able to 
find the images that were created using partimage. The options that are 
provided in the script does not allow for selecting an architecture 
other than x86/x86_64. Also, in the error log vcld is looking for

/opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl

and cannot find the template file. Should the template file that needs 
to be accessed in this case be createimage.ppc64.tmpl?

I have attached a log at the end of the mail. I am not sure where I have 
gone wrong with the VCL configuration.

-Sunil

-----

rh5image-power010701bi34-v0 image creation failed
------------------------------------------------------------------------
time: 2011-07-05 11:03:25
caller: image.pm:reservation_failed(385)
( 0) image.pm, reservation_failed (line: 385)
(-1) image.pm, process (line: 167)
(-2) vcld, make_new_child (line: 568)
(-3) vcld, main (line: 346)
------------------------------------------------------------------------
management node: web1.bluegrit.cs.umbc.edu
reservation PID: 9866
parent vcld PID: 19110

request ID: 30
reservation ID: 30
request state/laststate: image/image
request start time: 2011-07-05 11:03:20
request end time: 2011-07-05 12:03:20
for imaging: no
log ID: none

computer: power01.bluegrit.cs.umbc.edu
computer id: 2
computer type: blade
computer eth0 MAC address:<undefined>
computer eth1 MAC address:<undefined>
computer private IP address: 172.20.106.1
computer public IP address: 172.20.106.1
computer in block allocation: no
provisioning module: VCL::Module::Provisioning::xCAT2

image: rh5image-power010701bi34-v0
image display name: power010701bi
image ID: 34
image revision ID: 34
image size: 1450 MB
use Sysprep: yes
root access: yes
image owner ID: 1
image owner affiliation: Local
image revision date created: 2011-07-05 11:03:25
image revision production: yes
OS module: VCL::Module::OS::Linux

user: admin
user name: vcl admin
user ID: 1
user affiliation: Local
------------------------------------------------------------------------
RECENT LOG ENTRIES FOR THIS PROCESS:
2011-07-05 11:03:25|9866|30:30|image|Module.pm:create_os_object(304)|VCL::Module::OS::Linux OS object created for rh5image-power010701bi34-v0, address: 88fb070
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(110)|XCATROOT environment variable is not set, using /opt/xcat
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(128)|xCAT root path found: /opt/xcat
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:initialize(130)|xCAT module initialized
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment variable is not set, using /opt/xcat
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found: /opt/xcat
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
2011-07-05 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(420)|VCL::Module::Provisioning::xCAT2 module loaded
2011-07-05 11:03:25|9866|30:30|image|Module.pm:create_mn_os_object(335)|management node OS object has already been created, address: 88f23b0, returning 1
2011-07-05 11:03:25|9866|30:30|image|Module.pm:new(200)|VCL::Module::Provisioning::xCAT2 object created for computer power01, address: 88fb0e0
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(110)|XCATROOT environment variable is not set, using /opt/xcat
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(128)|xCAT root path found: /opt/xcat
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:initialize(130)|xCAT module initialized
2011-07-05 11:03:25|9866|30:30|image|Module.pm:create_provisioning_object(426)|VCL::Module::Provisioning::xCAT2 provisioner object created for power01, address: 88fb0e0
2011-07-05 11:03:25|9866|30:30|image|State.pm:initialize(126)|returning 1
2011-07-05 11:03:25|9866|30:30|image|vcld:make_new_child(565)|VCL::image object created and initialized
2011-07-05 11:03:25|9866|30:30|image|utils.pm:mail(1268)|SUCCESS -- Sending mail To:shrusun@gmail.com, VCL IMAGE Creation Started: rh5image-power010701bi34-v0
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2434)|image OS install type: partimage
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2910)|management node identifier argument was not specified
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2932)|attempting to determine repository path for image on web1.bluegrit.cs.umbc.edu:
|9866|30:30|image| image id: 34
|9866|30:30|image| OS name: rh5image
|9866|30:30|image| OS type: linux
|9866|30:30|image| OS install type: partimage
|9866|30:30|image| OS source path: image
|9866|30:30|image| architecture: x86_64
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(2996)|did not find any images under /tftpboot/xcat//linux_image/x86_64 on web1.bluegrit.cs.umbc.edu
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:_get_image_repository_path(3006)|returning repository path for web1.bluegrit.cs.umbc.edu: /tftpboot/xcat//image/x86_64
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2444)|image repository path: /tftpboot/xcat//image/x86_64
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_command(9010)|executed command: du -c /tftpboot/xcat//image/x86_64/*rh5image-power010701bi34-v0* 2>&1 | grep total 2>&1, pid: 9877, exit status: 0, output:
|9866|30:30|image| 0 total
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2506)|image does NOT exist: rh5image-power010701bi34-v0
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2084)|management node identifier argument was not specified
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2115)|attempting to determine template path for image:
|9866|30:30|image| image name: rh5image-power010701bi34-v0
|9866|30:30|image| OS install type: partimage
|9866|30:30|image| OS source path: image
|9866|30:30|image| xCAT 2.x OS source path: image
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:_get_image_template_path(2123)|returning: /opt/xcat/share/xcat/install/image
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2518)|template repository path for rh5image-power010701bi34-v0: /opt/xcat/share/xcat/install/image
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2530)|template file does not exist: /opt/xcat/share/xcat/install/image/rh5image-power010701bi34-v0.tmpl
2011-07-05 11:03:25|9866|30:30|image|xCAT.pm:does_image_exist(2570)|image rh5image-power010701bi34-v0 does NOT exist on this management node
2011-07-05 11:03:25|9866|30:30|image|image.pm:process(145)|image rh5image-power010701bi34-v0 does not exist in the repository
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data structure updated: $self->request_data->{reservation}{30}{image}{lastupdate}
|9866|30:30|image| image_lastupdate = 2011-07-05 11:03:25
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:_automethod(834)|data structure updated: $self->request_data->{reservation}{30}{imagerevision}{datecreated}
|9866|30:30|image| imagerevision_date_created = 2011-07-05 11:03:25
2011-07-05 11:03:25|9866|30:30|image|image.pm:process(161)|calling provisioning module's capture() subroutine
2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:capture(776)|image=rh5image-power010701bi34-v0, computer=power01
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|executing SSH command on power01:
|9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root currentimage.txt; chmod 777 currentimage.txt' 2>&1
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|run_ssh_command output:
|9866|30:30|image| Permission denied, please try again.
|9866|30:30|image| Permission denied, please try again.
|9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH command executed on power01, command:
|9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o StrictHostKeyChecking=no -l root -p 22 -x power01 'chown root currentimage.txt; chmod 777 currentimage.txt' 2>&1
|9866|30:30|image| returning (255, "Permission denied, please try ...")
2011-07-05 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5685)|updated ownership and permissions on currentimage.txt
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5380)|executing SSH command on power01:
|9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e "rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimagerevision_id=34\r\nimagerevision_datecreated=2011-07-05 11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.edu">  currentimage.txt&&  cat currentimage.txt' 2>&1
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5464)|run_ssh_command output:
|9866|30:30|image| Permission denied, please try again.
|9866|30:30|image| Permission denied, please try again.
|9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
2011-07-05 11:03:25|9866|30:30|image|utils.pm:run_ssh_command(5474)|SSH command executed on power01, command:
|9866|30:30|image| /usr/bin/ssh -i /etc/vcl/vcl.key  -o StrictHostKeyChecking=no -l root -p 22 -x power01 'echo -e "rh5image-power010701bi34-v0\r\nid=34\r\nprettyname=power010701bi\r\nimagerevision_id=34\r\nimagerevision_datecreated=2011-07-05 11:03:25\r\ncomputer_id=2\r\ncomputer_hostname=power01.bluegrit.cs.umbc.edu">  currentimage.txt&&  cat currentimage.txt' 2>&1
|9866|30:30|image| returning (255, "Permission denied, please try ...")
|9866|30:30|image| ---- WARNING ----
|9866|30:30|image| 2011-07-05 11:03:25|9866|30:30|image|utils.pm:write_currentimage_txt(5699)|failed to create currentimage.txt file on power01:
|9866|30:30|image| Permission denied, please try again.
|9866|30:30|image| Permission denied, please try again.
|9866|30:30|image| Permission denied (publickey,gssapi-with-mic,password).
|9866|30:30|image| ( 0) utils.pm, write_currentimage_txt (line: 5699)
|9866|30:30|image| (-1) xCAT2.pm, capture (line: 779)
|9866|30:30|image| (-2) image.pm, process (line: 162)
|9866|30:30|image| (-3) vcld, make_new_child (line: 568)
|9866|30:30|image| (-4) vcld, main (line: 346)
|9866|30:30|image| ---- WARNING ----
|9866|30:30|image| 2011-07-05 11:03:25|9866|30:30|image|xCAT2.pm:capture(783)|unable to update currentimage.txt on power01
|9866|30:30|image| ( 0) xCAT2.pm, capture (line: 783)
|9866|30:30|image| (-1) image.pm, process (line: 162)
|9866|30:30|image| (-2) vcld, make_new_child (line: 568)
|9866|30:30|image| (-3) vcld, main (line: 346)
|9866|30:30|image| ---- WARNING ----
|9866|30:30|image| 2011-07-05 11:03:25|9866|30:30|image|image.pm:process(166)|rh5image-power010701bi34-v0 image failed to be captured by provisioning module
|9866|30:30|image| ( 0) image.pm, process (line: 166)
|9866|30:30|image| (-1) vcld, make_new_child (line: 568)
|9866|30:30|image| (-2) vcld, main (line: 346)
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address(1581)|attempting to retrieve private IP address for computer: power01
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address(1585)|retrieved contents of /etc/hosts on this management node, contains 158 lines
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:get_computer_private_ip_address(1645)|returning IP address from /etc/hosts file: 172.20.106.1
2011-07-05 11:03:25|9866|30:30|image|utils.pm:is_inblockrequest(6163)|zero rows were returned from database select
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:get_image_affiliation_name(2035)|image owner id: 1
2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested (information_schema) does not match handle stored in $ENV{dbh} (vcl:)
2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database handle stored in $ENV{dbh}
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1352)|attempting to retrieve and store data for user: user.id = '1'
2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2709)|database requested (vcl) does not match handle stored in $ENV{dbh} (information_schema:)
2011-07-05 11:03:25|9866|30:30|image|utils.pm:getnewdbh(2760)|database handle stored in $ENV{dbh}
2011-07-05 11:03:25|9866|30:30|image|DataStructure.pm:retrieve_user_data(1415)|data has been retrieved for user: admin (id: 1)




On 6/24/11 10:13 AM, Josh Thompson wrote:
> Sunil,
>
> "nodeset<nodename>  image" sets up all the xCAT stuff so that the next time
> the node is booted, it will boot the stateless/statelite image and capture an
> image of the node.
>
> Can you double check that you have 'os' in the nodetype table set to image for
> the node you are using?  If you look in the partimageng.pm xCAT module, you
> see toward the top where it registers the "handled_commands".  The "mk" gets
> stripped off.  So, that module is registering "install" and "image" for os
> type = "image".  As long as you have os in the nodetype table set to image, it
> should be using that module.
>
> You will need to make sure you have all of the required files in locations
> using 'ppc64' as the arch.
>
> Josh
>
> On Wednesday June 22, 2011, Sunil Venkatesh wrote:
>> Hi,
>>
>> Update !
>>
>> I was able to fix the problem that I was facing with the scripts by
>> disabling the firewall. But, I still have a problem with the command-
>>
>> nodeset<nodename>  image
>>
>> Unless this error is fixed, I don't think partimage will work. Am I right
>> here?
>>
>> Thanks,
>> Sunil
>>
>> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh<su...@umbc.edu>  wrote:
>>> Josh,
>>>
>>> I have reached a point where I am able to boot the ppc using the
>>> statelite images created using genimage. But, I was wondering how
>>> significant the following command is.
>>>
>>> nodeset<nodename>  image
>>>
>>> I got the same error that Prem had mentioned.
>>>
>>>
>>> power01: Error: Unable to identify plugin for this command, check
>>> relevant tables: nodetype.os
>>> Error: Some nodes failed to set up image resources, aborting
>>>
>>> I tried changing the 'os' field to 'image' under nodetype, that doesn't
>>> seem to help. I get the same error even after the change. 'arch' in my
>>> case is set to 'ppc64'.
>>>
>>>
>>> Also, I think partimage plugin needs to be changed to support the ppc
>>> architecture, from what you had mentioned in the other thread.
>>>
>>> I am not sure what the command 'nodeset<nodename>  image' does, but, I am
>>> able to boot the statelite images by making changes to the yaboot
>>> configuration files. The ppc blade currently uses LVM, that needs to be
>>> replaced with ext2/ext3 from what I read from the other thread, am I
>>> right? Also, just out of curiosity I left the statelite image to boot
>>> with my current setting. I can see the xcat script throwing an error-
>>>
>>> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such
>>> file or directory
>>> /tmp/mypostscript: line 16: updateflag.awk: command not found
>>>
>>> both getpostscript.awk&  updateflag.awk are not found in the rootimg
>>> created by genimage. Is there any place I could find these scripts?
>>>
>>> Also, please correct me if there is anything wrong with the procedure I
>>> am following.
>>>
>>>
>>> Thanks in advance.
>>>
>>> Regards,
>>> Sunil
>>>
>>> On 6/13/11 4:13 PM, Josh Thompson wrote:
>>>> Sunil,
>>>>
>>>>   From what I remember, I didn't have to do much to the rootimg.gz image
>>>>   to
>>>>
>>>> make
>>>> it work.  I created the files I supply before xCAT started using
>>>> "statelite"
>>>> instead of "stateless".  I think statelite uses NFS to mount the image,
>>>> and
>>>> stateless uses an image file downloaded to the node and run out of RAM.
>>>>
>>>>   Since
>>>>
>>>> generating a statelite image is pretty straightforward use of xCAT, you
>>>> may
>>>> want to ask on the xcat-user email list for help with it.
>>>>
>>>> Unless you can have the admins of the other dhcp server on your network
>>>> exclude the MAC addresses of your blades, you'll need to create a
>>>> separate private network to control your VCL stuff, either physically
>>>> or with VLANs.
>>>>
>>>> If they can exclude the MACs, you can set up the dhcp server on your
>>>> management node to only answer to requests from your blades.
>>>>
>>>> Josh
>>>>
>>>> On Monday June 13, 2011, Sunil Venkatesh wrote:
>>>>> Josh,
>>>>>
>>>>> Again, Thank you for your valuable inputs. I have got to the point
>>>>> where I can get the compute node to boot using the stateless images. I
>>>>> had to manually configure the netboot since we already had a DHCP
>>>>> server which is not the same as our Management node. Since our setup
>>>>> is not in an isolated environment, I could not let xcat handle the
>>>>> dhcp&   netboot configuration (it messed up out network configuration
>>>>> when i let xcat handle it,we had 2 dhcp servers running at that
>>>>> point). Are you aware of any way to let xcat handle such scenarios?
>>>>>
>>>>> Although I am able to get the compute node to boot with the kernel
>>>>> image&   initrd, and NFS mount the rootimg that was generated using
>>>>> 'genimage', I am getting the following error on the compute node's
>>>>> console -
>>>>>
>>>>>       FATAL error: could not get the entries from litefile table...
>>>>>
>>>>> after going thru the init-scripts, I found out 'xCATCmd' binary is not
>>>>> present in the rootimg. I am currently checking the xcat packages for
>>>>> its availability. If you know the procedure to get it onto the compute
>>>>> node, please let me know the same.
>>>>>
>>>>> Appreciate your support.
>>>>>
>>>>> Thanking you,
>>>>> Sunil
>>>>>
>>>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
>>>>>> Sunil,
>>>>>>
>>>>>> I don't recall seeing any documentation on those parts.  I had to poke
>>>>>> around looking at parts of xCAT to see how it worked.  It's been a few
>>>>>> years since I did that; so, I don't remember much about the process.
>>>>>> My recommendation would be to start looking at things in the
>>>>>> rootimg.gz image.  Looking at it now, I see that
>>>>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
>>>>>> like it downloads all of the postscripts from the management node and
>>>>>> then run getpostscript.awk which issues a command to xcatd to get the
>>>>>> primary postscript for that machine.  I've forgotten how xcatd then
>>>>>> builds the primary postscript. I do remember that in the
>>>>>> partimageng.pm module, I had it add the partimageng postscript.
>>>>>>
>>>>>> So, you'll really have to start digging through how the xcat
>>>>>> postscript system works.
>>>>>>
>>>>>> Josh
>>>>>>
>>>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>>>>>>> Josh,
>>>>>>>
>>>>>>> Is there any place I could find some details on
>>>>>>>
>>>>>>> "... /Once the compute node is booted with the stateless
>>>>>>> image, it uses NFS to mount some things from the management node, and
>>>>>>> then runs some xcat postscripts,/.... "
>>>>>>>
>>>>>>> I have the stateless images ready with partimage compiled for PPC.
>>>>>>> For the compute node (power 7) to boot using the stateless images, i
>>>>>>> need to
>>>>>>> configure the yaboot instead of pxeboot (which is specific to x86). I
>>>>>>> wanted to know where in the startup files the execution of partimage
>>>>>>> and
>>>>>>> NFS mount is configured. Is it configured by the "genimage" command
>>>>>>> itself? Considering the way in which the nodes are configured in the
>>>>>>> network, it would not be a good idea to let xcat take care of
>>>>>>> configuring the details like DHCPD for netboot. So, I need to make
>>>>>>> changes to the configuration files manually, which is why this query
>>>>>>> came up.
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Sunil
>>>>>>>
>>>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>>>>>>> Sunil,
>>>>>>>>
>>>>>>>> The "stateless" image I refer to is what is actually booted on the
>>>>>>>> compute node containing the image to be captured.  It's called
>>>>>>>> stateless because it is loaded completely in RAM and does not
>>>>>>>> maintain any state when a reboot occurs.
>>>>>>>>
>>>>>>>> The partimage binary is part of this stateless image and actually
>>>>>>>> runs on the compute node.  It does not run on the management node.
>>>>>>>> The management node does not have block level access to the disk on
>>>>>>>> the compute node to be able to capture the image from the disk.
>>>>>>>>
>>>>>>>> I'll try to describe the process a little better.  The management
>>>>>>>> node issues a reboot command to the compute node.  The compute node
>>>>>>>> uses PXE
>>>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img),
>>>>>>>> and
>>>>>>>> a root filesystem (rootimg.gz) from the management node.  All three
>>>>>>>> of these together make up the stateless image.  Once the compute
>>>>>>>> node is booted with the stateless image, it uses NFS to mount some
>>>>>>>> things from the management node, and then runs some xcat
>>>>>>>> postscripts, one of which is the partimageng postscript.  This
>>>>>>>> postscript determines what partitions are on the compute node and,
>>>>>>>> depending on how the postscript
>>>>>>>> is configured, uses partimage or partimageng to capture an image of
>>>>>>>> the
>>>>>>>> compute node disk that is then saved to the management node. When it
>>>>>>>> is
>>>>>>>> finished capturing the image, it notifies xcat on the management
>>>>>>>> node and then reboots.  xcat reconfigures itself to tell the
>>>>>>>> compute node to
>>>>>>>> boot off of disk at next boot.  When the compute node comes up, it
>>>>>>>> uses
>>>>>>>> PXE to ask the management node how to boot.  The management node
>>>>>>>> tells it to boot off of disk.
>>>>>>>>
>>>>>>>> I hope that clarifies how the system works.  If any of it is
>>>>>>>> unclear, please ask for further clarification.
>>>>>>>>
>>>>>>>> Josh
>>>>>>>>
>>>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>>>>>>> Josh,
>>>>>>>>>
>>>>>>>>> I had one more clarification.
>>>>>>>>>
>>>>>>>>> partimage binaries run in the management node to capture an
>>>>>>>>> (stateless) image from the compute node right? In that case, is
>>>>>>>>> there a need for these binaries to go into the rootimg.gz??
>>>>>>>>>
>>>>>>>>> My assumption is, partimage runs on the management node (an intel
>>>>>>>>> blade in our case) to capture a stateless image from a compute node
>>>>>>>>> (a power 7 blade) and stores these images under " /install " of the
>>>>>>>>> management node. Please correct me if I am wrong here.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sunil
>>>>>>>>>
>>>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>>>> Hash: SHA1
>>>>>>>>>>
>>>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I used the steps that were mentioned under
>>>>>>>>>>>
>>>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
>>>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL/Add
>>>>>>>>>>> ing+support+for+p>  ar ti mag e+and+partimage-
>>>>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
>>>>>>>>>>>
>>>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need to
>>>>>>>>>>> change references to x86&      x86_64 (as directories) to reflect
>>>>>>>>>>> the
>>>>>>>>>>> ppc architecture, as the web page says "The architecture for the
>>>>>>>>>>> node must always be set to x86 for this..". I have with me the
>>>>>>>>>>> vmlinuz (kernel image) and initrd for the capture process. The 2
>>>>>>>>>>> nodeset commands
>>>>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
>>>>>>>>>> blades, not the ones linked to off of the page you listed above?
>>>>>>>>>> If you do, that's a good start.  However, you'll also need
>>>>>>>>>> rootimg.gz. rootimg.gz is the root filesystem for the stateless
>>>>>>>>>> image.  It also contains the partimage and partimageng binaries.
>>>>>>>>>> Assuming partimage or partimageng can actually capture partitions
>>>>>>>>>> from power systems, you'll need to compile at least one of them
>>>>>>>>>> to run on power.  For the rootimg.gz image I provided, I compiled
>>>>>>>>>> them statically so that I didn't have to worry about including
>>>>>>>>>> any library dependencies in rootimg.gz.
>>>>>>>>>>
>>>>>>>>>> It would be a good idea to research how to use xcat's genimage
>>>>>>>>>> command to generate stateless images to learn how to do this.
>>>>>>>>>>
>>>>>>>>>> If there's any part of the above that you don't fully understand,
>>>>>>>>>> please ask me to clarify it.  Until you have a stateless image
>>>>>>>>>> that you can deploy to your power blades, there's no point in
>>>>>>>>>> trying to debug any VCL specific items.
>>>>>>>>>>
>>>>>>>>>> Josh
>>>>>>>>>> - --
>>>>>>>>>> - ------------------------------**-
>>>>>>>>>> Josh Thompson
>>>>>>>>>> VCL Developer
>>>>>>>>>> North Carolina State University
>>>>>>>>>>
>>>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>>>>>>
>>>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
>>>>>>>>>> g75RqGZY/j
>>>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
>>>>>>>>>> =exBV
>>>>>>>>>> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
Sunil,

"nodeset <nodename> image" sets up all the xCAT stuff so that the next time 
the node is booted, it will boot the stateless/statelite image and capture an 
image of the node.

Can you double check that you have 'os' in the nodetype table set to image for 
the node you are using?  If you look in the partimageng.pm xCAT module, you 
see toward the top where it registers the "handled_commands".  The "mk" gets 
stripped off.  So, that module is registering "install" and "image" for os 
type = "image".  As long as you have os in the nodetype table set to image, it 
should be using that module.

You will need to make sure you have all of the required files in locations 
using 'ppc64' as the arch.

Josh

On Wednesday June 22, 2011, Sunil Venkatesh wrote:
> Hi,
> 
> Update !
> 
> I was able to fix the problem that I was facing with the scripts by
> disabling the firewall. But, I still have a problem with the command-
> 
> nodeset <nodename> image
> 
> Unless this error is fixed, I don't think partimage will work. Am I right
> here?
> 
> Thanks,
> Sunil
> 
> On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh <su...@umbc.edu> wrote:
> > Josh,
> > 
> > I have reached a point where I am able to boot the ppc using the
> > statelite images created using genimage. But, I was wondering how
> > significant the following command is.
> > 
> > nodeset <nodename> image
> > 
> > I got the same error that Prem had mentioned.
> > 
> > 
> > power01: Error: Unable to identify plugin for this command, check
> > relevant tables: nodetype.os
> > Error: Some nodes failed to set up image resources, aborting
> > 
> > I tried changing the 'os' field to 'image' under nodetype, that doesn't
> > seem to help. I get the same error even after the change. 'arch' in my
> > case is set to 'ppc64'.
> > 
> > 
> > Also, I think partimage plugin needs to be changed to support the ppc
> > architecture, from what you had mentioned in the other thread.
> > 
> > I am not sure what the command 'nodeset <nodename> image' does, but, I am
> > able to boot the statelite images by making changes to the yaboot
> > configuration files. The ppc blade currently uses LVM, that needs to be
> > replaced with ext2/ext3 from what I read from the other thread, am I
> > right? Also, just out of curiosity I left the statelite image to boot
> > with my current setting. I can see the xcat script throwing an error-
> > 
> > /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such
> > file or directory
> > /tmp/mypostscript: line 16: updateflag.awk: command not found
> > 
> > both getpostscript.awk & updateflag.awk are not found in the rootimg
> > created by genimage. Is there any place I could find these scripts?
> > 
> > Also, please correct me if there is anything wrong with the procedure I
> > am following.
> > 
> > 
> > Thanks in advance.
> > 
> > Regards,
> > Sunil
> > 
> > On 6/13/11 4:13 PM, Josh Thompson wrote:
> >> Sunil,
> >> 
> >>  From what I remember, I didn't have to do much to the rootimg.gz image
> >>  to
> >> 
> >> make
> >> it work.  I created the files I supply before xCAT started using
> >> "statelite"
> >> instead of "stateless".  I think statelite uses NFS to mount the image,
> >> and
> >> stateless uses an image file downloaded to the node and run out of RAM.
> >> 
> >>  Since
> >> 
> >> generating a statelite image is pretty straightforward use of xCAT, you
> >> may
> >> want to ask on the xcat-user email list for help with it.
> >> 
> >> Unless you can have the admins of the other dhcp server on your network
> >> exclude the MAC addresses of your blades, you'll need to create a
> >> separate private network to control your VCL stuff, either physically
> >> or with VLANs.
> >> 
> >> If they can exclude the MACs, you can set up the dhcp server on your
> >> management node to only answer to requests from your blades.
> >> 
> >> Josh
> >> 
> >> On Monday June 13, 2011, Sunil Venkatesh wrote:
> >>> Josh,
> >>> 
> >>> Again, Thank you for your valuable inputs. I have got to the point
> >>> where I can get the compute node to boot using the stateless images. I
> >>> had to manually configure the netboot since we already had a DHCP
> >>> server which is not the same as our Management node. Since our setup
> >>> is not in an isolated environment, I could not let xcat handle the
> >>> dhcp&  netboot configuration (it messed up out network configuration
> >>> when i let xcat handle it,we had 2 dhcp servers running at that
> >>> point). Are you aware of any way to let xcat handle such scenarios?
> >>> 
> >>> Although I am able to get the compute node to boot with the kernel
> >>> image &  initrd, and NFS mount the rootimg that was generated using
> >>> 'genimage', I am getting the following error on the compute node's
> >>> console -
> >>> 
> >>>      FATAL error: could not get the entries from litefile table...
> >>> 
> >>> after going thru the init-scripts, I found out 'xCATCmd' binary is not
> >>> present in the rootimg. I am currently checking the xcat packages for
> >>> its availability. If you know the procedure to get it onto the compute
> >>> node, please let me know the same.
> >>> 
> >>> Appreciate your support.
> >>> 
> >>> Thanking you,
> >>> Sunil
> >>> 
> >>> On 6/8/11 9:02 AM, Josh Thompson wrote:
> >>>> Sunil,
> >>>> 
> >>>> I don't recall seeing any documentation on those parts.  I had to poke
> >>>> around looking at parts of xCAT to see how it worked.  It's been a few
> >>>> years since I did that; so, I don't remember much about the process. 
> >>>> My recommendation would be to start looking at things in the
> >>>> rootimg.gz image.  Looking at it now, I see that
> >>>> /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It looks
> >>>> like it downloads all of the postscripts from the management node and
> >>>> then run getpostscript.awk which issues a command to xcatd to get the
> >>>> primary postscript for that machine.  I've forgotten how xcatd then
> >>>> builds the primary postscript. I do remember that in the
> >>>> partimageng.pm module, I had it add the partimageng postscript.
> >>>> 
> >>>> So, you'll really have to start digging through how the xcat
> >>>> postscript system works.
> >>>> 
> >>>> Josh
> >>>> 
> >>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
> >>>>> Josh,
> >>>>> 
> >>>>> Is there any place I could find some details on
> >>>>> 
> >>>>> "... /Once the compute node is booted with the stateless
> >>>>> image, it uses NFS to mount some things from the management node, and
> >>>>> then runs some xcat postscripts,/.... "
> >>>>> 
> >>>>> I have the stateless images ready with partimage compiled for PPC.
> >>>>> For the compute node (power 7) to boot using the stateless images, i
> >>>>> need to
> >>>>> configure the yaboot instead of pxeboot (which is specific to x86). I
> >>>>> wanted to know where in the startup files the execution of partimage
> >>>>> and
> >>>>> NFS mount is configured. Is it configured by the "genimage" command
> >>>>> itself? Considering the way in which the nodes are configured in the
> >>>>> network, it would not be a good idea to let xcat take care of
> >>>>> configuring the details like DHCPD for netboot. So, I need to make
> >>>>> changes to the configuration files manually, which is why this query
> >>>>> came up.
> >>>>> 
> >>>>> Thanks in advance.
> >>>>> 
> >>>>> Regards,
> >>>>> Sunil
> >>>>> 
> >>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
> >>>>>> Sunil,
> >>>>>> 
> >>>>>> The "stateless" image I refer to is what is actually booted on the
> >>>>>> compute node containing the image to be captured.  It's called
> >>>>>> stateless because it is loaded completely in RAM and does not
> >>>>>> maintain any state when a reboot occurs.
> >>>>>> 
> >>>>>> The partimage binary is part of this stateless image and actually
> >>>>>> runs on the compute node.  It does not run on the management node. 
> >>>>>> The management node does not have block level access to the disk on
> >>>>>> the compute node to be able to capture the image from the disk.
> >>>>>> 
> >>>>>> I'll try to describe the process a little better.  The management
> >>>>>> node issues a reboot command to the compute node.  The compute node
> >>>>>> uses PXE
> >>>>>> to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img),
> >>>>>> and
> >>>>>> a root filesystem (rootimg.gz) from the management node.  All three
> >>>>>> of these together make up the stateless image.  Once the compute
> >>>>>> node is booted with the stateless image, it uses NFS to mount some
> >>>>>> things from the management node, and then runs some xcat
> >>>>>> postscripts, one of which is the partimageng postscript.  This
> >>>>>> postscript determines what partitions are on the compute node and,
> >>>>>> depending on how the postscript
> >>>>>> is configured, uses partimage or partimageng to capture an image of
> >>>>>> the
> >>>>>> compute node disk that is then saved to the management node. When it
> >>>>>> is
> >>>>>> finished capturing the image, it notifies xcat on the management
> >>>>>> node and then reboots.  xcat reconfigures itself to tell the
> >>>>>> compute node to
> >>>>>> boot off of disk at next boot.  When the compute node comes up, it
> >>>>>> uses
> >>>>>> PXE to ask the management node how to boot.  The management node
> >>>>>> tells it to boot off of disk.
> >>>>>> 
> >>>>>> I hope that clarifies how the system works.  If any of it is
> >>>>>> unclear, please ask for further clarification.
> >>>>>> 
> >>>>>> Josh
> >>>>>> 
> >>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> >>>>>>> Josh,
> >>>>>>> 
> >>>>>>> I had one more clarification.
> >>>>>>> 
> >>>>>>> partimage binaries run in the management node to capture an
> >>>>>>> (stateless) image from the compute node right? In that case, is
> >>>>>>> there a need for these binaries to go into the rootimg.gz??
> >>>>>>> 
> >>>>>>> My assumption is, partimage runs on the management node (an intel
> >>>>>>> blade in our case) to capture a stateless image from a compute node
> >>>>>>> (a power 7 blade) and stores these images under " /install " of the
> >>>>>>> management node. Please correct me if I am wrong here.
> >>>>>>> 
> >>>>>>> Regards,
> >>>>>>> Sunil
> >>>>>>> 
> >>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
> >>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
> >>>>>>>> Hash: SHA1
> >>>>>>>> 
> >>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >>>>>>>>> Hi,
> >>>>>>>>> 
> >>>>>>>>> I used the steps that were mentioned under
> >>>>>>>>> 
> >>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
> >>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL/Add
> >>>>>>>>> ing+support+for+p> ar ti mag e+and+partimage-
> >>>>>>>>> ng+to+xCAT+2.x+%28unofficial%**29
> >>>>>>>>> 
> >>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need to
> >>>>>>>>> change references to x86&     x86_64 (as directories) to reflect
> >>>>>>>>> the
> >>>>>>>>> ppc architecture, as the web page says "The architecture for the
> >>>>>>>>> node must always be set to x86 for this..". I have with me the
> >>>>>>>>> vmlinuz (kernel image) and initrd for the capture process. The 2
> >>>>>>>>> nodeset commands
> >>>>>>>> 
> >>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
> >>>>>>>> blades, not the ones linked to off of the page you listed above? 
> >>>>>>>> If you do, that's a good start.  However, you'll also need
> >>>>>>>> rootimg.gz. rootimg.gz is the root filesystem for the stateless
> >>>>>>>> image.  It also contains the partimage and partimageng binaries. 
> >>>>>>>> Assuming partimage or partimageng can actually capture partitions
> >>>>>>>> from power systems, you'll need to compile at least one of them
> >>>>>>>> to run on power.  For the rootimg.gz image I provided, I compiled
> >>>>>>>> them statically so that I didn't have to worry about including
> >>>>>>>> any library dependencies in rootimg.gz.
> >>>>>>>> 
> >>>>>>>> It would be a good idea to research how to use xcat's genimage
> >>>>>>>> command to generate stateless images to learn how to do this.
> >>>>>>>> 
> >>>>>>>> If there's any part of the above that you don't fully understand,
> >>>>>>>> please ask me to clarify it.  Until you have a stateless image
> >>>>>>>> that you can deploy to your power blades, there's no point in
> >>>>>>>> trying to debug any VCL specific items.
> >>>>>>>> 
> >>>>>>>> Josh
> >>>>>>>> - --
> >>>>>>>> - ------------------------------**-
> >>>>>>>> Josh Thompson
> >>>>>>>> VCL Developer
> >>>>>>>> North Carolina State University
> >>>>>>>> 
> >>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
> >>>>>>>> -----BEGIN PGP SIGNATURE-----
> >>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
> >>>>>>>> 
> >>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
> >>>>>>>> g75RqGZY/j
> >>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
> >>>>>>>> =exBV
> >>>>>>>> -----END PGP SIGNATURE-----
-- 
-------------------------------
Josh Thompson
Systems Programmer
Advanced Computing | VCL Developer
North Carolina State University

Josh_Thompson@ncsu.edu
919-515-5323

my GPG/PGP key can be found at pgp.mit.edu

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Hi,

Update !

I was able to fix the problem that I was facing with the scripts by
disabling the firewall. But, I still have a problem with the command-

nodeset <nodename> image

Unless this error is fixed, I don't think partimage will work. Am I right
here?

Thanks,
Sunil



On Tue, Jun 21, 2011 at 3:13 PM, Sunil Venkatesh <su...@umbc.edu> wrote:

> Josh,
>
> I have reached a point where I am able to boot the ppc using the statelite
> images created using genimage. But, I was wondering how significant the
> following command is.
>
> nodeset <nodename> image
>
> I got the same error that Prem had mentioned.
>
>
> power01: Error: Unable to identify plugin for this command, check relevant
> tables: nodetype.os
> Error: Some nodes failed to set up image resources, aborting
>
> I tried changing the 'os' field to 'image' under nodetype, that doesn't
> seem to help. I get the same error even after the change. 'arch' in my case
> is set to 'ppc64'.
>
>
> Also, I think partimage plugin needs to be changed to support the ppc
> architecture, from what you had mentioned in the other thread.
>
> I am not sure what the command 'nodeset <nodename> image' does, but, I am
> able to boot the statelite images by making changes to the yaboot
> configuration files. The ppc blade currently uses LVM, that needs to be
> replaced with ext2/ext3 from what I read from the other thread, am I right?
> Also, just out of curiosity I left the statelite image to boot with my
> current setting. I can see the xcat script throwing an error-
>
> /opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such
> file or directory
> /tmp/mypostscript: line 16: updateflag.awk: command not found
>
> both getpostscript.awk & updateflag.awk are not found in the rootimg
> created by genimage. Is there any place I could find these scripts?
>
> Also, please correct me if there is anything wrong with the procedure I am
> following.
>
>
> Thanks in advance.
>
> Regards,
> Sunil
>
> On 6/13/11 4:13 PM, Josh Thompson wrote:
>
>> Sunil,
>>
>>  From what I remember, I didn't have to do much to the rootimg.gz image to
>> make
>> it work.  I created the files I supply before xCAT started using
>> "statelite"
>> instead of "stateless".  I think statelite uses NFS to mount the image,
>> and
>> stateless uses an image file downloaded to the node and run out of RAM.
>>  Since
>> generating a statelite image is pretty straightforward use of xCAT, you
>> may
>> want to ask on the xcat-user email list for help with it.
>>
>> Unless you can have the admins of the other dhcp server on your network
>> exclude the MAC addresses of your blades, you'll need to create a separate
>> private network to control your VCL stuff, either physically or with
>> VLANs.
>>
>> If they can exclude the MACs, you can set up the dhcp server on your
>> management node to only answer to requests from your blades.
>>
>> Josh
>>
>> On Monday June 13, 2011, Sunil Venkatesh wrote:
>>
>>> Josh,
>>>
>>> Again, Thank you for your valuable inputs. I have got to the point where
>>> I can get the compute node to boot using the stateless images. I had to
>>> manually configure the netboot since we already had a DHCP server which
>>> is not the same as our Management node. Since our setup is not in an
>>> isolated environment, I could not let xcat handle the dhcp&  netboot
>>> configuration (it messed up out network configuration when i let xcat
>>> handle it,we had 2 dhcp servers running at that point). Are you aware of
>>> any way to let xcat handle such scenarios?
>>>
>>> Although I am able to get the compute node to boot with the kernel image
>>> &  initrd, and NFS mount the rootimg that was generated using 'genimage',
>>> I am getting the following error on the compute node's console -
>>>
>>>      FATAL error: could not get the entries from litefile table...
>>>
>>> after going thru the init-scripts, I found out 'xCATCmd' binary is not
>>> present in the rootimg. I am currently checking the xcat packages for
>>> its availability. If you know the procedure to get it onto the compute
>>> node, please let me know the same.
>>>
>>> Appreciate your support.
>>>
>>> Thanking you,
>>> Sunil
>>>
>>> On 6/8/11 9:02 AM, Josh Thompson wrote:
>>>
>>>> Sunil,
>>>>
>>>> I don't recall seeing any documentation on those parts.  I had to poke
>>>> around looking at parts of xCAT to see how it worked.  It's been a few
>>>> years since I did that; so, I don't remember much about the process.  My
>>>> recommendation would be to start looking at things in the rootimg.gz
>>>> image.  Looking at it now, I see that /opt/xcat/xcatdsklspost gets run
>>>> when rootimg.gz boots.  It looks like it downloads all of the
>>>> postscripts from the management node and then run getpostscript.awk
>>>> which issues a command to xcatd to get the primary postscript for that
>>>> machine.  I've forgotten how xcatd then builds the primary postscript.
>>>> I do remember that in the partimageng.pm module, I had it add the
>>>> partimageng postscript.
>>>>
>>>> So, you'll really have to start digging through how the xcat postscript
>>>> system works.
>>>>
>>>> Josh
>>>>
>>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>>>>
>>>>> Josh,
>>>>>
>>>>> Is there any place I could find some details on
>>>>>
>>>>> "... /Once the compute node is booted with the stateless
>>>>> image, it uses NFS to mount some things from the management node, and
>>>>> then runs some xcat postscripts,/.... "
>>>>>
>>>>> I have the stateless images ready with partimage compiled for PPC. For
>>>>> the compute node (power 7) to boot using the stateless images, i need
>>>>> to
>>>>> configure the yaboot instead of pxeboot (which is specific to x86). I
>>>>> wanted to know where in the startup files the execution of partimage
>>>>> and
>>>>> NFS mount is configured. Is it configured by the "genimage" command
>>>>> itself? Considering the way in which the nodes are configured in the
>>>>> network, it would not be a good idea to let xcat take care of
>>>>> configuring the details like DHCPD for netboot. So, I need to make
>>>>> changes to the configuration files manually, which is why this query
>>>>> came up.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Regards,
>>>>> Sunil
>>>>>
>>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>>>>
>>>>>> Sunil,
>>>>>>
>>>>>> The "stateless" image I refer to is what is actually booted on the
>>>>>> compute node containing the image to be captured.  It's called
>>>>>> stateless because it is loaded completely in RAM and does not maintain
>>>>>> any state when a reboot occurs.
>>>>>>
>>>>>> The partimage binary is part of this stateless image and actually runs
>>>>>> on the compute node.  It does not run on the management node.  The
>>>>>> management node does not have block level access to the disk on the
>>>>>> compute node to be able to capture the image from the disk.
>>>>>>
>>>>>> I'll try to describe the process a little better.  The management node
>>>>>> issues a reboot command to the compute node.  The compute node uses
>>>>>> PXE
>>>>>> to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img),
>>>>>> and
>>>>>> a root filesystem (rootimg.gz) from the management node.  All three of
>>>>>> these together make up the stateless image.  Once the compute node is
>>>>>> booted with the stateless image, it uses NFS to mount some things from
>>>>>> the management node, and then runs some xcat postscripts, one of which
>>>>>> is the partimageng postscript.  This postscript determines what
>>>>>> partitions are on the compute node and, depending on how the
>>>>>> postscript
>>>>>> is configured, uses partimage or partimageng to capture an image of
>>>>>> the
>>>>>> compute node disk that is then saved to the management node. When it
>>>>>> is
>>>>>> finished capturing the image, it notifies xcat on the management node
>>>>>> and then reboots.  xcat reconfigures itself to tell the compute node
>>>>>> to
>>>>>> boot off of disk at next boot.  When the compute node comes up, it
>>>>>> uses
>>>>>> PXE to ask the management node how to boot.  The management node tells
>>>>>> it to boot off of disk.
>>>>>>
>>>>>> I hope that clarifies how the system works.  If any of it is unclear,
>>>>>> please ask for further clarification.
>>>>>>
>>>>>> Josh
>>>>>>
>>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>>>>
>>>>>>> Josh,
>>>>>>>
>>>>>>> I had one more clarification.
>>>>>>>
>>>>>>> partimage binaries run in the management node to capture an
>>>>>>> (stateless) image from the compute node right? In that case, is there
>>>>>>> a need for these binaries to go into the rootimg.gz??
>>>>>>>
>>>>>>> My assumption is, partimage runs on the management node (an intel
>>>>>>> blade in our case) to capture a stateless image from a compute node
>>>>>>> (a power 7 blade) and stores these images under " /install " of the
>>>>>>> management node. Please correct me if I am wrong here.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Sunil
>>>>>>>
>>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>>>>
>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>> Hash: SHA1
>>>>>>>>
>>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I used the steps that were mentioned under
>>>>>>>>>
>>>>>>>>> https://cwiki.apache.org/**confluence/display/VCL/Adding+**
>>>>>>>>> support+for+p<https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+p>
>>>>>>>>> ar ti mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%**29
>>>>>>>>>
>>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need to
>>>>>>>>> change references to x86&     x86_64 (as directories) to reflect
>>>>>>>>> the
>>>>>>>>> ppc architecture, as the web page says "The architecture for the
>>>>>>>>> node must always be set to x86 for this..". I have with me the
>>>>>>>>> vmlinuz (kernel image) and initrd for the capture process. The 2
>>>>>>>>> nodeset commands
>>>>>>>>>
>>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
>>>>>>>> blades, not the ones linked to off of the page you listed above?  If
>>>>>>>> you do, that's a good start.  However, you'll also need rootimg.gz.
>>>>>>>> rootimg.gz is the root filesystem for the stateless image.  It also
>>>>>>>> contains the partimage and partimageng binaries.  Assuming partimage
>>>>>>>> or partimageng can actually capture partitions from power systems,
>>>>>>>> you'll need to compile at least one of them to run on power.  For
>>>>>>>> the rootimg.gz image I provided, I compiled them statically so that
>>>>>>>> I didn't have to worry about including any library dependencies in
>>>>>>>> rootimg.gz.
>>>>>>>>
>>>>>>>> It would be a good idea to research how to use xcat's genimage
>>>>>>>> command to generate stateless images to learn how to do this.
>>>>>>>>
>>>>>>>> If there's any part of the above that you don't fully understand,
>>>>>>>> please ask me to clarify it.  Until you have a stateless image that
>>>>>>>> you can deploy to your power blades, there's no point in trying to
>>>>>>>> debug any VCL specific items.
>>>>>>>>
>>>>>>>> Josh
>>>>>>>> - --
>>>>>>>> - ------------------------------**-
>>>>>>>> Josh Thompson
>>>>>>>> VCL Developer
>>>>>>>> North Carolina State University
>>>>>>>>
>>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>>>>
>>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/**LQcNdtPQNnVgCbB9ZFJn0+C45RC/**
>>>>>>>> g75RqGZY/j
>>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4O**Ma
>>>>>>>> =exBV
>>>>>>>> -----END PGP SIGNATURE-----
>>>>>>>>
>>>>>>>

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Josh,

I have reached a point where I am able to boot the ppc using the 
statelite images created using genimage. But, I was wondering how 
significant the following command is.

nodeset <nodename> image

I got the same error that Prem had mentioned.

power01: Error: Unable to identify plugin for this command, check 
relevant tables: nodetype.os
Error: Some nodes failed to set up image resources, aborting

I tried changing the 'os' field to 'image' under nodetype, that doesn't 
seem to help. I get the same error even after the change. 'arch' in my 
case is set to 'ppc64'.

Also, I think partimage plugin needs to be changed to support the ppc 
architecture, from what you had mentioned in the other thread.

I am not sure what the command 'nodeset <nodename> image' does, but, I 
am able to boot the statelite images by making changes to the yaboot 
configuration files. The ppc blade currently uses LVM, that needs to be 
replaced with ext2/ext3 from what I read from the other thread, am I 
right? Also, just out of curiosity I left the statelite image to boot 
with my current setting. I can see the xcat script throwing an error-

/opt/xcat/xcatdsklspost: line 229: /xcatpost/getpostscript.awk: No such 
file or directory
/tmp/mypostscript: line 16: updateflag.awk: command not found

both getpostscript.awk & updateflag.awk are not found in the rootimg 
created by genimage. Is there any place I could find these scripts?

Also, please correct me if there is anything wrong with the procedure I 
am following.

Thanks in advance.

Regards,
Sunil

On 6/13/11 4:13 PM, Josh Thompson wrote:
> Sunil,
>
>  From what I remember, I didn't have to do much to the rootimg.gz image to make
> it work.  I created the files I supply before xCAT started using "statelite"
> instead of "stateless".  I think statelite uses NFS to mount the image, and
> stateless uses an image file downloaded to the node and run out of RAM.  Since
> generating a statelite image is pretty straightforward use of xCAT, you may
> want to ask on the xcat-user email list for help with it.
>
> Unless you can have the admins of the other dhcp server on your network
> exclude the MAC addresses of your blades, you'll need to create a separate
> private network to control your VCL stuff, either physically or with VLANs.
>
> If they can exclude the MACs, you can set up the dhcp server on your
> management node to only answer to requests from your blades.
>
> Josh
>
> On Monday June 13, 2011, Sunil Venkatesh wrote:
>> Josh,
>>
>> Again, Thank you for your valuable inputs. I have got to the point where
>> I can get the compute node to boot using the stateless images. I had to
>> manually configure the netboot since we already had a DHCP server which
>> is not the same as our Management node. Since our setup is not in an
>> isolated environment, I could not let xcat handle the dhcp&  netboot
>> configuration (it messed up out network configuration when i let xcat
>> handle it,we had 2 dhcp servers running at that point). Are you aware of
>> any way to let xcat handle such scenarios?
>>
>> Although I am able to get the compute node to boot with the kernel image
>> &  initrd, and NFS mount the rootimg that was generated using 'genimage',
>> I am getting the following error on the compute node's console -
>>
>>       FATAL error: could not get the entries from litefile table...
>>
>> after going thru the init-scripts, I found out 'xCATCmd' binary is not
>> present in the rootimg. I am currently checking the xcat packages for
>> its availability. If you know the procedure to get it onto the compute
>> node, please let me know the same.
>>
>> Appreciate your support.
>>
>> Thanking you,
>> Sunil
>>
>> On 6/8/11 9:02 AM, Josh Thompson wrote:
>>> Sunil,
>>>
>>> I don't recall seeing any documentation on those parts.  I had to poke
>>> around looking at parts of xCAT to see how it worked.  It's been a few
>>> years since I did that; so, I don't remember much about the process.  My
>>> recommendation would be to start looking at things in the rootimg.gz
>>> image.  Looking at it now, I see that /opt/xcat/xcatdsklspost gets run
>>> when rootimg.gz boots.  It looks like it downloads all of the
>>> postscripts from the management node and then run getpostscript.awk
>>> which issues a command to xcatd to get the primary postscript for that
>>> machine.  I've forgotten how xcatd then builds the primary postscript.
>>> I do remember that in the partimageng.pm module, I had it add the
>>> partimageng postscript.
>>>
>>> So, you'll really have to start digging through how the xcat postscript
>>> system works.
>>>
>>> Josh
>>>
>>> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>>>> Josh,
>>>>
>>>> Is there any place I could find some details on
>>>>
>>>> "... /Once the compute node is booted with the stateless
>>>> image, it uses NFS to mount some things from the management node, and
>>>> then runs some xcat postscripts,/.... "
>>>>
>>>> I have the stateless images ready with partimage compiled for PPC. For
>>>> the compute node (power 7) to boot using the stateless images, i need to
>>>> configure the yaboot instead of pxeboot (which is specific to x86). I
>>>> wanted to know where in the startup files the execution of partimage and
>>>> NFS mount is configured. Is it configured by the "genimage" command
>>>> itself? Considering the way in which the nodes are configured in the
>>>> network, it would not be a good idea to let xcat take care of
>>>> configuring the details like DHCPD for netboot. So, I need to make
>>>> changes to the configuration files manually, which is why this query
>>>> came up.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards,
>>>> Sunil
>>>>
>>>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>>>> Sunil,
>>>>>
>>>>> The "stateless" image I refer to is what is actually booted on the
>>>>> compute node containing the image to be captured.  It's called
>>>>> stateless because it is loaded completely in RAM and does not maintain
>>>>> any state when a reboot occurs.
>>>>>
>>>>> The partimage binary is part of this stateless image and actually runs
>>>>> on the compute node.  It does not run on the management node.  The
>>>>> management node does not have block level access to the disk on the
>>>>> compute node to be able to capture the image from the disk.
>>>>>
>>>>> I'll try to describe the process a little better.  The management node
>>>>> issues a reboot command to the compute node.  The compute node uses PXE
>>>>> to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img), and
>>>>> a root filesystem (rootimg.gz) from the management node.  All three of
>>>>> these together make up the stateless image.  Once the compute node is
>>>>> booted with the stateless image, it uses NFS to mount some things from
>>>>> the management node, and then runs some xcat postscripts, one of which
>>>>> is the partimageng postscript.  This postscript determines what
>>>>> partitions are on the compute node and, depending on how the postscript
>>>>> is configured, uses partimage or partimageng to capture an image of the
>>>>> compute node disk that is then saved to the management node. When it is
>>>>> finished capturing the image, it notifies xcat on the management node
>>>>> and then reboots.  xcat reconfigures itself to tell the compute node to
>>>>> boot off of disk at next boot.  When the compute node comes up, it uses
>>>>> PXE to ask the management node how to boot.  The management node tells
>>>>> it to boot off of disk.
>>>>>
>>>>> I hope that clarifies how the system works.  If any of it is unclear,
>>>>> please ask for further clarification.
>>>>>
>>>>> Josh
>>>>>
>>>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>>>> Josh,
>>>>>>
>>>>>> I had one more clarification.
>>>>>>
>>>>>> partimage binaries run in the management node to capture an
>>>>>> (stateless) image from the compute node right? In that case, is there
>>>>>> a need for these binaries to go into the rootimg.gz??
>>>>>>
>>>>>> My assumption is, partimage runs on the management node (an intel
>>>>>> blade in our case) to capture a stateless image from a compute node
>>>>>> (a power 7 blade) and stores these images under " /install " of the
>>>>>> management node. Please correct me if I am wrong here.
>>>>>>
>>>>>> Regards,
>>>>>> Sunil
>>>>>>
>>>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>> Hash: SHA1
>>>>>>>
>>>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I used the steps that were mentioned under
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+p
>>>>>>>> ar ti mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
>>>>>>>>
>>>>>>>> to enable partimage support for xcat. I wasn't sure if I need to
>>>>>>>> change references to x86&     x86_64 (as directories) to reflect the
>>>>>>>> ppc architecture, as the web page says "The architecture for the
>>>>>>>> node must always be set to x86 for this..". I have with me the
>>>>>>>> vmlinuz (kernel image) and initrd for the capture process. The 2
>>>>>>>> nodeset commands
>>>>>>> By this, do you mean you have vmlinuz and initrd for your power
>>>>>>> blades, not the ones linked to off of the page you listed above?  If
>>>>>>> you do, that's a good start.  However, you'll also need rootimg.gz.
>>>>>>> rootimg.gz is the root filesystem for the stateless image.  It also
>>>>>>> contains the partimage and partimageng binaries.  Assuming partimage
>>>>>>> or partimageng can actually capture partitions from power systems,
>>>>>>> you'll need to compile at least one of them to run on power.  For
>>>>>>> the rootimg.gz image I provided, I compiled them statically so that
>>>>>>> I didn't have to worry about including any library dependencies in
>>>>>>> rootimg.gz.
>>>>>>>
>>>>>>> It would be a good idea to research how to use xcat's genimage
>>>>>>> command to generate stateless images to learn how to do this.
>>>>>>>
>>>>>>> If there's any part of the above that you don't fully understand,
>>>>>>> please ask me to clarify it.  Until you have a stateless image that
>>>>>>> you can deploy to your power blades, there's no point in trying to
>>>>>>> debug any VCL specific items.
>>>>>>>
>>>>>>> Josh
>>>>>>> - --
>>>>>>> - -------------------------------
>>>>>>> Josh Thompson
>>>>>>> VCL Developer
>>>>>>> North Carolina State University
>>>>>>>
>>>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>>>
>>>>>>> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
>>>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
>>>>>>> =exBV
>>>>>>> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
Sunil,

From what I remember, I didn't have to do much to the rootimg.gz image to make 
it work.  I created the files I supply before xCAT started using "statelite" 
instead of "stateless".  I think statelite uses NFS to mount the image, and  
stateless uses an image file downloaded to the node and run out of RAM.  Since 
generating a statelite image is pretty straightforward use of xCAT, you may 
want to ask on the xcat-user email list for help with it.

Unless you can have the admins of the other dhcp server on your network 
exclude the MAC addresses of your blades, you'll need to create a separate 
private network to control your VCL stuff, either physically or with VLANs.

If they can exclude the MACs, you can set up the dhcp server on your 
management node to only answer to requests from your blades.

Josh

On Monday June 13, 2011, Sunil Venkatesh wrote:
> Josh,
> 
> Again, Thank you for your valuable inputs. I have got to the point where
> I can get the compute node to boot using the stateless images. I had to
> manually configure the netboot since we already had a DHCP server which
> is not the same as our Management node. Since our setup is not in an
> isolated environment, I could not let xcat handle the dhcp & netboot
> configuration (it messed up out network configuration when i let xcat
> handle it,we had 2 dhcp servers running at that point). Are you aware of
> any way to let xcat handle such scenarios?
> 
> Although I am able to get the compute node to boot with the kernel image
> & initrd, and NFS mount the rootimg that was generated using 'genimage',
> I am getting the following error on the compute node's console -
> 
>      FATAL error: could not get the entries from litefile table...
> 
> after going thru the init-scripts, I found out 'xCATCmd' binary is not
> present in the rootimg. I am currently checking the xcat packages for
> its availability. If you know the procedure to get it onto the compute
> node, please let me know the same.
> 
> Appreciate your support.
> 
> Thanking you,
> Sunil
> 
> On 6/8/11 9:02 AM, Josh Thompson wrote:
> > Sunil,
> > 
> > I don't recall seeing any documentation on those parts.  I had to poke
> > around looking at parts of xCAT to see how it worked.  It's been a few
> > years since I did that; so, I don't remember much about the process.  My
> > recommendation would be to start looking at things in the rootimg.gz
> > image.  Looking at it now, I see that /opt/xcat/xcatdsklspost gets run
> > when rootimg.gz boots.  It looks like it downloads all of the
> > postscripts from the management node and then run getpostscript.awk
> > which issues a command to xcatd to get the primary postscript for that
> > machine.  I've forgotten how xcatd then builds the primary postscript. 
> > I do remember that in the partimageng.pm module, I had it add the
> > partimageng postscript.
> > 
> > So, you'll really have to start digging through how the xcat postscript
> > system works.
> > 
> > Josh
> > 
> > On Tuesday June 07, 2011, Sunil Venkatesh wrote:
> >> Josh,
> >> 
> >> Is there any place I could find some details on
> >> 
> >> "... /Once the compute node is booted with the stateless
> >> image, it uses NFS to mount some things from the management node, and
> >> then runs some xcat postscripts,/.... "
> >> 
> >> I have the stateless images ready with partimage compiled for PPC. For
> >> the compute node (power 7) to boot using the stateless images, i need to
> >> configure the yaboot instead of pxeboot (which is specific to x86). I
> >> wanted to know where in the startup files the execution of partimage and
> >> NFS mount is configured. Is it configured by the "genimage" command
> >> itself? Considering the way in which the nodes are configured in the
> >> network, it would not be a good idea to let xcat take care of
> >> configuring the details like DHCPD for netboot. So, I need to make
> >> changes to the configuration files manually, which is why this query
> >> came up.
> >> 
> >> Thanks in advance.
> >> 
> >> Regards,
> >> Sunil
> >> 
> >> On 6/1/11 1:39 PM, Josh Thompson wrote:
> >>> Sunil,
> >>> 
> >>> The "stateless" image I refer to is what is actually booted on the
> >>> compute node containing the image to be captured.  It's called
> >>> stateless because it is loaded completely in RAM and does not maintain
> >>> any state when a reboot occurs.
> >>> 
> >>> The partimage binary is part of this stateless image and actually runs
> >>> on the compute node.  It does not run on the management node.  The
> >>> management node does not have block level access to the disk on the
> >>> compute node to be able to capture the image from the disk.
> >>> 
> >>> I'll try to describe the process a little better.  The management node
> >>> issues a reboot command to the compute node.  The compute node uses PXE
> >>> to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img), and
> >>> a root filesystem (rootimg.gz) from the management node.  All three of
> >>> these together make up the stateless image.  Once the compute node is
> >>> booted with the stateless image, it uses NFS to mount some things from
> >>> the management node, and then runs some xcat postscripts, one of which
> >>> is the partimageng postscript.  This postscript determines what
> >>> partitions are on the compute node and, depending on how the postscript
> >>> is configured, uses partimage or partimageng to capture an image of the
> >>> compute node disk that is then saved to the management node. When it is
> >>> finished capturing the image, it notifies xcat on the management node
> >>> and then reboots.  xcat reconfigures itself to tell the compute node to
> >>> boot off of disk at next boot.  When the compute node comes up, it uses
> >>> PXE to ask the management node how to boot.  The management node tells
> >>> it to boot off of disk.
> >>> 
> >>> I hope that clarifies how the system works.  If any of it is unclear,
> >>> please ask for further clarification.
> >>> 
> >>> Josh
> >>> 
> >>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> >>>> Josh,
> >>>> 
> >>>> I had one more clarification.
> >>>> 
> >>>> partimage binaries run in the management node to capture an
> >>>> (stateless) image from the compute node right? In that case, is there
> >>>> a need for these binaries to go into the rootimg.gz??
> >>>> 
> >>>> My assumption is, partimage runs on the management node (an intel
> >>>> blade in our case) to capture a stateless image from a compute node
> >>>> (a power 7 blade) and stores these images under " /install " of the
> >>>> management node. Please correct me if I am wrong here.
> >>>> 
> >>>> Regards,
> >>>> Sunil
> >>>> 
> >>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
> >>>>> -----BEGIN PGP SIGNED MESSAGE-----
> >>>>> Hash: SHA1
> >>>>> 
> >>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >>>>>> Hi,
> >>>>>> 
> >>>>>> I used the steps that were mentioned under
> >>>>>> 
> >>>>>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+p
> >>>>>> ar ti mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
> >>>>>> 
> >>>>>> to enable partimage support for xcat. I wasn't sure if I need to
> >>>>>> change references to x86&    x86_64 (as directories) to reflect the
> >>>>>> ppc architecture, as the web page says "The architecture for the
> >>>>>> node must always be set to x86 for this..". I have with me the
> >>>>>> vmlinuz (kernel image) and initrd for the capture process. The 2
> >>>>>> nodeset commands
> >>>>> 
> >>>>> By this, do you mean you have vmlinuz and initrd for your power
> >>>>> blades, not the ones linked to off of the page you listed above?  If
> >>>>> you do, that's a good start.  However, you'll also need rootimg.gz. 
> >>>>> rootimg.gz is the root filesystem for the stateless image.  It also
> >>>>> contains the partimage and partimageng binaries.  Assuming partimage
> >>>>> or partimageng can actually capture partitions from power systems,
> >>>>> you'll need to compile at least one of them to run on power.  For
> >>>>> the rootimg.gz image I provided, I compiled them statically so that
> >>>>> I didn't have to worry about including any library dependencies in
> >>>>> rootimg.gz.
> >>>>> 
> >>>>> It would be a good idea to research how to use xcat's genimage
> >>>>> command to generate stateless images to learn how to do this.
> >>>>> 
> >>>>> If there's any part of the above that you don't fully understand,
> >>>>> please ask me to clarify it.  Until you have a stateless image that
> >>>>> you can deploy to your power blades, there's no point in trying to
> >>>>> debug any VCL specific items.
> >>>>> 
> >>>>> Josh
> >>>>> - --
> >>>>> - -------------------------------
> >>>>> Josh Thompson
> >>>>> VCL Developer
> >>>>> North Carolina State University
> >>>>> 
> >>>>> my GPG/PGP key can be found at pgp.mit.edu
> >>>>> -----BEGIN PGP SIGNATURE-----
> >>>>> Version: GnuPG v2.0.17 (GNU/Linux)
> >>>>> 
> >>>>> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
> >>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
> >>>>> =exBV
> >>>>> -----END PGP SIGNATURE-----
-- 
-------------------------------
Josh Thompson
Systems Programmer
Advanced Computing | VCL Developer
North Carolina State University

Josh_Thompson@ncsu.edu
919-515-5323

my GPG/PGP key can be found at pgp.mit.edu

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Josh,

Again, Thank you for your valuable inputs. I have got to the point where 
I can get the compute node to boot using the stateless images. I had to 
manually configure the netboot since we already had a DHCP server which 
is not the same as our Management node. Since our setup is not in an 
isolated environment, I could not let xcat handle the dhcp & netboot 
configuration (it messed up out network configuration when i let xcat 
handle it,we had 2 dhcp servers running at that point). Are you aware of 
any way to let xcat handle such scenarios?

Although I am able to get the compute node to boot with the kernel image 
& initrd, and NFS mount the rootimg that was generated using 'genimage', 
I am getting the following error on the compute node's console -

     FATAL error: could not get the entries from litefile table...

after going thru the init-scripts, I found out 'xCATCmd' binary is not 
present in the rootimg. I am currently checking the xcat packages for 
its availability. If you know the procedure to get it onto the compute 
node, please let me know the same.

Appreciate your support.

Thanking you,
Sunil

On 6/8/11 9:02 AM, Josh Thompson wrote:
> Sunil,
>
> I don't recall seeing any documentation on those parts.  I had to poke around
> looking at parts of xCAT to see how it worked.  It's been a few years since I
> did that; so, I don't remember much about the process.  My recommendation
> would be to start looking at things in the rootimg.gz image.  Looking at it
> now, I see that /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It
> looks like it downloads all of the postscripts from the management node and
> then run getpostscript.awk which issues a command to xcatd to get the primary
> postscript for that machine.  I've forgotten how xcatd then builds the primary
> postscript.  I do remember that in the partimageng.pm module, I had it add the
> partimageng postscript.
>
> So, you'll really have to start digging through how the xcat postscript system
> works.
>
> Josh
>
> On Tuesday June 07, 2011, Sunil Venkatesh wrote:
>> Josh,
>>
>> Is there any place I could find some details on
>>
>> "... /Once the compute node is booted with the stateless
>> image, it uses NFS to mount some things from the management node, and then
>> runs some xcat postscripts,/.... "
>>
>> I have the stateless images ready with partimage compiled for PPC. For
>> the compute node (power 7) to boot using the stateless images, i need to
>> configure the yaboot instead of pxeboot (which is specific to x86). I
>> wanted to know where in the startup files the execution of partimage and
>> NFS mount is configured. Is it configured by the "genimage" command
>> itself? Considering the way in which the nodes are configured in the
>> network, it would not be a good idea to let xcat take care of
>> configuring the details like DHCPD for netboot. So, I need to make
>> changes to the configuration files manually, which is why this query
>> came up.
>>
>> Thanks in advance.
>>
>> Regards,
>> Sunil
>>
>> On 6/1/11 1:39 PM, Josh Thompson wrote:
>>> Sunil,
>>>
>>> The "stateless" image I refer to is what is actually booted on the
>>> compute node containing the image to be captured.  It's called stateless
>>> because it is loaded completely in RAM and does not maintain any state
>>> when a reboot occurs.
>>>
>>> The partimage binary is part of this stateless image and actually runs on
>>> the compute node.  It does not run on the management node.  The
>>> management node does not have block level access to the disk on the
>>> compute node to be able to capture the image from the disk.
>>>
>>> I'll try to describe the process a little better.  The management node
>>> issues a reboot command to the compute node.  The compute node uses PXE
>>> to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img), and
>>> a root filesystem (rootimg.gz) from the management node.  All three of
>>> these together make up the stateless image.  Once the compute node is
>>> booted with the stateless image, it uses NFS to mount some things from
>>> the management node, and then runs some xcat postscripts, one of which
>>> is the partimageng postscript.  This postscript determines what
>>> partitions are on the compute node and, depending on how the postscript
>>> is configured, uses partimage or partimageng to capture an image of the
>>> compute node disk that is then saved to the management node. When it is
>>> finished capturing the image, it notifies xcat on the management node
>>> and then reboots.  xcat reconfigures itself to tell the compute node to
>>> boot off of disk at next boot.  When the compute node comes up, it uses
>>> PXE to ask the management node how to boot.  The management node tells
>>> it to boot off of disk.
>>>
>>> I hope that clarifies how the system works.  If any of it is unclear,
>>> please ask for further clarification.
>>>
>>> Josh
>>>
>>> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>>>> Josh,
>>>>
>>>> I had one more clarification.
>>>>
>>>> partimage binaries run in the management node to capture an (stateless)
>>>> image from the compute node right? In that case, is there a need for
>>>> these binaries to go into the rootimg.gz??
>>>>
>>>> My assumption is, partimage runs on the management node (an intel blade
>>>> in our case) to capture a stateless image from a compute node (a power 7
>>>> blade) and stores these images under " /install " of the management
>>>> node. Please correct me if I am wrong here.
>>>>
>>>> Regards,
>>>> Sunil
>>>>
>>>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>> Hash: SHA1
>>>>>
>>>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I used the steps that were mentioned under
>>>>>>
>>>>>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+par
>>>>>> ti mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
>>>>>>
>>>>>> to enable partimage support for xcat. I wasn't sure if I need to
>>>>>> change references to x86&    x86_64 (as directories) to reflect the
>>>>>> ppc architecture, as the web page says "The architecture for the node
>>>>>> must always be set to x86 for this..". I have with me the vmlinuz
>>>>>> (kernel image) and initrd for the capture process. The 2 nodeset
>>>>>> commands
>>>>> By this, do you mean you have vmlinuz and initrd for your power blades,
>>>>> not the ones linked to off of the page you listed above?  If you do,
>>>>> that's a good start.  However, you'll also need rootimg.gz.  rootimg.gz
>>>>> is the root filesystem for the stateless image.  It also contains the
>>>>> partimage and partimageng binaries.  Assuming partimage or partimageng
>>>>> can actually capture partitions from power systems, you'll need to
>>>>> compile at least one of them to run on power.  For the rootimg.gz image
>>>>> I provided, I compiled them statically so that I didn't have to worry
>>>>> about including any library dependencies in rootimg.gz.
>>>>>
>>>>> It would be a good idea to research how to use xcat's genimage command
>>>>> to generate stateless images to learn how to do this.
>>>>>
>>>>> If there's any part of the above that you don't fully understand,
>>>>> please ask me to clarify it.  Until you have a stateless image that
>>>>> you can deploy to your power blades, there's no point in trying to
>>>>> debug any VCL specific items.
>>>>>
>>>>> Josh
>>>>> - --
>>>>> - -------------------------------
>>>>> Josh Thompson
>>>>> VCL Developer
>>>>> North Carolina State University
>>>>>
>>>>> my GPG/PGP key can be found at pgp.mit.edu
>>>>> -----BEGIN PGP SIGNATURE-----
>>>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>>>
>>>>> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
>>>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
>>>>> =exBV
>>>>> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
Sunil,

I don't recall seeing any documentation on those parts.  I had to poke around 
looking at parts of xCAT to see how it worked.  It's been a few years since I 
did that; so, I don't remember much about the process.  My recommendation 
would be to start looking at things in the rootimg.gz image.  Looking at it 
now, I see that /opt/xcat/xcatdsklspost gets run when rootimg.gz boots.  It 
looks like it downloads all of the postscripts from the management node and 
then run getpostscript.awk which issues a command to xcatd to get the primary 
postscript for that machine.  I've forgotten how xcatd then builds the primary 
postscript.  I do remember that in the partimageng.pm module, I had it add the 
partimageng postscript.

So, you'll really have to start digging through how the xcat postscript system 
works.

Josh

On Tuesday June 07, 2011, Sunil Venkatesh wrote:
> Josh,
> 
> Is there any place I could find some details on
> 
> "... /Once the compute node is booted with the stateless
> image, it uses NFS to mount some things from the management node, and then
> runs some xcat postscripts,/.... "
> 
> I have the stateless images ready with partimage compiled for PPC. For
> the compute node (power 7) to boot using the stateless images, i need to
> configure the yaboot instead of pxeboot (which is specific to x86). I
> wanted to know where in the startup files the execution of partimage and
> NFS mount is configured. Is it configured by the "genimage" command
> itself? Considering the way in which the nodes are configured in the
> network, it would not be a good idea to let xcat take care of
> configuring the details like DHCPD for netboot. So, I need to make
> changes to the configuration files manually, which is why this query
> came up.
> 
> Thanks in advance.
> 
> Regards,
> Sunil
> 
> On 6/1/11 1:39 PM, Josh Thompson wrote:
> > Sunil,
> > 
> > The "stateless" image I refer to is what is actually booted on the
> > compute node containing the image to be captured.  It's called stateless
> > because it is loaded completely in RAM and does not maintain any state
> > when a reboot occurs.
> > 
> > The partimage binary is part of this stateless image and actually runs on
> > the compute node.  It does not run on the management node.  The
> > management node does not have block level access to the disk on the
> > compute node to be able to capture the image from the disk.
> > 
> > I'll try to describe the process a little better.  The management node
> > issues a reboot command to the compute node.  The compute node uses PXE
> > to load and boot a kernel (vmlinuz), initial RAM disk (initrd.img), and
> > a root filesystem (rootimg.gz) from the management node.  All three of
> > these together make up the stateless image.  Once the compute node is
> > booted with the stateless image, it uses NFS to mount some things from
> > the management node, and then runs some xcat postscripts, one of which
> > is the partimageng postscript.  This postscript determines what
> > partitions are on the compute node and, depending on how the postscript
> > is configured, uses partimage or partimageng to capture an image of the
> > compute node disk that is then saved to the management node. When it is
> > finished capturing the image, it notifies xcat on the management node
> > and then reboots.  xcat reconfigures itself to tell the compute node to
> > boot off of disk at next boot.  When the compute node comes up, it uses
> > PXE to ask the management node how to boot.  The management node tells
> > it to boot off of disk.
> > 
> > I hope that clarifies how the system works.  If any of it is unclear,
> > please ask for further clarification.
> > 
> > Josh
> > 
> > On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> >> Josh,
> >> 
> >> I had one more clarification.
> >> 
> >> partimage binaries run in the management node to capture an (stateless)
> >> image from the compute node right? In that case, is there a need for
> >> these binaries to go into the rootimg.gz??
> >> 
> >> My assumption is, partimage runs on the management node (an intel blade
> >> in our case) to capture a stateless image from a compute node (a power 7
> >> blade) and stores these images under " /install " of the management
> >> node. Please correct me if I am wrong here.
> >> 
> >> Regards,
> >> Sunil
> >> 
> >> On 6/1/11 9:58 AM, Josh Thompson wrote:
> >>> -----BEGIN PGP SIGNED MESSAGE-----
> >>> Hash: SHA1
> >>> 
> >>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >>>> Hi,
> >>>> 
> >>>> I used the steps that were mentioned under
> >>>> 
> >>>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+par
> >>>> ti mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
> >>>> 
> >>>> to enable partimage support for xcat. I wasn't sure if I need to
> >>>> change references to x86&   x86_64 (as directories) to reflect the
> >>>> ppc architecture, as the web page says "The architecture for the node
> >>>> must always be set to x86 for this..". I have with me the vmlinuz
> >>>> (kernel image) and initrd for the capture process. The 2 nodeset
> >>>> commands
> >>> 
> >>> By this, do you mean you have vmlinuz and initrd for your power blades,
> >>> not the ones linked to off of the page you listed above?  If you do,
> >>> that's a good start.  However, you'll also need rootimg.gz.  rootimg.gz
> >>> is the root filesystem for the stateless image.  It also contains the
> >>> partimage and partimageng binaries.  Assuming partimage or partimageng
> >>> can actually capture partitions from power systems, you'll need to
> >>> compile at least one of them to run on power.  For the rootimg.gz image
> >>> I provided, I compiled them statically so that I didn't have to worry
> >>> about including any library dependencies in rootimg.gz.
> >>> 
> >>> It would be a good idea to research how to use xcat's genimage command
> >>> to generate stateless images to learn how to do this.
> >>> 
> >>> If there's any part of the above that you don't fully understand,
> >>> please ask me to clarify it.  Until you have a stateless image that
> >>> you can deploy to your power blades, there's no point in trying to
> >>> debug any VCL specific items.
> >>> 
> >>> Josh
> >>> - --
> >>> - -------------------------------
> >>> Josh Thompson
> >>> VCL Developer
> >>> North Carolina State University
> >>> 
> >>> my GPG/PGP key can be found at pgp.mit.edu
> >>> -----BEGIN PGP SIGNATURE-----
> >>> Version: GnuPG v2.0.17 (GNU/Linux)
> >>> 
> >>> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
> >>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
> >>> =exBV
> >>> -----END PGP SIGNATURE-----
-- 
-------------------------------
Josh Thompson
Systems Programmer
Advanced Computing | VCL Developer
North Carolina State University

Josh_Thompson@ncsu.edu
919-515-5323

my GPG/PGP key can be found at pgp.mit.edu

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Josh,

Thank you for that detailed clarification. Appreciate your support.

Regards,
Sunil

On 6/1/11 1:39 PM, Josh Thompson wrote:
> Sunil,
>
> The "stateless" image I refer to is what is actually booted on the compute
> node containing the image to be captured.  It's called stateless because it is
> loaded completely in RAM and does not maintain any state when a reboot occurs.
>
> The partimage binary is part of this stateless image and actually runs on the
> compute node.  It does not run on the management node.  The management node
> does not have block level access to the disk on the compute node to be able to
> capture the image from the disk.
>
> I'll try to describe the process a little better.  The management node issues
> a reboot command to the compute node.  The compute node uses PXE to load and
> boot a kernel (vmlinuz), initial RAM disk (initrd.img), and a root filesystem
> (rootimg.gz) from the management node.  All three of these together make up
> the stateless image.  Once the compute node is booted with the stateless
> image, it uses NFS to mount some things from the management node, and then
> runs some xcat postscripts, one of which is the partimageng postscript.  This
> postscript determines what partitions are on the compute node and, depending
> on how the postscript is configured, uses partimage or partimageng to capture
> an image of the compute node disk that is then saved to the management node.
> When it is finished capturing the image, it notifies xcat on the management
> node and then reboots.  xcat reconfigures itself to tell the compute node to
> boot off of disk at next boot.  When the compute node comes up, it uses PXE to
> ask the management node how to boot.  The management node tells it to boot off
> of disk.
>
> I hope that clarifies how the system works.  If any of it is unclear, please
> ask for further clarification.
>
> Josh
>
> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>> Josh,
>>
>> I had one more clarification.
>>
>> partimage binaries run in the management node to capture an (stateless)
>> image from the compute node right? In that case, is there a need for
>> these binaries to go into the rootimg.gz??
>>
>> My assumption is, partimage runs on the management node (an intel blade
>> in our case) to capture a stateless image from a compute node (a power 7
>> blade) and stores these images under " /install " of the management
>> node. Please correct me if I am wrong here.
>>
>> Regards,
>> Sunil
>>
>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>> Hi,
>>>>
>>>> I used the steps that were mentioned under
>>>>
>>>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+parti
>>>> mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
>>>>
>>>> to enable partimage support for xcat. I wasn't sure if I need to change
>>>> references to x86&   x86_64 (as directories) to reflect the ppc
>>>> architecture, as the web page says "The architecture for the node must
>>>> always be set to x86 for this..". I have with me the vmlinuz (kernel
>>>> image) and initrd for the capture process. The 2 nodeset commands
>>> By this, do you mean you have vmlinuz and initrd for your power blades,
>>> not the ones linked to off of the page you listed above?  If you do,
>>> that's a good start.  However, you'll also need rootimg.gz.  rootimg.gz
>>> is the root filesystem for the stateless image.  It also contains the
>>> partimage and partimageng binaries.  Assuming partimage or partimageng
>>> can actually capture partitions from power systems, you'll need to
>>> compile at least one of them to run on power.  For the rootimg.gz image
>>> I provided, I compiled them statically so that I didn't have to worry
>>> about including any library dependencies in rootimg.gz.
>>>
>>> It would be a good idea to research how to use xcat's genimage command to
>>> generate stateless images to learn how to do this.
>>>
>>> If there's any part of the above that you don't fully understand, please
>>> ask me to clarify it.  Until you have a stateless image that you can
>>> deploy to your power blades, there's no point in trying to debug any VCL
>>> specific items.
>>>
>>> Josh
>>> - --
>>> - -------------------------------
>>> Josh Thompson
>>> VCL Developer
>>> North Carolina State University
>>>
>>> my GPG/PGP key can be found at pgp.mit.edu
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>
>>> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
>>> =exBV
>>> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Josh,

Is there any place I could find some details on

"... /Once the compute node is booted with the stateless
image, it uses NFS to mount some things from the management node, and then
runs some xcat postscripts,/.... "

I have the stateless images ready with partimage compiled for PPC. For 
the compute node (power 7) to boot using the stateless images, i need to 
configure the yaboot instead of pxeboot (which is specific to x86). I 
wanted to know where in the startup files the execution of partimage and 
NFS mount is configured. Is it configured by the "genimage" command 
itself? Considering the way in which the nodes are configured in the 
network, it would not be a good idea to let xcat take care of 
configuring the details like DHCPD for netboot. So, I need to make 
changes to the configuration files manually, which is why this query 
came up.

Thanks in advance.

Regards,
Sunil

On 6/1/11 1:39 PM, Josh Thompson wrote:
> Sunil,
>
> The "stateless" image I refer to is what is actually booted on the compute
> node containing the image to be captured.  It's called stateless because it is
> loaded completely in RAM and does not maintain any state when a reboot occurs.
>
> The partimage binary is part of this stateless image and actually runs on the
> compute node.  It does not run on the management node.  The management node
> does not have block level access to the disk on the compute node to be able to
> capture the image from the disk.
>
> I'll try to describe the process a little better.  The management node issues
> a reboot command to the compute node.  The compute node uses PXE to load and
> boot a kernel (vmlinuz), initial RAM disk (initrd.img), and a root filesystem
> (rootimg.gz) from the management node.  All three of these together make up
> the stateless image.  Once the compute node is booted with the stateless
> image, it uses NFS to mount some things from the management node, and then
> runs some xcat postscripts, one of which is the partimageng postscript.  This
> postscript determines what partitions are on the compute node and, depending
> on how the postscript is configured, uses partimage or partimageng to capture
> an image of the compute node disk that is then saved to the management node.
> When it is finished capturing the image, it notifies xcat on the management
> node and then reboots.  xcat reconfigures itself to tell the compute node to
> boot off of disk at next boot.  When the compute node comes up, it uses PXE to
> ask the management node how to boot.  The management node tells it to boot off
> of disk.
>
> I hope that clarifies how the system works.  If any of it is unclear, please
> ask for further clarification.
>
> Josh
>
> On Wednesday June 01, 2011, Sunil Venkatesh wrote:
>> Josh,
>>
>> I had one more clarification.
>>
>> partimage binaries run in the management node to capture an (stateless)
>> image from the compute node right? In that case, is there a need for
>> these binaries to go into the rootimg.gz??
>>
>> My assumption is, partimage runs on the management node (an intel blade
>> in our case) to capture a stateless image from a compute node (a power 7
>> blade) and stores these images under " /install " of the management
>> node. Please correct me if I am wrong here.
>>
>> Regards,
>> Sunil
>>
>> On 6/1/11 9:58 AM, Josh Thompson wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>>>> Hi,
>>>>
>>>> I used the steps that were mentioned under
>>>>
>>>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+parti
>>>> mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
>>>>
>>>> to enable partimage support for xcat. I wasn't sure if I need to change
>>>> references to x86&   x86_64 (as directories) to reflect the ppc
>>>> architecture, as the web page says "The architecture for the node must
>>>> always be set to x86 for this..". I have with me the vmlinuz (kernel
>>>> image) and initrd for the capture process. The 2 nodeset commands
>>> By this, do you mean you have vmlinuz and initrd for your power blades,
>>> not the ones linked to off of the page you listed above?  If you do,
>>> that's a good start.  However, you'll also need rootimg.gz.  rootimg.gz
>>> is the root filesystem for the stateless image.  It also contains the
>>> partimage and partimageng binaries.  Assuming partimage or partimageng
>>> can actually capture partitions from power systems, you'll need to
>>> compile at least one of them to run on power.  For the rootimg.gz image
>>> I provided, I compiled them statically so that I didn't have to worry
>>> about including any library dependencies in rootimg.gz.
>>>
>>> It would be a good idea to research how to use xcat's genimage command to
>>> generate stateless images to learn how to do this.
>>>
>>> If there's any part of the above that you don't fully understand, please
>>> ask me to clarify it.  Until you have a stateless image that you can
>>> deploy to your power blades, there's no point in trying to debug any VCL
>>> specific items.
>>>
>>> Josh
>>> - --
>>> - -------------------------------
>>> Josh Thompson
>>> VCL Developer
>>> North Carolina State University
>>>
>>> my GPG/PGP key can be found at pgp.mit.edu
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.17 (GNU/Linux)
>>>
>>> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
>>> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
>>> =exBV
>>> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
Sunil,

The "stateless" image I refer to is what is actually booted on the compute 
node containing the image to be captured.  It's called stateless because it is 
loaded completely in RAM and does not maintain any state when a reboot occurs.

The partimage binary is part of this stateless image and actually runs on the 
compute node.  It does not run on the management node.  The management node 
does not have block level access to the disk on the compute node to be able to 
capture the image from the disk.

I'll try to describe the process a little better.  The management node issues 
a reboot command to the compute node.  The compute node uses PXE to load and 
boot a kernel (vmlinuz), initial RAM disk (initrd.img), and a root filesystem 
(rootimg.gz) from the management node.  All three of these together make up 
the stateless image.  Once the compute node is booted with the stateless 
image, it uses NFS to mount some things from the management node, and then 
runs some xcat postscripts, one of which is the partimageng postscript.  This 
postscript determines what partitions are on the compute node and, depending 
on how the postscript is configured, uses partimage or partimageng to capture 
an image of the compute node disk that is then saved to the management node.  
When it is finished capturing the image, it notifies xcat on the management 
node and then reboots.  xcat reconfigures itself to tell the compute node to 
boot off of disk at next boot.  When the compute node comes up, it uses PXE to 
ask the management node how to boot.  The management node tells it to boot off 
of disk.

I hope that clarifies how the system works.  If any of it is unclear, please 
ask for further clarification.

Josh

On Wednesday June 01, 2011, Sunil Venkatesh wrote:
> Josh,
> 
> I had one more clarification.
> 
> partimage binaries run in the management node to capture an (stateless)
> image from the compute node right? In that case, is there a need for
> these binaries to go into the rootimg.gz??
> 
> My assumption is, partimage runs on the management node (an intel blade
> in our case) to capture a stateless image from a compute node (a power 7
> blade) and stores these images under " /install " of the management
> node. Please correct me if I am wrong here.
> 
> Regards,
> Sunil
> 
> On 6/1/11 9:58 AM, Josh Thompson wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> > 
> > On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> >> Hi,
> >> 
> >> I used the steps that were mentioned under
> >> 
> >> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+parti
> >> mag e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
> >> 
> >> to enable partimage support for xcat. I wasn't sure if I need to change
> >> references to x86&  x86_64 (as directories) to reflect the ppc
> >> architecture, as the web page says "The architecture for the node must
> >> always be set to x86 for this..". I have with me the vmlinuz (kernel
> >> image) and initrd for the capture process. The 2 nodeset commands
> > 
> > By this, do you mean you have vmlinuz and initrd for your power blades,
> > not the ones linked to off of the page you listed above?  If you do,
> > that's a good start.  However, you'll also need rootimg.gz.  rootimg.gz
> > is the root filesystem for the stateless image.  It also contains the
> > partimage and partimageng binaries.  Assuming partimage or partimageng
> > can actually capture partitions from power systems, you'll need to
> > compile at least one of them to run on power.  For the rootimg.gz image
> > I provided, I compiled them statically so that I didn't have to worry
> > about including any library dependencies in rootimg.gz.
> > 
> > It would be a good idea to research how to use xcat's genimage command to
> > generate stateless images to learn how to do this.
> > 
> > If there's any part of the above that you don't fully understand, please
> > ask me to clarify it.  Until you have a stateless image that you can
> > deploy to your power blades, there's no point in trying to debug any VCL
> > specific items.
> > 
> > Josh
> > - --
> > - -------------------------------
> > Josh Thompson
> > VCL Developer
> > North Carolina State University
> > 
> > my GPG/PGP key can be found at pgp.mit.edu
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v2.0.17 (GNU/Linux)
> > 
> > iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
> > PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
> > =exBV
> > -----END PGP SIGNATURE-----
-- 
-------------------------------
Josh Thompson
Systems Programmer
Advanced Computing | VCL Developer
North Carolina State University

Josh_Thompson@ncsu.edu
919-515-5323

my GPG/PGP key can be found at pgp.mit.edu

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Josh,

I had one more clarification.

partimage binaries run in the management node to capture an (stateless) 
image from the compute node right? In that case, is there a need for 
these binaries to go into the rootimg.gz??

My assumption is, partimage runs on the management node (an intel blade 
in our case) to capture a stateless image from a compute node (a power 7 
blade) and stores these images under " /install " of the management 
node. Please correct me if I am wrong here.

Regards,
Sunil

On 6/1/11 9:58 AM, Josh Thompson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Tuesday May 31, 2011, Sunil Venkatesh wrote:
>> Hi,
>>
>> I used the steps that were mentioned under
>>
>> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+partimag
>> e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
>>
>> to enable partimage support for xcat. I wasn't sure if I need to change
>> references to x86&  x86_64 (as directories) to reflect the ppc
>> architecture, as the web page says "The architecture for the node must
>> always be set to x86 for this..". I have with me the vmlinuz (kernel
>> image) and initrd for the capture process. The 2 nodeset commands
> By this, do you mean you have vmlinuz and initrd for your power blades, not
> the ones linked to off of the page you listed above?  If you do, that's a good
> start.  However, you'll also need rootimg.gz.  rootimg.gz is the root
> filesystem for the stateless image.  It also contains the partimage and
> partimageng binaries.  Assuming partimage or partimageng can actually capture
> partitions from power systems, you'll need to compile at least one of them to
> run on power.  For the rootimg.gz image I provided, I compiled them statically
> so that I didn't have to worry about including any library dependencies in
> rootimg.gz.
>
> It would be a good idea to research how to use xcat's genimage command to
> generate stateless images to learn how to do this.
>
> If there's any part of the above that you don't fully understand, please ask
> me to clarify it.  Until you have a stateless image that you can deploy to
> your power blades, there's no point in trying to debug any VCL specific items.
>
> Josh
> - -- 
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
>
> my GPG/PGP key can be found at pgp.mit.edu
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
>
> iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
> PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
> =exBV
> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday May 31, 2011, Sunil Venkatesh wrote:
> Hi,
> 
> I used the steps that were mentioned under
> 
> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+partimag
> e+and+partimage- ng+to+xCAT+2.x+%28unofficial%29
> 
> to enable partimage support for xcat. I wasn't sure if I need to change
> references to x86 & x86_64 (as directories) to reflect the ppc
> architecture, as the web page says "The architecture for the node must
> always be set to x86 for this..". I have with me the vmlinuz (kernel
> image) and initrd for the capture process. The 2 nodeset commands

By this, do you mean you have vmlinuz and initrd for your power blades, not 
the ones linked to off of the page you listed above?  If you do, that's a good 
start.  However, you'll also need rootimg.gz.  rootimg.gz is the root 
filesystem for the stateless image.  It also contains the partimage and 
partimageng binaries.  Assuming partimage or partimageng can actually capture 
partitions from power systems, you'll need to compile at least one of them to 
run on power.  For the rootimg.gz image I provided, I compiled them statically 
so that I didn't have to worry about including any library dependencies in 
rootimg.gz.

It would be a good idea to research how to use xcat's genimage command to 
generate stateless images to learn how to do this.

If there's any part of the above that you don't fully understand, please ask 
me to clarify it.  Until you have a stateless image that you can deploy to 
your power blades, there's no point in trying to debug any VCL specific items.

Josh
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk3mRYsACgkQV/LQcNdtPQNnVgCbB9ZFJn0+C45RC/g75RqGZY/j
PZYAniP2Eam7nxgiDWUnp5sKPYPO4OMa
=exBV
-----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Sunil Venkatesh <su...@umbc.edu>.
Hi,

I used the steps that were mentioned under

https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+partimage+and+partimage-
ng+to+xCAT+2.x+%28unofficial%29

to enable partimage support for xcat. I wasn't sure if I need to change 
references to x86 & x86_64 (as directories) to reflect the ppc 
architecture, as the web page says "The architecture for the node must 
always be set to x86 for this..". I have with me the vmlinuz (kernel 
image) and initrd for the capture process. The 2 nodeset commands

nodeset <node> image
nodeset <node> install

fail with the following error msg: "Error: Invalid nodes and/or groups 
in noderange: power01".

Also, when I start the capture process using "vcld --setup", again, 
there is no option to choose a ppc architecture (I am not sure if I 
should be modifying the setup scripts to support ppc architecture). I 
continue with the setup process by choosing the architecture as x86_64 
(instead of ppc for our Power 7 blade). Operating System is RHEL 5. The 
scripts attempts to power down the blade, waits for 120 seconds before 
it quits with a failure. The log file is shown at the bottom of the mail.

The script fails saying the blade didn't turn off, but, I can see in my 
console that the blade is off and it does not respond to connection 
requests. There is a deviation from the configuration that were 
mentioned: the blade we are trying to use as a compute node has only 1 
ethernet port enabled, hence there is no separate private & public 
networks. Will that matter?

Any support with this would be really helpful.  Thanks in advance.

Regards,
Sunil Venkatesh


***************

RECENT LOG ENTRIES FOR THIS PROCESS:
|11662|23:23|image| (-2) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-3) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-4) Linux.pm, pre_capture (line: 183)
|11662|23:23|image| (-5) xCAT2.pm, capture (line: 792)
2011-05-31 
12:56:37|11662|23:23|image|Module.pm:code_loop_timeout(755)|attempt 36: 
code returned false, seconds elapsed/remaining: 114/6, sleeping for 3 
seconds
2011-05-31 
12:56:40|11662|23:23|image|Module.pm:code_loop_timeout(759)|attempt 37: 
waiting for power01 to power off
2011-05-31 12:56:40|11662|23:23|image|xCAT.pm:_rpower(1944)|attempting 
to execute rpower for computer: power01, mode: stat
2011-05-31 
12:56:40|11662|23:23|image|utils.pm:run_command(9010)|executed command: 
/opt/xcat/bin/rpower power01 stat, pid: 11922, exit status: 1, output:
|11662|23:23|image| Error: Invalid nodes and/or groups in noderange: power01
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:40|11662|23:23|image|xCAT.pm:_rpower(1997)|unexpected output 
returned from rpower: Error: Invalid nodes and/or groups in noderange: 
power01
|11662|23:23|image| ( 0) xCAT.pm, _rpower (line: 1997)
|11662|23:23|image| (-1) xCAT.pm, power_status (line: 1675)
|11662|23:23|image| (-2) Provisioning.pm, __ANON__ (line: 324)
|11662|23:23|image| (-3) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-4) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-5) Linux.pm, pre_capture (line: 183)
2011-05-31 
12:56:40|11662|23:23|image|xCAT.pm:power_status(1676)|retrieved power 
status of power01: 0
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:40|11662|23:23|image|xCAT.pm:power_status(1679)|failed to 
determine power status, rpower subroutine returned 0
|11662|23:23|image| ( 0) xCAT.pm, power_status (line: 1679)
|11662|23:23|image| (-1) Provisioning.pm, __ANON__ (line: 324)
|11662|23:23|image| (-2) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-3) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-4) Linux.pm, pre_capture (line: 183)
|11662|23:23|image| (-5) xCAT2.pm, capture (line: 792)
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:40|11662|23:23|image|vcld:warning_handler(610)|Use of 
uninitialized value in pattern match (m//) at 
/usr/local/vcl/bin/../lib/VCL/Module/Provisioning.pm line 324.
|11662|23:23|image| ( 0) vcld, warning_handler (line: 610)
|11662|23:23|image| (-1) Provisioning.pm, __ANON__ (line: 324)
|11662|23:23|image| (-2) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-3) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-4) Linux.pm, pre_capture (line: 183)
|11662|23:23|image| (-5) xCAT2.pm, capture (line: 792)
2011-05-31 
12:56:40|11662|23:23|image|Module.pm:code_loop_timeout(755)|attempt 37: 
code returned false, seconds elapsed/remaining: 117/3, sleeping for 3 
seconds
2011-05-31 
12:56:43|11662|23:23|image|Module.pm:code_loop_timeout(759)|attempt 38: 
waiting for power01 to power off
2011-05-31 12:56:43|11662|23:23|image|xCAT.pm:_rpower(1944)|attempting 
to execute rpower for computer: power01, mode: stat
2011-05-31 
12:56:44|11662|23:23|image|utils.pm:run_command(9010)|executed command: 
/opt/xcat/bin/rpower power01 stat, pid: 11926, exit status: 1, output:
|11662|23:23|image| Error: Invalid nodes and/or groups in noderange: power01
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|xCAT.pm:_rpower(1997)|unexpected output 
returned from rpower: Error: Invalid nodes and/or groups in noderange: 
power01
|11662|23:23|image| ( 0) xCAT.pm, _rpower (line: 1997)
|11662|23:23|image| (-1) xCAT.pm, power_status (line: 1675)
|11662|23:23|image| (-2) Provisioning.pm, __ANON__ (line: 324)
|11662|23:23|image| (-3) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-4) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-5) Linux.pm, pre_capture (line: 183)
2011-05-31 
12:56:44|11662|23:23|image|xCAT.pm:power_status(1676)|retrieved power 
status of power01: 0
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|xCAT.pm:power_status(1679)|failed to 
determine power status, rpower subroutine returned 0
|11662|23:23|image| ( 0) xCAT.pm, power_status (line: 1679)
|11662|23:23|image| (-1) Provisioning.pm, __ANON__ (line: 324)
|11662|23:23|image| (-2) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-3) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-4) Linux.pm, pre_capture (line: 183)
|11662|23:23|image| (-5) xCAT2.pm, capture (line: 792)
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|vcld:warning_handler(610)|Use of 
uninitialized value in pattern match (m//) at 
/usr/local/vcl/bin/../lib/VCL/Module/Provisioning.pm line 324.
|11662|23:23|image| ( 0) vcld, warning_handler (line: 610)
|11662|23:23|image| (-1) Provisioning.pm, __ANON__ (line: 324)
|11662|23:23|image| (-2) Module.pm, code_loop_timeout (line: 761)
|11662|23:23|image| (-3) Provisioning.pm, wait_for_power_off (line: 324)
|11662|23:23|image| (-4) Linux.pm, pre_capture (line: 183)
|11662|23:23|image| (-5) xCAT2.pm, capture (line: 792)
2011-05-31 
12:56:44|11662|23:23|image|Module.pm:code_loop_timeout(767)|waiting for 
power01 to power off, code did not return true after waiting 120 seconds
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|Provisioning.pm:wait_for_power_off(328)|power01 
has not powered off after waiting 120 seconds, returning 0
|11662|23:23|image| ( 0) Provisioning.pm, wait_for_power_off (line: 328)
|11662|23:23|image| (-1) Linux.pm, pre_capture (line: 183)
|11662|23:23|image| (-2) xCAT2.pm, capture (line: 792)
|11662|23:23|image| (-3) image.pm, process (line: 162)
|11662|23:23|image| (-4) vcld, make_new_child (line: 568)
|11662|23:23|image| (-5) vcld, main (line: 346)
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|Linux.pm:pre_capture(190)|power01 never 
powered off
|11662|23:23|image| ( 0) Linux.pm, pre_capture (line: 190)
|11662|23:23|image| (-1) xCAT2.pm, capture (line: 792)
|11662|23:23|image| (-2) image.pm, process (line: 162)
|11662|23:23|image| (-3) vcld, make_new_child (line: 568)
|11662|23:23|image| (-4) vcld, main (line: 346)
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|xCAT2.pm:capture(793)|OS module pre_capture() 
failed
|11662|23:23|image| ( 0) xCAT2.pm, capture (line: 793)
|11662|23:23|image| (-1) image.pm, process (line: 162)
|11662|23:23|image| (-2) vcld, make_new_child (line: 568)
|11662|23:23|image| (-3) vcld, main (line: 346)
|11662|23:23|image| ---- WARNING ----
|11662|23:23|image| 2011-05-31 
12:56:44|11662|23:23|image|image.pm:process(166)|rh5image-power01bi053127-v0 
image failed to be captured by provisioning module
|11662|23:23|image| ( 0) image.pm, process (line: 166)
|11662|23:23|image| (-1) vcld, make_new_child (line: 568)
|11662|23:23|image| (-2) vcld, main (line: 346)
2011-05-31 
12:56:44|11662|23:23|image|DataStructure.pm:get_computer_private_ip_address(1557)|returning 
private IP address previously retrieved: 172.20.106.1
2011-05-31 
12:56:44|11662|23:23|image|utils.pm:is_inblockrequest(6163)|zero rows 
were returned from database select
2011-05-31 
12:56:44|11662|23:23|image|DataStructure.pm:get_image_affiliation_name(2035)|image 
owner id: 1
2011-05-31 12:56:44|11662|23:23|image|utils.pm:getnewdbh(2709)|database 
requested (information_schema) does not match handle stored in $ENV{dbh} 
(vcl:)
2011-05-31 12:56:44|11662|23:23|image|utils.pm:getnewdbh(2760)|database 
handle stored in $ENV{dbh}
2011-05-31 
12:56:44|11662|23:23|image|DataStructure.pm:retrieve_user_data(1352)|attempting 
to retrieve and store data for user: user.id = '1'
2011-05-31 12:56:44|11662|23:23|image|utils.pm:getnewdbh(2709)|database 
requested (vcl) does not match handle stored in $ENV{dbh} 
(information_schema:)
2011-05-31 12:56:44|11662|23:23|image|utils.pm:getnewdbh(2760)|database 
handle stored in $ENV{dbh}
2011-05-31 
12:56:44|11662|23:23|image|DataStructure.pm:retrieve_user_data(1415)|data has 
been retrieved for user: admin (id: 1)


On 5/19/11 3:48 PM, Josh Thompson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Sunil,
>
> Let's back up a little bit.  The first thing to look at is why the image
> failed.  Unless you created your own stateless image for capturing and
> provisioning images, then xCAT will be unable to capture an image from a Power
> blade.
>
> Did you use the steps here:
>
> https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+partimage+and+partimage-
> ng+to+xCAT+2.x+%28unofficial%29
>
> for modifying xCAT to be able to capture/deploy images?  If so, the stateless
> images linked to off of that page are for x86 hardware.  You will need to
> create your own stateless or statelite images for Power blades.
>
> Josh
>
> On Thursday May 19, 2011, Sunil Venkatesh wrote:
>> Hi,
>>
>> We are currently in the process of configuring VCL 2.2.1 to work on a
>> Power 7 blade. Our current setup is:
>>
>> 1. A web-server that hosts the Database and the Web Code. The same
>> server acts as the Management node. xCAT is configured as the
>> provisioning module on this node.
>> 2. Power7 is our compute node.
>> 3. I used the command "vcld --setup" command to create/capture base
>> image of RHEL 5 that is running on the Power7 blade (by specifying the
>> IP address of Power7 blade when prompted for an address).
>>
>> The creation process failed as Xianqing Yu had mentioned to us earlier.
>> Although, before it failed it created appropriate entries in the tables
>> image, imagerevision and resource. I was able to "Undelete" the image
>> from the web page and see it under "New Reservations".
>>
>> I am facing similar problems that Mike Waldron had faced with the
>> reservation. Even after making memory adjustment, I wasn't able to make
>> a reservation. The time table shows all green (available), however, when
>> I choose any entry from the list, it takes me directly to "New
>> Reservation" page without any status/feedback. And, I don't see any
>> reservations created when I check under "Current Reservations". I am
>> just assuming the groupings of Images and Computers are correct, is
>> there anyway I could verify the same. Also, if there is any reference to
>> how the grouping need to be done, please let me know of the same.
>>
>> Please do correct me if there is anything wrong with the system setup.
>>
>> Regards,
>> Sunil Venkatesh
>> Research Assistant,
>> MC2 Lab, UMBC.
> - -- 
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
>
> my GPG/PGP key can be found at pgp.mit.edu
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
>
> iEYEARECAAYFAk3VdBQACgkQV/LQcNdtPQP5VQCfVoX4ykJSCpMHHJTocpwTHsVs
> teEAn2NCYnBXDq/gzjcwj2FNn9kdJsPC
> =COKC
> -----END PGP SIGNATURE-----

Re: [VCL 2.2.1] [Power7] Problem with image reservation

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sunil,

Let's back up a little bit.  The first thing to look at is why the image 
failed.  Unless you created your own stateless image for capturing and 
provisioning images, then xCAT will be unable to capture an image from a Power 
blade.

Did you use the steps here:

https://cwiki.apache.org/confluence/display/VCL/Adding+support+for+partimage+and+partimage-
ng+to+xCAT+2.x+%28unofficial%29

for modifying xCAT to be able to capture/deploy images?  If so, the stateless 
images linked to off of that page are for x86 hardware.  You will need to 
create your own stateless or statelite images for Power blades.

Josh

On Thursday May 19, 2011, Sunil Venkatesh wrote:
> Hi,
> 
> We are currently in the process of configuring VCL 2.2.1 to work on a
> Power 7 blade. Our current setup is:
> 
> 1. A web-server that hosts the Database and the Web Code. The same
> server acts as the Management node. xCAT is configured as the
> provisioning module on this node.
> 2. Power7 is our compute node.
> 3. I used the command "vcld --setup" command to create/capture base
> image of RHEL 5 that is running on the Power7 blade (by specifying the
> IP address of Power7 blade when prompted for an address).
> 
> The creation process failed as Xianqing Yu had mentioned to us earlier.
> Although, before it failed it created appropriate entries in the tables
> image, imagerevision and resource. I was able to "Undelete" the image
> from the web page and see it under "New Reservations".
> 
> I am facing similar problems that Mike Waldron had faced with the
> reservation. Even after making memory adjustment, I wasn't able to make
> a reservation. The time table shows all green (available), however, when
> I choose any entry from the list, it takes me directly to "New
> Reservation" page without any status/feedback. And, I don't see any
> reservations created when I check under "Current Reservations". I am
> just assuming the groupings of Images and Computers are correct, is
> there anyway I could verify the same. Also, if there is any reference to
> how the grouping need to be done, please let me know of the same.
> 
> Please do correct me if there is anything wrong with the system setup.
> 
> Regards,
> Sunil Venkatesh
> Research Assistant,
> MC2 Lab, UMBC.
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk3VdBQACgkQV/LQcNdtPQP5VQCfVoX4ykJSCpMHHJTocpwTHsVs
teEAn2NCYnBXDq/gzjcwj2FNn9kdJsPC
=COKC
-----END PGP SIGNATURE-----