You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@vcl.apache.org by Josh Thompson <jo...@ncsu.edu> on 2011/04/01 15:05:32 UTC

Re: VCL2.2 + xCAT2.5 on bladecenter

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John,

The partimageng postscript mounts an image store via NFS at /install.  The NFS 
server and path are specified in the xCAT site table as IMAGELIBSERVER and 
IMAGELIBINSTALLDIR.  More info about this part is at the bottom of our wiki 
page explaining how to add partimage support to xCAT.

Do you have your image store exported read/write via NFS and available to the 
client nodes?

As a test, you could modify the partimageng script to output more debugging 
info.  You could modify the mount command on line 144 to be:

logger -t xcat "Attempting to mount image store: $IMAGELIBSERVER:$IMAGELIBINSTALLDIR"
if ! mount -o nfsvers=3,tcp,nolock,rw $IMAGELIBSERVER:$IMAGELIBINSTALLDIR /install; then
    echo "CRITICAL ERROR: Failed to mount image store at $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
    logger -t xcat "CRITICAL ERROR: Failed to mount image store at $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
    sleep 3
    exit 1
fi

Josh

On Thursday March 31, 2011, John Ma wrote:
> Josh,
> 
> Thanks again for the help. I configured an anonymous ftp share of
> /install, and it passed the previous error. Now I am at here:
>   Mar 30 21:06:53 blade08 blade08 xcat: running partimage -z1 -f3 -odbc
> save /dev/sda1 /install/image/x86/centos5image-blade08mar2466-v0.gz
>   Mar 30 21:06:58 blade08 blade08 xcat: partimage exited with a non-zero
> status, failing
>   Mar 30 21:06:58 blade08 blade08 xcat: partimage-ng failed with exit code
> 1
>   Mar 30 21:06:58 blade08 blade08 init: rc3 main process (1166) killed by
> TERM signal
> 
> Blade08 then rebooted itself and loop again. partimage's save location
> /install/image/x86/.. doesn't seem right to me, but how to configure it to
> use nfs?
> See the attached log file for more details, ( the clock on blade08 is off
> or maybe UTC)
> 
> 
> Thanks,
> John
> 
> 
> 
> 
> 
> 
> From:   Josh Thompson <jo...@ncsu.edu>
> To:     vcl-user@incubator.apache.org
> Date:   03/31/2011 03:29 PM
> Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
> 
> 
> 
> - gpg control packet
> John,
> 
> Sorry to take so long to get back to you.
> 
> I didn't even realize this until digging through xcatdsklspost, but your
> management node needs to be sharing out /install via ftp.  I'm assuming
> xcat
> sets this up because I don't remember setting that up manually.  The
> following
> line is from xcatdsklspost:
> 
> wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 0 -T
> 60
> ftp://$SIP/postscripts 2> /tmp/wget.log
> 
> $SIP is obtained earlier in the script from some dhcp information.
> 
> The next line is where your screenshot shows the first error:
> 
> mv $SIP/postscripts/* /xcatpost;
> 
> The wget command should try forever until it downloads everything under
> ftp://$SIP/postscripts.  The fact that you are getting past wget, but the
> move
> is failing for $SIP/postscripts/* makes me think you don't have anything
> under
> ftp://$SIP/postscripts.  Can you try using a normal ftp client to browse
> ftp://172.20.101.140/postscripts?  It may be that the ftp server is
> sharing
> out the wrong directory.
> 
> Josh
> 
> On Friday March 25, 2011, John Ma wrote:
> > Josh,
> > 
> > I made some progress, but stuck again. This time at the reboot of the
> > machine being captured. The machine apparently cannot find postscripts.
> > Any idea about how to fix it or what to try next?
> > 
> > I placed partimageng in /install/postscripts on our VCL (web, db, and
> 
> mgt
> 
> > code) server - Blade14 (172.20.101.140). The machine being captured is
> > blade08 (172.20.101.80).
> > 
> > Here is the screenshot:
> > 
> > Here is the pxe boot config file:
> > [root@blade14 ~]# cat /tftpboot/pxelinux.cfg/blade08
> > #image image-x86-centos5image-blade08mar2466-v0
> > DEFAULT xCAT
> > LABEL xCAT
> > 
> >  KERNEL xcat/image/x86/vmlinuz
> >  APPEND initrd=xcat/image/x86/initrd.img
> > 
> > imgurl=http://blade14//install/image/x86/installer_files/rootimg.gz
> > image=/install/image/x86/centos5image-blade08mar2466-v0.img blocks=512
> > action=save installnic=eth0 reboot  noipv6
> > 
> >   IPAPPEND 2
> > 
> > [root@blade14 ~]#
> > 
> > Thanks,
> > John Ma
> > Marist College
> > 
> > 
> > 
> > 
> > From:   Josh Thompson <jo...@ncsu.edu>
> > To:     vcl-user@incubator.apache.org
> > Date:   03/22/2011 12:15 PM
> > Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAk2VzaAACgkQV/LQcNdtPQPPfwCfZF5WlXYUnvLAV6XXiPG4ENQe
k7MAnAgiDIaXKzr8Lr9dClRuVGp6peaK
=MuU1
-----END PGP SIGNATURE-----

Re: VCL2.2 + xCAT2.5 on bladecenter

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John,

Do you have IMAGELIBINSTALLDIR set to /opt or to /opt/image/x86?  It needs to 
be /opt.

One thing to try is changing usepartimageng to 1 on line 147.  That will 
switch to using partimageng instead of partimage and tell us if it is 
something specific to partimage.

The next step will be to find the partimage logs, which is a little more 
complicated.

Feel free to hop on IRC to debug a little faster: #asfvcl on freenode.

Josh

On Friday April 01, 2011, John Ma wrote:
> Josh,
> 
> I added the debugging code. It turned out NFS looks fine. I also manually
> verified the NFS share:
> 
>   [root@blade14 ~]# mkdir nfstest
>   [root@blade14 ~]# mount 172.20.0.1:/opt/image/x86 nfstest/
>   [root@blade14 ~]# cd nfstest/
>   [root@blade14 nfstest]# mkdir writetest
> 
> 
> 
> How to debug the partimage save operation?
> 
> Thanks,
> John Ma
> Marist College
> 
> 
> 
> 
> From:   Josh Thompson <jo...@ncsu.edu>
> To:     vcl-user@incubator.apache.org
> Date:   04/01/2011 09:06 AM
> Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
> 
> 
> 
> - gpg control packet
> John,
> 
> The partimageng postscript mounts an image store via NFS at /install.  The
> NFS
> server and path are specified in the xCAT site table as IMAGELIBSERVER and
> 
> IMAGELIBINSTALLDIR.  More info about this part is at the bottom of our
> wiki
> page explaining how to add partimage support to xCAT.
> 
> Do you have your image store exported read/write via NFS and available to
> the
> client nodes?
> 
> As a test, you could modify the partimageng script to output more
> debugging
> info.  You could modify the mount command on line 144 to be:
> 
> logger -t xcat "Attempting to mount image store:
> $IMAGELIBSERVER:$IMAGELIBINSTALLDIR"
> if ! mount -o nfsvers=3,tcp,nolock,rw $IMAGELIBSERVER:$IMAGELIBINSTALLDIR
> /install; then
>     echo "CRITICAL ERROR: Failed to mount image store at
> $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
>     logger -t xcat "CRITICAL ERROR: Failed to mount image store at
> $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
>     sleep 3
>     exit 1
> fi
> 
> Josh
> 
> On Thursday March 31, 2011, John Ma wrote:
> > Josh,
> > 
> > Thanks again for the help. I configured an anonymous ftp share of
> > 
> > /install, and it passed the previous error. Now I am at here:
> >   Mar 30 21:06:53 blade08 blade08 xcat: running partimage -z1 -f3 -odbc
> > 
> > save /dev/sda1 /install/image/x86/centos5image-blade08mar2466-v0.gz
> > 
> >   Mar 30 21:06:58 blade08 blade08 xcat: partimage exited with a non-zero
> > 
> > status, failing
> > 
> >   Mar 30 21:06:58 blade08 blade08 xcat: partimage-ng failed with exit
> 
> code
> 
> > 1
> > 
> >   Mar 30 21:06:58 blade08 blade08 init: rc3 main process (1166) killed
> 
> by
> 
> > TERM signal
> > 
> > Blade08 then rebooted itself and loop again. partimage's save location
> > /install/image/x86/.. doesn't seem right to me, but how to configure it
> 
> to
> 
> > use nfs?
> > See the attached log file for more details, ( the clock on blade08 is
> 
> off
> 
> > or maybe UTC)
> > 
> > 
> > Thanks,
> > John
> > 
> > 
> > 
> > 
> > 
> > 
> > From:   Josh Thompson <jo...@ncsu.edu>
> > To:     vcl-user@incubator.apache.org
> > Date:   03/31/2011 03:29 PM
> > Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
> > 
> > 
> > 
> > - gpg control packet
> > John,
> > 
> > Sorry to take so long to get back to you.
> > 
> > I didn't even realize this until digging through xcatdsklspost, but your
> > management node needs to be sharing out /install via ftp.  I'm assuming
> > xcat
> > sets this up because I don't remember setting that up manually.  The
> > following
> > line is from xcatdsklspost:
> > 
> > wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 0
> 
> -T
> 
> > 60
> > ftp://$SIP/postscripts 2> /tmp/wget.log
> > 
> > $SIP is obtained earlier in the script from some dhcp information.
> > 
> > The next line is where your screenshot shows the first error:
> > 
> > mv $SIP/postscripts/* /xcatpost;
> > 
> > The wget command should try forever until it downloads everything under
> > ftp://$SIP/postscripts.  The fact that you are getting past wget, but
> 
> the
> 
> > move
> > is failing for $SIP/postscripts/* makes me think you don't have anything
> > under
> > ftp://$SIP/postscripts.  Can you try using a normal ftp client to browse
> > ftp://172.20.101.140/postscripts?  It may be that the ftp server is
> > sharing
> > out the wrong directory.
> > 
> > Josh
> > 
> > On Friday March 25, 2011, John Ma wrote:
> > > Josh,
> > > 
> > > I made some progress, but stuck again. This time at the reboot of the
> > > machine being captured. The machine apparently cannot find
> 
> postscripts.
> 
> > > Any idea about how to fix it or what to try next?
> > > 
> > > I placed partimageng in /install/postscripts on our VCL (web, db, and
> > 
> > mgt
> > 
> > > code) server - Blade14 (172.20.101.140). The machine being captured is
> > > blade08 (172.20.101.80).
> > > 
> > > Here is the screenshot:
> > > 
> > > Here is the pxe boot config file:
> > > [root@blade14 ~]# cat /tftpboot/pxelinux.cfg/blade08
> > > #image image-x86-centos5image-blade08mar2466-v0
> > > DEFAULT xCAT
> > > LABEL xCAT
> > > 
> > >  KERNEL xcat/image/x86/vmlinuz
> > >  APPEND initrd=xcat/image/x86/initrd.img
> > > 
> > > imgurl=http://blade14//install/image/x86/installer_files/rootimg.gz
> > > image=/install/image/x86/centos5image-blade08mar2466-v0.img blocks=512
> > > action=save installnic=eth0 reboot  noipv6
> > > 
> > >   IPAPPEND 2
> > > 
> > > [root@blade14 ~]#
> > > 
> > > Thanks,
> > > John Ma
> > > Marist College
> > > 
> > > 
> > > 
> > > 
> > > From:   Josh Thompson <jo...@ncsu.edu>
> > > To:     vcl-user@incubator.apache.org
> > > Date:   03/22/2011 12:15 PM
> > > Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAk2V22cACgkQV/LQcNdtPQM2cQCfeRhu3/EFLU2Rwu+MaN3L+cg/
TQ8An1giXD25WpoTf6b0/+yzwcGDQpsX
=IheQ
-----END PGP SIGNATURE-----

Re: VCL2.2 + xCAT2.5 on bladecenter

Posted by John Ma <Jo...@marist.edu>.
Josh,

I added the debugging code. It turned out NFS looks fine. I also manually 
verified the NFS share:

  [root@blade14 ~]# mkdir nfstest
  [root@blade14 ~]# mount 172.20.0.1:/opt/image/x86 nfstest/
  [root@blade14 ~]# cd nfstest/
  [root@blade14 nfstest]# mkdir writetest



How to debug the partimage save operation?

Thanks,
John Ma
Marist College




From:   Josh Thompson <jo...@ncsu.edu>
To:     vcl-user@incubator.apache.org
Date:   04/01/2011 09:06 AM
Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John,

The partimageng postscript mounts an image store via NFS at /install.  The 
NFS 
server and path are specified in the xCAT site table as IMAGELIBSERVER and 

IMAGELIBINSTALLDIR.  More info about this part is at the bottom of our 
wiki 
page explaining how to add partimage support to xCAT.

Do you have your image store exported read/write via NFS and available to 
the 
client nodes?

As a test, you could modify the partimageng script to output more 
debugging 
info.  You could modify the mount command on line 144 to be:

logger -t xcat "Attempting to mount image store: 
$IMAGELIBSERVER:$IMAGELIBINSTALLDIR"
if ! mount -o nfsvers=3,tcp,nolock,rw $IMAGELIBSERVER:$IMAGELIBINSTALLDIR 
/install; then
    echo "CRITICAL ERROR: Failed to mount image store at 
$IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
    logger -t xcat "CRITICAL ERROR: Failed to mount image store at 
$IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
    sleep 3
    exit 1
fi

Josh

On Thursday March 31, 2011, John Ma wrote:
> Josh,
> 
> Thanks again for the help. I configured an anonymous ftp share of
> /install, and it passed the previous error. Now I am at here:
>   Mar 30 21:06:53 blade08 blade08 xcat: running partimage -z1 -f3 -odbc
> save /dev/sda1 /install/image/x86/centos5image-blade08mar2466-v0.gz
>   Mar 30 21:06:58 blade08 blade08 xcat: partimage exited with a non-zero
> status, failing
>   Mar 30 21:06:58 blade08 blade08 xcat: partimage-ng failed with exit 
code
> 1
>   Mar 30 21:06:58 blade08 blade08 init: rc3 main process (1166) killed 
by
> TERM signal
> 
> Blade08 then rebooted itself and loop again. partimage's save location
> /install/image/x86/.. doesn't seem right to me, but how to configure it 
to
> use nfs?
> See the attached log file for more details, ( the clock on blade08 is 
off
> or maybe UTC)
> 
> 
> Thanks,
> John
> 
> 
> 
> 
> 
> 
> From:   Josh Thompson <jo...@ncsu.edu>
> To:     vcl-user@incubator.apache.org
> Date:   03/31/2011 03:29 PM
> Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
> 
> 
> 
> - gpg control packet
> John,
> 
> Sorry to take so long to get back to you.
> 
> I didn't even realize this until digging through xcatdsklspost, but your
> management node needs to be sharing out /install via ftp.  I'm assuming
> xcat
> sets this up because I don't remember setting that up manually.  The
> following
> line is from xcatdsklspost:
> 
> wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 0 
-T
> 60
> ftp://$SIP/postscripts 2> /tmp/wget.log
> 
> $SIP is obtained earlier in the script from some dhcp information.
> 
> The next line is where your screenshot shows the first error:
> 
> mv $SIP/postscripts/* /xcatpost;
> 
> The wget command should try forever until it downloads everything under
> ftp://$SIP/postscripts.  The fact that you are getting past wget, but 
the
> move
> is failing for $SIP/postscripts/* makes me think you don't have anything
> under
> ftp://$SIP/postscripts.  Can you try using a normal ftp client to browse
> ftp://172.20.101.140/postscripts?  It may be that the ftp server is
> sharing
> out the wrong directory.
> 
> Josh
> 
> On Friday March 25, 2011, John Ma wrote:
> > Josh,
> > 
> > I made some progress, but stuck again. This time at the reboot of the
> > machine being captured. The machine apparently cannot find 
postscripts.
> > Any idea about how to fix it or what to try next?
> > 
> > I placed partimageng in /install/postscripts on our VCL (web, db, and
> 
> mgt
> 
> > code) server - Blade14 (172.20.101.140). The machine being captured is
> > blade08 (172.20.101.80).
> > 
> > Here is the screenshot:
> > 
> > Here is the pxe boot config file:
> > [root@blade14 ~]# cat /tftpboot/pxelinux.cfg/blade08
> > #image image-x86-centos5image-blade08mar2466-v0
> > DEFAULT xCAT
> > LABEL xCAT
> > 
> >  KERNEL xcat/image/x86/vmlinuz
> >  APPEND initrd=xcat/image/x86/initrd.img
> > 
> > imgurl=http://blade14//install/image/x86/installer_files/rootimg.gz
> > image=/install/image/x86/centos5image-blade08mar2466-v0.img blocks=512
> > action=save installnic=eth0 reboot  noipv6
> > 
> >   IPAPPEND 2
> > 
> > [root@blade14 ~]#
> > 
> > Thanks,
> > John Ma
> > Marist College
> > 
> > 
> > 
> > 
> > From:   Josh Thompson <jo...@ncsu.edu>
> > To:     vcl-user@incubator.apache.org
> > Date:   03/22/2011 12:15 PM
> > Subject:        Re: VCL2.2 + xCAT2.5 on bladecenter
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAk2VzaAACgkQV/LQcNdtPQPPfwCfZF5WlXYUnvLAV6XXiPG4ENQe
k7MAnAgiDIaXKzr8Lr9dClRuVGp6peaK
=MuU1
-----END PGP SIGNATURE-----