You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@vcl.apache.org by Josh Thompson <jo...@ncsu.edu> on 2011/04/01 15:05:32 UTC
Re: VCL2.2 + xCAT2.5 on bladecenter
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John,
The partimageng postscript mounts an image store via NFS at /install. The NFS
server and path are specified in the xCAT site table as IMAGELIBSERVER and
IMAGELIBINSTALLDIR. More info about this part is at the bottom of our wiki
page explaining how to add partimage support to xCAT.
Do you have your image store exported read/write via NFS and available to the
client nodes?
As a test, you could modify the partimageng script to output more debugging
info. You could modify the mount command on line 144 to be:
logger -t xcat "Attempting to mount image store: $IMAGELIBSERVER:$IMAGELIBINSTALLDIR"
if ! mount -o nfsvers=3,tcp,nolock,rw $IMAGELIBSERVER:$IMAGELIBINSTALLDIR /install; then
echo "CRITICAL ERROR: Failed to mount image store at $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
logger -t xcat "CRITICAL ERROR: Failed to mount image store at $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
sleep 3
exit 1
fi
Josh
On Thursday March 31, 2011, John Ma wrote:
> Josh,
>
> Thanks again for the help. I configured an anonymous ftp share of
> /install, and it passed the previous error. Now I am at here:
> Mar 30 21:06:53 blade08 blade08 xcat: running partimage -z1 -f3 -odbc
> save /dev/sda1 /install/image/x86/centos5image-blade08mar2466-v0.gz
> Mar 30 21:06:58 blade08 blade08 xcat: partimage exited with a non-zero
> status, failing
> Mar 30 21:06:58 blade08 blade08 xcat: partimage-ng failed with exit code
> 1
> Mar 30 21:06:58 blade08 blade08 init: rc3 main process (1166) killed by
> TERM signal
>
> Blade08 then rebooted itself and loop again. partimage's save location
> /install/image/x86/.. doesn't seem right to me, but how to configure it to
> use nfs?
> See the attached log file for more details, ( the clock on blade08 is off
> or maybe UTC)
>
>
> Thanks,
> John
>
>
>
>
>
>
> From: Josh Thompson <jo...@ncsu.edu>
> To: vcl-user@incubator.apache.org
> Date: 03/31/2011 03:29 PM
> Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
>
>
>
> - gpg control packet
> John,
>
> Sorry to take so long to get back to you.
>
> I didn't even realize this until digging through xcatdsklspost, but your
> management node needs to be sharing out /install via ftp. I'm assuming
> xcat
> sets this up because I don't remember setting that up manually. The
> following
> line is from xcatdsklspost:
>
> wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 0 -T
> 60
> ftp://$SIP/postscripts 2> /tmp/wget.log
>
> $SIP is obtained earlier in the script from some dhcp information.
>
> The next line is where your screenshot shows the first error:
>
> mv $SIP/postscripts/* /xcatpost;
>
> The wget command should try forever until it downloads everything under
> ftp://$SIP/postscripts. The fact that you are getting past wget, but the
> move
> is failing for $SIP/postscripts/* makes me think you don't have anything
> under
> ftp://$SIP/postscripts. Can you try using a normal ftp client to browse
> ftp://172.20.101.140/postscripts? It may be that the ftp server is
> sharing
> out the wrong directory.
>
> Josh
>
> On Friday March 25, 2011, John Ma wrote:
> > Josh,
> >
> > I made some progress, but stuck again. This time at the reboot of the
> > machine being captured. The machine apparently cannot find postscripts.
> > Any idea about how to fix it or what to try next?
> >
> > I placed partimageng in /install/postscripts on our VCL (web, db, and
>
> mgt
>
> > code) server - Blade14 (172.20.101.140). The machine being captured is
> > blade08 (172.20.101.80).
> >
> > Here is the screenshot:
> >
> > Here is the pxe boot config file:
> > [root@blade14 ~]# cat /tftpboot/pxelinux.cfg/blade08
> > #image image-x86-centos5image-blade08mar2466-v0
> > DEFAULT xCAT
> > LABEL xCAT
> >
> > KERNEL xcat/image/x86/vmlinuz
> > APPEND initrd=xcat/image/x86/initrd.img
> >
> > imgurl=http://blade14//install/image/x86/installer_files/rootimg.gz
> > image=/install/image/x86/centos5image-blade08mar2466-v0.img blocks=512
> > action=save installnic=eth0 reboot noipv6
> >
> > IPAPPEND 2
> >
> > [root@blade14 ~]#
> >
> > Thanks,
> > John Ma
> > Marist College
> >
> >
> >
> >
> > From: Josh Thompson <jo...@ncsu.edu>
> > To: vcl-user@incubator.apache.org
> > Date: 03/22/2011 12:15 PM
> > Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
- --
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University
my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
iEYEARECAAYFAk2VzaAACgkQV/LQcNdtPQPPfwCfZF5WlXYUnvLAV6XXiPG4ENQe
k7MAnAgiDIaXKzr8Lr9dClRuVGp6peaK
=MuU1
-----END PGP SIGNATURE-----
Re: VCL2.2 + xCAT2.5 on bladecenter
Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John,
Do you have IMAGELIBINSTALLDIR set to /opt or to /opt/image/x86? It needs to
be /opt.
One thing to try is changing usepartimageng to 1 on line 147. That will
switch to using partimageng instead of partimage and tell us if it is
something specific to partimage.
The next step will be to find the partimage logs, which is a little more
complicated.
Feel free to hop on IRC to debug a little faster: #asfvcl on freenode.
Josh
On Friday April 01, 2011, John Ma wrote:
> Josh,
>
> I added the debugging code. It turned out NFS looks fine. I also manually
> verified the NFS share:
>
> [root@blade14 ~]# mkdir nfstest
> [root@blade14 ~]# mount 172.20.0.1:/opt/image/x86 nfstest/
> [root@blade14 ~]# cd nfstest/
> [root@blade14 nfstest]# mkdir writetest
>
>
>
> How to debug the partimage save operation?
>
> Thanks,
> John Ma
> Marist College
>
>
>
>
> From: Josh Thompson <jo...@ncsu.edu>
> To: vcl-user@incubator.apache.org
> Date: 04/01/2011 09:06 AM
> Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
>
>
>
> - gpg control packet
> John,
>
> The partimageng postscript mounts an image store via NFS at /install. The
> NFS
> server and path are specified in the xCAT site table as IMAGELIBSERVER and
>
> IMAGELIBINSTALLDIR. More info about this part is at the bottom of our
> wiki
> page explaining how to add partimage support to xCAT.
>
> Do you have your image store exported read/write via NFS and available to
> the
> client nodes?
>
> As a test, you could modify the partimageng script to output more
> debugging
> info. You could modify the mount command on line 144 to be:
>
> logger -t xcat "Attempting to mount image store:
> $IMAGELIBSERVER:$IMAGELIBINSTALLDIR"
> if ! mount -o nfsvers=3,tcp,nolock,rw $IMAGELIBSERVER:$IMAGELIBINSTALLDIR
> /install; then
> echo "CRITICAL ERROR: Failed to mount image store at
> $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
> logger -t xcat "CRITICAL ERROR: Failed to mount image store at
> $IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
> sleep 3
> exit 1
> fi
>
> Josh
>
> On Thursday March 31, 2011, John Ma wrote:
> > Josh,
> >
> > Thanks again for the help. I configured an anonymous ftp share of
> >
> > /install, and it passed the previous error. Now I am at here:
> > Mar 30 21:06:53 blade08 blade08 xcat: running partimage -z1 -f3 -odbc
> >
> > save /dev/sda1 /install/image/x86/centos5image-blade08mar2466-v0.gz
> >
> > Mar 30 21:06:58 blade08 blade08 xcat: partimage exited with a non-zero
> >
> > status, failing
> >
> > Mar 30 21:06:58 blade08 blade08 xcat: partimage-ng failed with exit
>
> code
>
> > 1
> >
> > Mar 30 21:06:58 blade08 blade08 init: rc3 main process (1166) killed
>
> by
>
> > TERM signal
> >
> > Blade08 then rebooted itself and loop again. partimage's save location
> > /install/image/x86/.. doesn't seem right to me, but how to configure it
>
> to
>
> > use nfs?
> > See the attached log file for more details, ( the clock on blade08 is
>
> off
>
> > or maybe UTC)
> >
> >
> > Thanks,
> > John
> >
> >
> >
> >
> >
> >
> > From: Josh Thompson <jo...@ncsu.edu>
> > To: vcl-user@incubator.apache.org
> > Date: 03/31/2011 03:29 PM
> > Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
> >
> >
> >
> > - gpg control packet
> > John,
> >
> > Sorry to take so long to get back to you.
> >
> > I didn't even realize this until digging through xcatdsklspost, but your
> > management node needs to be sharing out /install via ftp. I'm assuming
> > xcat
> > sets this up because I don't remember setting that up manually. The
> > following
> > line is from xcatdsklspost:
> >
> > wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 0
>
> -T
>
> > 60
> > ftp://$SIP/postscripts 2> /tmp/wget.log
> >
> > $SIP is obtained earlier in the script from some dhcp information.
> >
> > The next line is where your screenshot shows the first error:
> >
> > mv $SIP/postscripts/* /xcatpost;
> >
> > The wget command should try forever until it downloads everything under
> > ftp://$SIP/postscripts. The fact that you are getting past wget, but
>
> the
>
> > move
> > is failing for $SIP/postscripts/* makes me think you don't have anything
> > under
> > ftp://$SIP/postscripts. Can you try using a normal ftp client to browse
> > ftp://172.20.101.140/postscripts? It may be that the ftp server is
> > sharing
> > out the wrong directory.
> >
> > Josh
> >
> > On Friday March 25, 2011, John Ma wrote:
> > > Josh,
> > >
> > > I made some progress, but stuck again. This time at the reboot of the
> > > machine being captured. The machine apparently cannot find
>
> postscripts.
>
> > > Any idea about how to fix it or what to try next?
> > >
> > > I placed partimageng in /install/postscripts on our VCL (web, db, and
> >
> > mgt
> >
> > > code) server - Blade14 (172.20.101.140). The machine being captured is
> > > blade08 (172.20.101.80).
> > >
> > > Here is the screenshot:
> > >
> > > Here is the pxe boot config file:
> > > [root@blade14 ~]# cat /tftpboot/pxelinux.cfg/blade08
> > > #image image-x86-centos5image-blade08mar2466-v0
> > > DEFAULT xCAT
> > > LABEL xCAT
> > >
> > > KERNEL xcat/image/x86/vmlinuz
> > > APPEND initrd=xcat/image/x86/initrd.img
> > >
> > > imgurl=http://blade14//install/image/x86/installer_files/rootimg.gz
> > > image=/install/image/x86/centos5image-blade08mar2466-v0.img blocks=512
> > > action=save installnic=eth0 reboot noipv6
> > >
> > > IPAPPEND 2
> > >
> > > [root@blade14 ~]#
> > >
> > > Thanks,
> > > John Ma
> > > Marist College
> > >
> > >
> > >
> > >
> > > From: Josh Thompson <jo...@ncsu.edu>
> > > To: vcl-user@incubator.apache.org
> > > Date: 03/22/2011 12:15 PM
> > > Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
- --
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University
my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
iEYEARECAAYFAk2V22cACgkQV/LQcNdtPQM2cQCfeRhu3/EFLU2Rwu+MaN3L+cg/
TQ8An1giXD25WpoTf6b0/+yzwcGDQpsX
=IheQ
-----END PGP SIGNATURE-----
Re: VCL2.2 + xCAT2.5 on bladecenter
Posted by John Ma <Jo...@marist.edu>.
Josh,
I added the debugging code. It turned out NFS looks fine. I also manually
verified the NFS share:
[root@blade14 ~]# mkdir nfstest
[root@blade14 ~]# mount 172.20.0.1:/opt/image/x86 nfstest/
[root@blade14 ~]# cd nfstest/
[root@blade14 nfstest]# mkdir writetest
How to debug the partimage save operation?
Thanks,
John Ma
Marist College
From: Josh Thompson <jo...@ncsu.edu>
To: vcl-user@incubator.apache.org
Date: 04/01/2011 09:06 AM
Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John,
The partimageng postscript mounts an image store via NFS at /install. The
NFS
server and path are specified in the xCAT site table as IMAGELIBSERVER and
IMAGELIBINSTALLDIR. More info about this part is at the bottom of our
wiki
page explaining how to add partimage support to xCAT.
Do you have your image store exported read/write via NFS and available to
the
client nodes?
As a test, you could modify the partimageng script to output more
debugging
info. You could modify the mount command on line 144 to be:
logger -t xcat "Attempting to mount image store:
$IMAGELIBSERVER:$IMAGELIBINSTALLDIR"
if ! mount -o nfsvers=3,tcp,nolock,rw $IMAGELIBSERVER:$IMAGELIBINSTALLDIR
/install; then
echo "CRITICAL ERROR: Failed to mount image store at
$IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
logger -t xcat "CRITICAL ERROR: Failed to mount image store at
$IMAGELIBSERVER:$IMAGELIBINSTALLDIR; unable to save image"
sleep 3
exit 1
fi
Josh
On Thursday March 31, 2011, John Ma wrote:
> Josh,
>
> Thanks again for the help. I configured an anonymous ftp share of
> /install, and it passed the previous error. Now I am at here:
> Mar 30 21:06:53 blade08 blade08 xcat: running partimage -z1 -f3 -odbc
> save /dev/sda1 /install/image/x86/centos5image-blade08mar2466-v0.gz
> Mar 30 21:06:58 blade08 blade08 xcat: partimage exited with a non-zero
> status, failing
> Mar 30 21:06:58 blade08 blade08 xcat: partimage-ng failed with exit
code
> 1
> Mar 30 21:06:58 blade08 blade08 init: rc3 main process (1166) killed
by
> TERM signal
>
> Blade08 then rebooted itself and loop again. partimage's save location
> /install/image/x86/.. doesn't seem right to me, but how to configure it
to
> use nfs?
> See the attached log file for more details, ( the clock on blade08 is
off
> or maybe UTC)
>
>
> Thanks,
> John
>
>
>
>
>
>
> From: Josh Thompson <jo...@ncsu.edu>
> To: vcl-user@incubator.apache.org
> Date: 03/31/2011 03:29 PM
> Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
>
>
>
> - gpg control packet
> John,
>
> Sorry to take so long to get back to you.
>
> I didn't even realize this until digging through xcatdsklspost, but your
> management node needs to be sharing out /install via ftp. I'm assuming
> xcat
> sets this up because I don't remember setting that up manually. The
> following
> line is from xcatdsklspost:
>
> wget -l inf -N -r --waitretry=10 --random-wait --retry-connrefused -t 0
-T
> 60
> ftp://$SIP/postscripts 2> /tmp/wget.log
>
> $SIP is obtained earlier in the script from some dhcp information.
>
> The next line is where your screenshot shows the first error:
>
> mv $SIP/postscripts/* /xcatpost;
>
> The wget command should try forever until it downloads everything under
> ftp://$SIP/postscripts. The fact that you are getting past wget, but
the
> move
> is failing for $SIP/postscripts/* makes me think you don't have anything
> under
> ftp://$SIP/postscripts. Can you try using a normal ftp client to browse
> ftp://172.20.101.140/postscripts? It may be that the ftp server is
> sharing
> out the wrong directory.
>
> Josh
>
> On Friday March 25, 2011, John Ma wrote:
> > Josh,
> >
> > I made some progress, but stuck again. This time at the reboot of the
> > machine being captured. The machine apparently cannot find
postscripts.
> > Any idea about how to fix it or what to try next?
> >
> > I placed partimageng in /install/postscripts on our VCL (web, db, and
>
> mgt
>
> > code) server - Blade14 (172.20.101.140). The machine being captured is
> > blade08 (172.20.101.80).
> >
> > Here is the screenshot:
> >
> > Here is the pxe boot config file:
> > [root@blade14 ~]# cat /tftpboot/pxelinux.cfg/blade08
> > #image image-x86-centos5image-blade08mar2466-v0
> > DEFAULT xCAT
> > LABEL xCAT
> >
> > KERNEL xcat/image/x86/vmlinuz
> > APPEND initrd=xcat/image/x86/initrd.img
> >
> > imgurl=http://blade14//install/image/x86/installer_files/rootimg.gz
> > image=/install/image/x86/centos5image-blade08mar2466-v0.img blocks=512
> > action=save installnic=eth0 reboot noipv6
> >
> > IPAPPEND 2
> >
> > [root@blade14 ~]#
> >
> > Thanks,
> > John Ma
> > Marist College
> >
> >
> >
> >
> > From: Josh Thompson <jo...@ncsu.edu>
> > To: vcl-user@incubator.apache.org
> > Date: 03/22/2011 12:15 PM
> > Subject: Re: VCL2.2 + xCAT2.5 on bladecenter
- --
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University
my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
iEYEARECAAYFAk2VzaAACgkQV/LQcNdtPQPPfwCfZF5WlXYUnvLAV6XXiPG4ENQe
k7MAnAgiDIaXKzr8Lr9dClRuVGp6peaK
=MuU1
-----END PGP SIGNATURE-----