You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vcl.apache.org by Andrew Brown <br...@gmail.com> on 2009/02/06 21:12:16 UTC
Re: role of loadlog table - computerloadlog
So we have things mostly working except for one detail: When we check out a
computer for capture through "update image", all that goes fine. But when we
go and hit "create" to actually capture the image, vcld hangs with the log
messages that Brian posted, repeating.
If we go in and remove the database entry in the computerloadlog table, the
capture goes through; however, there seems to be a process that lingers and
keeps trying to ssh to the machine even when our capture is proceeding. It
gives us log entries like this:
2009-02-06 14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)|checking
for connection by administrator on test_vm, attempt 4
2009-02-06 14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|executing
SSH command on test_vm: netstat -an
2009-02-06 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: netstat -an, returning (0, output)
2009-02-06 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|executing
SSH command on test_vm: who
2009-02-06 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)|run_ssh_command
output: admin pts/0 Feb 6 15:59 (152.14.244.35)
2009-02-06 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: who, returning (0, output)
(the pid being different from the one that is going into our capture
routine)
This may be unrelated, I don't know. But we need to fix it so we don't need
to manually delete the computerloadlog entry each time, as well as fix it so
this second process... goes away or something.
-Andrew
On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler <aa...@ncsu.edu> wrote:
> Actually it's computerloadlog
>
> The computerloadlog table has three purposes.
> It also has two dependent tables computerloadflow and computerloadstate
>
>
> 1) Provides information to the user about what state user request is in,
> where it's at in loading, performing post install tasks, adding user acct,
> etc. This information is seen on the current reservations page by the user
> clicking on the pending link. This link opens a ajax based window and will
> provided the state the load is in. At some point during your load routine -
> you'll want to add these type of calls in the appropriate places.
>
> insertloadlog($reservation_id, $computer_id, "startload",
> "$computer_node_name $image_name");
>
> But this isn't required to get a provision module functional. It's a nice
> feature, so the user can be aware of what's going on. We'll put together
> information on how to best use the insertloadlog.
>
> 2) If a new|reload request is started and the computer state is reload or
> reloading. The forked request process will use that table to determine what
> to do. Either take over, fail, or let the current owning process complete.
>
> 3) It is also used in vcld to prevent duplicate or competing processes.
>
> It gets cleaned out on a per-reservation basis. When a reservations goes
> into the inuse state and when a reservation is removed from the db.
>
> In your case, since your just testing your module, just manually empty the
> computerloadlog table as needed.
>
> Aaron
>
>
>
> --On February 6, 2009 11:20:47 AM -0500 Brian Bouterse <bm...@ncsu.edu>
> wrote:
>
> What is the purpose of the loadlog table? Currently the ESX provisioning
>> module doesn't update any information in the loadlog table, and the
>> provisioning works fine. We're working on the capture() portions now,
>> and when we go through the web interface to have it "capture" our
>> reservation, vcld warns saying:
>>
>> | 25052|72:72|image| ---- WARNING ----
>> | 25052|72:72|image| 2009-02-06
>> | 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is already
>> | being processed 25052|72:72|image| ( 0) utils.pm, notify (line: 683)
>> | 25052|72:72|image| (-1) vcld, main (line: 272)
>>
>> Is this normal for the "image update" process?
>>
>> Best,
>> Brian
>>
>> Brian Bouterse
>> Secure Open Systems Initiative
>> 919.698.8796
>>
>>
>>
>>
>>
>
>
Re: role of loadlog table - computerloadlog
Posted by Brian Bouterse <bm...@ncsu.edu>.
too bad apache doesn't let me attach the log. Here is the log for the
previous e-mail:
2009-02-06 14:52:27|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:52:27
2009-02-06 14:52:39|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:52:39
2009-02-06 14:52:51|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:52:51
2009-02-06 14:53:03|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:53:03
2009-02-06 14:53:15|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:53:15
2009-02-06 14:53:27|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:53:27
2009-02-06 14:53:39|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:53:39
2009-02-06 14:53:39|26991|93:93|new|utils.pm:check_time(1148)|
reservation start time was in the past 17 minutes (-9)
2009-02-06 14:53:39|26991|93:93|new|
utils.pm:reservation_being_processed(9787)|computerloadlog 'begin'
entry does NOT exist for reservation 93
2009-02-06 14:53:39|26991|93:93|new|vcld:main(276)|reservation 93 is
NOT already being processed
2009-02-06 14:53:39|26991|93:93|new|utils.pm:get_request_info(6034)|
non-NCSU user affiliation found: Local
2009-02-06 14:53:39|26991|93:93|new|
utils.pm:get_management_node_info(7210)|management node info retrieved
from database for dhcp2
2009-02-06 14:53:39|26991|93:93|new|vcld:main(281)|retieved request
information from database
2009-02-06 14:53:39|26991|93:93|new|vcld:main(318)|sleeping for 2
seconds before updating state to pending
2009-02-06 14:53:41|26991|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:41|26991|93:93|new|
utils.pm:update_request_state(2237)|request 93 state updated to:
pending, laststate to: new
2009-02-06 14:53:41|26991|93:93|new|vcld:main(327)|request state
updated to pending, laststate new
2009-02-06 14:53:41|26991|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, begin, beginning to process, state is new
2009-02-06 14:53:41|26991|93:93|new|vcld:main(340)|inserted 'begin'
entry into computerloadlog for reservation 93
2009-02-06 14:53:41|26991|93:93|new|vcld:make_new_child(511)|creating
new process
2009-02-06 14:53:41|26991|93:93|new|vcld:make_new_child(515)|loaded
VCL::new module
2009-02-06 14:53:41|31384|93:93|new|vcld:make_new_child(587)|vcld
environment variable set to 0 for this process
2009-02-06 14:53:41|31384|93:93|new|Module.pm:new(80)|constructor
called, class=VCL::new
2009-02-06 14:53:41|31384|93:93|new|Module.pm:new(102)|VCL::new object
created
2009-02-06 14:53:41|26991|93:93|new|vcld:make_new_child(571)|current
number of forked kids: 1
2009-02-06 14:53:41|31384|93:93|new|State.pm:initialize(86)|obtained a
database handle for this state process, stored as $ENV{dbh}
2009-02-06 14:53:41|31384|93:93|new|State.pm:check_image_os(378)|no
corrections need to be made, not an imaging request, returning 1
2009-02-06 14:53:41|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:41|31384|93:93|new|utils.pm:rename_vcld_process(8326)|
reservation count: 1
2009-02-06 14:53:41|31384|93:93|new|utils.pm:rename_vcld_process(8345)|
PARENTIMAGE: 1
2009-02-06 14:53:41|31384|93:93|new|utils.pm:rename_vcld_process(8346)|
SUBIMAGE: 0
2009-02-06 14:53:41|31384|93:93|new|utils.pm:rename_vcld_process(8364)|
renamed process to 'vcld VCL::new 93:93 new imaging'
2009-02-06 14:53:41|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:41|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:41|31384|93:93|new|State.pm:initialize(122)|
attempting to load provisioning module: VCL::Module::Provisioning::esx
2009-02-06 14:53:42|31384|93:93|new|esx.pm:initialize(114)|vmware ESX
module initialized
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(130)|
VCL::Module::Provisioning::esx module successfully loaded
2009-02-06 14:53:42|31384|93:93|new|Module.pm:new(80)|constructor
called, class=VCL::Module::Provisioning::esx
2009-02-06 14:53:42|31384|93:93|new|Module.pm:new(102)|
VCL::Module::Provisioning::esx object created
2009-02-06 14:53:42|31384|93:93|new|esx.pm:initialize(114)|vmware ESX
module initialized
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(134)|
VCL::Module::Provisioning::esx provisioner object successfully created
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(148)|
attempting to load OS module: VCL::Module::OS::Linux
2009-02-06 14:53:42|31384|93:93|new|Module.pm:new(80)|constructor
called, class=VCL::Module::OS::Linux
2009-02-06 14:53:42|31384|93:93|new|Module.pm:new(102)|
VCL::Module::OS::Linux object created
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(156)|
VCL::Module::OS::Linux OS object successfully created
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(170)|
attempting to load predictive loading module:
VCL::Module::Predictive::Level_0
2009-02-06 14:53:42|31384|93:93|new|Module.pm:new(80)|constructor
called, class=VCL::Module::Predictive::Level_0
2009-02-06 14:53:42|31384|93:93|new|Module.pm:new(102)|
VCL::Module::Predictive::Level_0 object created
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(178)|
VCL::Module::Predictive::Level_0 predictive loading object
successfully created
2009-02-06 14:53:42|31384|93:93|new|State.pm:initialize(190)|returning 1
2009-02-06 14:53:42|31384|93:93|new|vcld:make_new_child(596)|VCL::new
object created and initialized
2009-02-06 14:53:42|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:42|31384|93:93|new|
utils.pm:get_management_node_id(9353)|environment variable: 1
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(135)|reservation is
parent = 1
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(136)|preload only = 0
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(137)|originating
request state = new
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(138)|originating
request laststate = new
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(139)|originating
computer state = available
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(140)|originating
computer type = virtualmachine
2009-02-06 14:53:42|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:42|31384|93:93|new|
utils.pm:get_management_node_id(9353)|environment variable: 1
2009-02-06 14:53:42|31384|93:93|new|
new.pm:computer_not_being_used(746)|test_vm is available, its state is
available
2009-02-06 14:53:42|31384|93:93|new|new.pm:process(170)|test_vm is not
being used
2009-02-06 14:53:42|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:53:42|31384|93:93|new|
utils.pm:get_management_node_id(9353)|environment variable: 1
2009-02-06 14:53:42|31384|93:93|new|new.pm:reload_image(512)|calling
VCL::Module::Provisioning::esx->node_status()
2009-02-06 14:53:42|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, statuscheck, checking status of node
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(616)|Entering
node_status, checking status of test_vm
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(617)|
request_for_imaging: 1
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(618)|requeseted
image name: esx3-lamp-1.3.2-x86-v0
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(648)|checking
if test_vm is pingable
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(651)|test_vm is
pingable (1)
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(665)|Trying to
ssh...
2009-02-06 14:53:42|31384|93:93|new|utils.pm:run_ssh_command(6729)|
executing SSH command on test_vm: uname -s
2009-02-06 14:53:42|31384|93:93|new|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:53:42|31384|93:93|new|utils.pm:run_ssh_command(6796)|
run_ssh_command output: Linux
2009-02-06 14:53:42|31384|93:93|new|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: uname -s, returning (0, output)
2009-02-06 14:53:42|31384|93:93|new|esx.pm:node_status(674)|SSH good,
trying to query image name
2009-02-06 14:53:42|31384|93:93|new|utils.pm:run_ssh_command(6729)|
executing SSH command on test_vm: cat currentimage.txt
2009-02-06 14:53:43|31384|93:93|new|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:53:43|31384|93:93|new|utils.pm:run_ssh_command(6796)|
run_ssh_command output: esx3-lamp-1.3.2-x86-v0
2009-02-06 14:53:43|31384|93:93|new|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: cat currentimage.txt, returning (0, output)
2009-02-06 14:53:43|31384|93:93|new|esx.pm:node_status(681)|Image
name: esx3-lamp-1.3.2-x86-v0
2009-02-06 14:53:43|31384|93:93|new|esx.pm:node_status(687)|test_vm is
loaded with requestedimagename esx3-lamp-1.3.2-x86-v0
2009-02-06 14:53:43|31384|93:93|new|esx.pm:node_status(703)|status set
to READY
2009-02-06 14:53:43|31384|93:93|new|esx.pm:node_status(708)|
request_forimaging set, setting status to RELOAD
2009-02-06 14:53:43|31384|93:93|new|esx.pm:node_status(711)|returning
node status hash reference ($node_status->{status}=RELOAD)
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(527)|
node_status returned a hash reference
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(532)|
node_status hash reference contains key {status}=RELOAD
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(580)|node
status is RELOAD, test_vm will be reloaded
2009-02-06 14:53:43|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, loadimageblade, test_vm must be reloaded with
esx3-lamp-1.3.2-x86-v0
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(593)|calling
VCL::Module::Provisioning::esx->does_image_exist()
2009-02-06 14:53:43|31384|93:93|new|esx.pm:does_image_exist(736)|image
esx3-lamp-1.3.2-x86-v0 exists
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(596)|esx3-
lamp-1.3.2-x86-v0 exists on this management node
2009-02-06 14:53:43|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, doesimageexists, confirmed image exists
2009-02-06 14:53:43|31384|93:93|new|
utils.pm:update_computer_state(2279)|computer 2 state updated to:
reloading
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(628)|computer
test_vm state set to reloading
2009-02-06 14:53:43|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, info, computer state updated to reloading
2009-02-06 14:53:43|31384|93:93|new|new.pm:reload_image(638)|calling
VCL::Module::Provisioning::esx->load() subroutine
2009-02-06 14:53:43|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, info, calling VCL::Module::Provisioning::esx-
>load() subroutine
2009-02-06 14:53:43|31384|93:93|new|esx.pm:load(139)|
****************************************************
2009-02-06 14:53:43|31384|93:93|new|esx.pm:load(164)|Entered ESX
module, loading esx3-lamp-1.3.2-x86-v0 on test_vm (on
blade2-7.cnl.ncsu.edu) for reservation 93
2009-02-06 14:53:43|31384|93:93|new|esx.pm:load(174)|VM info command: /
usr/lib/vmware-viperl/apps/vm/vminfo.pl --server
'blade2-7.cnl.ncsu.edu' --vmname test_vm --username root --password
'anotherTestpassw0rd'
2009-02-06 14:53:46|31384|93:93|new|esx.pm:load(177)|VM info output:
|31384|93:93|new| Information of Virtual Machine test_vm
|31384|93:93|new| Name: test_vm
|31384|93:93|new| No. of CPU(s): 1
|31384|93:93|new| Memory Size: 256
|31384|93:93|new| Virtual Disks: 1
|31384|93:93|new| Template: 0
|31384|93:93|new| vmPathName: [VCL] inuse/test_vm/esx3-lamp-1.3.2-x86-
v0.vmx
|31384|93:93|new| Guest OS: Not Known
|31384|93:93|new| guestId: Not Known
|31384|93:93|new| Host name: Not Known
|31384|93:93|new| IP Address: Not Known
|31384|93:93|new| VMware Tools: VMware Tools has never been installed
or has not run in the virtual machine.
|31384|93:93|new| Cpu usage: 52 MHz
|31384|93:93|new| Host memory usage: 204 MB
|31384|93:93|new| Guest memory usage: 192 MB
|31384|93:93|new| Overall Status: The entity is OK
2009-02-06 14:53:46|31384|93:93|new|esx.pm:load(188)|Power off
command: /usr/lib/vmware-viperl/apps/vm/vmcontrol.pl --server
'blade2-7.cnl.ncsu.edu' --vmname test_vm --operation poweroff --
username root --password 'anotherTestpassw0rd'
2009-02-06 14:53:53|31384|93:93|new|esx.pm:load(191)|Powered off:
|31384|93:93|new| virtual machine 'test_vm' under host
blade2-7.cnl.ncsu.edu powered off
2009-02-06 14:53:53|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:53:53
2009-02-06 14:53:59|31384|93:93|new|esx.pm:load(206)|Un-Registered:
|31384|93:93|new| Unregister of VM 'test_vm' successfully completed
under host blade2-7.cnl.ncsu.edu
2009-02-06 14:53:59|31384|93:93|new|esx.pm:load(210)|Output from
remove command is:
2009-02-06 14:53:59|31384|93:93|new|esx.pm:load(225)|COPIED VMDK
SUCCESSFULLY
2009-02-06 14:53:59|31384|93:93|new|esx.pm:load(233)|SSHing to copy
vmdk-flat file
2009-02-06 14:54:05|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:54:05
2009-02-06 14:54:17|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:54:17
2009-02-06 14:54:29|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:54:29
2009-02-06 14:54:39|31384|93:93|new|esx.pm:load(256)|adapter= lsilogic
2009-02-06 14:54:41|31384|93:93|new|esx.pm:load(298)|wrote vmxarray
to /mnt/vcl/inuse/test_vm/esx3-lamp-1.3.2-x86-v0.vmx
2009-02-06 14:54:41|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:54:41
2009-02-06 14:54:46|31384|93:93|new|esx.pm:load(319)|Registered:
|31384|93:93|new| Register of VM 'test_vm' successfully completed
under host blade2-7.cnl.ncsu.edu
2009-02-06 14:54:46|31384|93:93|new|esx.pm:load(328)|Power on
command: /usr/lib/vmware-viperl/apps/vm/vmcontrol.pl --server
'blade2-7.cnl.ncsu.edu' --vmname test_vm --operation poweron --
username root --password 'anotherTestpassw0rd'
2009-02-06 14:54:53|31384|93:93|new|esx.pm:load(331)|Powered on:
|31384|93:93|new| virtual machine 'test_vm' under host
blade2-7.cnl.ncsu.edu powered on
2009-02-06 14:54:54|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:54:53
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(347)|deviceinfo-
>summary: VirtualPCNet32=HASH(0xaaa5444)->deviceinfo->summary
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(348)|virtualswitch0:
Management
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(347)|deviceinfo-
>summary: VirtualPCNet32=HASH(0x996b414)->deviceinfo->summary
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(348)|virtualswitch0:
Management
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(358)|Queried MAC
address is 00:0c:29:68:ad:3c
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(370)|test_vm now has
ip 10.0.4.14
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(386)|Found IP address
10.0.4.14
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(389)|Removing old
hosts entry
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(391)|
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(400)|Waiting for ssh
to come up on test_vm
2009-02-06 14:54:57|31384|93:93|new|esx.pm:load(417)|going to sleep 5
seconds, waiting for computer to start SSH. Try 1
2009-02-06 14:55:02|31384|93:93|new|esx.pm:load(417)|going to sleep 5
seconds, waiting for computer to start SSH. Try 2
2009-02-06 14:55:06|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:55:06
2009-02-06 14:55:08|31384|93:93|new|esx.pm:load(417)|going to sleep 5
seconds, waiting for computer to start SSH. Try 3
2009-02-06 14:55:13|31384|93:93|new|esx.pm:load(417)|going to sleep 5
seconds, waiting for computer to start SSH. Try 4
2009-02-06 14:55:18|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:55:18
2009-02-06 14:55:18|31384|93:93|new|esx.pm:load(417)|going to sleep 5
seconds, waiting for computer to start SSH. Try 5
2009-02-06 14:55:23|31384|93:93|new|utils.pm:run_ssh_command(6729)|
executing SSH command on test_vm: uname -s
2009-02-06 14:55:24|31384|93:93|new|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:55:24|31384|93:93|new|utils.pm:run_ssh_command(6796)|
run_ssh_command output: Linux
2009-02-06 14:55:24|31384|93:93|new|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: uname -s, returning (0, output)
2009-02-06 14:55:24|31384|93:93|new|esx.pm:load(405)|test_vm now has
active sshd running
2009-02-06 14:55:24|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, dynamicDHCPaddress, collecting dynamic IP address
for node
2009-02-06 14:55:24|31384|93:93|new|esx.pm:load(428)|Attempting to
query vmclient for its public IP...
2009-02-06 14:55:24|31384|93:93|new|utils.pm:getdynamicaddress(1638)|
PrivateIP address for test_vm collected 10.0.4.14
2009-02-06 14:55:24|31384|93:93|new|utils.pm:run_ssh_command(6729)|
executing SSH command on test_vm: /sbin/ifconfig |grep inet
2009-02-06 14:55:25|31384|93:93|new|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:55:25|31384|93:93|new|utils.pm:run_ssh_command(6796)|
run_ssh_command output: inet addr:10.0.4.14 Bcast:10.0.255.255 Mask:
255.255.0.0
|31384|93:93|new| inet6 addr: fe80::20c:29ff:fe68:ad3c/64 Scope:Link
|31384|93:93|new| inet addr:152.14.17.82 Bcast:152.14.17.255 Mask:
255.255.255.0
|31384|93:93|new| inet6 addr: fe80::20c:29ff:fe68:ad46/64 Scope:Link
|31384|93:93|new| inet addr:127.0.0.1 Mask:255.0.0.0
|31384|93:93|new| inet6 addr: ::1/128 Scope:Host
2009-02-06 14:55:25|31384|93:93|new|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: /sbin/ifconfig |grep inet, returning (0,
output)
2009-02-06 14:55:25|31384|93:93|new|esx.pm:load(432)|Got dynamic
address from vmclient, attempting to update database
2009-02-06 14:55:25|31384|93:93|new|
utils.pm:update_computer_address(7412)|computer 2 IP address
152.14.17.82 updated in database
2009-02-06 14:55:25|31384|93:93|new|esx.pm:load(433)|succesfully
updated IPaddress of node test_vm
2009-02-06 14:55:25|31384|93:93|new|new.pm:reload_image(641)|esx3-
lamp-1.3.2-x86-v0 was successfully reloaded on test_vm
2009-02-06 14:55:25|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, loadimagecomplete, esx3-lamp-1.3.2-x86-v0 was
successfully reloaded on test_vm
2009-02-06 14:55:25|31384|93:93|new|utils.pm:update_currentimage(7242)|
updating computer 2: image=10, imagerevision=0
2009-02-06 14:55:25|31384|93:93|new|utils.pm:update_currentimage(7272)|
updated currentimageid and imagerevision id for computer id 2
2009-02-06 14:55:25|31384|93:93|new|new.pm:reload_image(651)|updated
computer table for test_vm: currentimageid=10
2009-02-06 14:55:25|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, nodeready, test_vm was reloaded with esx3-
lamp-1.3.2-x86-v0
2009-02-06 14:55:25|31384|93:93|new|new.pm:reload_image(667)|returning 1
2009-02-06 14:55:25|31384|93:93|new|new.pm:process(266)|test_vm is
loaded with esx3-lamp-1.3.2-x86-v0
2009-02-06 14:55:25|31384|93:93|new|new.pm:process(350)|test_vm
preferred image set to esx3-lamp-1.3.2-x86-v0
2009-02-06 14:55:25|31384|93:93|new|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:55:25|31384|93:93|new|
utils.pm:get_management_node_id(9353)|environment variable: 1
2009-02-06 14:55:25|31384|93:93|new|new.pm:reserve_computer(969)|
user_standalone=1, image OS type=linux
2009-02-06 14:55:25|31384|93:93|new|utils.pm:getdynamicaddress(1638)|
PrivateIP address for test_vm collected 10.0.4.14
2009-02-06 14:55:25|31384|93:93|new|utils.pm:run_ssh_command(6729)|
executing SSH command on test_vm: /sbin/ifconfig |grep inet
2009-02-06 14:55:26|31384|93:93|new|utils.pm:run_ssh_command(6748)|$?:
0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:55:26|31384|93:93|new|utils.pm:run_ssh_command(6796)|
run_ssh_command output: inet addr:10.0.4.14 Bcast:10.0.255.255 Mask:
255.255.0.0
|31384|93:93|new| inet6 addr: fe80::20c:29ff:fe68:ad3c/64 Scope:Link
|31384|93:93|new| inet addr:152.14.17.82 Bcast:152.14.17.255 Mask:
255.255.255.0
|31384|93:93|new| inet6 addr: fe80::20c:29ff:fe68:ad46/64 Scope:Link
|31384|93:93|new| inet addr:127.0.0.1 Mask:255.0.0.0
|31384|93:93|new| inet6 addr: ::1/128 Scope:Host
2009-02-06 14:55:26|31384|93:93|new|utils.pm:run_ssh_command(6799)|SSH
command executed on test_vm: /sbin/ifconfig |grep inet, returning (0,
output)
2009-02-06 14:55:26|31384|93:93|new|
utils.pm:update_computer_address(7412)|computer 2 IP address
152.14.17.82 updated in database
2009-02-06 14:55:26|31384|93:93|new|new.pm:reserve_computer(991)|
dynamic address collect 152.14.17.82 -- updating computer table
2009-02-06 14:55:26|31384|93:93|new|DataStructure.pm:_automethod(628)|
data structure updated: computer_ip_address = 152.14.17.82
2009-02-06 14:55:26|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, info, node ready adding user account
2009-02-06 14:55:26|31384|93:93|new|
utils.pm:update_request_password(2318)|password updated for
reservation_id 93
2009-02-06 14:55:26|31384|93:93|new|new.pm:reserve_computer(1014)|
updated password entry reservation_id 93
2009-02-06 14:55:26|31384|93:93|new|new.pm:reserve_computer(1161)|
root@localhost
|31384|93:93|new| vcl,
|31384|93:93|new| The resources for your VCL image creation request
have been successfully reserved.
|31384|93:93|new| Reservation Information:
|31384|93:93|new| Image Name: ESX LAMP VM
|31384|93:93|new| IP Address: 152.14.17.82
|31384|93:93|new| Connection will not be allowed until you acknowledge
using the VCL web interface. You must acknowledge the reservation
within the next 15 minutes or the resources will be reclaimed for
other VCL users.
|31384|93:93|new| -Visit http://vcl.ncsu.edu
|31384|93:93|new| -Select "Current Reservations"
|31384|93:93|new| -Click the "Connect" button
|31384|93:93|new| Upon acknowledgement, all of the remaining
connection details will be displayed.
|31384|93:93|new| You have up to 8 hours to complete the new image.
Once you have completed preparing the new image:
|31384|93:93|new| -Visit http://vcl.ncsu.edu
|31384|93:93|new| -Select "Current Reservations"
|31384|93:93|new| -Click the "Create Image" button and follow the
instuctions
|31384|93:93|new| Thank You,
|31384|93:93|new| VCL Team
2009-02-06 14:55:26|31384|93:93|new|new.pm:reserve_computer(1164)|
IMTYPE_name= none calling notify_via
2009-02-06 14:55:26|31384|93:93|new|new.pm:reserve_computer(1219)|
updated sublog 38 for node test_vm IPaddress 152.14.17.82
2009-02-06 14:55:26|31384|93:93|new|new.pm:process(381)|test_vm with
esx3-lamp-1.3.2-x86-v0 successfully reserved
2009-02-06 14:55:26|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, reserved, test_vm successfully reserved with esx3-
lamp-1.3.2-x86-v0
2009-02-06 14:55:26|31384|93:93|new|new.pm:process(393)|inserted
computerloadlog entry, load state=reserved
2009-02-06 14:55:26|31384|93:93|new|
utils.pm:update_computer_state(2279)|computer 2 state updated to:
reserved
2009-02-06 14:55:26|31384|93:93|new|new.pm:process(417)|test_vm state
set to 'reserved'
2009-02-06 14:55:26|31384|93:93|new|
utils.pm:update_request_state(2237)|request 93 state updated to:
reserved, laststate to: new
2009-02-06 14:55:26|31384|93:93|new|new.pm:process(426)|request state
set to 'reserved'/'new'
2009-02-06 14:55:26|31384|93:93|new|utils.pm:insertloadlog(5379)|
inserted computer=2, info, test_vm successfully set to reserved with
esx3-lamp-1.3.2-x86-v0
2009-02-06 14:55:26|31384|93:93|new|new.pm:process(439)|inserted
computerloadlog entry: test_vm successfully set to reserved with esx3-
lamp-1.3.2-x86-v0
2009-02-06 14:55:26|31384|93:93|new|new.pm:process(446)|exiting
2009-02-06 14:55:26|31384|93:93|new|State.pm:DESTROY(455)|destructor
called, ref($self)=VCL::new
2009-02-06 14:55:26|31384|93:93|new|
utils.pm:delete_computerloadlog_reservation(8062)|deleted rows from
computerloadlog table where reservation id=93, loadstatename=begin
2009-02-06 14:55:26|31384|93:93|new|State.pm:DESTROY(458)|removed
computerloadlog rows with loadstate=begin for reservation
2009-02-06 14:55:26|31384|93:93|new|State.pm:DESTROY(466)|number of
database handles state process created: 1
2009-02-06 14:55:26|31384|93:93|new|State.pm:DESTROY(475)|process has
a database handle stored in $ENV{dbh}, attempting disconnect
2009-02-06 14:55:26|31384|93:93|new|State.pm:DESTROY(477)|$ENV{dbh}:
database disconnect successful
2009-02-06 14:55:26|26991|vcld:REAPER(685)|VCL state process exited,
pid=31384
2009-02-06 14:55:26|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:55:26
2009-02-06 14:55:26|26991|93:93|reserved|
utils.pm:reservation_being_processed(9787)|computerloadlog 'begin'
entry does NOT exist for reservation 93
2009-02-06 14:55:26|26991|93:93|reserved|vcld:main(276)|reservation 93
is NOT already being processed
2009-02-06 14:55:26|26991|93:93|reserved|
utils.pm:get_request_info(6034)|non-NCSU user affiliation found: Local
2009-02-06 14:55:26|26991|93:93|reserved|
utils.pm:get_management_node_info(7210)|management node info retrieved
from database for dhcp2
2009-02-06 14:55:26|26991|93:93|reserved|vcld:main(281)|retieved
request information from database
2009-02-06 14:55:26|26991|93:93|reserved|vcld:main(318)|sleeping for 2
seconds before updating state to pending
2009-02-06 14:55:28|26991|93:93|reserved|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:55:28|26991|93:93|reserved|
utils.pm:update_request_state(2237)|request 93 state updated to:
pending, laststate to: reserved
2009-02-06 14:55:28|26991|93:93|reserved|vcld:main(327)|request state
updated to pending, laststate reserved
2009-02-06 14:55:28|26991|93:93|reserved|utils.pm:insertloadlog(5379)|
inserted computer=2, begin, beginning to process, state is reserved
2009-02-06 14:55:28|26991|93:93|reserved|vcld:main(340)|inserted
'begin' entry into computerloadlog for reservation 93
2009-02-06 14:55:28|26991|93:93|reserved|vcld:make_new_child(511)|
creating new process
2009-02-06 14:55:28|26991|93:93|reserved|vcld:make_new_child(515)|
loaded VCL::reserved module
2009-02-06 14:55:28|31467|93:93|reserved|vcld:make_new_child(587)|vcld
environment variable set to 0 for this process
2009-02-06 14:55:28|31467|93:93|reserved|Module.pm:new(80)|constructor
called, class=VCL::reserved
2009-02-06 14:55:28|31467|93:93|reserved|Module.pm:new(102)|
VCL::reserved object created
2009-02-06 14:55:28|26991|93:93|reserved|vcld:make_new_child(571)|
current number of forked kids: 1
2009-02-06 14:55:28|31467|93:93|reserved|State.pm:initialize(86)|
obtained a database handle for this state process, stored as $ENV{dbh}
2009-02-06 14:55:28|31467|93:93|reserved|State.pm:check_image_os(378)|
no corrections need to be made, not an imaging request, returning 1
2009-02-06 14:55:28|31467|93:93|reserved|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:55:28|31467|93:93|reserved|
utils.pm:rename_vcld_process(8326)|reservation count: 1
2009-02-06 14:55:28|31467|93:93|reserved|
utils.pm:rename_vcld_process(8345)|PARENTIMAGE: 1
2009-02-06 14:55:28|31467|93:93|reserved|
utils.pm:rename_vcld_process(8346)|SUBIMAGE: 0
2009-02-06 14:55:28|31467|93:93|reserved|
utils.pm:rename_vcld_process(8364)|renamed process to 'vcld
VCL::reserved 93:93 reserved imaging'
2009-02-06 14:55:28|31467|93:93|reserved|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:55:28|31467|93:93|reserved|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:55:28|31467|93:93|reserved|State.pm:initialize(122)|
attempting to load provisioning module: VCL::Module::Provisioning::esx
2009-02-06 14:55:29|31467|93:93|reserved|esx.pm:initialize(114)|vmware
ESX module initialized
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(130)|
VCL::Module::Provisioning::esx module successfully loaded
2009-02-06 14:55:29|31467|93:93|reserved|Module.pm:new(80)|constructor
called, class=VCL::Module::Provisioning::esx
2009-02-06 14:55:29|31467|93:93|reserved|Module.pm:new(102)|
VCL::Module::Provisioning::esx object created
2009-02-06 14:55:29|31467|93:93|reserved|esx.pm:initialize(114)|vmware
ESX module initialized
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(134)|
VCL::Module::Provisioning::esx provisioner object successfully created
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(148)|
attempting to load OS module: VCL::Module::OS::Linux
2009-02-06 14:55:29|31467|93:93|reserved|Module.pm:new(80)|constructor
called, class=VCL::Module::OS::Linux
2009-02-06 14:55:29|31467|93:93|reserved|Module.pm:new(102)|
VCL::Module::OS::Linux object created
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(156)|
VCL::Module::OS::Linux OS object successfully created
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(170)|
attempting to load predictive loading module:
VCL::Module::Predictive::Level_0
2009-02-06 14:55:29|31467|93:93|reserved|Module.pm:new(80)|constructor
called, class=VCL::Module::Predictive::Level_0
2009-02-06 14:55:29|31467|93:93|reserved|Module.pm:new(102)|
VCL::Module::Predictive::Level_0 object created
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(178)|
VCL::Module::Predictive::Level_0 predictive loading object
successfully created
2009-02-06 14:55:29|31467|93:93|reserved|State.pm:initialize(190)|
returning 1
2009-02-06 14:55:29|31467|93:93|reserved|vcld:make_new_child(596)|
VCL::reserved object created and initialized
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(121)|
updated log table, set loaded time to now for id:38
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(133)|
imagemeta user group defined
2009-02-06 14:55:29|31467|93:93|reserved|
utils.pm:getusergroupmembers(4419)|no data returned for usergroupid
returning empty lists
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(138)|
imagemeta user group membership count = 0
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(158)|
computer info: id=2, type=virtualmachine, hostname=test_vm
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(159)|user
info: uid=101, unity id=admin, standalone=1
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(160)|
imagemeta checkuser set to: 1
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(161)|
formimaging set to: 1
2009-02-06 14:55:29|31467|93:93|reserved|reserved.pm:process(172)|
begin checking for user acknowledgement
2009-02-06 14:55:29|31467|93:93|reserved|utils.pm:insertloadlog(5379)|
inserted computer=2, info, reserved: waiting for user acknowledgement
2009-02-06 14:55:40|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:55:40
2009-02-06 14:55:49|31467|93:93|reserved|reserved.pm:process(193)|user
acknowledged, remote IP: 152.14.244.35
2009-02-06 14:55:49|31467|93:93|reserved|reserved.pm:process(197)|
blade or virtual machine detected: virtualmachine
2009-02-06 14:55:49|31467|93:93|reserved|reserved.pm:process(251)|
Linux image detected: fc9image
2009-02-06 14:55:49|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: /usr/
sbin/useradd -u 101 -d /home/admin -m admin -g ncsu
2009-02-06 14:55:51|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:55:51|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: none
2009-02-06 14:55:51|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: /usr/
sbin/useradd -u 101 -d /home/admin -m admin -g ncsu, returning (0,
output)
2009-02-06 14:55:51|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: cat /
etc/passwd
2009-02-06 14:55:52|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:55:52|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: root:x:
0:0:root:/root:/bin/bash
|31467|93:93|reserved| daemon:x:2:2:daemon:/sbin:/sbin/nologin
|31467|93:93|reserved| lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
|31467|93:93|reserved| mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
|31467|93:93|reserved| uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
|31467|93:93|reserved| named:x:25:25:Named:/var/named:/sbin/nologin
|31467|93:93|reserved| postgres:x:26:26:postgres:/srv/pgsql:/bin/bash
|31467|93:93|reserved| mysql:x:27:27:mysql:/var/lib/mysql:/bin/bash
|31467|93:93|reserved| rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/
sbin/nologin
|31467|93:93|reserved| rpc:x:32:32:Portmapper RPC User:/:/sbin/nologin
|31467|93:93|reserved| ntp:x:38:38::/etc/ntp:/sbin/nologin
|31467|93:93|reserved| mailnull:x:47:47::/var/spool/mailqueue:/sbin/
nologin
|31467|93:93|reserved| apache:x:48:48:Apache:/var/www:/sbin/nologin
|31467|93:93|reserved| smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
|31467|93:93|reserved| emerge:x:70:70:Conary emerge build user:/var/
conary/emerge:/sbin/nologin
|31467|93:93|reserved| sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
|31467|93:93|reserved| raa-web:x:91:91:rPath Appliance Agent:/var/lib/
raa/:/sbin/nologin
|31467|93:93|reserved| nobody:x:99:99:Nobody:/:/sbin/nologin
|31467|93:93|reserved| web:x:500:48:Web User Account:/srv/www/html:/
usr/bin/scponly
|31467|93:93|reserved| admin:x:101:500::/home/admin:/bin/bash
2009-02-06 14:55:52|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: cat /
etc/passwd, returning (0, output)
2009-02-06 14:55:52|31467|93:93|reserved|utils.pm:add_user(3020)|added
user account admin to test_vm
2009-02-06 14:55:52|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: echo
"AllowUsers admin" >> /etc/ssh/external_sshd_config
2009-02-06 14:55:52|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:55:52
2009-02-06 14:55:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:55:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: none
2009-02-06 14:55:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: echo
"AllowUsers admin" >> /etc/ssh/external_sshd_config, returning (0,
output)
2009-02-06 14:55:53|31467|93:93|reserved|utils.pm:add_user(3023)|added
AllowUsers admin to external_sshd_config
2009-02-06 14:55:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: /etc/
init.d/ext_sshd restart
2009-02-06 14:55:54|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 32512, signal: 0, core dump: 0,
exit status: 127
2009-02-06 14:55:54|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: bash: /etc/
init.d/ext_sshd: No such file or directory
2009-02-06 14:55:54|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: /etc/
init.d/ext_sshd restart, returning (127, output)
2009-02-06 14:55:54|31467|93:93|reserved|utils.pm:add_user(3041)|
started ext_sshd on test_vm
2009-02-06 14:55:54|31467|93:93|reserved|reserved.pm:process(277)|user
admin added to test_vm
2009-02-06 14:55:54|31467|93:93|reserved|utils.pm:insertloadlog(5379)|
inserted computer=2, info, reserved: adding user and opening remote
access port for 152.14.244.35
2009-02-06 14:55:54|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: echo
BF8PKX | /usr/bin/passwd -f admin --stdin
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: Changing
password for user admin.
|31467|93:93|reserved| passwd: all authentication tokens updated
successfully.
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: echo
BF8PKX | /usr/bin/passwd -f admin --stdin, returning (0, output)
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:changelinuxpassword(4332)|successfully changed local password
account admin
2009-02-06 14:55:55|31467|93:93|reserved|reserved.pm:process(290)|
password changed on test_vm for standalone user admin
2009-02-06 14:55:55|31467|93:93|reserved|reserved.pm:process(405)|
checkuser flag is set to 1, checking user connection
2009-02-06 14:55:55|31467|93:93|reserved|reserved.pm:process(409)|
forimaging flag is set to 1, checking for connection by administrator
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:check_connection(1828)|checking for connection by
administrator on test_vm, attempt 1
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm:
netstat -an
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm:
netstat -an, returning (0, output)
2009-02-06 14:55:55|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: who
2009-02-06 14:55:56|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:55:56|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: none
2009-02-06 14:55:56|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: who,
returning (0, output)
2009-02-06 14:56:05|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:56:04
2009-02-06 14:56:16|31467|93:93|reserved|
utils.pm:check_connection(1828)|checking for connection by
administrator on test_vm, attempt 2
2009-02-06 14:56:16|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm:
netstat -an
2009-02-06 14:56:16|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:56:16|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm:
netstat -an, returning (0, output)
2009-02-06 14:56:16|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: who
2009-02-06 14:56:17|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:56:17
2009-02-06 14:56:17|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:56:17|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: admin pts/0 Feb
6 15:59 (152.14.244.35)
2009-02-06 14:56:17|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: who,
returning (0, output)
2009-02-06 14:56:29|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:56:29
2009-02-06 14:56:29|26991|93:93|image|
utils.pm:reservation_being_processed(9779)|computerloadlog 'begin'
entry exists for reservation
|26991|93:93|image| ---- WARNING ----
|26991|93:93|image| 2009-02-06 14:56:29|26991|93:93|image|
vcld:main(272)|reservation 93 is already being processed
|26991|93:93|image| ( 0) utils.pm, notify (line: 683)
|26991|93:93|image| (-1) vcld, main (line: 272)
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:check_connection(1828)|checking for connection by
administrator on test_vm, attempt 3
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm:
netstat -an
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm:
netstat -an, returning (0, output)
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: who
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: admin pts/0 Feb
6 15:59 (152.14.244.35)
2009-02-06 14:56:37|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: who,
returning (0, output)
2009-02-06 14:56:41|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:56:41
2009-02-06 14:56:41|26991|93:93|image|
utils.pm:reservation_being_processed(9779)|computerloadlog 'begin'
entry exists for reservation
|26991|93:93|image| ---- WARNING ----
|26991|93:93|image| 2009-02-06 14:56:41|26991|93:93|image|
vcld:main(272)|reservation 93 is already being processed
|26991|93:93|image| ( 0) utils.pm, notify (line: 683)
|26991|93:93|image| (-1) vcld, main (line: 272)
2009-02-06 14:56:53|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:56:53
2009-02-06 14:56:53|26991|93:93|image|
utils.pm:reservation_being_processed(9787)|computerloadlog 'begin'
entry does NOT exist for reservation 93
2009-02-06 14:56:53|26991|93:93|image|vcld:main(276)|reservation 93 is
NOT already being processed
2009-02-06 14:56:53|26991|93:93|image|utils.pm:get_request_info(6034)|
non-NCSU user affiliation found: Local
2009-02-06 14:56:53|26991|93:93|image|
utils.pm:get_management_node_info(7210)|management node info retrieved
from database for dhcp2
2009-02-06 14:56:53|26991|93:93|image|vcld:main(281)|retieved request
information from database
2009-02-06 14:56:53|26991|93:93|image|vcld:main(318)|sleeping for 2
seconds before updating state to pending
2009-02-06 14:56:55|26991|93:93|image|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:56:55|26991|93:93|image|
utils.pm:update_request_state(2237)|request 93 state updated to:
pending, laststate to: image
2009-02-06 14:56:55|26991|93:93|image|vcld:main(327)|request state
updated to pending, laststate image
2009-02-06 14:56:55|26991|93:93|image|utils.pm:insertloadlog(5379)|
inserted computer=2, begin, beginning to process, state is image
2009-02-06 14:56:55|26991|93:93|image|vcld:main(340)|inserted 'begin'
entry into computerloadlog for reservation 93
2009-02-06 14:56:55|26991|93:93|image|vcld:make_new_child(511)|
creating new process
2009-02-06 14:56:55|26991|93:93|image|vcld:make_new_child(515)|loaded
VCL::image module
2009-02-06 14:56:55|31543|93:93|image|vcld:make_new_child(587)|vcld
environment variable set to 0 for this process
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(80)|constructor
called, class=VCL::image
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(102)|VCL::image
object created
2009-02-06 14:56:55|26991|93:93|image|vcld:make_new_child(571)|current
number of forked kids: 2
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(86)|obtained
a database handle for this state process, stored as $ENV{dbh}
2009-02-06 14:56:55|31543|93:93|image|State.pm:check_image_os(388)|no
corrections need to be made to image OS: fc9image
2009-02-06 14:56:55|31543|93:93|image|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:56:55|31543|93:93|image|
utils.pm:rename_vcld_process(8326)|reservation count: 1
2009-02-06 14:56:55|31543|93:93|image|
utils.pm:rename_vcld_process(8345)|PARENTIMAGE: 1
2009-02-06 14:56:55|31543|93:93|image|
utils.pm:rename_vcld_process(8346)|SUBIMAGE: 0
2009-02-06 14:56:55|31543|93:93|image|
utils.pm:rename_vcld_process(8364)|renamed process to 'vcld VCL::image
93:93 image imaging'
2009-02-06 14:56:55|31543|93:93|image|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:56:55|31543|93:93|image|
DataStructure.pm:is_parent_reservation(783)|returning true: parent
reservation ID for this request: 93
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(122)|
attempting to load provisioning module: VCL::Module::Provisioning::esx
2009-02-06 14:56:55|31543|93:93|image|esx.pm:initialize(114)|vmware
ESX module initialized
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(130)|
VCL::Module::Provisioning::esx module successfully loaded
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(80)|constructor
called, class=VCL::Module::Provisioning::esx
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(102)|
VCL::Module::Provisioning::esx object created
2009-02-06 14:56:55|31543|93:93|image|esx.pm:initialize(114)|vmware
ESX module initialized
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(134)|
VCL::Module::Provisioning::esx provisioner object successfully created
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(148)|
attempting to load OS module: VCL::Module::OS::Linux
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(80)|constructor
called, class=VCL::Module::OS::Linux
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(102)|
VCL::Module::OS::Linux object created
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(156)|
VCL::Module::OS::Linux OS object successfully created
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(170)|
attempting to load predictive loading module:
VCL::Module::Predictive::Level_0
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(80)|constructor
called, class=VCL::Module::Predictive::Level_0
2009-02-06 14:56:55|31543|93:93|image|Module.pm:new(102)|
VCL::Module::Predictive::Level_0 object created
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(178)|
VCL::Module::Predictive::Level_0 predictive loading object
successfully created
2009-02-06 14:56:55|31543|93:93|image|State.pm:initialize(190)|
returning 1
2009-02-06 14:56:55|31543|93:93|image|vcld:make_new_child(596)|
VCL::image object created and initialized
2009-02-06 14:56:55|31543|93:93|image|utils.pm:mail(1281)|SUCCESS --
Sending mail To: root@localhost, VCL IMAGE Creation Started: esx3-
lamp-1.3.2-x86-v7
|31543|93:93|image| ---- WARNING ----
|31543|93:93|image| 2009-02-06 14:56:55|31543|93:93|image|
esx.pm:does_image_exist(742)|image /mnt/vcl/golden/esx3-lamp-1.3.2-x86-
v7 does NOT exists
|31543|93:93|image| ( 0) utils.pm, notify (line: 683)
|31543|93:93|image| (-1) esx.pm, does_image_exist (line: 742)
|31543|93:93|image| (-2) image.pm, process (line: 136)
|31543|93:93|image| (-3) vcld, make_new_child (line: 599)
|31543|93:93|image| (-4) vcld, main (line: 347)
2009-02-06 14:56:55|31543|93:93|image|image.pm:process(146)|image esx3-
lamp-1.3.2-x86-v7 does not exist in the repository
2009-02-06 14:56:55|31543|93:93|image|
DataStructure.pm:_automethod(628)|data structure updated:
image_lastupdate = 2009-02-06 14:56:55
2009-02-06 14:56:55|31543|93:93|image|
DataStructure.pm:_automethod(628)|data structure updated:
imagerevision_date_created = 2009-02-06 14:56:55
2009-02-06 14:56:55|31543|93:93|image|esx.pm:capture(471)|
**********************************************************
2009-02-06 14:56:55|31543|93:93|image|esx.pm:capture(472)|Entering ESX
Capture routine
2009-02-06 14:56:55|31543|93:93|image|esx.pm:capture(513)|found
previous name= esx3-lamp-1.3.2-x86-v0
2009-02-06 14:56:55|31543|93:93|image|esx.pm:capture(515)|SSHing to
node to configure currentimage.txt
2009-02-06 14:56:55|31543|93:93|image|utils.pm:run_ssh_command(6729)|
executing SSH command on test_vm: echo esx3-lamp-1.3.2-x86-v7 > /root/
currentimage.txt
2009-02-06 14:56:56|31543|93:93|image|utils.pm:run_ssh_command(6748)|
$?: 0, signal: 0, core dump: 0, exit status: 0
2009-02-06 14:56:56|31543|93:93|image|utils.pm:run_ssh_command(6796)|
run_ssh_command output: none
2009-02-06 14:56:56|31543|93:93|image|utils.pm:run_ssh_command(6799)|
SSH command executed on test_vm: echo esx3-lamp-1.3.2-x86-v7 > /root/
currentimage.txt, returning (0, output)
2009-02-06 14:56:56|31543|93:93|image|esx.pm:capture(524)|Power off
command: /usr/lib/vmware-viperl/apps/vm/vmcontrol.pl --server
'blade2-7.cnl.ncsu.edu' --vmname test_vm --operation poweroff --
username root --password 'anotherTestpassw0rd'
2009-02-06 14:56:57|31467|93:93|reserved|
utils.pm:check_connection(1828)|checking for connection by
administrator on test_vm, attempt 4
2009-02-06 14:56:57|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm:
netstat -an
2009-02-06 14:56:58|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:56:58|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm:
netstat -an, returning (0, output)
2009-02-06 14:56:58|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: who
2009-02-06 14:56:58|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 0, signal: 0, core dump: 0, exit
status: 0
2009-02-06 14:56:58|31467|93:93|reserved|
utils.pm:run_ssh_command(6796)|run_ssh_command output: admin pts/0 Feb
6 15:59 (152.14.244.35)
2009-02-06 14:56:58|31467|93:93|reserved|
utils.pm:run_ssh_command(6799)|SSH command executed on test_vm: who,
returning (0, output)
2009-02-06 14:57:03|31543|93:93|image|esx.pm:capture(527)|Powered off:
|31543|93:93|image| virtual machine 'test_vm' under host
blade2-7.cnl.ncsu.edu powered off
2009-02-06 14:57:03|31543|93:93|image|esx.pm:capture(529)|Waiting 5
seconds for power off
2009-02-06 14:57:07|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:57:07
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(539)|Preparing to
copy vmdk from /mnt/vcl/inuse/test_vm/esx3-lamp-1.3.2-x86-v0.vmdk to /
mnt/vcl/golden/esx3-lamp-1.3.2-x86-v7/esx3-lamp-1.3.2-x86-v7.vmdk
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(545)|COPIED VMDK
SUCCESSFULLY
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(549)|Preparing to
copy vmx from /mnt/vcl/inuse/test_vm/esx3-lamp-1.3.2-x86-v0.vmx to /
mnt/vcl/golden/esx3-lamp-1.3.2-x86-v7/esx3-lamp-1.3.2-x86-v7.vmx
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(555)|COPIED VMX
SUCCESSFULLY
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(558)|Rewriting
VMDK and VMX files with new image name
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(560)|VMX sed:
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(562)|VMDK sed:
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(568)|Preparing to
ssh to 152.14.17.78 copy vmdk-flat from /mnt/export/inuse/test_vm/esx3-
lamp-1.3.2-x86-v0-flat.vmdk to /mnt/export/golden/esx3-lamp-1.3.2-x86-
v7/esx3-lamp-1.3.2-x86-v7-flat.vmdk
2009-02-06 14:57:08|31543|93:93|image|esx.pm:capture(570)|SSHing to
copy vmdk-flat file
2009-02-06 14:57:18|31467|93:93|reserved|
utils.pm:check_connection(1828)|checking for connection by
administrator on test_vm, attempt 5
2009-02-06 14:57:18|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm:
netstat -an
2009-02-06 14:57:19|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:57:19
2009-02-06 14:57:31|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:57:31
2009-02-06 14:57:43|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:57:43
2009-02-06 14:57:55|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:57:55
2009-02-06 14:58:06|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:06|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 1/3: failed to execute SSH
command on test_vm: netstat -an, exit status: 255, SSH exits with the
exit status of the remote command or with 255 if an error occurred,
output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:06|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 2/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'netstat -an' 2>&1
2009-02-06 14:58:07|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:58:07
2009-02-06 14:58:09|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:09|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 2/3: failed to execute SSH
command on test_vm: netstat -an, exit status: 255, SSH exits with the
exit status of the remote command or with 255 if an error occurred,
output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:09|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 3/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'netstat -an' 2>&1
2009-02-06 14:58:12|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:12|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 3/3: failed to execute SSH
command on test_vm: netstat -an, exit status: 255, SSH exits with the
exit status of the remote command or with 255 if an error occurred,
output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:12|31467|93:93|reserved|
utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l
root -p 22 -x test_vm 'netstat -an' 2>&1, exit status: 255, output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:12|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: who
2009-02-06 14:58:15|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:15|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 1/3: failed to execute SSH
command on test_vm: who, exit status: 255, SSH exits with the exit
status of the remote command or with 255 if an error occurred, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:15|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 2/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'who' 2>&1
2009-02-06 14:58:18|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:18|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 2/3: failed to execute SSH
command on test_vm: who, exit status: 255, SSH exits with the exit
status of the remote command or with 255 if an error occurred, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:18|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 3/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'who' 2>&1
2009-02-06 14:58:19|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:58:19
2009-02-06 14:58:21|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 3/3: failed to execute SSH
command on test_vm: who, exit status: 255, SSH exits with the exit
status of the remote command or with 255 if an error occurred, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved|
utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l
root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:31|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:58:31
2009-02-06 14:58:41|31467|93:93|reserved|
utils.pm:check_connection(1828)|checking for connection by
administrator on test_vm, attempt 6
2009-02-06 14:58:41|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm:
netstat -an
2009-02-06 14:58:43|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:58:43
2009-02-06 14:58:44|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:44|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 1/3: failed to execute SSH
command on test_vm: netstat -an, exit status: 255, SSH exits with the
exit status of the remote command or with 255 if an error occurred,
output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:44|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 2/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'netstat -an' 2>&1
2009-02-06 14:58:47|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:47|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 2/3: failed to execute SSH
command on test_vm: netstat -an, exit status: 255, SSH exits with the
exit status of the remote command or with 255 if an error occurred,
output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:47|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 3/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'netstat -an' 2>&1
2009-02-06 14:58:50|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:50|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 3/3: failed to execute SSH
command on test_vm: netstat -an, exit status: 255, SSH exits with the
exit status of the remote command or with 255 if an error occurred,
output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:50|31467|93:93|reserved|
utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l
root -p 22 -x test_vm 'netstat -an' 2>&1, exit status: 255, output:
|31467|93:93|reserved| ssh output (netstat -a...): ssh: connect to
host test_vm port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1903)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:50|31467|93:93|reserved|
utils.pm:run_ssh_command(6729)|executing SSH command on test_vm: who
2009-02-06 14:58:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 1/3: failed to execute SSH
command on test_vm: who, exit status: 255, SSH exits with the exit
status of the remote command or with 255 if an error occurred, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:53|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 2/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'who' 2>&1
2009-02-06 14:58:55|26991|vcld:main(164)|lastcheckin time updated for
management node 1: 2009-02-06 14:58:55
2009-02-06 14:58:56|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:56|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 2/3: failed to execute SSH
command on test_vm: who, exit status: 255, SSH exits with the exit
status of the remote command or with 255 if an error occurred, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:58:56|31467|93:93|reserved|
utils.pm:run_ssh_command(6733)|attempt 3/3: executing SSH command on
test_vm: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -
x test_vm 'who' 2>&1
2009-02-06 14:58:59|31467|93:93|reserved|
utils.pm:run_ssh_command(6748)|$?: 65280, signal: 0, core dump: 0,
exit status: 255
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:59|31467|93:93|reserved|
utils.pm:run_ssh_command(6786)|attempt 3/3: failed to execute SSH
command on test_vm: who, exit status: 255, SSH exits with the exit
status of the remote command or with 255 if an error occurred, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6786)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:59|31467|93:93|reserved|
utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l
root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
2009-02-06 14:59:01|31543|93:93|image|image.pm:process(199)|esx3-
lamp-1.3.2-x86-v7 image files successfully saved
2009-02-06 14:59:01|31543|93:93|image|
utils.pm:update_request_state(2237)|request 93 state updated to:
completed, laststate to: image
2009-02-06 14:59:01|31543|93:93|image|image.pm:process(202)|request
state updated to completed, laststate to image
2009-02-06 14:59:01|31543|93:93|image|esx.pm:get_image_size(771)|
getting size of image: esx3-lamp-1.3.2-x86-v7
2009-02-06 14:59:01|31543|93:93|image|image.pm:process(211)|size of
esx3-lamp-1.3.2-x86-v7: 995
2009-02-06 14:59:01|31543|93:93|image|
DataStructure.pm:_automethod(628)|data structure updated: image_size =
995
2009-02-06 14:59:01|31543|93:93|image|image.pm:process(240)|image and
imagerevision tables updated for image=10, imagerevision=18, name=esx3-
lamp-1.3.2-x86-v7, lastupdate=2009-02-06 14:56:55, deleted=0, size=995
2009-02-06 14:59:01|31543|93:93|image|utils.pm:mail(1281)|SUCCESS --
Sending mail To: root@localhost, VCL -- ESX LAMP VM Image Creation
Succeeded
2009-02-06 14:59:01|31543|93:93|image|utils.pm:mail(1281)|SUCCESS --
Sending mail To: root@localhost, VCL IMAGE Creation Completed: esx3-
lamp-1.3.2-x86-v7
Brian Bouterse
Secure Open Systems Initiative
919.698.8796
On Feb 6, 2009, at 4:38 PM, Brian Bouterse wrote:
> Our testing steps are these:
>
> - make the reservation/request
> - wait for connect button
> - click connect button
> - connect to the machine
> - touch a file for uniqueness purposes
> - then you can hit the create image button
> - indicate we want to update the image (not create a new image from
> this one).
>
> Still looking into why the child create to handle the original
> imaging reservation/request continues to try to ssh continuously
> instead of marking the imaging event as in use like it should. That
> log is attached; it's towards the end, and really shows itself with
> repeated messages like:
>
> |31467|93:93|reserved| ---- WARNING ----
> |31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved|
> utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
> attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l
> root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output:
> |31467|93:93|reserved| ssh output (who): ssh: connect to host
> test_vm port 22: No route to host
> |31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
> |31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
> |31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
> |31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
> |31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
> |31467|93:93|reserved| (-5) vcld, main (line: 347)
>
>
> Brian
>
>
>
>
> Brian Bouterse
> Secure Open Systems Initiative
> 919.698.8796
>
>
>
>
> On Feb 6, 2009, at 3:41 PM, Aaron Peeler wrote:
>
>> What are your steps when you start the imaging process?
>>
>> You have to do the following:
>> - make the reservation/request
>> - wait for connect button
>> - click connect button
>> - connect to the machine
>> - then you can hit the create image button to start the imaging
>> process.
>>
>> If your in the reserved state as reported in the log file below -
>> the process is polling the node looking for the user to connect.
>> Once connected then it proceeds to the inuse state.
>>
>> This is actually a user interface bug. We're assuming users will
>> actually login to the remote node and install something - but in
>> your case if your just testing I can see going straight to the
>> image creation mode. Created JIRA issue VCL-84.
>>
>> Under normal operation you don't need to clean on the
>> computerloadlog. Since your developing and likely having to break
>> things, it's going to behave a little bit different.
>>
>> Aaron
>>
>> --On February 6, 2009 3:12:16 PM -0500 Andrew Brown <brownan@gmail.com
>> > wrote:
>>
>>> So we have things mostly working except for one detail: When we
>>> check out
>>> a computer for capture through "update image", all that goes fine.
>>> But
>>> when we go and hit "create" to actually capture the image, vcld
>>> hangs
>>> with the log messages that Brian posted, repeating.
>>>
>>> If we go in and remove the database entry in the computerloadlog
>>> table,
>>> the capture goes through; however, there seems to be a process that
>>> lingers and keeps trying to ssh to the machine even when our
>>> capture is
>>> proceeding. It gives us log entries like this:
>>> 2009-02-06
>>> 14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)|
>>> checking
>>> for connection by administrator on test_vm, attempt 4
>>> 2009-02-06
>>> 14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|
>>> executing
>>> SSH command on test_vm: netstat -an
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
>>> signal: 0, core dump: 0, exit status: 0
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
>>> command
>>> executed on test_vm: netstat -an, returning (0, output)
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|
>>> executing
>>> SSH command on test_vm: who
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
>>> signal: 0, core dump: 0, exit status: 0
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)|
>>> run_ssh_comm
>>> and output: admin pts/0 Feb 6 15:59 (152.14.244.35)
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
>>> command
>>> executed on test_vm: who, returning (0, output)
>>> (the pid being different from the one that is going into our capture
>>> routine)
>>>
>>> This may be unrelated, I don't know. But we need to fix it so we
>>> don't
>>> need to manually delete the computerloadlog entry each time, as
>>> well as
>>> fix it so this second process... goes away or something.
>>>
>>> -Andrew
>>>
>>> On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler <aa...@ncsu.edu>
>>> wrote:
>>>
>>>> Actually it's computerloadlog
>>>>
>>>> The computerloadlog table has three purposes.
>>>> It also has two dependent tables computerloadflow and
>>>> computerloadstate
>>>>
>>>>
>>>> 1) Provides information to the user about what state user request
>>>> is in,
>>>> where it's at in loading, performing post install tasks, adding
>>>> user
>>>> acct, etc. This information is seen on the current reservations
>>>> page by
>>>> the user clicking on the pending link. This link opens a ajax based
>>>> window and will provided the state the load is in. At some point
>>>> during
>>>> your load routine - you'll want to add these type of calls in the
>>>> appropriate places.
>>>>
>>>> insertloadlog($reservation_id, $computer_id, "startload",
>>>> "$computer_node_name $image_name");
>>>>
>>>> But this isn't required to get a provision module functional.
>>>> It's a nice
>>>> feature, so the user can be aware of what's going on. We'll put
>>>> together
>>>> information on how to best use the insertloadlog.
>>>>
>>>> 2) If a new|reload request is started and the computer state is
>>>> reload or
>>>> reloading. The forked request process will use that table to
>>>> determine
>>>> what to do. Either take over, fail, or let the current owning
>>>> process
>>>> complete.
>>>>
>>>> 3) It is also used in vcld to prevent duplicate or competing
>>>> processes.
>>>>
>>>> It gets cleaned out on a per-reservation basis. When a
>>>> reservations goes
>>>> into the inuse state and when a reservation is removed from the db.
>>>>
>>>> In your case, since your just testing your module, just manually
>>>> empty
>>>> the computerloadlog table as needed.
>>>>
>>>> Aaron
>>>>
>>>>
>>>>
>>>> --On February 6, 2009 11:20:47 AM -0500 Brian Bouterse
>>>> <bm...@ncsu.edu> wrote:
>>>>
>>>> What is the purpose of the loadlog table? Currently the ESX
>>>> provisioning
>>>>> module doesn't update any information in the loadlog table, and
>>>>> the
>>>>> provisioning works fine. We're working on the capture()
>>>>> portions now,
>>>>> and when we go through the web interface to have it "capture" our
>>>>> reservation, vcld warns saying:
>>>>>
>>>>> | 25052|72:72|image| ---- WARNING ----
>>>>> | 25052|72:72|image| 2009-02-06
>>>>> | 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is
>>>>> already
>>>>> | being processed 25052|72:72|image| ( 0) utils.pm, notify
>>>>> (line: 683)
>>>>> | 25052|72:72|image| (-1) vcld, main (line: 272)
>>>>>
>>>>> Is this normal for the "image update" process?
>>>>>
>>>>> Best,
>>>>> Brian
>>>>>
>>>>> Brian Bouterse
>>>>> Secure Open Systems Initiative
>>>>> 919.698.8796
>>>>>
>>>>>
>>>>>
>>>>> \
>
Re: utils.pm::check_connection()
Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tuesday February 10, 2009, Brian Bouterse wrote:
> An aside question .... Why do we require VCL to be aware of when the
> users connect in the first place? What if I want to reserve a server
> instance, but never want to SSH into it. Currently, VCL would not
> allow that use case because of these additional checks.
All I can answer is the aside; so, I cut the rest out.
The reason for the user check is limited resources. We don't want users
making a reservation for 4 hours, using it for 30 minutes, and then leaving
it sitting there unused for 3.5 hours. You can disable the user checks for
the "server instance" case you brought up by going to Manage Images->Edit
Image Profiles->Edit (for the desired image)->Advanced Options and set "Check
for logged in user" to no. However, even in this case, the user still needs
to click the Connect button at least once for that reservation.
Josh
- --
- -------------------------------
Josh Thompson
Systems Programmer
Virtual Computing Lab (VCL)
North Carolina State University
Josh_Thompson@ncsu.edu
919-515-5323
my GPG/PGP key can be found at pgp.mit.edu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFJkfXXV/LQcNdtPQMRAv3qAJ9HC+FA0UlG6BDjfE+JIGsIl6/K/wCeNVCL
9OXQdWbrcZZDj6fAcabJPvc=
=oDaU
-----END PGP SIGNATURE-----
Re: utils.pm::check_connection()
Posted by Aaron Peeler <aa...@ncsu.edu>.
I don't think we can/should depend on the 'who' output. I've experienced
some cases where 'who' cmd reported someone, but in-fact no-one was
connected. Maybe it's due to a zombie or orphaned process, I don't know.
But in any case I think the best thing is to first update the regex to
handle the case brought up, then plan moving to the OS modules - if needed.
Aaron
--On February 12, 2009 4:54:56 PM -0500 Andy Kurth <an...@ncsu.edu>
wrote:
> Thanks for figuring out why the regex wasn't working. You're right,
> check_connection() needs to be modularized and moved to OS modules.
>
> Brian Bouterse wrote:
>
>> Problem 1) The linux portions of the utils.pm::check_connection()
>> function currently use two checks, one 'netstat' command check and one
>> 'who' command check. With IPv6 hosts, the netstat returned lines don't
>> look the same, therefore the regex on the netstat return doesn't stand a
>> chance of working. I'm concerned that this type of regex is a losing
>> battle. What do you all think? The who seems to work more reliable,
>> could we just take netstat out?
>
> Seems fine to me but modifications to the 'who' output checking will need
> to be done. check_connection() is currently only searching the 'who'
> output for the reservation username. netstat is checking for any
> established connection to port 22.
>
> If 'who' is the only method used, I think you would need to check its
> output for *any* logged in user. An example where it would certainly
> fail if the current 'who' code was used alone would be for images
> configured with user groups. Anyone in the user group may be logged in
> rather than the reservation user. It wouldn't catch this.
>
> We definitely want to error on the lenient side. It may be safer to keep
> both but loosen up the regex to something like this:
> /tcp.*($ipaddress:22)\s+.*(ESTABLISHED)/
>
>
>> Problem 2) When a reservation is made, VCL gives us a username,
>> password, and an IP to log in with. We logon to the web portal with the
>> 'admin' username, but when VCL goes to determine if the user has logged
>> in yet, it checks for the user "administrator". Why are they different?
>> How do we get it to use the same username that was given to the user?
>
> This is due to the bug I noted earlier. If you make a Windows imaging
> reservation, the username is always Administrator. If you make a Linux
> imaging reservation, the username is always the normal reservation login
> name (user.unityid).
>
> reserved.pm is only checking the forimaging flag in order to determine
> which username to pass to check_connection(). It should be checking both
> the forimaging flag and the OS.
>
> Current (bad):
> if ($forimaging==1) {
> check_connection(user='Administrator');
> }
> else {
> check_connection(user=$user_unityid);
> }
>
> Should be:
> if ($forimaging==1 && OS==Windows) {
> check_connection(user='Administrator');
> }
> else {
> check_connection(user=$user_unityid);
> }
>
>
> Hope this helps,
> Andy
Aaron Peeler
OIT Advanced Computing
College of Engineering-NCSU
919.513.4571
http://vcl.ncsu.edu
Re: utils.pm::check_connection()
Posted by Andy Kurth <an...@ncsu.edu>.
Looks OK to me. I also just realized reserved.pm isn't using the DataStructure
object, but the old $request_data->{} hash. I'll work on this in the near future.
-Andy
Andrew Brown wrote:
> We've made your suggested change, and verified that it works. Below is the
> exact change we made. Brian has just created issue VCL-101 for this, and
> unless there are any objections, we'll commit this patch.
>
> This still doesn't address the issue with netstat and the invalid regex, but
> at least logged in users are detected now with the who command when imaging.
> -Andrew
>
> Index: reserved.pm
> ===================================================================
> --- reserved.pm (revision 750099)
> +++ reserved.pm (working copy)
> @@ -417,12 +417,12 @@
> notify($ERRORS{'OK'}, 0, "checkuser flag is set to 1,
> checking user connection");
> # Check for the normal user ID if this isn't an imaging
> request
> # Check for "administrator" if this is an imaging request
> - if ($request_forimaging) {
> - notify($ERRORS{'OK'}, 0, "forimaging flag is set to
> 1, checking for connection by administrator");
> + if ($request_forimaging && $image_os_name =~
> /win|vmwarewin/) {
> + notify($ERRORS{'OK'}, 0, "forimaging flag is set to
> 1 and imageosname is $image_os_name, checking for connection by
> administrator");
> $retval_conn = check_connection($nodename,
> $computer_ip_address, $computer_type, $remote_ip, $time_limit,
> $image_os_name, 0, $request_id, "administrator",$image_os_type);
> }
> else {
> - notify($ERRORS{'OK'}, 0, "forimaging flag is set to
> 0, checking for connection by $user_unityid");
> + notify($ERRORS{'OK'}, 0, "forimaging flag is set to
> $request_forimaging and imageosname is $image_os_name, checking for
> connection by $user_unityid");
> $retval_conn = check_connection($nodename,
> $computer_ip_address, $computer_type, $remote_ip, $time_limit,
> $image_os_name, 0, $request_id, $user_unityid,$image_os_type);
> }
> } ## end else [ if (!$imagemeta_checkuser)
>
--
Andy Kurth
Virtual Computing Lab
Office of Information Technology
North Carolina State University
andy_kurth@ncsu.edu
919.513.4090
Re: utils.pm::check_connection()
Posted by Andrew Brown <br...@gmail.com>.
>
> Problem 2) When a reservation is made, VCL gives us a username, password,
>> and an IP to log in with. We logon to the web portal with the 'admin'
>> username, but when VCL goes to determine if the user has logged in yet, it
>> checks for the user "administrator". Why are they different? How do we get
>> it to use the same username that was given to the user?
>>
>
> This is due to the bug I noted earlier. If you make a Windows imaging
> reservation, the username is always Administrator. If you make a Linux
> imaging reservation, the username is always the normal reservation login
> name (user.unityid).
>
We've made your suggested change, and verified that it works. Below is the
exact change we made. Brian has just created issue VCL-101 for this, and
unless there are any objections, we'll commit this patch.
This still doesn't address the issue with netstat and the invalid regex, but
at least logged in users are detected now with the who command when imaging.
-Andrew
Index: reserved.pm
===================================================================
--- reserved.pm (revision 750099)
+++ reserved.pm (working copy)
@@ -417,12 +417,12 @@
notify($ERRORS{'OK'}, 0, "checkuser flag is set to 1,
checking user connection");
# Check for the normal user ID if this isn't an imaging
request
# Check for "administrator" if this is an imaging request
- if ($request_forimaging) {
- notify($ERRORS{'OK'}, 0, "forimaging flag is set to
1, checking for connection by administrator");
+ if ($request_forimaging && $image_os_name =~
/win|vmwarewin/) {
+ notify($ERRORS{'OK'}, 0, "forimaging flag is set to
1 and imageosname is $image_os_name, checking for connection by
administrator");
$retval_conn = check_connection($nodename,
$computer_ip_address, $computer_type, $remote_ip, $time_limit,
$image_os_name, 0, $request_id, "administrator",$image_os_type);
}
else {
- notify($ERRORS{'OK'}, 0, "forimaging flag is set to
0, checking for connection by $user_unityid");
+ notify($ERRORS{'OK'}, 0, "forimaging flag is set to
$request_forimaging and imageosname is $image_os_name, checking for
connection by $user_unityid");
$retval_conn = check_connection($nodename,
$computer_ip_address, $computer_type, $remote_ip, $time_limit,
$image_os_name, 0, $request_id, $user_unityid,$image_os_type);
}
} ## end else [ if (!$imagemeta_checkuser)
Re: utils.pm::check_connection()
Posted by Andy Kurth <an...@ncsu.edu>.
Thanks for figuring out why the regex wasn't working. You're right,
check_connection() needs to be modularized and moved to OS modules.
Brian Bouterse wrote:
> Problem 1) The linux portions of the utils.pm::check_connection()
> function currently use two checks, one 'netstat' command check and one
> 'who' command check. With IPv6 hosts, the netstat returned lines don't
> look the same, therefore the regex on the netstat return doesn't stand a
> chance of working. I'm concerned that this type of regex is a losing
> battle. What do you all think? The who seems to work more reliable,
> could we just take netstat out?
Seems fine to me but modifications to the 'who' output checking will need to be
done. check_connection() is currently only searching the 'who' output for the
reservation username. netstat is checking for any established connection to
port 22.
If 'who' is the only method used, I think you would need to check its output for
*any* logged in user. An example where it would certainly fail if the current
'who' code was used alone would be for images configured with user groups.
Anyone in the user group may be logged in rather than the reservation user. It
wouldn't catch this.
We definitely want to error on the lenient side. It may be safer to keep both
but loosen up the regex to something like this:
/tcp.*($ipaddress:22)\s+.*(ESTABLISHED)/
> Problem 2) When a reservation is made, VCL gives us a username,
> password, and an IP to log in with. We logon to the web portal with the
> 'admin' username, but when VCL goes to determine if the user has logged
> in yet, it checks for the user "administrator". Why are they different?
> How do we get it to use the same username that was given to the user?
This is due to the bug I noted earlier. If you make a Windows imaging
reservation, the username is always Administrator. If you make a Linux imaging
reservation, the username is always the normal reservation login name
(user.unityid).
reserved.pm is only checking the forimaging flag in order to determine which
username to pass to check_connection(). It should be checking both the
forimaging flag and the OS.
Current (bad):
if ($forimaging==1) {
check_connection(user='Administrator');
}
else {
check_connection(user=$user_unityid);
}
Should be:
if ($forimaging==1 && OS==Windows) {
check_connection(user='Administrator');
}
else {
check_connection(user=$user_unityid);
}
Hope this helps,
Andy
utils.pm::check_connection()
Posted by Brian Bouterse <bm...@ncsu.edu>.
The goal of the utils.pm::check_connection() function is to answer the
question "has a user connected to the provisioned image or not?" It
involves ssh'ing to the OS and asking it to list the current
connections to the OS. This contains OS dependant stuff, and the
check_connection() function has not been modularized into the OS
modules yet, so it sits entirely in utils.pm::check_connection().
There are two problems on the table currently:
Problem 1) The linux portions of the utils.pm::check_connection()
function currently use two checks, one 'netstat' command check and one
'who' command check. With IPv6 hosts, the netstat returned lines
don't look the same, therefore the regex on the netstat return doesn't
stand a chance of working. I'm concerned that this type of regex is a
losing battle. What do you all think? The who seems to work more
reliable, could we just take netstat out?
An aside question .... Why do we require VCL to be aware of when the
users connect in the first place? What if I want to reserve a server
instance, but never want to SSH into it. Currently, VCL would not
allow that use case because of these additional checks.
Problem 2) When a reservation is made, VCL gives us a username,
password, and an IP to log in with. We logon to the web portal with
the 'admin' username, but when VCL goes to determine if the user has
logged in yet, it checks for the user "administrator". Why are they
different? How do we get it to use the same username that was given to
the user?
Thanks!
Brian
Brian Bouterse
Secure Open Systems Initiative
919.698.8796
On Feb 10, 2009, at 1:24 PM, Brian Bouterse wrote:
> When running the netstat -an command on our reserved machine we get
> the following line:
>
> tcp 0 0 ::ffff:152.14.17.82:22 ::ffff:152.14.243:50257
> ESTABLISHED
>
> The ::ffff: explains why the regex below doesn't match. Any
> comments on what to do about this? Change the regex?
>
> /tcp\s+([0-9]*)\s+([0-9]*)\s($ipaddress:22)\s+([.0-9]*):([0-9]*)(.*)
> (ESTABLISHED)/
>
> Brian
>
>
> Brian Bouterse
> Secure Open Systems Initiative
> 919.698.8796
>
>
>
>
> On Feb 7, 2009, at 12:15 PM, Andy Kurth wrote:
>
>> The reserved process is not detecting that you have logged in to
>> the node. The problem is occurring in
>> utils.pm::check_connection(). For Linux images, the subroutine
>> attempts to determine if the user is logged in via "netstat -an"
>> and "who".
>>
>> The "who" command is failing to determine that you logged in
>> because the login name you're using is "admin" instead of
>> "administrator". I think you'll be able to get past this problem
>> if you change your reservation login name (user.unityid) to
>> "administrator".
>>
>> (Note: There's a bug in reserved.pm which is causing this. It is
>> passing "Administrator" instead of the correct user name to
>> check_connection() if forimaging=1, OS=Linux, and userstandalone=1.)
>>
>> The "netstat -an" command is failing because the regex in
>> utils.pm::check_connection() isn't finding a match. Here's the
>> regex and what the "netstat -an" output line looks like when I
>> connect to a Linux image via SSH:
>>
>> /tcp\s+([0-9]*)\s+([0-9]*)\s($ipaddress:22)\s+([.0-9]*):([0-9]*)(.*)
>> (ESTABLISHED)/
>>
>> tcp 0 0 152.46.18.198:22
>> 69.111.94.75:51803 ESTABLISHED
>>
>> I'm not sure exactly why it isn't finding a match because the
>> netstat output isn't displayed in the log file. To troubleshoot
>> this, first remove the last "1" argument from this command near
>> line 1911 in utils.pm:
>> run_ssh_command($shortnodename, $identity, "netstat -an", "root",
>> 22, 1);
>>
>> The last argument instructs run_ssh_command to not display the
>> command output. This was added because netstat returns a lot of
>> output and is called many times when a request is in the reserved
>> state. The output was causing the log file to grow quickly.
>>
>> Once you see the netstat output in the log file you should be able
>> to tell why the regex isn't matching. The $ipaddress variable is
>> the IP of the node. In your log file, it is 152.14.17.82. The 4th
>> grouping returned by the regex should match your remote IP, which
>> was 152.14.244.35 in your log file. You should expect to see a
>> netstat line like this:
>> tcp 0 0 152.14.17.82:22
>> 152.14.244.35:51803 ESTABLISHED
>>
>> The "connect to host test_vm port 22: No route to host" failures at
>> the end of the log file begin to occur after you manually delete
>> the computerloadlog 'begin' entry, the image.pm process is then
>> able to start up, then this is called:
>>
>> 2009-02-06 14:56:56|31543|93:93|image|esx.pm:capture(524)|Power off
>> command: /usr/lib/vmware-viperl/apps/vm/vmcontrol.pl --server
>> 'blade2-7.
>> cnl.ncsu.edu' --vmname test_vm --operation poweroff --username root
>> --password 'xxx'
>>
>> 2009-02-06 14:57:03|31543|93:93|image|esx.pm:capture(527)|Powered
>> off:
>> |31543|93:93|image| virtual machine 'test_vm' under host
>> blade2-7.cnl.ncsu.edu powered off
>>
>> Hope this helps,
>> Andy
>>
>> Brian Bouterse wrote:
>>> Our testing steps are these:
>>> - make the reservation/request
>>> - wait for connect button
>>> - click connect button
>>> - connect to the machine
>>> - touch a file for uniqueness purposes
>>> - then you can hit the create image button
>>> - indicate we want to update the image (not create a new image
>>> from this one).
>>> Still looking into why the child create to handle the original
>>> imaging reservation/request continues to try to ssh continuously
>>> instead of marking the imaging event as in use like it should.
>>> That log is attached; it's towards the end, and really shows
>>> itself with repeated messages like:
>>> |31467|93:93|reserved| ---- WARNING ----
>>> |31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved|
>>> utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
>>> attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -
>>> l root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output:
>>> |31467|93:93|reserved| ssh output (who): ssh: connect to host
>>> test_vm port 22: No route to host
>>> |31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
>>> |31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
>>> |31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
>>> |31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
>>> |31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
>>> |31467|93:93|reserved| (-5) vcld, main (line: 347)
>>> Brian
>>> Brian Bouterse
>>> Secure Open Systems Initiative
>>> 919.698.8796
>>> On Feb 6, 2009, at 3:41 PM, Aaron Peeler wrote:
>>>> What are your steps when you start the imaging process?
>>>>
>>>> You have to do the following:
>>>> - make the reservation/request
>>>> - wait for connect button
>>>> - click connect button
>>>> - connect to the machine
>>>> - then you can hit the create image button to start the imaging
>>>> process.
>>>>
>>>> If your in the reserved state as reported in the log file below -
>>>> the process is polling the node looking for the user to connect.
>>>> Once connected then it proceeds to the inuse state.
>>>>
>>>> This is actually a user interface bug. We're assuming users will
>>>> actually login to the remote node and install something - but in
>>>> your case if your just testing I can see going straight to the
>>>> image creation mode. Created JIRA issue VCL-84.
>>>>
>>>> Under normal operation you don't need to clean on the
>>>> computerloadlog. Since your developing and likely having to break
>>>> things, it's going to behave a little bit different.
>>>>
>>>> Aaron
>>>>
>>>> --On February 6, 2009 3:12:16 PM -0500 Andrew Brown <brownan@gmail.com
>>>> > wrote:
>>>>
>>>>> So we have things mostly working except for one detail: When we
>>>>> check out
>>>>> a computer for capture through "update image", all that goes
>>>>> fine. But
>>>>> when we go and hit "create" to actually capture the image, vcld
>>>>> hangs
>>>>> with the log messages that Brian posted, repeating.
>>>>>
>>>>> If we go in and remove the database entry in the computerloadlog
>>>>> table,
>>>>> the capture goes through; however, there seems to be a process
>>>>> that
>>>>> lingers and keeps trying to ssh to the machine even when our
>>>>> capture is
>>>>> proceeding. It gives us log entries like this:
>>>>> 2009-02-06
>>>>> 14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)|
>>>>> checking
>>>>> for connection by administrator on test_vm, attempt 4
>>>>> 2009-02-06
>>>>> 14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|
>>>>> executing
>>>>> SSH command on test_vm: netstat -an
>>>>> 2009-02-06
>>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?:
>>>>> 0,
>>>>> signal: 0, core dump: 0, exit status: 0
>>>>> 2009-02-06
>>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
>>>>> command
>>>>> executed on test_vm: netstat -an, returning (0, output)
>>>>> 2009-02-06
>>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|
>>>>> executing
>>>>> SSH command on test_vm: who
>>>>> 2009-02-06
>>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?:
>>>>> 0,
>>>>> signal: 0, core dump: 0, exit status: 0
>>>>> 2009-02-06
>>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)|
>>>>> run_ssh_comm
>>>>> and output: admin pts/0 Feb 6 15:59 (152.14.244.35)
>>>>> 2009-02-06
>>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
>>>>> command
>>>>> executed on test_vm: who, returning (0, output)
>>>>> (the pid being different from the one that is going into our
>>>>> capture
>>>>> routine)
>>>>>
>>>>> This may be unrelated, I don't know. But we need to fix it so we
>>>>> don't
>>>>> need to manually delete the computerloadlog entry each time, as
>>>>> well as
>>>>> fix it so this second process... goes away or something.
>>>>>
>>>>> -Andrew
>>>>>
>>>>> On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler <aaron_peeler@ncsu.edu
>>>>> >
>>>>> wrote:
>>>>>
>>>>>> Actually it's computerloadlog
>>>>>>
>>>>>> The computerloadlog table has three purposes.
>>>>>> It also has two dependent tables computerloadflow and
>>>>>> computerloadstate
>>>>>>
>>>>>>
>>>>>> 1) Provides information to the user about what state user
>>>>>> request is in,
>>>>>> where it's at in loading, performing post install tasks, adding
>>>>>> user
>>>>>> acct, etc. This information is seen on the current reservations
>>>>>> page by
>>>>>> the user clicking on the pending link. This link opens a ajax
>>>>>> based
>>>>>> window and will provided the state the load is in. At some
>>>>>> point during
>>>>>> your load routine - you'll want to add these type of calls in the
>>>>>> appropriate places.
>>>>>>
>>>>>> insertloadlog($reservation_id, $computer_id, "startload",
>>>>>> "$computer_node_name $image_name");
>>>>>>
>>>>>> But this isn't required to get a provision module functional.
>>>>>> It's a nice
>>>>>> feature, so the user can be aware of what's going on. We'll put
>>>>>> together
>>>>>> information on how to best use the insertloadlog.
>>>>>>
>>>>>> 2) If a new|reload request is started and the computer state is
>>>>>> reload or
>>>>>> reloading. The forked request process will use that table to
>>>>>> determine
>>>>>> what to do. Either take over, fail, or let the current owning
>>>>>> process
>>>>>> complete.
>>>>>>
>>>>>> 3) It is also used in vcld to prevent duplicate or competing
>>>>>> processes.
>>>>>>
>>>>>> It gets cleaned out on a per-reservation basis. When a
>>>>>> reservations goes
>>>>>> into the inuse state and when a reservation is removed from the
>>>>>> db.
>>>>>>
>>>>>> In your case, since your just testing your module, just
>>>>>> manually empty
>>>>>> the computerloadlog table as needed.
>>>>>>
>>>>>> Aaron
>>>>>>
>>>>>>
>>>>>>
>>>>>> --On February 6, 2009 11:20:47 AM -0500 Brian Bouterse
>>>>>> <bm...@ncsu.edu> wrote:
>>>>>>
>>>>>> What is the purpose of the loadlog table? Currently the ESX
>>>>>> provisioning
>>>>>>> module doesn't update any information in the loadlog table,
>>>>>>> and the
>>>>>>> provisioning works fine. We're working on the capture()
>>>>>>> portions now,
>>>>>>> and when we go through the web interface to have it "capture"
>>>>>>> our
>>>>>>> reservation, vcld warns saying:
>>>>>>>
>>>>>>> | 25052|72:72|image| ---- WARNING ----
>>>>>>> | 25052|72:72|image| 2009-02-06
>>>>>>> | 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is
>>>>>>> already
>>>>>>> | being processed 25052|72:72|image| ( 0) utils.pm, notify
>>>>>>> (line: 683)
>>>>>>> | 25052|72:72|image| (-1) vcld, main (line: 272)
>>>>>>>
>>>>>>> Is this normal for the "image update" process?
>>>>>>>
>>>>>>> Best,
>>>>>>> Brian
>>>>>>>
>>>>>>> Brian Bouterse
>>>>>>> Secure Open Systems Initiative
>>>>>>> 919.698.8796
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> \
>>
>
Re: role of loadlog table - computerloadlog
Posted by Brian Bouterse <bm...@ncsu.edu>.
When running the netstat -an command on our reserved machine we get
the following line:
tcp 0 0 ::ffff:152.14.17.82:22 ::ffff:152.14.243:50257
ESTABLISHED
The ::ffff: explains why the regex below doesn't match. Any comments
on what to do about this? Change the regex?
/tcp\s+([0-9]*)\s+([0-9]*)\s($ipaddress:22)\s+([.0-9]*):([0-9]*)(.*)
(ESTABLISHED)/
Brian
Brian Bouterse
Secure Open Systems Initiative
919.698.8796
On Feb 7, 2009, at 12:15 PM, Andy Kurth wrote:
> The reserved process is not detecting that you have logged in to the
> node. The problem is occurring in utils.pm::check_connection().
> For Linux images, the subroutine attempts to determine if the user
> is logged in via "netstat -an" and "who".
>
> The "who" command is failing to determine that you logged in because
> the login name you're using is "admin" instead of "administrator".
> I think you'll be able to get past this problem if you change your
> reservation login name (user.unityid) to "administrator".
>
> (Note: There's a bug in reserved.pm which is causing this. It is
> passing "Administrator" instead of the correct user name to
> check_connection() if forimaging=1, OS=Linux, and userstandalone=1.)
>
> The "netstat -an" command is failing because the regex in
> utils.pm::check_connection() isn't finding a match. Here's the
> regex and what the "netstat -an" output line looks like when I
> connect to a Linux image via SSH:
>
> /tcp\s+([0-9]*)\s+([0-9]*)\s($ipaddress:22)\s+([.0-9]*):([0-9]*)(.*)
> (ESTABLISHED)/
>
> tcp 0 0 152.46.18.198:22
> 69.111.94.75:51803 ESTABLISHED
>
> I'm not sure exactly why it isn't finding a match because the
> netstat output isn't displayed in the log file. To troubleshoot
> this, first remove the last "1" argument from this command near line
> 1911 in utils.pm:
> run_ssh_command($shortnodename, $identity, "netstat -an", "root",
> 22, 1);
>
> The last argument instructs run_ssh_command to not display the
> command output. This was added because netstat returns a lot of
> output and is called many times when a request is in the reserved
> state. The output was causing the log file to grow quickly.
>
> Once you see the netstat output in the log file you should be able
> to tell why the regex isn't matching. The $ipaddress variable is
> the IP of the node. In your log file, it is 152.14.17.82. The 4th
> grouping returned by the regex should match your remote IP, which
> was 152.14.244.35 in your log file. You should expect to see a
> netstat line like this:
> tcp 0 0 152.14.17.82:22
> 152.14.244.35:51803 ESTABLISHED
>
> The "connect to host test_vm port 22: No route to host" failures at
> the end of the log file begin to occur after you manually delete the
> computerloadlog 'begin' entry, the image.pm process is then able to
> start up, then this is called:
>
> 2009-02-06 14:56:56|31543|93:93|image|esx.pm:capture(524)|Power off
> command: /usr/lib/vmware-viperl/apps/vm/vmcontrol.pl --server
> 'blade2-7.
> cnl.ncsu.edu' --vmname test_vm --operation poweroff --username root
> --password 'xxx'
>
> 2009-02-06 14:57:03|31543|93:93|image|esx.pm:capture(527)|Powered off:
> |31543|93:93|image| virtual machine 'test_vm' under host
> blade2-7.cnl.ncsu.edu powered off
>
> Hope this helps,
> Andy
>
> Brian Bouterse wrote:
>> Our testing steps are these:
>> - make the reservation/request
>> - wait for connect button
>> - click connect button
>> - connect to the machine
>> - touch a file for uniqueness purposes
>> - then you can hit the create image button
>> - indicate we want to update the image (not create a new image from
>> this one).
>> Still looking into why the child create to handle the original
>> imaging reservation/request continues to try to ssh continuously
>> instead of marking the imaging event as in use like it should.
>> That log is attached; it's towards the end, and really shows itself
>> with repeated messages like:
>> |31467|93:93|reserved| ---- WARNING ----
>> |31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved|
>> utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
>> attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -
>> l root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output:
>> |31467|93:93|reserved| ssh output (who): ssh: connect to host
>> test_vm port 22: No route to host
>> |31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
>> |31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
>> |31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
>> |31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
>> |31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
>> |31467|93:93|reserved| (-5) vcld, main (line: 347)
>> Brian
>> Brian Bouterse
>> Secure Open Systems Initiative
>> 919.698.8796
>> On Feb 6, 2009, at 3:41 PM, Aaron Peeler wrote:
>>> What are your steps when you start the imaging process?
>>>
>>> You have to do the following:
>>> - make the reservation/request
>>> - wait for connect button
>>> - click connect button
>>> - connect to the machine
>>> - then you can hit the create image button to start the imaging
>>> process.
>>>
>>> If your in the reserved state as reported in the log file below -
>>> the process is polling the node looking for the user to connect.
>>> Once connected then it proceeds to the inuse state.
>>>
>>> This is actually a user interface bug. We're assuming users will
>>> actually login to the remote node and install something - but in
>>> your case if your just testing I can see going straight to the
>>> image creation mode. Created JIRA issue VCL-84.
>>>
>>> Under normal operation you don't need to clean on the
>>> computerloadlog. Since your developing and likely having to break
>>> things, it's going to behave a little bit different.
>>>
>>> Aaron
>>>
>>> --On February 6, 2009 3:12:16 PM -0500 Andrew Brown <brownan@gmail.com
>>> > wrote:
>>>
>>>> So we have things mostly working except for one detail: When we
>>>> check out
>>>> a computer for capture through "update image", all that goes
>>>> fine. But
>>>> when we go and hit "create" to actually capture the image, vcld
>>>> hangs
>>>> with the log messages that Brian posted, repeating.
>>>>
>>>> If we go in and remove the database entry in the computerloadlog
>>>> table,
>>>> the capture goes through; however, there seems to be a process that
>>>> lingers and keeps trying to ssh to the machine even when our
>>>> capture is
>>>> proceeding. It gives us log entries like this:
>>>> 2009-02-06
>>>> 14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)|
>>>> checking
>>>> for connection by administrator on test_vm, attempt 4
>>>> 2009-02-06
>>>> 14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|
>>>> executing
>>>> SSH command on test_vm: netstat -an
>>>> 2009-02-06
>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
>>>> signal: 0, core dump: 0, exit status: 0
>>>> 2009-02-06
>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
>>>> command
>>>> executed on test_vm: netstat -an, returning (0, output)
>>>> 2009-02-06
>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|
>>>> executing
>>>> SSH command on test_vm: who
>>>> 2009-02-06
>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
>>>> signal: 0, core dump: 0, exit status: 0
>>>> 2009-02-06
>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)|
>>>> run_ssh_comm
>>>> and output: admin pts/0 Feb 6 15:59 (152.14.244.35)
>>>> 2009-02-06
>>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH
>>>> command
>>>> executed on test_vm: who, returning (0, output)
>>>> (the pid being different from the one that is going into our
>>>> capture
>>>> routine)
>>>>
>>>> This may be unrelated, I don't know. But we need to fix it so we
>>>> don't
>>>> need to manually delete the computerloadlog entry each time, as
>>>> well as
>>>> fix it so this second process... goes away or something.
>>>>
>>>> -Andrew
>>>>
>>>> On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler
>>>> <aa...@ncsu.edu>
>>>> wrote:
>>>>
>>>>> Actually it's computerloadlog
>>>>>
>>>>> The computerloadlog table has three purposes.
>>>>> It also has two dependent tables computerloadflow and
>>>>> computerloadstate
>>>>>
>>>>>
>>>>> 1) Provides information to the user about what state user
>>>>> request is in,
>>>>> where it's at in loading, performing post install tasks, adding
>>>>> user
>>>>> acct, etc. This information is seen on the current reservations
>>>>> page by
>>>>> the user clicking on the pending link. This link opens a ajax
>>>>> based
>>>>> window and will provided the state the load is in. At some point
>>>>> during
>>>>> your load routine - you'll want to add these type of calls in the
>>>>> appropriate places.
>>>>>
>>>>> insertloadlog($reservation_id, $computer_id, "startload",
>>>>> "$computer_node_name $image_name");
>>>>>
>>>>> But this isn't required to get a provision module functional.
>>>>> It's a nice
>>>>> feature, so the user can be aware of what's going on. We'll put
>>>>> together
>>>>> information on how to best use the insertloadlog.
>>>>>
>>>>> 2) If a new|reload request is started and the computer state is
>>>>> reload or
>>>>> reloading. The forked request process will use that table to
>>>>> determine
>>>>> what to do. Either take over, fail, or let the current owning
>>>>> process
>>>>> complete.
>>>>>
>>>>> 3) It is also used in vcld to prevent duplicate or competing
>>>>> processes.
>>>>>
>>>>> It gets cleaned out on a per-reservation basis. When a
>>>>> reservations goes
>>>>> into the inuse state and when a reservation is removed from the
>>>>> db.
>>>>>
>>>>> In your case, since your just testing your module, just manually
>>>>> empty
>>>>> the computerloadlog table as needed.
>>>>>
>>>>> Aaron
>>>>>
>>>>>
>>>>>
>>>>> --On February 6, 2009 11:20:47 AM -0500 Brian Bouterse
>>>>> <bm...@ncsu.edu> wrote:
>>>>>
>>>>> What is the purpose of the loadlog table? Currently the ESX
>>>>> provisioning
>>>>>> module doesn't update any information in the loadlog table, and
>>>>>> the
>>>>>> provisioning works fine. We're working on the capture()
>>>>>> portions now,
>>>>>> and when we go through the web interface to have it "capture" our
>>>>>> reservation, vcld warns saying:
>>>>>>
>>>>>> | 25052|72:72|image| ---- WARNING ----
>>>>>> | 25052|72:72|image| 2009-02-06
>>>>>> | 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is
>>>>>> already
>>>>>> | being processed 25052|72:72|image| ( 0) utils.pm, notify
>>>>>> (line: 683)
>>>>>> | 25052|72:72|image| (-1) vcld, main (line: 272)
>>>>>>
>>>>>> Is this normal for the "image update" process?
>>>>>>
>>>>>> Best,
>>>>>> Brian
>>>>>>
>>>>>> Brian Bouterse
>>>>>> Secure Open Systems Initiative
>>>>>> 919.698.8796
>>>>>>
>>>>>>
>>>>>>
>>>>>> \
>
Re: role of loadlog table - computerloadlog
Posted by Andy Kurth <an...@ncsu.edu>.
The reserved process is not detecting that you have logged in to the
node. The problem is occurring in utils.pm::check_connection(). For
Linux images, the subroutine attempts to determine if the user is logged
in via "netstat -an" and "who".
The "who" command is failing to determine that you logged in because the
login name you're using is "admin" instead of "administrator". I think
you'll be able to get past this problem if you change your reservation
login name (user.unityid) to "administrator".
(Note: There's a bug in reserved.pm which is causing this. It is
passing "Administrator" instead of the correct user name to
check_connection() if forimaging=1, OS=Linux, and userstandalone=1.)
The "netstat -an" command is failing because the regex in
utils.pm::check_connection() isn't finding a match. Here's the regex
and what the "netstat -an" output line looks like when I connect to a
Linux image via SSH:
/tcp\s+([0-9]*)\s+([0-9]*)\s($ipaddress:22)\s+([.0-9]*):([0-9]*)(.*)(ESTABLISHED)/
tcp 0 0 152.46.18.198:22 69.111.94.75:51803
ESTABLISHED
I'm not sure exactly why it isn't finding a match because the netstat
output isn't displayed in the log file. To troubleshoot this, first
remove the last "1" argument from this command near line 1911 in utils.pm:
run_ssh_command($shortnodename, $identity, "netstat -an", "root", 22, 1);
The last argument instructs run_ssh_command to not display the command
output. This was added because netstat returns a lot of output and is
called many times when a request is in the reserved state. The output
was causing the log file to grow quickly.
Once you see the netstat output in the log file you should be able to
tell why the regex isn't matching. The $ipaddress variable is the IP of
the node. In your log file, it is 152.14.17.82. The 4th grouping
returned by the regex should match your remote IP, which was
152.14.244.35 in your log file. You should expect to see a netstat line
like this:
tcp 0 0 152.14.17.82:22 152.14.244.35:51803
ESTABLISHED
The "connect to host test_vm port 22: No route to host" failures at the
end of the log file begin to occur after you manually delete the
computerloadlog 'begin' entry, the image.pm process is then able to
start up, then this is called:
2009-02-06 14:56:56|31543|93:93|image|esx.pm:capture(524)|Power off
command: /usr/lib/vmware-viperl/apps/vm/vmcontrol.pl --server 'blade2-7.
cnl.ncsu.edu' --vmname test_vm --operation poweroff --username root
--password 'xxx'
2009-02-06 14:57:03|31543|93:93|image|esx.pm:capture(527)|Powered off:
|31543|93:93|image| virtual machine 'test_vm' under host
blade2-7.cnl.ncsu.edu powered off
Hope this helps,
Andy
Brian Bouterse wrote:
> Our testing steps are these:
>
> - make the reservation/request
> - wait for connect button
> - click connect button
> - connect to the machine
> - touch a file for uniqueness purposes
> - then you can hit the create image button
> - indicate we want to update the image (not create a new image from this
> one).
>
> Still looking into why the child create to handle the original imaging
> reservation/request continues to try to ssh continuously instead of
> marking the imaging event as in use like it should. That log is
> attached; it's towards the end, and really shows itself with repeated
> messages like:
>
> |31467|93:93|reserved| ---- WARNING ----
> |31467|93:93|reserved| 2009-02-06
> 14:58:21|31467|93:93|reserved|utils.pm:run_ssh_command(6807)|failed to
> run SSH command after 3 attempts, command: /usr/bin/ssh -i
> /etc/vcl/bladelinuxkey_id_rsa -l root -p 22 -x test_vm 'who' 2>&1, exit
> status: 255, output:
> |31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
> port 22: No route to host
> |31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
> |31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
> |31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
> |31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
> |31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
> |31467|93:93|reserved| (-5) vcld, main (line: 347)
>
>
> Brian
>
>
>
>
> Brian Bouterse
> Secure Open Systems Initiative
> 919.698.8796
>
>
>
>
> On Feb 6, 2009, at 3:41 PM, Aaron Peeler wrote:
>
>> What are your steps when you start the imaging process?
>>
>> You have to do the following:
>> - make the reservation/request
>> - wait for connect button
>> - click connect button
>> - connect to the machine
>> - then you can hit the create image button to start the imaging process.
>>
>> If your in the reserved state as reported in the log file below - the
>> process is polling the node looking for the user to connect. Once
>> connected then it proceeds to the inuse state.
>>
>> This is actually a user interface bug. We're assuming users will
>> actually login to the remote node and install something - but in your
>> case if your just testing I can see going straight to the image
>> creation mode. Created JIRA issue VCL-84.
>>
>> Under normal operation you don't need to clean on the computerloadlog.
>> Since your developing and likely having to break things, it's going
>> to behave a little bit different.
>>
>> Aaron
>>
>> --On February 6, 2009 3:12:16 PM -0500 Andrew Brown
>> <br...@gmail.com> wrote:
>>
>>> So we have things mostly working except for one detail: When we check
>>> out
>>> a computer for capture through "update image", all that goes fine. But
>>> when we go and hit "create" to actually capture the image, vcld hangs
>>> with the log messages that Brian posted, repeating.
>>>
>>> If we go in and remove the database entry in the computerloadlog table,
>>> the capture goes through; however, there seems to be a process that
>>> lingers and keeps trying to ssh to the machine even when our capture is
>>> proceeding. It gives us log entries like this:
>>> 2009-02-06
>>> 14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)|checking
>>> for connection by administrator on test_vm, attempt 4
>>> 2009-02-06
>>> 14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|executing
>>> SSH command on test_vm: netstat -an
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
>>> signal: 0, core dump: 0, exit status: 0
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH command
>>> executed on test_vm: netstat -an, returning (0, output)
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|executing
>>> SSH command on test_vm: who
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
>>> signal: 0, core dump: 0, exit status: 0
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)|run_ssh_comm
>>>
>>> and output: admin pts/0 Feb 6 15:59 (152.14.244.35)
>>> 2009-02-06
>>> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH command
>>> executed on test_vm: who, returning (0, output)
>>> (the pid being different from the one that is going into our capture
>>> routine)
>>>
>>> This may be unrelated, I don't know. But we need to fix it so we don't
>>> need to manually delete the computerloadlog entry each time, as well as
>>> fix it so this second process... goes away or something.
>>>
>>> -Andrew
>>>
>>> On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler <aa...@ncsu.edu>
>>> wrote:
>>>
>>>> Actually it's computerloadlog
>>>>
>>>> The computerloadlog table has three purposes.
>>>> It also has two dependent tables computerloadflow and computerloadstate
>>>>
>>>>
>>>> 1) Provides information to the user about what state user request is
>>>> in,
>>>> where it's at in loading, performing post install tasks, adding user
>>>> acct, etc. This information is seen on the current reservations page by
>>>> the user clicking on the pending link. This link opens a ajax based
>>>> window and will provided the state the load is in. At some point during
>>>> your load routine - you'll want to add these type of calls in the
>>>> appropriate places.
>>>>
>>>> insertloadlog($reservation_id, $computer_id, "startload",
>>>> "$computer_node_name $image_name");
>>>>
>>>> But this isn't required to get a provision module functional. It's a
>>>> nice
>>>> feature, so the user can be aware of what's going on. We'll put
>>>> together
>>>> information on how to best use the insertloadlog.
>>>>
>>>> 2) If a new|reload request is started and the computer state is
>>>> reload or
>>>> reloading. The forked request process will use that table to determine
>>>> what to do. Either take over, fail, or let the current owning process
>>>> complete.
>>>>
>>>> 3) It is also used in vcld to prevent duplicate or competing processes.
>>>>
>>>> It gets cleaned out on a per-reservation basis. When a reservations
>>>> goes
>>>> into the inuse state and when a reservation is removed from the db.
>>>>
>>>> In your case, since your just testing your module, just manually empty
>>>> the computerloadlog table as needed.
>>>>
>>>> Aaron
>>>>
>>>>
>>>>
>>>> --On February 6, 2009 11:20:47 AM -0500 Brian Bouterse
>>>> <bm...@ncsu.edu> wrote:
>>>>
>>>> What is the purpose of the loadlog table? Currently the ESX
>>>> provisioning
>>>>> module doesn't update any information in the loadlog table, and the
>>>>> provisioning works fine. We're working on the capture() portions now,
>>>>> and when we go through the web interface to have it "capture" our
>>>>> reservation, vcld warns saying:
>>>>>
>>>>> | 25052|72:72|image| ---- WARNING ----
>>>>> | 25052|72:72|image| 2009-02-06
>>>>> | 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is already
>>>>> | being processed 25052|72:72|image| ( 0) utils.pm, notify (line: 683)
>>>>> | 25052|72:72|image| (-1) vcld, main (line: 272)
>>>>>
>>>>> Is this normal for the "image update" process?
>>>>>
>>>>> Best,
>>>>> Brian
>>>>>
>>>>> Brian Bouterse
>>>>> Secure Open Systems Initiative
>>>>> 919.698.8796
>>>>>
>>>>>
>>>>>
>>>>> \
>
Re: role of loadlog table - computerloadlog
Posted by Brian Bouterse <bm...@ncsu.edu>.
Our testing steps are these:
- make the reservation/request
- wait for connect button
- click connect button
- connect to the machine
- touch a file for uniqueness purposes
- then you can hit the create image button
- indicate we want to update the image (not create a new image from
this one).
Still looking into why the child create to handle the original imaging
reservation/request continues to try to ssh continuously instead of
marking the imaging event as in use like it should. That log is
attached; it's towards the end, and really shows itself with repeated
messages like:
|31467|93:93|reserved| ---- WARNING ----
|31467|93:93|reserved| 2009-02-06 14:58:21|31467|93:93|reserved|
utils.pm:run_ssh_command(6807)|failed to run SSH command after 3
attempts, command: /usr/bin/ssh -i /etc/vcl/bladelinuxkey_id_rsa -l
root -p 22 -x test_vm 'who' 2>&1, exit status: 255, output:
|31467|93:93|reserved| ssh output (who): ssh: connect to host test_vm
port 22: No route to host
|31467|93:93|reserved| ( 0) utils.pm, notify (line: 683)
|31467|93:93|reserved| (-1) utils.pm, run_ssh_command (line: 6807)
|31467|93:93|reserved| (-2) utils.pm, check_connection (line: 1931)
|31467|93:93|reserved| (-3) reserved.pm, process (line: 410)
|31467|93:93|reserved| (-4) vcld, make_new_child (line: 599)
|31467|93:93|reserved| (-5) vcld, main (line: 347)
Brian
Re: role of loadlog table - computerloadlog
Posted by Aaron Peeler <aa...@ncsu.edu>.
What are your steps when you start the imaging process?
You have to do the following:
- make the reservation/request
- wait for connect button
- click connect button
- connect to the machine
- then you can hit the create image button to start the imaging process.
If your in the reserved state as reported in the log file below - the
process is polling the node looking for the user to connect. Once connected
then it proceeds to the inuse state.
This is actually a user interface bug. We're assuming users will actually
login to the remote node and install something - but in your case if your
just testing I can see going straight to the image creation mode. Created
JIRA issue VCL-84.
Under normal operation you don't need to clean on the computerloadlog.
Since your developing and likely having to break things, it's going to
behave a little bit different.
Aaron
--On February 6, 2009 3:12:16 PM -0500 Andrew Brown <br...@gmail.com>
wrote:
> So we have things mostly working except for one detail: When we check out
> a computer for capture through "update image", all that goes fine. But
> when we go and hit "create" to actually capture the image, vcld hangs
> with the log messages that Brian posted, repeating.
>
> If we go in and remove the database entry in the computerloadlog table,
> the capture goes through; however, there seems to be a process that
> lingers and keeps trying to ssh to the machine even when our capture is
> proceeding. It gives us log entries like this:
> 2009-02-06
> 14:56:57|31467|93:93|reserved|utils.pm:check_connection(1828)|checking
> for connection by administrator on test_vm, attempt 4
> 2009-02-06
> 14:56:57|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|executing
> SSH command on test_vm: netstat -an
> 2009-02-06
> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
> signal: 0, core dump: 0, exit status: 0
> 2009-02-06
> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH command
> executed on test_vm: netstat -an, returning (0, output)
> 2009-02-06
> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6729)|executing
> SSH command on test_vm: who
> 2009-02-06
> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6748)|$?: 0,
> signal: 0, core dump: 0, exit status: 0
> 2009-02-06
> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6796)|run_ssh_comm
> and output: admin pts/0 Feb 6 15:59 (152.14.244.35)
> 2009-02-06
> 14:56:58|31467|93:93|reserved|utils.pm:run_ssh_command(6799)|SSH command
> executed on test_vm: who, returning (0, output)
> (the pid being different from the one that is going into our capture
> routine)
>
> This may be unrelated, I don't know. But we need to fix it so we don't
> need to manually delete the computerloadlog entry each time, as well as
> fix it so this second process... goes away or something.
>
> -Andrew
>
> On Fri, Feb 6, 2009 at 1:23 PM, Aaron Peeler <aa...@ncsu.edu>
> wrote:
>
>> Actually it's computerloadlog
>>
>> The computerloadlog table has three purposes.
>> It also has two dependent tables computerloadflow and computerloadstate
>>
>>
>> 1) Provides information to the user about what state user request is in,
>> where it's at in loading, performing post install tasks, adding user
>> acct, etc. This information is seen on the current reservations page by
>> the user clicking on the pending link. This link opens a ajax based
>> window and will provided the state the load is in. At some point during
>> your load routine - you'll want to add these type of calls in the
>> appropriate places.
>>
>> insertloadlog($reservation_id, $computer_id, "startload",
>> "$computer_node_name $image_name");
>>
>> But this isn't required to get a provision module functional. It's a nice
>> feature, so the user can be aware of what's going on. We'll put together
>> information on how to best use the insertloadlog.
>>
>> 2) If a new|reload request is started and the computer state is reload or
>> reloading. The forked request process will use that table to determine
>> what to do. Either take over, fail, or let the current owning process
>> complete.
>>
>> 3) It is also used in vcld to prevent duplicate or competing processes.
>>
>> It gets cleaned out on a per-reservation basis. When a reservations goes
>> into the inuse state and when a reservation is removed from the db.
>>
>> In your case, since your just testing your module, just manually empty
>> the computerloadlog table as needed.
>>
>> Aaron
>>
>>
>>
>> --On February 6, 2009 11:20:47 AM -0500 Brian Bouterse
>> <bm...@ncsu.edu> wrote:
>>
>> What is the purpose of the loadlog table? Currently the ESX
>> provisioning
>>> module doesn't update any information in the loadlog table, and the
>>> provisioning works fine. We're working on the capture() portions now,
>>> and when we go through the web interface to have it "capture" our
>>> reservation, vcld warns saying:
>>>
>>> | 25052|72:72|image| ---- WARNING ----
>>> | 25052|72:72|image| 2009-02-06
>>> | 11:19:56|25052|72:72|image|vcld:main(272)|reservation 72 is already
>>> | being processed 25052|72:72|image| ( 0) utils.pm, notify (line: 683)
>>> | 25052|72:72|image| (-1) vcld, main (line: 272)
>>>
>>> Is this normal for the "image update" process?
>>>
>>> Best,
>>> Brian
>>>
>>> Brian Bouterse
>>> Secure Open Systems Initiative
>>> 919.698.8796
>>>
>>>
>>>
>>> \