You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Brian Ferris <bd...@cs.washington.edu> on 2009/05/13 19:51:53 UTC
HOD 0.20.0 - deallocate fails to terminate cluster
I've been running HOD with hadoop-0.20.0 successfully on our Torque
cluster and it runs map-reduce jobs as advertised. The only hiccup
comes when I try to deallocate the cluster. I run the following
command:
> hod deallocate path/to/my/cluster_config
The HOD executable cleans up the cluster_config directory, but it
doesn't seem to actually terminate the actual HOD process job in the
Torque queue. I can still see it if I run "qstat" and it doesn't go
away, even after waiting a few minutes. Eventually, I have to kill
the task with "qdel".
Shouldn't the deallocate operation automatically stop the HOD job? Is
this a bug? Something I haven't configured correctly? How might I go
about debugging what's wrong if it is a bug?
This obviously isn't mission critical, since everything else is
working correctly and qdel seems to do the trick either way. Mostly
I'm just curious.
More details and version information:
Fedora 7
hadoop-0.20.0
torque-2.1.10-1.fc7
torque-client-2.1.10-1.fc7
torque-scheduler-2.1.10-1.fc7
libtorque-2.1.10-1.fc7
torque-gui-2.1.10-1.fc7
torque-docs-2.1.10-1.fc7
torque-server-2.1.10-1.fc7
Thanks,
Brian