You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Boyu Zhang <bo...@gmail.com> on 2010/04/08 20:09:53 UTC
HOD: JobTracker failed to initialise
Dear All,
I am trying to install HOD on a cluster. When I tried to allocate a new
Hadoop cluster, I got the following error:
[2010-04-08 13:47:25,304] CRITICAL/50 hadoop:303 - Cluster could not be
allocated because of the following errors.
Hodring at n0 failed with following errors:
JobTracker failed to initialise
*The log file ringmaster.log has the following message:*
[2010-04-08 13:46:22,297] DEBUG/10 ringMaster:479 - getServiceAddr name:
hdfs
[2010-04-08 13:46:22,299] DEBUG/10 ringMaster:487 - getServiceAddr service:
<hodlib.GridServices.hdfs.Hdfs instance at 0x2057b758>
[2010-04-08 13:46:22,300] DEBUG/10 ringMaster:504 - getServiceAddr addr
hdfs: not found
*The log file hodring.log has the following message:*
[2010-04-08 13:46:31,749] DEBUG/10 hodRing:416 - hadoopThread still == None
...
[2010-04-08 13:46:31,750] DEBUG/10 hodRing:419 - hadoop input: None
[2010-04-08 13:46:31,752] DEBUG/10 hodRing:428 - isForground: False
[2010-04-08 13:46:31,753] DEBUG/10 hodRing:440 - hadoop run status: True
[2010-04-08 13:46:31,754] DEBUG/10 hodRing:657 - Waiting for jobtracker to
initialise
[2010-04-08 13:46:31,755] DEBUG/10 hodRing:659 - jobtracker version : 20
[2010-04-08 13:46:31,756] DEBUG/10 hodRing:664 - jobtracker rpc server :
n2:59664
[2010-04-08 13:46:31,757] DEBUG/10 hodRing:670 - Jobtracker jetty : n2:57775
[2010-04-08 13:46:32,042] DEBUG/10 hodRing:713 - Jetty gave a socket error.
Sleeping for 0.5
[2010-04-08 13:46:33,544] DEBUG/10 hodRing:713 - Jetty gave a socket error.
Sleeping for 1.0
[2010-04-08 13:46:35,545] DEBUG/10 hodRing:713 - Jetty gave a socket error.
Sleeping for 2.0
[2010-04-08 13:46:38,546] DEBUG/10 hodRing:713 - Jetty gave a socket error.
Sleeping for 4.0
[2010-04-08 13:46:43,547] DEBUG/10 hodRing:713 - Jetty gave a socket error.
Sleeping for 8.0
[2010-04-08 13:46:52,548] DEBUG/10 hodRing:713 - Jetty gave a socket error.
Sleeping for 16.0
4864033937778270/hdfs-nn/dfs-name']
[2010-04-08 13:47:08,552] CRITICAL/50 hodRing:723 - Jobtracker failed to
initialise.
*The log file hadoop.log in the actual compute node n0 has: *
2010-04-08 17:47:24,424 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/scratch/hod/mapredsys/zhang/mapredsystem/
85.geronimo.gcl.cis.udel.edu/jobtracker.info could only be replicated to 0
nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
--------------------------------------------------------------------------------------------------
It looks like that hdfs daemon failed to start, so JT has no one to
communicate with, then Jetty gave a error.
I used hadoop0.20.2, Scyld OS, the cluster uses 0-5 (n0-n5) to refer to the
back end compute node. Did anyone have this problem before? Any help will be
appreciated.
P.S. I have tmp files Jetty*** generated under /tmp on the compute nodes,
but I set all the tmp dir to /home or /scratch, any idea?
Here is my hod conf file:
[hod]
stream = True
java-home =/usr
cluster = geronimo
cluster-factor = 1.8
xrs-port-range = 32768-65536
debug = 4
allocate-wait-time = 3600
temp-dir = /home/zhang/hodtmp.$PBS_JOBID
[ringmaster]
register = True
stream = False
temp-dir = /scratch/hod/ringmastertmp.$PBS_JOBID
http-port-range = 8000-9000
work-dirs = /scratch/hod/tmp/1,/scratch/hod/tmp/2
xrs-port-range = 32768-65536
debug = 4
[hodring]
stream = False
temp-dir = /scratch/hod/hodringtmp.$PBS_JOBID
register = True
java-home = /usr
http-port-range = 8000-9000
xrs-port-range = 32768-65536
debug = 4
mapred-system-dir-root = /scratch/hod/mapredsys
[resource_manager]
queue = batch
batch-home = /usr
id = torque
env-vars =
HOD_PYTHON_HOME=/opt/python/2.5.1/bin/python
[gridservice-mapred]
external = False
pkgs = /home/zhang/hadoop-0.20.2
tracker_port = 8030
info_port = 50080
[gridservice-hdfs]
external = False
pkgs = /home/zhang/hadoop-0.20.2
fs_port = 8020
info_port = 50070
Thanks a lot!!
Boyu