You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dieter Plaetinck <di...@intec.ugent.be> on 2011/04/04 18:45:49 UTC

INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion and other errors

Hi,
I have a cluster of 4 debian squeeze machines, on all of them I
installed the same version ( hadoop-0.20.2.tar.gz )

I have : n-0 namenode, n-1: jobtracker and n-{0,1,2,3} slaves
but you can see all my configs in more detail @
http://pastie.org/1754875

the machines have 3GiB RAM.
I don't think disk space is an issue anywhere, but FWIW:
/var/bomvl (which would house hdfs stuff) is a filesystem of 140GB on
all nodes. they all have 1.8GB free on /


so, I invoke the cluster on n-0:

$ bin/hadoop namenode -format
11/04/04 18:27:33 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = n-0.map-reduce.search.wall2.test/10.2.1.90
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
11/04/04 18:27:33 INFO namenode.FSNamesystem: fsOwner=dplaetin,Search,root
11/04/04 18:27:33 INFO namenode.FSNamesystem: supergroup=supergroup
11/04/04 18:27:33 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/04/04 18:27:33 INFO common.Storage: Image file of size 98 saved in 0 seconds.
11/04/04 18:27:34 INFO common.Storage: Storage directory /var/bomvl/name has been successfully formatted.
11/04/04 18:27:34 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at n-0.map-reduce.search.wall2.test/10.2.1.90
************************************************************/

$ bin/start-all.sh
starting namenode, logging to /var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.out
n-0: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-0.map-reduce.search.wall2.test.out
n-2: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-2.map-reduce.search.wall2.test.out
n-3: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-3.map-reduce.search.wall2.test.out
n-1: starting datanode, logging to /var/log/hadoop/hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.out
localhost: starting secondarynamenode, logging to /var/log/hadoop/hadoop-dplaetin-secondarynamenode-n-0.map-reduce.search.wall2.test.out
starting jobtracker, logging to /var/log/hadoop/hadoop-dplaetin-jobtracker-n-0.map-reduce.search.wall2.test.out
n-2: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-2.map-reduce.search.wall2.test.out
n-3: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-3.map-reduce.search.wall2.test.out
n-0: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-0.map-reduce.search.wall2.test.out
n-1: starting tasktracker, logging to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.out


However, in n-0:/var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.log,
I see this:
2011-04-04 18:27:34,760 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = n-0.map-reduce.search.wall2.test/10.2.1.90
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-04-04 18:27:34,846 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000
2011-04-04 18:27:34,851 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: n-0-lan0/10.1.1.2:9000
2011-04-04 18:27:34,853 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null
2011-04-04 18:27:34,854 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-04-04 18:27:34,893 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=dplaetin,Search,root
2011-04-04 18:27:34,893 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2011-04-04 18:27:34,893 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2011-04-04 18:27:34,899 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
2011-04-04 18:27:34,900 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean
2011-04-04 18:27:34,927 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1
2011-04-04 18:27:34,931 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0
2011-04-04 18:27:34,931 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98 loaded in 0 seconds.
2011-04-04 18:27:34,932 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /var/bomvl/name/current/edits of size 4 edits # 0 loaded in 0 seconds.
2011-04-04 18:27:34,971 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98 saved in 0 seconds.
2011-04-04 18:27:35,146 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 263 msecs
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of blocks = 0
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid blocks = 0
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of under-replicated blocks = 0
2011-04-04 18:27:35,147 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of  over-replicated blocks = 0
2011-04-04 18:27:35,148 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0 secs.
2011-04-04 18:27:35,148 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
2011-04-04 18:27:35,148 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
2011-04-04 18:27:40,247 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-04 18:27:40,299 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070
2011-04-04 18:27:40,300 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
2011-04-04 18:27:40,300 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070
2011-04-04 18:27:40,300 INFO org.mortbay.log: jetty-6.1.14
2011-04-04 18:27:40,518 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50070
2011-04-04 18:27:40,518 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070
2011-04-04 18:27:40,521 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-04-04 18:27:40,521 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9000: starting
2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000: starting
2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000: starting
2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000: starting
2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000: starting
2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000: starting
2011-04-04 18:27:40,528 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000: starting
2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000: starting
2011-04-04 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error
register getProtocolVersion
java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
        at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
        at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
        at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-04-04 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion
java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
        at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
        at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
        at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-04-04 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion
java.lang.IllegalArgumentException: Duplicate metricsName:getProtocolVersion
        at org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
        at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
        at org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
2011-04-04 18:29:20,803 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.5:50010 storage DS-1679058086-10.2.0.80-50010-1301934560735
2011-04-04 18:29:20,806 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.5:50010
2011-04-04 18:29:32,077 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.3:50010 storage DS-195369122-10.2.1.91-50010-1301934572032
2011-04-04 18:29:32,077 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.3:50010
2011-04-04 18:29:39,321 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.4:50010 storage DS-564465257-10.2.1.79-50010-1301934579273
2011-04-04 18:29:39,321 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.4:50010
2011-04-04 18:30:58,007 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 10.1.1.2:50010 storage DS-166622494-10.2.1.90-50010-1301934658005
2011-04-04 18:30:58,007 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/10.1.1.2:50010
2011-04-04 18:33:13,788 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 10.1.1.2
2011-04-04 18:33:13,788 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 
2011-04-04 18:33:14,091 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage from 10.1.1.2
2011-04-04 18:33:14,092 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 0 Total time for transactions(ms): 0Number of
transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 33 


Anyone knows what could be the issue at hand? The suggestions I could find in other threads are "check diskspace" and "make sure the versions are the same",
 but on the former I'm good, the latter is not an issue here.

 When I try to put a file on dfhs, I get:
011-04-04 17:10:28,306 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call addBlock(/tmp/hadoop-dplaetin/mapred/system/jobtracker.info, DFSClient_371874060) from 10.1.1.3:47491: error: java.io.IOException: File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


On n-1, hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log looks good:
$ cat hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log
2011-04-04 18:27:35,886 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = n-1.map-reduce.search.wall2.test/10.2.1.91
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-04-04 18:27:40,535 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /var/bomvl/data is not formatted.
2011-04-04 18:27:40,535 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
2011-04-04 18:27:40,828 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
2011-04-04 18:27:40,829 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 50010
2011-04-04 18:27:40,831 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s
2011-04-04 18:27:45,876 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075
2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075
2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075
2011-04-04 18:27:45,940 INFO org.mortbay.log: jetty-6.1.14
2011-04-04 18:29:27,006 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50075
2011-04-04 18:29:27,011 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=DataNode, sessionId=null
2011-04-04 18:29:32,026 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=DataNode, port=50020
2011-04-04 18:29:32,029 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-04-04 18:29:32,029 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting
2011-04-04 18:29:32,031 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(n144-17a.wall2.test:50010, storageID=, infoPort=50075, ipcPort=50020)
2011-04-04 18:29:32,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id DS-195369122-10.2.1.91-50010-1301934572032 is assigned to data-node 10.1.1.3:50010
2011-04-04 18:29:32,061 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.1.1.3:50010, storageID=DS-195369122-10.2.1.91-50010-1301934572032, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/var/bomvl/data/current'}
2011-04-04 18:29:32,062 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
2011-04-04 18:29:32,070 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0 blocks got processed in 2 msecs
2011-04-04 18:29:32,071 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
block scanner.

However, this doesn't seem very right:
$ head -n 40 hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.log
2011-04-04 18:27:39,041 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = n-1.map-reduce.search.wall2.test/10.2.1.91
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
2011-04-04 18:27:44,176 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-04 18:27:44,263 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
2011-04-04 18:27:44,269 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
2011-04-04 18:27:44,269 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
2011-04-04 18:27:44,269 INFO org.mortbay.log: jetty-6.1.14
2011-04-04 18:28:05,858 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50060
2011-04-04 18:28:05,863 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=TaskTracker, sessionId=
2011-04-04 18:28:05,872 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=TaskTracker, port=42914
2011-04-04 18:28:05,908 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2011-04-04 18:28:05,908 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 42914: starting
2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 42914: starting
2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 42914: starting
2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 42914: starting
2011-04-04 18:28:05,910 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:42914
2011-04-04 18:28:05,910 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_n144-17a.wall2.test:localhost/127.0.0.1:42914
2011-04-04 18:28:06,933 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 0 time(s).
2011-04-04 18:28:07,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 1 time(s).
2011-04-04 18:28:08,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 2 time(s).
2011-04-04 18:28:09,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 3 time(s).
2011-04-04 18:28:10,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 4 time(s).
2011-04-04 18:28:11,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 5 time(s).
2011-04-04 18:28:12,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 6 time(s).
2011-04-04 18:28:13,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 7 time(s).
2011-04-04 18:28:14,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 8 time(s).
2011-04-04 18:28:15,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: n-1/10.1.1.3:9001. Already tried 9 time(s).
2011-04-04 18:28:15,938 INFO org.apache.hadoop.ipc.RPC: Server at n-1/10.1.1.3:9001 not available yet, Zzzzz...
2011-04-04 18:28:17,939 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: n-1/10.1.1.3:9001. Already tried 0 time(s).

Indeed, the jobtracker does not seem to be running (although that's probably another problem):
$ sudo netstat -n -l -p | grep 9001
$ 

Thanks for any help,
Dieter

Re: INFO org.apache.hadoop.ipc.Server: Error register getProtocolVersion and other errors

Posted by Dieter Plaetinck <di...@intec.ugent.be>.
Anyone? Anyone at all?
I figured out the issue with the jobtracker, but I still have the
errors:

* Error register getProtocolVersion
* File (..) could only be replicated to 0 nodes, instead of 1

as explained in my first mail.

The 2nd error can appear without _any_ errors in _any_ of the datanode
or tasktracker logs.  And the NameNode webinterface even tells me all
nodes are live, none are dead.

This is effectively holding me back from using the cluster, 
I'm completely in the dark, I find this very frustrating. :(

Thank you,

Dieter

On Mon, 4 Apr 2011 18:45:49 +0200
Dieter Plaetinck <di...@intec.ugent.be> wrote:

> Hi,
> I have a cluster of 4 debian squeeze machines, on all of them I
> installed the same version ( hadoop-0.20.2.tar.gz )
> 
> I have : n-0 namenode, n-1: jobtracker and n-{0,1,2,3} slaves
> but you can see all my configs in more detail @
> http://pastie.org/1754875
> 
> the machines have 3GiB RAM.
> I don't think disk space is an issue anywhere, but FWIW:
> /var/bomvl (which would house hdfs stuff) is a filesystem of 140GB on
> all nodes. they all have 1.8GB free on /
> 
> 
> so, I invoke the cluster on n-0:
> 
> $ bin/hadoop namenode -format
> 11/04/04 18:27:33 INFO namenode.NameNode: STARTUP_MSG: 
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = n-0.map-reduce.search.wall2.test/10.2.1.90
> STARTUP_MSG:   args = [-format]
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 11/04/04 18:27:33 INFO namenode.FSNamesystem:
> fsOwner=dplaetin,Search,root 11/04/04 18:27:33 INFO
> namenode.FSNamesystem: supergroup=supergroup 11/04/04 18:27:33 INFO
> namenode.FSNamesystem: isPermissionEnabled=true 11/04/04 18:27:33
> INFO common.Storage: Image file of size 98 saved in 0 seconds.
> 11/04/04 18:27:34 INFO common.Storage: Storage
> directory /var/bomvl/name has been successfully formatted. 11/04/04
> 18:27:34 INFO namenode.NameNode:
> SHUTDOWN_MSG: /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at
> n-0.map-reduce.search.wall2.test/10.2.1.90
> ************************************************************/
> 
> $ bin/start-all.sh
> starting namenode, logging
> to /var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.out
> n-0: starting datanode, logging
> to /var/log/hadoop/hadoop-dplaetin-datanode-n-0.map-reduce.search.wall2.test.out
> n-2: starting datanode, logging
> to /var/log/hadoop/hadoop-dplaetin-datanode-n-2.map-reduce.search.wall2.test.out
> n-3: starting datanode, logging
> to /var/log/hadoop/hadoop-dplaetin-datanode-n-3.map-reduce.search.wall2.test.out
> n-1: starting datanode, logging
> to /var/log/hadoop/hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.out
> localhost: starting secondarynamenode, logging
> to /var/log/hadoop/hadoop-dplaetin-secondarynamenode-n-0.map-reduce.search.wall2.test.out
> starting jobtracker, logging
> to /var/log/hadoop/hadoop-dplaetin-jobtracker-n-0.map-reduce.search.wall2.test.out
> n-2: starting tasktracker, logging
> to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-2.map-reduce.search.wall2.test.out
> n-3: starting tasktracker, logging
> to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-3.map-reduce.search.wall2.test.out
> n-0: starting tasktracker, logging
> to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-0.map-reduce.search.wall2.test.out
> n-1: starting tasktracker, logging
> to /var/log/hadoop/hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.out
> 
> 
> However, in
> n-0:/var/log/hadoop/hadoop-dplaetin-namenode-n-0.map-reduce.search.wall2.test.log,
> I see this: 2011-04-04 18:27:34,760 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode:
> STARTUP_MSG: /************************************************************
> STARTUP_MSG: Starting NameNode STARTUP_MSG:   host =
> n-0.map-reduce.search.wall2.test/10.2.1.90 STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 2011-04-04 18:27:34,846 INFO
> org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
> with hostName=NameNode, port=9000 2011-04-04 18:27:34,851 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
> n-0-lan0/10.1.1.2:9000 2011-04-04 18:27:34,853 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
> with processName=NameNode, sessionId=null 2011-04-04 18:27:34,854
> INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
> Initializing NameNodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext 2011-04-04
> 18:27:34,893 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> fsOwner=dplaetin,Search,root 2011-04-04 18:27:34,893 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> supergroup=supergroup 2011-04-04 18:27:34,893 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=true 2011-04-04 18:27:34,899 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> Initializing FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext 2011-04-04
> 18:27:34,900 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> FSNamesystemStatusMBean 2011-04-04 18:27:34,927 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Number of files = 1
> 2011-04-04 18:27:34,931 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Number of files under
> construction = 0 2011-04-04 18:27:34,931 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98
> loaded in 0 seconds. 2011-04-04 18:27:34,932 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Edits
> file /var/bomvl/name/current/edits of size 4 edits # 0 loaded in 0
> seconds. 2011-04-04 18:27:34,971 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Image file of size 98
> saved in 0 seconds. 2011-04-04 18:27:35,146 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
> FSImage in 263 msecs 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of
> blocks = 0 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> invalid blocks = 0 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> under-replicated blocks = 0 2011-04-04 18:27:35,147 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> over-replicated blocks = 0 2011-04-04 18:27:35,148 INFO
> org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 0
> secs. 2011-04-04 18:27:35,148 INFO
> org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0
> racks and 0 datanodes 2011-04-04 18:27:35,148 INFO
> org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has
> 0 blocks 2011-04-04 18:27:40,247 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog 2011-04-04 18:27:40,299 INFO
> org.apache.hadoop.http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50070 2011-04-04 18:27:40,300 INFO
> org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned
> 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
> 2011-04-04 18:27:40,300 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50070 2011-04-04 18:27:40,300 INFO org.mortbay.log:
> jetty-6.1.14 2011-04-04 18:27:40,518 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50070 2011-04-04 18:27:40,518 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
> 0.0.0.0:50070 2011-04-04 18:27:40,521 INFO
> org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2011-04-04 18:27:40,521 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 9000: starting 2011-04-04 18:27:40,522 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9000: starting
> 2011-04-04 18:27:40,522 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000: starting 2011-04-04 18:27:40,522 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000: starting
> 2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 9000: starting 2011-04-04 18:27:40,526 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000: starting
> 2011-04-04 18:27:40,526 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 9000: starting 2011-04-04 18:27:40,526 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000: starting
> 2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 9000: starting 2011-04-04 18:27:40,528 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000: starting
> 2011-04-04 18:27:40,527 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 9000: starting 2011-04-04 18:27:40,550 INFO
> org.apache.hadoop.ipc.Server: Error register getProtocolVersion
> java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-04-04
> 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register
> getProtocolVersion java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-04-04
> 18:27:40,550 INFO org.apache.hadoop.ipc.Server: Error register
> getProtocolVersion java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
> at
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-04-04
> 18:29:20,803 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.5:50010
> storage DS-1679058086-10.2.0.80-50010-1301934560735 2011-04-04
> 18:29:20,806 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.5:50010 2011-04-04 18:29:32,077 INFO
> org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.3:50010
> storage DS-195369122-10.2.1.91-50010-1301934572032 2011-04-04
> 18:29:32,077 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.3:50010 2011-04-04 18:29:39,321 INFO
> org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.4:50010
> storage DS-564465257-10.2.1.79-50010-1301934579273 2011-04-04
> 18:29:39,321 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.4:50010 2011-04-04 18:30:58,007 INFO
> org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 10.1.1.2:50010
> storage DS-166622494-10.2.1.90-50010-1301934658005 2011-04-04
> 18:30:58,007 INFO org.apache.hadoop.net.NetworkTopology: Adding a new
> node: /default-rack/10.1.1.2:50010 2011-04-04 18:33:13,788 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log
> from 10.1.1.2 2011-04-04 18:33:13,788 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 0 Total time for transactions(ms): 0Number of
> transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
> 2011-04-04 18:33:14,091 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage
> from 10.1.1.2 2011-04-04 18:33:14,092 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> transactions: 0 Total time for transactions(ms): 0Number of
> transactions batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 33 
> 
> 
> Anyone knows what could be the issue at hand? The suggestions I could
> find in other threads are "check diskspace" and "make sure the
> versions are the same", but on the former I'm good, the latter is not
> an issue here.
> 
>  When I try to put a file on dfhs, I get:
> 011-04-04 17:10:28,306 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call
> addBlock(/tmp/hadoop-dplaetin/mapred/system/jobtracker.info,
> DFSClient_371874060) from 10.1.1.3:47491: error: java.io.IOException:
> File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be
> replicated to 0 nodes, instead of 1 java.io.IOException:
> File /tmp/hadoop-dplaetin/mapred/system/jobtracker.info could only be
> replicated to 0 nodes, instead of 1 at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 
> 
> On n-1, hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log
> looks good: $ cat
> hadoop-dplaetin-datanode-n-1.map-reduce.search.wall2.test.log
> 2011-04-04 18:27:35,886 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> STARTUP_MSG: /************************************************************
> STARTUP_MSG: Starting DataNode STARTUP_MSG:   host =
> n-1.map-reduce.search.wall2.test/10.2.1.91 STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2 STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 2011-04-04 18:27:40,535 INFO
> org.apache.hadoop.hdfs.server.common.Storage: Storage
> directory /var/bomvl/data is not formatted. 2011-04-04 18:27:40,535
> INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2011-04-04 18:27:40,828 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
> FSDatasetStatusMBean 2011-04-04 18:27:40,829 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server
> at 50010 2011-04-04 18:27:40,831 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith
> is 1048576 bytes/s 2011-04-04 18:27:45,876 INFO org.mortbay.log:
> Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog 2011-04-04 18:27:45,940 INFO
> org.apache.hadoop.http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50075 2011-04-04 18:27:45,940 INFO
> org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned
> 50075 webServer.getConnectors()[0].getLocalPort() returned 50075
> 2011-04-04 18:27:45,940 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50075 2011-04-04 18:27:45,940 INFO org.mortbay.log:
> jetty-6.1.14 2011-04-04 18:29:27,006 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50075 2011-04-04 18:29:27,011 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
> with processName=DataNode, sessionId=null 2011-04-04 18:29:32,026
> INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC
> Metrics with hostName=DataNode, port=50020 2011-04-04 18:29:32,029
> INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2011-04-04 18:29:32,029 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 50020: starting 2011-04-04 18:29:32,031 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting
> 2011-04-04 18:29:32,031 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 50020: starting 2011-04-04 18:29:32,031 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting
> 2011-04-04 18:29:32,031 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration =
> DatanodeRegistration(n144-17a.wall2.test:50010, storageID=,
> infoPort=50075, ipcPort=50020) 2011-04-04 18:29:32,061 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id
> DS-195369122-10.2.1.91-50010-1301934572032 is assigned to data-node
> 10.1.1.3:50010 2011-04-04 18:29:32,061 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(10.1.1.3:50010,
> storageID=DS-195369122-10.2.1.91-50010-1301934572032, infoPort=50075,
> ipcPort=50020)In DataNode.run, data =
> FSDataset{dirpath='/var/bomvl/data/current'} 2011-04-04 18:29:32,062
> INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using
> BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec 2011-04-04
> 18:29:32,070 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> BlockReport of 0 blocks got processed in 2 msecs 2011-04-04
> 18:29:32,071 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
> Starting Periodic block scanner.
> 
> However, this doesn't seem very right:
> $ head -n 40
> hadoop-dplaetin-tasktracker-n-1.map-reduce.search.wall2.test.log
> 2011-04-04 18:27:39,041 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG: /************************************************************
> STARTUP_MSG: Starting TaskTracker STARTUP_MSG:   host =
> n-1.map-reduce.search.wall2.test/10.2.1.91 STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20
> -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> ************************************************************/
> 2011-04-04 18:27:44,176 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog 2011-04-04 18:27:44,263 INFO
> org.apache.hadoop.http.HttpServer: Port returned by
> webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50060 2011-04-04 18:27:44,269 INFO
> org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned
> 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
> 2011-04-04 18:27:44,269 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50060 2011-04-04 18:27:44,269 INFO org.mortbay.log:
> jetty-6.1.14 2011-04-04 18:28:05,858 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50060 2011-04-04 18:28:05,863 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics
> with processName=TaskTracker, sessionId= 2011-04-04 18:28:05,872 INFO
> org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics
> with hostName=TaskTracker, port=42914 2011-04-04 18:28:05,908 INFO
> org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2011-04-04 18:28:05,908 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 42914: starting 2011-04-04 18:28:05,909 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 0 on 42914: starting
> 2011-04-04 18:28:05,909 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 42914: starting 2011-04-04 18:28:05,909 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 2 on 42914: starting
> 2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 42914: starting 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 4 on 42914: starting
> 2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 42914: starting 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.ipc.Server: IPC Server handler 6 on 42914: starting
> 2011-04-04 18:28:05,910 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 42914: starting 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
> localhost/127.0.0.1:42914 2011-04-04 18:28:05,910 INFO
> org.apache.hadoop.mapred.TaskTracker: Starting tracker
> tracker_n144-17a.wall2.test:localhost/127.0.0.1:42914 2011-04-04
> 18:28:06,933 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 0 time(s). 2011-04-04
> 18:28:07,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 1 time(s). 2011-04-04
> 18:28:08,934 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 2 time(s). 2011-04-04
> 18:28:09,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 3 time(s). 2011-04-04
> 18:28:10,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 4 time(s). 2011-04-04
> 18:28:11,935 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 5 time(s). 2011-04-04
> 18:28:12,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 6 time(s). 2011-04-04
> 18:28:13,936 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 7 time(s). 2011-04-04
> 18:28:14,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 8 time(s). 2011-04-04
> 18:28:15,937 INFO org.apache.hadoop.ipc.Client: Retrying connect to
> server: n-1/10.1.1.3:9001. Already tried 9 time(s). 2011-04-04
> 18:28:15,938 INFO org.apache.hadoop.ipc.RPC: Server at
> n-1/10.1.1.3:9001 not available yet, Zzzzz... 2011-04-04 18:28:17,939
> INFO org.apache.hadoop.ipc.Client: Retrying connect to server:
> n-1/10.1.1.3:9001. Already tried 0 time(s).
> 
> Indeed, the jobtracker does not seem to be running (although that's
> probably another problem): $ sudo netstat -n -l -p | grep 9001
> $ 
> 
> Thanks for any help,
> Dieter