You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Björn-Elmar Macek <ma...@cs.uni-kassel.de> on 2012/08/13 14:31:06 UTC

DataNode and Tasttracker communication

Hi,

i am currently trying to run my hadoop program on a cluster. Sadly 
though my datanodes and tasktrackers seem to have difficulties with 
their communication as their logs say:
* Some datanodes and tasktrackers seem to have portproblems of some kind 
as it can be seen in the logs below. I wondered if this might be due to 
reasons correllated with the localhost entry in /etc/hosts as you can 
read in alot of posts with similar errors, but i checked the file 
neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you 
can ping localhost... the technician of the cluster said he'd be looking 
for the mechanics resolving localhost)
* The other nodes can not speak with the namenode and jobtracker 
(its-cs131). Although it is absolutely not clear, why this is happening: 
the "dfs -put" i do directly before the job is running fine, which seems 
to imply that communication between those servers is working flawlessly.

Is there any reason why this might happen?


Regards,
Elmar

LOGS BELOW:

\____Datanodes

After successfully putting the data to hdfs (at this point i thought 
namenode and datanodes have to communicate), i get the following errors 
when starting the job:

There are 2 kinds of logs i found: the first one is big (about 12MB) and 
looks like this:
############################### LOG TYPE 1 
############################################################
2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
2012-08-13 08:23:36,335 WARN 
org.apache.hadoop.hdfs.server.datanode.DataNode: 
java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed 
on connection exception: java.net.ConnectException: Connection refused
     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
     at $Proxy5.sendHeartbeat(Unknown Source)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
     at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.ConnectException: Connection refused
     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
     at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
     at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
     at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
     at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
     ... 5 more

... (this continues til the end of the log)

The second is short kind:
########################### LOG TYPE 2 
############################################################
2012-08-13 00:59:19,038 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 
1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-08-13 00:59:19,203 INFO 
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
hadoop-metrics2.properties
2012-08-13 00:59:19,216 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
MetricsSystem,sub=Stats registered.
2012-08-13 00:59:19,217 INFO 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot 
period at 10 second(s).
2012-08-13 00:59:19,218 INFO 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics 
system started
2012-08-13 00:59:19,306 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
ugi registered.
2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: 
Loaded the native-hadoop library
2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
2012-08-13 00:59:21,584 INFO 
org.apache.hadoop.hdfs.server.common.Storage: Storage directory 
/home/work/bmacek/hadoop/hdfs/slave is not formatted.
2012-08-13 00:59:21,584 INFO 
org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
2012-08-13 00:59:21,787 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: Registered 
FSDatasetStatusMBean
2012-08-13 00:59:21,897 INFO 
org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: 
Shutting down all async disk service threads...
2012-08-13 00:59:21,897 INFO 
org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All 
async disk service threads have been shut down.
2012-08-13 00:59:21,898 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: 
Problem binding to /0.0.0.0:50010 : Address already in use
     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
     at 
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
Caused by: java.net.BindException: Address already in use
     at sun.nio.ch.Net.bind(Native Method)
     at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
     ... 7 more

2012-08-13 00:59:21,899 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at 
its-cs133.its.uni-kassel.de/141.51.205.43
************************************************************/





\_____TastTracker
With TaskTrackers it is the same: there are 2 kinds.
############################### LOG TYPE 1 
############################################################
2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: 
Resending 'status' to 'its-cs131' with reponseId '879
2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: 
Caught exception: java.net.ConnectException: Call to 
its-cs131/141.51.205.41:35555 failed on connection exception: 
java.net.ConnectException: Connection refused
     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
     at 
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
     at 
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
Caused by: java.net.ConnectException: Connection refused
     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
     at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
     at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
     at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
     at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
     ... 6 more


########################### LOG TYPE 2 
############################################################
2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: 
STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 
1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-08-13 00:59:24,569 INFO 
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
hadoop-metrics2.properties
2012-08-13 00:59:24,626 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
MetricsSystem,sub=Stats registered.
2012-08-13 00:59:24,627 INFO 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot 
period at 10 second(s).
2012-08-13 00:59:24,627 INFO 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics 
system started
2012-08-13 00:59:24,950 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
ugi registered.
2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via 
org.mortbay.log.Slf4jLog
2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added 
global filtersafety 
(class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: 
Starting tasktracker with owner as bmacek
2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good 
mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: 
Loaded the native-hadoop library
2012-08-13 00:59:25,255 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
jvm registered.
2012-08-13 00:59:25,256 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
TaskTrackerMetrics registered.
2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting 
SocketReader
2012-08-13 00:59:25,282 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
RpcDetailedActivityForPort54850 registered.
2012-08-13 00:59:25,282 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
RpcActivityForPort54850 registered.
2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server 
Responder: starting
2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server 
listener on 54850: starting
2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 0 on 54850: starting
2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 1 on 54850: starting
2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: 
TaskTracker up at: localhost/127.0.0.1:54850
2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 3 on 54850: starting
2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 2 on 54850: starting
2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: 
Starting tracker 
tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying 
connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: 
Starting thread: Map-events fetcher for all reduce tasks on 
tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid 
exited with exit code 0
2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using 
ResourceCalculatorPlugin : 
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: 
TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is 
disabled.
2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: 
IndexCache created with max memory = 10485760
2012-08-13 00:59:38,158 INFO 
org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source 
ShuffleServerMetrics registered.
2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port 
returned by webServer.getConnectors()[0].getLocalPort() before open() is 
-1. Opening the listener on 50060
2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can 
not start task tracker because java.net.BindException: Address already 
in use
     at sun.nio.ch.Net.bind(Native Method)
     at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
     at 
org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)

2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: 
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at 
its-cs133.its.uni-kassel.de/141.51.205.43
************************************************************/

Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi James,

thank you for your reply!

i tried to, but i can only see my own processes, since i am no root user. :(
I already sent out a request to the cluster admins to sort this out for me.

Regards,
Björn


Am 14.08.2012 08:51, schrieb James Brown:
> Hi Bjorn,
>
> For the two items below, it is possible datanodes and tasktrackers are 
> already running.
>
> This command will show processes bound to the datanode port:
> netstat -putan | grep 50010
>
> tasktracker port:
> netstat -putan | grep 50060
>
> If your netstat command does not support the -p option try lsof.
>
>
>> \____Datanodes
> ...
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> ...
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>
> ...
>
>> \_____TastTracker
> ...
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
> ...
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already
>> in use
>
>
>
>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi James,

thank you for your reply!

i tried to, but i can only see my own processes, since i am no root user. :(
I already sent out a request to the cluster admins to sort this out for me.

Regards,
Björn


Am 14.08.2012 08:51, schrieb James Brown:
> Hi Bjorn,
>
> For the two items below, it is possible datanodes and tasktrackers are 
> already running.
>
> This command will show processes bound to the datanode port:
> netstat -putan | grep 50010
>
> tasktracker port:
> netstat -putan | grep 50060
>
> If your netstat command does not support the -p option try lsof.
>
>
>> \____Datanodes
> ...
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> ...
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>
> ...
>
>> \_____TastTracker
> ...
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
> ...
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already
>> in use
>
>
>
>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi James,

thank you for your reply!

i tried to, but i can only see my own processes, since i am no root user. :(
I already sent out a request to the cluster admins to sort this out for me.

Regards,
Björn


Am 14.08.2012 08:51, schrieb James Brown:
> Hi Bjorn,
>
> For the two items below, it is possible datanodes and tasktrackers are 
> already running.
>
> This command will show processes bound to the datanode port:
> netstat -putan | grep 50010
>
> tasktracker port:
> netstat -putan | grep 50060
>
> If your netstat command does not support the -p option try lsof.
>
>
>> \____Datanodes
> ...
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> ...
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>
> ...
>
>> \_____TastTracker
> ...
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
> ...
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already
>> in use
>
>
>
>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi James,

thank you for your reply!

i tried to, but i can only see my own processes, since i am no root user. :(
I already sent out a request to the cluster admins to sort this out for me.

Regards,
Björn


Am 14.08.2012 08:51, schrieb James Brown:
> Hi Bjorn,
>
> For the two items below, it is possible datanodes and tasktrackers are 
> already running.
>
> This command will show processes bound to the datanode port:
> netstat -putan | grep 50010
>
> tasktracker port:
> netstat -putan | grep 50060
>
> If your netstat command does not support the -p option try lsof.
>
>
>> \____Datanodes
> ...
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> ...
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>
> ...
>
>> \_____TastTracker
> ...
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
> ...
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already
>> in use
>
>
>
>
>


Re: DataNode and Tasttracker communication

Posted by James Brown <jb...@syndicate.net>.
Hi Bjorn,

For the two items below, it is possible datanodes and tasktrackers are 
already running.

This command will show processes bound to the datanode port:
netstat -putan | grep 50010

tasktracker port:
netstat -putan | grep 50060

If your netstat command does not support the -p option try lsof.


> \____Datanodes
...
> The second is short kind:
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:19,038 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
...
> 2012-08-13 00:59:21,898 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
> Problem binding to /0.0.0.0:50010 : Address already in use

...

> \_____TastTracker
...
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG:
...
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
> not start task tracker because java.net.BindException: Address already
> in use





Re: DataNode and Tasttracker communication

Posted by James Brown <jb...@syndicate.net>.
Hi Bjorn,

For the two items below, it is possible datanodes and tasktrackers are 
already running.

This command will show processes bound to the datanode port:
netstat -putan | grep 50010

tasktracker port:
netstat -putan | grep 50060

If your netstat command does not support the -p option try lsof.


> \____Datanodes
...
> The second is short kind:
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:19,038 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
...
> 2012-08-13 00:59:21,898 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
> Problem binding to /0.0.0.0:50010 : Address already in use

...

> \_____TastTracker
...
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG:
...
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
> not start task tracker because java.net.BindException: Address already
> in use





Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi,

with "using DNS" you mean using the servers' non-IP-names, right?
If so, i do use DNS. Since i am working in a SLURM enviroment and i get 
a list of nodes for evry job i schedule, i construct the config files 
for evry job by taking the list of assigned nodes and deviding the 
roles(NameNode,JobTracker,SecondaryNameNode,TaskTrackers,DataNodes) over 
this set of machines. SLURM offers me names like "its-cs<nodenumber>" 
which is enough for ssh to connect - maybe it isnt for all hadoop 
processes. The complete names would be 
"its-cs<nodenumber>.its.uni-kassel.de". I will add this part of the 
adress for testing. But i fear it wont help alot, cause the JobTracker's 
log seems to know the full names:
###
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000887 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000888 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000889 has split on 
node:/default-rack/its-cs195.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000890 has split on 
node:/default-rack/its-cs196.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000891 has split on 
node:/default-rack/its-cs201.its.uni-kassel.de
###

Pings work btw: i could ping the NameNode from all problematic nodes. 
And lsof -i didnt yield and other programs running on the 
NameNode/JobTracker node with the problematic ports. :( Maybe something 
to notice is, that after the NameNode/JobTracker server is atm not 
running anymore although the DataNode/TaskTracker logs are still growing.


Concerning IPv6: as far as i can see i would have to modify global 
config files to dsiable it. Since i am only a user of this cluster with 
very limited insight in why the machines are configured the way they 
are, i want to be very careful with asking the technicians to make 
changes to their setup. I dont want to be respectless.
I will try using the full names first and if this doesnt help, i will 
ofc ask them if no other options are available.


Am 13.08.12 16:12, schrieb Mohammad Tariq:
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback 
> prob to me. The log shows that call is going at 0.0.0.0. Apart from 
> what you have said, I think disabling IPv6 and making sure that there 
> is no prob with the DNS resolution is also necessary. Please correct 
> me if I am wrong. Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel 
> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>
>     Based on your /etc/hosts output, why aren't you using DNS?
>
>     Outside of MapR, multihomed machines can be problematic. Hadoop
>     doesn't generally work well when you're not using the FQDN or its
>     alias.
>
>     The issue isn't the SSH, but if you go to the node which is having
>     trouble connecting to another node,  then try to ping it, or some
>     other general communication,  if it succeeds, your issue is that
>     the port you're trying to communicate with is blocked.  Then its
>     more than likely an ipconfig or firewall issue.
>
>     On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>     <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>>     Hi Michael,
>>
>>     well i can ssh from any node to any other without being prompted.
>>     The reason for this is, that my home dir is mounted in every
>>     server in the cluster.
>>
>>     If the machines are multihomed: i dont know. i could ask if this
>>     would be of importance.
>>
>>     Shall i?
>>
>>     Regards,
>>     Elmar
>>
>>     Am 13.08.12 14:59, schrieb Michael Segel:
>>>     If the nodes can communicate and distribute data, then the odds
>>>     are that the issue isn't going to be in his /etc/hosts.
>>>
>>>     A more relevant question is if he's running a firewall on each
>>>     of these machines?
>>>
>>>     A simple test... ssh to one node, ping other nodes and the
>>>     control nodes at random to see if they can see one another. Then
>>>     check to see if there is a firewall running which would limit
>>>     the types of traffic between nodes.
>>>
>>>     One other side note... are these machines multi-homed?
>>>
>>>     On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hello there,
>>>>
>>>>          Could you please share your /etc/hosts file, if you don't
>>>>     mind.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>     <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         i am currently trying to run my hadoop program on a
>>>>         cluster. Sadly though my datanodes and tasktrackers seem to
>>>>         have difficulties with their communication as their logs say:
>>>>         * Some datanodes and tasktrackers seem to have portproblems
>>>>         of some kind as it can be seen in the logs below. I
>>>>         wondered if this might be due to reasons correllated with
>>>>         the localhost entry in /etc/hosts as you can read in alot
>>>>         of posts with similar errors, but i checked the file
>>>>         neither localhost nor 127.0.0.1/127.0.1.1
>>>>         <http://127.0.0.1/127.0.1.1> is bound there. (although you
>>>>         can ping localhost... the technician of the cluster said
>>>>         he'd be looking for the mechanics resolving localhost)
>>>>         * The other nodes can not speak with the namenode and
>>>>         jobtracker (its-cs131). Although it is absolutely not
>>>>         clear, why this is happening: the "dfs -put" i do directly
>>>>         before the job is running fine, which seems to imply that
>>>>         communication between those servers is working flawlessly.
>>>>
>>>>         Is there any reason why this might happen?
>>>>
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         LOGS BELOW:
>>>>
>>>>         \____Datanodes
>>>>
>>>>         After successfully putting the data to hdfs (at this point
>>>>         i thought namenode and datanodes have to communicate), i
>>>>         get the following errors when starting the job:
>>>>
>>>>         There are 2 kinds of logs i found: the first one is big
>>>>         (about 12MB) and looks like this:
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>         2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>         2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>         2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>         2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>         2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>         2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>         2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>         2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>         2012-08-13 08:23:36,335 WARN
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35554 <http://141.51.205.41:35554/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at $Proxy5.sendHeartbeat(Unknown Source)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>             at java.lang.Thread.run(Thread.java:619)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 5 more
>>>>
>>>>         ... (this continues til the end of the log)
>>>>
>>>>         The second is short kind:
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:19,038 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting DataNode
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:19,203 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:19,216 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:19,217 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:19,218 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>>>         metrics system started
>>>>         2012-08-13 00:59:19,306 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:19,346 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>         directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>         2012-08-13 00:59:21,787 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>>         FSDatasetStatusMBean
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         Shutting down all async disk service threads...
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         All async disk service threads have been shut down.
>>>>         2012-08-13 00:59:21,898 ERROR
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.BindException: Problem binding to /0.0.0.0:50010
>>>>         <http://0.0.0.0:50010/> : Address already in use
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>         Caused by: java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>             ... 7 more
>>>>
>>>>         2012-08-13 00:59:21,899 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down DataNode at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         \_____TastTracker
>>>>         With TaskTrackers it is the same: there are 2 kinds.
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 02:09:54,645 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>>>         'its-cs131' with reponseId '879
>>>>         2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>         2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>         2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>         2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>         2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>         2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>         2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>         2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>         2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>         2012-08-13 02:10:04,651 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35555 <http://141.51.205.41:35555/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>         Source)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 6 more
>>>>
>>>>
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:24,376 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting TaskTracker
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:24,569 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:24,626 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         TaskTracker metrics system started
>>>>         2012-08-13 00:59:24,950 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>>         org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>>         org.mortbay.log.Slf4jLog
>>>>         2012-08-13 00:59:25,206 INFO
>>>>         org.apache.hadoop.http.HttpServer: Added global
>>>>         filtersafety
>>>>         (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>         2012-08-13 00:59:25,232 INFO
>>>>         org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
>>>>         logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>         2012-08-13 00:59:25,237 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tasktracker
>>>>         with owner as bmacek
>>>>         2012-08-13 00:59:25,239 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>>>         directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>         2012-08-13 00:59:25,244 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:25,255 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source jvm registered.
>>>>         2012-08-13 00:59:25,256 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source TaskTrackerMetrics registered.
>>>>         2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>>>         Starting SocketReader
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcDetailedActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server Responder: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server listener on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 0 on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 1 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>>>         localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 3 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 2 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:38,104 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting thread:
>>>>         Map-events fetcher for all reduce tasks on
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:38,120 INFO
>>>>         org.apache.hadoop.util.ProcessTree: setsid exited with exit
>>>>         code 0
>>>>         2012-08-13 00:59:38,134 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Using
>>>>         ResourceCalculatorPlugin :
>>>>         org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>         2012-08-13 00:59:38,137 WARN
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>         totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>>         disabled.
>>>>         2012-08-13 00:59:38,145 INFO
>>>>         org.apache.hadoop.mapred.IndexCache: IndexCache created
>>>>         with max memory = 10485760
>>>>         2012-08-13 00:59:38,158 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ShuffleServerMetrics registered.
>>>>         2012-08-13 00:59:38,161 INFO
>>>>         org.apache.hadoop.http.HttpServer: Port returned by
>>>>         webServer.getConnectors()[0].getLocalPort() before open()
>>>>         is -1. Opening the listener on 50060
>>>>         2012-08-13 00:59:38,161 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Can not start task
>>>>         tracker because java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at
>>>>         org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>             at
>>>>         org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>>         2012-08-13 00:59:38,163 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>
>>
>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Ok, to give to you the solution to the namespace errors on the 
datanodes, the startup and the communication problem between 
datanodes/tasktracker and namenode/jobtracker i did the following:

As you can read on several sites: there are 2 strategies for fixing 
datanode namespaces. since i like to delete old stuff, cause it seems 
more reliable to me i wrote this script which can be called anytime to 
fix namespaces in an arbitrary complex enviroment:

############ SCRIPT OVER HERE##########
#!/bin/sh
~/hadoop-1.0.2/bin/stop-all.sh

rm curclean.sh

sleep 3

echo "#!/bin/sh" > curclean.sh
while read line
do
     echo "ssh '$line' 'rm -rf /home/work/bmacek/hadoop/hdfs/slave" >> 
curclean.sh
done < "/home/fb16/bmacek/hadoop-1.0.2/conf/slaves"

/home/fb16/bmacek/curclean.sh

sleep 3

ssh $(< ~/hadoop-1.0.2/conf/namenode) "~/hadoop-1.0.2/bin/hadoop 
namenode -format"

#####################################

!!! WARNING ADAPT PATHS !!!



The next two problems could be avoided by setting the following 
properties in mapred-site.xml

############## FIX PORT PROBLEMS FOR SLAVES #############
     <property>
         <name>mapred.task.tracker.http.address</name>
         <value>0.0.0.0:0</value>
     </property>
     <property>
         <name>dfs.datanode.port</name>
         <value>0
     </property>



For people who are working with huge data i strongly recommend using:
    <property>
         <name>mapred.task.timeout</name>
         <value>0</value>
     </property>
Otherwise your job might fail due to reasons which you dont want to 
influence the jobexecution.


So much from me ... for now. ;)


Best regards and thanks for having a look into my problems here and there.
Björn

Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Ok, to give to you the solution to the namespace errors on the 
datanodes, the startup and the communication problem between 
datanodes/tasktracker and namenode/jobtracker i did the following:

As you can read on several sites: there are 2 strategies for fixing 
datanode namespaces. since i like to delete old stuff, cause it seems 
more reliable to me i wrote this script which can be called anytime to 
fix namespaces in an arbitrary complex enviroment:

############ SCRIPT OVER HERE##########
#!/bin/sh
~/hadoop-1.0.2/bin/stop-all.sh

rm curclean.sh

sleep 3

echo "#!/bin/sh" > curclean.sh
while read line
do
     echo "ssh '$line' 'rm -rf /home/work/bmacek/hadoop/hdfs/slave" >> 
curclean.sh
done < "/home/fb16/bmacek/hadoop-1.0.2/conf/slaves"

/home/fb16/bmacek/curclean.sh

sleep 3

ssh $(< ~/hadoop-1.0.2/conf/namenode) "~/hadoop-1.0.2/bin/hadoop 
namenode -format"

#####################################

!!! WARNING ADAPT PATHS !!!



The next two problems could be avoided by setting the following 
properties in mapred-site.xml

############## FIX PORT PROBLEMS FOR SLAVES #############
     <property>
         <name>mapred.task.tracker.http.address</name>
         <value>0.0.0.0:0</value>
     </property>
     <property>
         <name>dfs.datanode.port</name>
         <value>0
     </property>



For people who are working with huge data i strongly recommend using:
    <property>
         <name>mapred.task.timeout</name>
         <value>0</value>
     </property>
Otherwise your job might fail due to reasons which you dont want to 
influence the jobexecution.


So much from me ... for now. ;)


Best regards and thanks for having a look into my problems here and there.
Björn

Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Ok, to give to you the solution to the namespace errors on the 
datanodes, the startup and the communication problem between 
datanodes/tasktracker and namenode/jobtracker i did the following:

As you can read on several sites: there are 2 strategies for fixing 
datanode namespaces. since i like to delete old stuff, cause it seems 
more reliable to me i wrote this script which can be called anytime to 
fix namespaces in an arbitrary complex enviroment:

############ SCRIPT OVER HERE##########
#!/bin/sh
~/hadoop-1.0.2/bin/stop-all.sh

rm curclean.sh

sleep 3

echo "#!/bin/sh" > curclean.sh
while read line
do
     echo "ssh '$line' 'rm -rf /home/work/bmacek/hadoop/hdfs/slave" >> 
curclean.sh
done < "/home/fb16/bmacek/hadoop-1.0.2/conf/slaves"

/home/fb16/bmacek/curclean.sh

sleep 3

ssh $(< ~/hadoop-1.0.2/conf/namenode) "~/hadoop-1.0.2/bin/hadoop 
namenode -format"

#####################################

!!! WARNING ADAPT PATHS !!!



The next two problems could be avoided by setting the following 
properties in mapred-site.xml

############## FIX PORT PROBLEMS FOR SLAVES #############
     <property>
         <name>mapred.task.tracker.http.address</name>
         <value>0.0.0.0:0</value>
     </property>
     <property>
         <name>dfs.datanode.port</name>
         <value>0
     </property>



For people who are working with huge data i strongly recommend using:
    <property>
         <name>mapred.task.timeout</name>
         <value>0</value>
     </property>
Otherwise your job might fail due to reasons which you dont want to 
influence the jobexecution.


So much from me ... for now. ;)


Best regards and thanks for having a look into my problems here and there.
Björn

Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Ok, to give to you the solution to the namespace errors on the 
datanodes, the startup and the communication problem between 
datanodes/tasktracker and namenode/jobtracker i did the following:

As you can read on several sites: there are 2 strategies for fixing 
datanode namespaces. since i like to delete old stuff, cause it seems 
more reliable to me i wrote this script which can be called anytime to 
fix namespaces in an arbitrary complex enviroment:

############ SCRIPT OVER HERE##########
#!/bin/sh
~/hadoop-1.0.2/bin/stop-all.sh

rm curclean.sh

sleep 3

echo "#!/bin/sh" > curclean.sh
while read line
do
     echo "ssh '$line' 'rm -rf /home/work/bmacek/hadoop/hdfs/slave" >> 
curclean.sh
done < "/home/fb16/bmacek/hadoop-1.0.2/conf/slaves"

/home/fb16/bmacek/curclean.sh

sleep 3

ssh $(< ~/hadoop-1.0.2/conf/namenode) "~/hadoop-1.0.2/bin/hadoop 
namenode -format"

#####################################

!!! WARNING ADAPT PATHS !!!



The next two problems could be avoided by setting the following 
properties in mapred-site.xml

############## FIX PORT PROBLEMS FOR SLAVES #############
     <property>
         <name>mapred.task.tracker.http.address</name>
         <value>0.0.0.0:0</value>
     </property>
     <property>
         <name>dfs.datanode.port</name>
         <value>0
     </property>



For people who are working with huge data i strongly recommend using:
    <property>
         <name>mapred.task.timeout</name>
         <value>0</value>
     </property>
Otherwise your job might fail due to reasons which you dont want to 
influence the jobexecution.


So much from me ... for now. ;)


Best regards and thanks for having a look into my problems here and there.
Björn

Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hello again,

well i have sorted out about all of doubts, that the communication 
problems are related to the infrastructures. Instead  i found in a new 
execution of my program, that due to some unknown and untracked reasons 
the namenode and the tasktracker stop their services due to too many 
failed map tasks. See the logs below.

 From that time on ofc, the running datanodes/tasktracker cannot 
communicate with jobtracker/namenode.


What i do not understand is, why the jobs do not answer or fail. I 
wanted to look it up in the logs, but somehow they do not contain 
anything from times prior to 24:00/0:00 o'clock - a time at which the 
master(s) were already dead for 2 hours.

Are there any suggestions? Maybe did i do something wrong in the Mapper?

Regards,
Elmar

#################################### JOBLOG ... LAST LINES 
#################################
Task attempt_201208152128_0001_m_000007_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:10 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 21:50:12 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 21:50:13 INFO mapred.JobClient:  map 23% reduce 0%
12/08/15 21:50:14 INFO mapred.JobClient:  map 19% reduce 0%
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000014_1, Status : FAILED
Task attempt_201208152128_0001_m_000014_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000015_1, Status : FAILED
Task attempt_201208152128_0001_m_000015_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000004_1, Status : FAILED
Task attempt_201208152128_0001_m_000004_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000005_1, Status : FAILED
Task attempt_201208152128_0001_m_000005_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000012_1, Status : FAILED
Task attempt_201208152128_0001_m_000012_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:18 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000008_1, Status : FAILED
Task attempt_201208152128_0001_m_000008_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000009_1, Status : FAILED
Task attempt_201208152128_0001_m_000009_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000000_1, Status : FAILED
Task attempt_201208152128_0001_m_000000_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000010_1, Status : FAILED
Task attempt_201208152128_0001_m_000010_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:20 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000002_1, Status : FAILED
Task attempt_201208152128_0001_m_000002_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:21 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000003_1, Status : FAILED
Task attempt_201208152128_0001_m_000003_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_1, Status : FAILED
Task attempt_201208152128_0001_m_000013_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_1, Status : FAILED
Task attempt_201208152128_0001_m_000001_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:23 INFO mapred.JobClient:  map 11% reduce 0%
12/08/15 21:50:25 INFO mapred.JobClient:  map 17% reduce 0%
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_1, Status : FAILED
Task attempt_201208152128_0001_m_000006_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_2, Status : FAILED
Task attempt_201208152128_0001_m_000013_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_2, Status : FAILED
Task attempt_201208152128_0001_m_000001_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:28 INFO mapred.JobClient:  map 40% reduce 0%
12/08/15 21:50:29 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_2, Status : FAILED
Task attempt_201208152128_0001_m_000011_2 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:30 INFO mapred.JobClient:  map 42% reduce 0%
12/08/15 21:50:31 INFO mapred.JobClient:  map 52% reduce 0%
12/08/15 21:50:33 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:37 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:39 INFO mapred.JobClient:  map 61% reduce 0%
12/08/15 21:50:42 INFO mapred.JobClient:  map 62% reduce 0%
12/08/15 21:50:46 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:55 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:57 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_1, Status : FAILED
Task attempt_201208152128_0001_m_000011_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:52:10 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_2, Status : FAILED
Task attempt_201208152128_0001_m_000006_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:25 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000007_2, Status : FAILED
Task attempt_201208152128_0001_m_000007_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:29 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 22:00:32 INFO mapred.JobClient:  map 46% reduce 0%
12/08/15 22:00:34 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 22:00:35 INFO mapred.JobClient:  map 38% reduce 0%
12/08/15 22:00:40 INFO mapred.JobClient: Job complete: job_201208152128_0001
12/08/15 22:00:40 INFO mapred.JobClient: Counters: 8
12/08/15 22:00:40 INFO mapred.JobClient:   Job Counters
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=8861935
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
reduces waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
maps waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Rack-local map tasks=52
12/08/15 22:00:40 INFO mapred.JobClient:     Launched map tasks=59
12/08/15 22:00:40 INFO mapred.JobClient:     Data-local map tasks=7
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
12/08/15 22:00:40 INFO mapred.JobClient:     Failed map tasks=1
12/08/15 22:00:40 INFO mapred.JobClient: Job Failed: # of failed Map 
Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: 
task_201208152128_0001_m_000007
java.io.IOException: Job failed!
     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
     at 
uni.kassel.macek.rtprep.RetweetApplication.run(RetweetApplication.java:76)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at 
uni.kassel.macek.rtprep.RetweetApplication.main(RetweetApplication.java:27)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:601)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

#################################### NAMENODE ... LAST LINES 
#################################
2012-08-15 21:44:32,976 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.30:50010, blocks: 3, 
processing time: 0 msecs
2012-08-15 21:48:44,346 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:48:44,346 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:50:04,135 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 16 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 21:53:44,355 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:53:44,355 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:56:17,856 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.112:50010, blocks: 8, 
processing time: 0 msecs
2012-08-15 21:58:44,362 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:58:44,363 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 22:00:19,904 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 36 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-2215699895714614889 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-9153653706918013228 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.115:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.118:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.114:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:39,717 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar. 
blk_8780584073579865736_1010
2012-08-15 22:00:39,724 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 141.51.205.113:50010 is 
added to blk_8780584073579865736_1010 size 39356
2012-08-15 22:00:39,726 INFO org.apache.hadoop.hdfs.StateChange: 
Removing lease on  file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
from client DFSClient_-1191648872
2012-08-15 22:00:39,727 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.completeFile: file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
is closed by DFSClient_-1191648872
2012-08-15 22:00:39,750 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_2297057220333714498 is added to 
invalidSet of 141.51.205.114:50010
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

#################################### JOBTRACKER ... LAST LINES 
#################################

2012-08-15 22:00:33,295 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:33,296 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (TASK_CLEANUP) 'attempt_201208152128_0001_m_000008_2' to tip 
task_201208152128_0001_m_000008, for tracker 
'tracker_its-cs208.its.uni-kassel.de:localhost/127.0.0.1:58503'
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Aborting job job_201208152128_0001
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Killing job 'job_201208152128_0001'
2012-08-15 22:00:33,697 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (JOB_CLEANUP) 'attempt_201208152128_0001_m_000016_0' to tip 
task_201208152128_0001_m_000016, for tracker 
'tracker_its-cs120.its.uni-kassel.de:localhost/127.0.0.1:52467'
2012-08-15 22:00:33,698 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_3'
2012-08-15 22:00:33,705 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000015_2: Task 
attempt_201208152128_0001_m_000015_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:33,706 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000015_2'
2012-08-15 22:00:36,200 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000003_2: Task 
attempt_201208152128_0001_m_000003_2 failed to report status for 600 
seconds. Killing!
2012-08-15 22:00:36,201 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_2'
2012-08-15 22:00:36,534 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_2'
2012-08-15 22:00:36,702 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000012_2: Task 
attempt_201208152128_0001_m_000012_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,703 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000012_2'
2012-08-15 22:00:36,708 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000009_2: Task 
attempt_201208152128_0001_m_000009_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,709 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_2'
2012-08-15 22:00:36,778 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000000_2: Task 
attempt_201208152128_0001_m_000000_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_2'
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_2'
2012-08-15 22:00:37,024 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_4'
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_3'
2012-08-15 22:00:38,758 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000002_2: Task 
attempt_201208152128_0001_m_000002_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000002_2'
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000013_3'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_4'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_3'
2012-08-15 22:00:39,205 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000004_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000005_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:39,539 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000006_3'
2012-08-15 22:00:39,707 INFO org.apache.hadoop.mapred.JobInProgress: 
Task 'attempt_201208152128_0001_m_000016_0' has completed 
task_201208152128_0001_m_000016 successfully.
2012-08-15 22:00:39,712 INFO 
org.apache.hadoop.mapred.JobInProgress$JobSummary: 
jobId=job_201208152128_0001,submitTime=1345058961257,launchTime=1345058961719,firstMapTaskLaunchTime=1345058968763,firstJobSetupTaskLaunchTime=1345058962669,firstJobCleanupTaskLaunchTime=1345060833697,finishTime=1345060839709,numMaps=16,numSlotsPerMap=1,numReduces=1,numSlotsPerReduce=1,user=bmacek,queue=default,status=FAILED,mapSlotSeconds=8861,reduceSlotsSeconds=0,clusterMapCapacity=22,clusterReduceCapacity=22
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000016_0'
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobHistory: 
Creating DONE subfolder at 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,747 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,762 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_conf.xml 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>





Am 14.08.2012 13:25, schrieb Björn-Elmar Macek:
> Hi Michael and Mohammad,
>
> thanks alot for your inpus!
> i have pinged the people at the cluster in order to (eventually 
> disable IPv6) and definetly check the ports corresponding to the 
> appropriate machines. I will keep you updated.
>
> Regards,
> Elmar
>
>
> Am 13.08.2012 22:39, schrieb Michael Segel:
>>
>> The key is to think about what can go wrong, but start with the low 
>> hanging fruit.
>>
>> I mean you could be right, however you're jumping the gun and are 
>> over looking simpler issues.
>>
>> The most common issue is that the networking traffic is being filtered.
>> Of course since we're both diagnosing this with minimal information, 
>> we're kind of shooting from the hip.
>>
>> This is why I'm asking if there is any networking traffic between the 
>> nodes.  If you have partial communication, then focus on why you 
>> can't see the specific traffic.
>>
>>
>> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>> Thank you so very much for the detailed response Michael. I'll keep 
>>> the tip in mind. Please pardon my ignorance, as I am still in the 
>>> learning phase.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>>
>>>     0.0.0.0 means that the call is going to all interfaces on the
>>>     machine.  (Shouldn't be an issue...)
>>>
>>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>>     data to DNs and they seem to communicate, therefore if its IPv6
>>>     related, wouldn't it impact all traffic and not just a specific
>>>     port?
>>>     I agree... shut down IPv6 if you can.
>>>
>>>     I don't disagree with your assessment. I am just suggesting that
>>>     before you do a really deep dive, you think about the more
>>>     obvious stuff first.
>>>
>>>     There are a couple of other things... like do all of the
>>>     /etc/hosts files on all of the machines match?
>>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>>
>>>     BTW, you said DNS in your response. if you're using DNS, then
>>>     you don't really want to have much info in the /etc/hosts file
>>>     except loopback and the server's IP address.
>>>
>>>     Looking at the problem OP is indicating some traffic works,
>>>     while other traffic doesn't. Most likely something is blocking
>>>     the ports. Iptables is the first place to look.
>>>
>>>     Just saying. ;-)
>>>
>>>
>>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hi Michael,
>>>>            I asked for hosts file because there seems to be some
>>>>     loopback prob to me. The log shows that call is going at
>>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>>     and making sure that there is no prob with the DNS resolution
>>>>     is also necessary. Please correct me if I am wrong. Thank you.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>>     wrote:
>>>>
>>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>>
>>>>         Outside of MapR, multihomed machines can be problematic.
>>>>         Hadoop doesn't generally work well when you're not using
>>>>         the FQDN or its alias.
>>>>
>>>>         The issue isn't the SSH, but if you go to the node which is
>>>>         having trouble connecting to another node,  then try to
>>>>         ping it, or some other general communication,  if it
>>>>         succeeds, your issue is that the port you're trying to
>>>>         communicate with is blocked.  Then its more than likely an
>>>>         ipconfig or firewall issue.
>>>>
>>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>>         Hi Michael,
>>>>>
>>>>>         well i can ssh from any node to any other without being
>>>>>         prompted. The reason for this is, that my home dir is
>>>>>         mounted in every server in the cluster.
>>>>>
>>>>>         If the machines are multihomed: i dont know. i could ask
>>>>>         if this would be of importance.
>>>>>
>>>>>         Shall i?
>>>>>
>>>>>         Regards,
>>>>>         Elmar
>>>>>
>>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>>         If the nodes can communicate and distribute data, then
>>>>>>         the odds are that the issue isn't going to be in his
>>>>>>         /etc/hosts.
>>>>>>
>>>>>>         A more relevant question is if he's running a firewall on
>>>>>>         each of these machines?
>>>>>>
>>>>>>         A simple test... ssh to one node, ping other nodes and
>>>>>>         the control nodes at random to see if they can see one
>>>>>>         another. Then check to see if there is a firewall running
>>>>>>         which would limit the types of traffic between nodes.
>>>>>>
>>>>>>         One other side note... are these machines multi-homed?
>>>>>>
>>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>>
>>>>>>>         Hello there,
>>>>>>>
>>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>>         don't mind.
>>>>>>>
>>>>>>>         Regards,
>>>>>>>          Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>>         wrote:
>>>>>>>
>>>>>>>             Hi,
>>>>>>>
>>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>>             seem to have difficulties with their communication
>>>>>>>             as their logs say:
>>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>>             logs below. I wondered if this might be due to
>>>>>>>             reasons correllated with the localhost entry in
>>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>>             similar errors, but i checked the file neither
>>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>>             (although you can ping localhost... the technician
>>>>>>>             of the cluster said he'd be looking for the
>>>>>>>             mechanics resolving localhost)
>>>>>>>             * The other nodes can not speak with the namenode
>>>>>>>             and jobtracker (its-cs131). Although it is
>>>>>>>             absolutely not clear, why this is happening: the
>>>>>>>             "dfs -put" i do directly before the job is running
>>>>>>>             fine, which seems to imply that communication
>>>>>>>             between those servers is working flawlessly.
>>>>>>>
>>>>>>>             Is there any reason why this might happen?
>>>>>>>
>>>>>>>
>>>>>>>             Regards,
>>>>>>>             Elmar
>>>>>>>
>>>>>>>             LOGS BELOW:
>>>>>>>
>>>>>>>             \____Datanodes
>>>>>>>
>>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>>             point i thought namenode and datanodes have to
>>>>>>>             communicate), i get the following errors when
>>>>>>>             starting the job:
>>>>>>>
>>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>>             big (about 12MB) and looks like this:
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 5 more
>>>>>>>
>>>>>>>             ... (this continues til the end of the log)
>>>>>>>
>>>>>>>             The second is short kind:
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             DataNode metrics system started
>>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Storage directory
>>>>>>>             /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Formatting ...
>>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             Registered FSDatasetStatusMBean
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             Shutting down all async disk service threads...
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             All async disk service threads have been shut down.
>>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.BindException: Problem binding to
>>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>>             already in use
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>>             Caused by: java.net.BindException: Address already
>>>>>>>             in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>>                 ... 7 more
>>>>>>>
>>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             \_____TastTracker
>>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>>             Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 6 more
>>>>>>>
>>>>>>>
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             TaskTracker metrics system started
>>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log:
>>>>>>>             Logging to
>>>>>>>             org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>>             filtersafety
>>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>>             and reduceRetainSize=-1
>>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tasktracker with owner as bmacek
>>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>>             local directories are:
>>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source jvm registered.
>>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source TaskTrackerMetrics registered.
>>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcDetailedActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>>             starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>>             54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tracker
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>>             with exit code 0
>>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>>             ResourceCalculatorPlugin :
>>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>>             is disabled.
>>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>>             created with max memory = 10485760
>>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ShuffleServerMetrics registered.
>>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>>             task tracker because java.net.BindException: Address
>>>>>>>             already in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>>
>>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hello again,

well i have sorted out about all of doubts, that the communication 
problems are related to the infrastructures. Instead  i found in a new 
execution of my program, that due to some unknown and untracked reasons 
the namenode and the tasktracker stop their services due to too many 
failed map tasks. See the logs below.

 From that time on ofc, the running datanodes/tasktracker cannot 
communicate with jobtracker/namenode.


What i do not understand is, why the jobs do not answer or fail. I 
wanted to look it up in the logs, but somehow they do not contain 
anything from times prior to 24:00/0:00 o'clock - a time at which the 
master(s) were already dead for 2 hours.

Are there any suggestions? Maybe did i do something wrong in the Mapper?

Regards,
Elmar

#################################### JOBLOG ... LAST LINES 
#################################
Task attempt_201208152128_0001_m_000007_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:10 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 21:50:12 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 21:50:13 INFO mapred.JobClient:  map 23% reduce 0%
12/08/15 21:50:14 INFO mapred.JobClient:  map 19% reduce 0%
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000014_1, Status : FAILED
Task attempt_201208152128_0001_m_000014_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000015_1, Status : FAILED
Task attempt_201208152128_0001_m_000015_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000004_1, Status : FAILED
Task attempt_201208152128_0001_m_000004_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000005_1, Status : FAILED
Task attempt_201208152128_0001_m_000005_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000012_1, Status : FAILED
Task attempt_201208152128_0001_m_000012_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:18 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000008_1, Status : FAILED
Task attempt_201208152128_0001_m_000008_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000009_1, Status : FAILED
Task attempt_201208152128_0001_m_000009_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000000_1, Status : FAILED
Task attempt_201208152128_0001_m_000000_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000010_1, Status : FAILED
Task attempt_201208152128_0001_m_000010_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:20 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000002_1, Status : FAILED
Task attempt_201208152128_0001_m_000002_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:21 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000003_1, Status : FAILED
Task attempt_201208152128_0001_m_000003_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_1, Status : FAILED
Task attempt_201208152128_0001_m_000013_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_1, Status : FAILED
Task attempt_201208152128_0001_m_000001_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:23 INFO mapred.JobClient:  map 11% reduce 0%
12/08/15 21:50:25 INFO mapred.JobClient:  map 17% reduce 0%
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_1, Status : FAILED
Task attempt_201208152128_0001_m_000006_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_2, Status : FAILED
Task attempt_201208152128_0001_m_000013_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_2, Status : FAILED
Task attempt_201208152128_0001_m_000001_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:28 INFO mapred.JobClient:  map 40% reduce 0%
12/08/15 21:50:29 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_2, Status : FAILED
Task attempt_201208152128_0001_m_000011_2 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:30 INFO mapred.JobClient:  map 42% reduce 0%
12/08/15 21:50:31 INFO mapred.JobClient:  map 52% reduce 0%
12/08/15 21:50:33 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:37 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:39 INFO mapred.JobClient:  map 61% reduce 0%
12/08/15 21:50:42 INFO mapred.JobClient:  map 62% reduce 0%
12/08/15 21:50:46 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:55 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:57 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_1, Status : FAILED
Task attempt_201208152128_0001_m_000011_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:52:10 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_2, Status : FAILED
Task attempt_201208152128_0001_m_000006_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:25 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000007_2, Status : FAILED
Task attempt_201208152128_0001_m_000007_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:29 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 22:00:32 INFO mapred.JobClient:  map 46% reduce 0%
12/08/15 22:00:34 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 22:00:35 INFO mapred.JobClient:  map 38% reduce 0%
12/08/15 22:00:40 INFO mapred.JobClient: Job complete: job_201208152128_0001
12/08/15 22:00:40 INFO mapred.JobClient: Counters: 8
12/08/15 22:00:40 INFO mapred.JobClient:   Job Counters
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=8861935
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
reduces waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
maps waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Rack-local map tasks=52
12/08/15 22:00:40 INFO mapred.JobClient:     Launched map tasks=59
12/08/15 22:00:40 INFO mapred.JobClient:     Data-local map tasks=7
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
12/08/15 22:00:40 INFO mapred.JobClient:     Failed map tasks=1
12/08/15 22:00:40 INFO mapred.JobClient: Job Failed: # of failed Map 
Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: 
task_201208152128_0001_m_000007
java.io.IOException: Job failed!
     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
     at 
uni.kassel.macek.rtprep.RetweetApplication.run(RetweetApplication.java:76)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at 
uni.kassel.macek.rtprep.RetweetApplication.main(RetweetApplication.java:27)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:601)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

#################################### NAMENODE ... LAST LINES 
#################################
2012-08-15 21:44:32,976 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.30:50010, blocks: 3, 
processing time: 0 msecs
2012-08-15 21:48:44,346 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:48:44,346 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:50:04,135 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 16 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 21:53:44,355 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:53:44,355 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:56:17,856 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.112:50010, blocks: 8, 
processing time: 0 msecs
2012-08-15 21:58:44,362 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:58:44,363 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 22:00:19,904 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 36 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-2215699895714614889 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-9153653706918013228 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.115:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.118:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.114:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:39,717 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar. 
blk_8780584073579865736_1010
2012-08-15 22:00:39,724 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 141.51.205.113:50010 is 
added to blk_8780584073579865736_1010 size 39356
2012-08-15 22:00:39,726 INFO org.apache.hadoop.hdfs.StateChange: 
Removing lease on  file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
from client DFSClient_-1191648872
2012-08-15 22:00:39,727 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.completeFile: file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
is closed by DFSClient_-1191648872
2012-08-15 22:00:39,750 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_2297057220333714498 is added to 
invalidSet of 141.51.205.114:50010
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

#################################### JOBTRACKER ... LAST LINES 
#################################

2012-08-15 22:00:33,295 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:33,296 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (TASK_CLEANUP) 'attempt_201208152128_0001_m_000008_2' to tip 
task_201208152128_0001_m_000008, for tracker 
'tracker_its-cs208.its.uni-kassel.de:localhost/127.0.0.1:58503'
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Aborting job job_201208152128_0001
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Killing job 'job_201208152128_0001'
2012-08-15 22:00:33,697 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (JOB_CLEANUP) 'attempt_201208152128_0001_m_000016_0' to tip 
task_201208152128_0001_m_000016, for tracker 
'tracker_its-cs120.its.uni-kassel.de:localhost/127.0.0.1:52467'
2012-08-15 22:00:33,698 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_3'
2012-08-15 22:00:33,705 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000015_2: Task 
attempt_201208152128_0001_m_000015_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:33,706 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000015_2'
2012-08-15 22:00:36,200 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000003_2: Task 
attempt_201208152128_0001_m_000003_2 failed to report status for 600 
seconds. Killing!
2012-08-15 22:00:36,201 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_2'
2012-08-15 22:00:36,534 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_2'
2012-08-15 22:00:36,702 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000012_2: Task 
attempt_201208152128_0001_m_000012_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,703 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000012_2'
2012-08-15 22:00:36,708 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000009_2: Task 
attempt_201208152128_0001_m_000009_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,709 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_2'
2012-08-15 22:00:36,778 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000000_2: Task 
attempt_201208152128_0001_m_000000_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_2'
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_2'
2012-08-15 22:00:37,024 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_4'
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_3'
2012-08-15 22:00:38,758 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000002_2: Task 
attempt_201208152128_0001_m_000002_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000002_2'
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000013_3'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_4'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_3'
2012-08-15 22:00:39,205 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000004_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000005_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:39,539 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000006_3'
2012-08-15 22:00:39,707 INFO org.apache.hadoop.mapred.JobInProgress: 
Task 'attempt_201208152128_0001_m_000016_0' has completed 
task_201208152128_0001_m_000016 successfully.
2012-08-15 22:00:39,712 INFO 
org.apache.hadoop.mapred.JobInProgress$JobSummary: 
jobId=job_201208152128_0001,submitTime=1345058961257,launchTime=1345058961719,firstMapTaskLaunchTime=1345058968763,firstJobSetupTaskLaunchTime=1345058962669,firstJobCleanupTaskLaunchTime=1345060833697,finishTime=1345060839709,numMaps=16,numSlotsPerMap=1,numReduces=1,numSlotsPerReduce=1,user=bmacek,queue=default,status=FAILED,mapSlotSeconds=8861,reduceSlotsSeconds=0,clusterMapCapacity=22,clusterReduceCapacity=22
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000016_0'
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobHistory: 
Creating DONE subfolder at 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,747 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,762 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_conf.xml 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>





Am 14.08.2012 13:25, schrieb Björn-Elmar Macek:
> Hi Michael and Mohammad,
>
> thanks alot for your inpus!
> i have pinged the people at the cluster in order to (eventually 
> disable IPv6) and definetly check the ports corresponding to the 
> appropriate machines. I will keep you updated.
>
> Regards,
> Elmar
>
>
> Am 13.08.2012 22:39, schrieb Michael Segel:
>>
>> The key is to think about what can go wrong, but start with the low 
>> hanging fruit.
>>
>> I mean you could be right, however you're jumping the gun and are 
>> over looking simpler issues.
>>
>> The most common issue is that the networking traffic is being filtered.
>> Of course since we're both diagnosing this with minimal information, 
>> we're kind of shooting from the hip.
>>
>> This is why I'm asking if there is any networking traffic between the 
>> nodes.  If you have partial communication, then focus on why you 
>> can't see the specific traffic.
>>
>>
>> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>> Thank you so very much for the detailed response Michael. I'll keep 
>>> the tip in mind. Please pardon my ignorance, as I am still in the 
>>> learning phase.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>>
>>>     0.0.0.0 means that the call is going to all interfaces on the
>>>     machine.  (Shouldn't be an issue...)
>>>
>>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>>     data to DNs and they seem to communicate, therefore if its IPv6
>>>     related, wouldn't it impact all traffic and not just a specific
>>>     port?
>>>     I agree... shut down IPv6 if you can.
>>>
>>>     I don't disagree with your assessment. I am just suggesting that
>>>     before you do a really deep dive, you think about the more
>>>     obvious stuff first.
>>>
>>>     There are a couple of other things... like do all of the
>>>     /etc/hosts files on all of the machines match?
>>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>>
>>>     BTW, you said DNS in your response. if you're using DNS, then
>>>     you don't really want to have much info in the /etc/hosts file
>>>     except loopback and the server's IP address.
>>>
>>>     Looking at the problem OP is indicating some traffic works,
>>>     while other traffic doesn't. Most likely something is blocking
>>>     the ports. Iptables is the first place to look.
>>>
>>>     Just saying. ;-)
>>>
>>>
>>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hi Michael,
>>>>            I asked for hosts file because there seems to be some
>>>>     loopback prob to me. The log shows that call is going at
>>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>>     and making sure that there is no prob with the DNS resolution
>>>>     is also necessary. Please correct me if I am wrong. Thank you.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>>     wrote:
>>>>
>>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>>
>>>>         Outside of MapR, multihomed machines can be problematic.
>>>>         Hadoop doesn't generally work well when you're not using
>>>>         the FQDN or its alias.
>>>>
>>>>         The issue isn't the SSH, but if you go to the node which is
>>>>         having trouble connecting to another node,  then try to
>>>>         ping it, or some other general communication,  if it
>>>>         succeeds, your issue is that the port you're trying to
>>>>         communicate with is blocked.  Then its more than likely an
>>>>         ipconfig or firewall issue.
>>>>
>>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>>         Hi Michael,
>>>>>
>>>>>         well i can ssh from any node to any other without being
>>>>>         prompted. The reason for this is, that my home dir is
>>>>>         mounted in every server in the cluster.
>>>>>
>>>>>         If the machines are multihomed: i dont know. i could ask
>>>>>         if this would be of importance.
>>>>>
>>>>>         Shall i?
>>>>>
>>>>>         Regards,
>>>>>         Elmar
>>>>>
>>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>>         If the nodes can communicate and distribute data, then
>>>>>>         the odds are that the issue isn't going to be in his
>>>>>>         /etc/hosts.
>>>>>>
>>>>>>         A more relevant question is if he's running a firewall on
>>>>>>         each of these machines?
>>>>>>
>>>>>>         A simple test... ssh to one node, ping other nodes and
>>>>>>         the control nodes at random to see if they can see one
>>>>>>         another. Then check to see if there is a firewall running
>>>>>>         which would limit the types of traffic between nodes.
>>>>>>
>>>>>>         One other side note... are these machines multi-homed?
>>>>>>
>>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>>
>>>>>>>         Hello there,
>>>>>>>
>>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>>         don't mind.
>>>>>>>
>>>>>>>         Regards,
>>>>>>>          Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>>         wrote:
>>>>>>>
>>>>>>>             Hi,
>>>>>>>
>>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>>             seem to have difficulties with their communication
>>>>>>>             as their logs say:
>>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>>             logs below. I wondered if this might be due to
>>>>>>>             reasons correllated with the localhost entry in
>>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>>             similar errors, but i checked the file neither
>>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>>             (although you can ping localhost... the technician
>>>>>>>             of the cluster said he'd be looking for the
>>>>>>>             mechanics resolving localhost)
>>>>>>>             * The other nodes can not speak with the namenode
>>>>>>>             and jobtracker (its-cs131). Although it is
>>>>>>>             absolutely not clear, why this is happening: the
>>>>>>>             "dfs -put" i do directly before the job is running
>>>>>>>             fine, which seems to imply that communication
>>>>>>>             between those servers is working flawlessly.
>>>>>>>
>>>>>>>             Is there any reason why this might happen?
>>>>>>>
>>>>>>>
>>>>>>>             Regards,
>>>>>>>             Elmar
>>>>>>>
>>>>>>>             LOGS BELOW:
>>>>>>>
>>>>>>>             \____Datanodes
>>>>>>>
>>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>>             point i thought namenode and datanodes have to
>>>>>>>             communicate), i get the following errors when
>>>>>>>             starting the job:
>>>>>>>
>>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>>             big (about 12MB) and looks like this:
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 5 more
>>>>>>>
>>>>>>>             ... (this continues til the end of the log)
>>>>>>>
>>>>>>>             The second is short kind:
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             DataNode metrics system started
>>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Storage directory
>>>>>>>             /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Formatting ...
>>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             Registered FSDatasetStatusMBean
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             Shutting down all async disk service threads...
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             All async disk service threads have been shut down.
>>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.BindException: Problem binding to
>>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>>             already in use
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>>             Caused by: java.net.BindException: Address already
>>>>>>>             in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>>                 ... 7 more
>>>>>>>
>>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             \_____TastTracker
>>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>>             Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 6 more
>>>>>>>
>>>>>>>
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             TaskTracker metrics system started
>>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log:
>>>>>>>             Logging to
>>>>>>>             org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>>             filtersafety
>>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>>             and reduceRetainSize=-1
>>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tasktracker with owner as bmacek
>>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>>             local directories are:
>>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source jvm registered.
>>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source TaskTrackerMetrics registered.
>>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcDetailedActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>>             starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>>             54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tracker
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>>             with exit code 0
>>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>>             ResourceCalculatorPlugin :
>>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>>             is disabled.
>>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>>             created with max memory = 10485760
>>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ShuffleServerMetrics registered.
>>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>>             task tracker because java.net.BindException: Address
>>>>>>>             already in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>>
>>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hello again,

well i have sorted out about all of doubts, that the communication 
problems are related to the infrastructures. Instead  i found in a new 
execution of my program, that due to some unknown and untracked reasons 
the namenode and the tasktracker stop their services due to too many 
failed map tasks. See the logs below.

 From that time on ofc, the running datanodes/tasktracker cannot 
communicate with jobtracker/namenode.


What i do not understand is, why the jobs do not answer or fail. I 
wanted to look it up in the logs, but somehow they do not contain 
anything from times prior to 24:00/0:00 o'clock - a time at which the 
master(s) were already dead for 2 hours.

Are there any suggestions? Maybe did i do something wrong in the Mapper?

Regards,
Elmar

#################################### JOBLOG ... LAST LINES 
#################################
Task attempt_201208152128_0001_m_000007_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:10 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 21:50:12 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 21:50:13 INFO mapred.JobClient:  map 23% reduce 0%
12/08/15 21:50:14 INFO mapred.JobClient:  map 19% reduce 0%
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000014_1, Status : FAILED
Task attempt_201208152128_0001_m_000014_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000015_1, Status : FAILED
Task attempt_201208152128_0001_m_000015_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000004_1, Status : FAILED
Task attempt_201208152128_0001_m_000004_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000005_1, Status : FAILED
Task attempt_201208152128_0001_m_000005_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000012_1, Status : FAILED
Task attempt_201208152128_0001_m_000012_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:18 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000008_1, Status : FAILED
Task attempt_201208152128_0001_m_000008_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000009_1, Status : FAILED
Task attempt_201208152128_0001_m_000009_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000000_1, Status : FAILED
Task attempt_201208152128_0001_m_000000_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000010_1, Status : FAILED
Task attempt_201208152128_0001_m_000010_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:20 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000002_1, Status : FAILED
Task attempt_201208152128_0001_m_000002_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:21 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000003_1, Status : FAILED
Task attempt_201208152128_0001_m_000003_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_1, Status : FAILED
Task attempt_201208152128_0001_m_000013_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_1, Status : FAILED
Task attempt_201208152128_0001_m_000001_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:23 INFO mapred.JobClient:  map 11% reduce 0%
12/08/15 21:50:25 INFO mapred.JobClient:  map 17% reduce 0%
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_1, Status : FAILED
Task attempt_201208152128_0001_m_000006_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_2, Status : FAILED
Task attempt_201208152128_0001_m_000013_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_2, Status : FAILED
Task attempt_201208152128_0001_m_000001_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:28 INFO mapred.JobClient:  map 40% reduce 0%
12/08/15 21:50:29 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_2, Status : FAILED
Task attempt_201208152128_0001_m_000011_2 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:30 INFO mapred.JobClient:  map 42% reduce 0%
12/08/15 21:50:31 INFO mapred.JobClient:  map 52% reduce 0%
12/08/15 21:50:33 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:37 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:39 INFO mapred.JobClient:  map 61% reduce 0%
12/08/15 21:50:42 INFO mapred.JobClient:  map 62% reduce 0%
12/08/15 21:50:46 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:55 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:57 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_1, Status : FAILED
Task attempt_201208152128_0001_m_000011_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:52:10 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_2, Status : FAILED
Task attempt_201208152128_0001_m_000006_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:25 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000007_2, Status : FAILED
Task attempt_201208152128_0001_m_000007_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:29 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 22:00:32 INFO mapred.JobClient:  map 46% reduce 0%
12/08/15 22:00:34 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 22:00:35 INFO mapred.JobClient:  map 38% reduce 0%
12/08/15 22:00:40 INFO mapred.JobClient: Job complete: job_201208152128_0001
12/08/15 22:00:40 INFO mapred.JobClient: Counters: 8
12/08/15 22:00:40 INFO mapred.JobClient:   Job Counters
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=8861935
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
reduces waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
maps waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Rack-local map tasks=52
12/08/15 22:00:40 INFO mapred.JobClient:     Launched map tasks=59
12/08/15 22:00:40 INFO mapred.JobClient:     Data-local map tasks=7
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
12/08/15 22:00:40 INFO mapred.JobClient:     Failed map tasks=1
12/08/15 22:00:40 INFO mapred.JobClient: Job Failed: # of failed Map 
Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: 
task_201208152128_0001_m_000007
java.io.IOException: Job failed!
     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
     at 
uni.kassel.macek.rtprep.RetweetApplication.run(RetweetApplication.java:76)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at 
uni.kassel.macek.rtprep.RetweetApplication.main(RetweetApplication.java:27)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:601)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

#################################### NAMENODE ... LAST LINES 
#################################
2012-08-15 21:44:32,976 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.30:50010, blocks: 3, 
processing time: 0 msecs
2012-08-15 21:48:44,346 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:48:44,346 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:50:04,135 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 16 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 21:53:44,355 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:53:44,355 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:56:17,856 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.112:50010, blocks: 8, 
processing time: 0 msecs
2012-08-15 21:58:44,362 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:58:44,363 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 22:00:19,904 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 36 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-2215699895714614889 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-9153653706918013228 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.115:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.118:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.114:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:39,717 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar. 
blk_8780584073579865736_1010
2012-08-15 22:00:39,724 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 141.51.205.113:50010 is 
added to blk_8780584073579865736_1010 size 39356
2012-08-15 22:00:39,726 INFO org.apache.hadoop.hdfs.StateChange: 
Removing lease on  file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
from client DFSClient_-1191648872
2012-08-15 22:00:39,727 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.completeFile: file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
is closed by DFSClient_-1191648872
2012-08-15 22:00:39,750 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_2297057220333714498 is added to 
invalidSet of 141.51.205.114:50010
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

#################################### JOBTRACKER ... LAST LINES 
#################################

2012-08-15 22:00:33,295 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:33,296 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (TASK_CLEANUP) 'attempt_201208152128_0001_m_000008_2' to tip 
task_201208152128_0001_m_000008, for tracker 
'tracker_its-cs208.its.uni-kassel.de:localhost/127.0.0.1:58503'
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Aborting job job_201208152128_0001
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Killing job 'job_201208152128_0001'
2012-08-15 22:00:33,697 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (JOB_CLEANUP) 'attempt_201208152128_0001_m_000016_0' to tip 
task_201208152128_0001_m_000016, for tracker 
'tracker_its-cs120.its.uni-kassel.de:localhost/127.0.0.1:52467'
2012-08-15 22:00:33,698 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_3'
2012-08-15 22:00:33,705 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000015_2: Task 
attempt_201208152128_0001_m_000015_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:33,706 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000015_2'
2012-08-15 22:00:36,200 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000003_2: Task 
attempt_201208152128_0001_m_000003_2 failed to report status for 600 
seconds. Killing!
2012-08-15 22:00:36,201 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_2'
2012-08-15 22:00:36,534 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_2'
2012-08-15 22:00:36,702 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000012_2: Task 
attempt_201208152128_0001_m_000012_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,703 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000012_2'
2012-08-15 22:00:36,708 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000009_2: Task 
attempt_201208152128_0001_m_000009_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,709 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_2'
2012-08-15 22:00:36,778 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000000_2: Task 
attempt_201208152128_0001_m_000000_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_2'
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_2'
2012-08-15 22:00:37,024 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_4'
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_3'
2012-08-15 22:00:38,758 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000002_2: Task 
attempt_201208152128_0001_m_000002_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000002_2'
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000013_3'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_4'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_3'
2012-08-15 22:00:39,205 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000004_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000005_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:39,539 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000006_3'
2012-08-15 22:00:39,707 INFO org.apache.hadoop.mapred.JobInProgress: 
Task 'attempt_201208152128_0001_m_000016_0' has completed 
task_201208152128_0001_m_000016 successfully.
2012-08-15 22:00:39,712 INFO 
org.apache.hadoop.mapred.JobInProgress$JobSummary: 
jobId=job_201208152128_0001,submitTime=1345058961257,launchTime=1345058961719,firstMapTaskLaunchTime=1345058968763,firstJobSetupTaskLaunchTime=1345058962669,firstJobCleanupTaskLaunchTime=1345060833697,finishTime=1345060839709,numMaps=16,numSlotsPerMap=1,numReduces=1,numSlotsPerReduce=1,user=bmacek,queue=default,status=FAILED,mapSlotSeconds=8861,reduceSlotsSeconds=0,clusterMapCapacity=22,clusterReduceCapacity=22
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000016_0'
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobHistory: 
Creating DONE subfolder at 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,747 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,762 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_conf.xml 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>





Am 14.08.2012 13:25, schrieb Björn-Elmar Macek:
> Hi Michael and Mohammad,
>
> thanks alot for your inpus!
> i have pinged the people at the cluster in order to (eventually 
> disable IPv6) and definetly check the ports corresponding to the 
> appropriate machines. I will keep you updated.
>
> Regards,
> Elmar
>
>
> Am 13.08.2012 22:39, schrieb Michael Segel:
>>
>> The key is to think about what can go wrong, but start with the low 
>> hanging fruit.
>>
>> I mean you could be right, however you're jumping the gun and are 
>> over looking simpler issues.
>>
>> The most common issue is that the networking traffic is being filtered.
>> Of course since we're both diagnosing this with minimal information, 
>> we're kind of shooting from the hip.
>>
>> This is why I'm asking if there is any networking traffic between the 
>> nodes.  If you have partial communication, then focus on why you 
>> can't see the specific traffic.
>>
>>
>> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>> Thank you so very much for the detailed response Michael. I'll keep 
>>> the tip in mind. Please pardon my ignorance, as I am still in the 
>>> learning phase.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>>
>>>     0.0.0.0 means that the call is going to all interfaces on the
>>>     machine.  (Shouldn't be an issue...)
>>>
>>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>>     data to DNs and they seem to communicate, therefore if its IPv6
>>>     related, wouldn't it impact all traffic and not just a specific
>>>     port?
>>>     I agree... shut down IPv6 if you can.
>>>
>>>     I don't disagree with your assessment. I am just suggesting that
>>>     before you do a really deep dive, you think about the more
>>>     obvious stuff first.
>>>
>>>     There are a couple of other things... like do all of the
>>>     /etc/hosts files on all of the machines match?
>>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>>
>>>     BTW, you said DNS in your response. if you're using DNS, then
>>>     you don't really want to have much info in the /etc/hosts file
>>>     except loopback and the server's IP address.
>>>
>>>     Looking at the problem OP is indicating some traffic works,
>>>     while other traffic doesn't. Most likely something is blocking
>>>     the ports. Iptables is the first place to look.
>>>
>>>     Just saying. ;-)
>>>
>>>
>>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hi Michael,
>>>>            I asked for hosts file because there seems to be some
>>>>     loopback prob to me. The log shows that call is going at
>>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>>     and making sure that there is no prob with the DNS resolution
>>>>     is also necessary. Please correct me if I am wrong. Thank you.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>>     wrote:
>>>>
>>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>>
>>>>         Outside of MapR, multihomed machines can be problematic.
>>>>         Hadoop doesn't generally work well when you're not using
>>>>         the FQDN or its alias.
>>>>
>>>>         The issue isn't the SSH, but if you go to the node which is
>>>>         having trouble connecting to another node,  then try to
>>>>         ping it, or some other general communication,  if it
>>>>         succeeds, your issue is that the port you're trying to
>>>>         communicate with is blocked.  Then its more than likely an
>>>>         ipconfig or firewall issue.
>>>>
>>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>>         Hi Michael,
>>>>>
>>>>>         well i can ssh from any node to any other without being
>>>>>         prompted. The reason for this is, that my home dir is
>>>>>         mounted in every server in the cluster.
>>>>>
>>>>>         If the machines are multihomed: i dont know. i could ask
>>>>>         if this would be of importance.
>>>>>
>>>>>         Shall i?
>>>>>
>>>>>         Regards,
>>>>>         Elmar
>>>>>
>>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>>         If the nodes can communicate and distribute data, then
>>>>>>         the odds are that the issue isn't going to be in his
>>>>>>         /etc/hosts.
>>>>>>
>>>>>>         A more relevant question is if he's running a firewall on
>>>>>>         each of these machines?
>>>>>>
>>>>>>         A simple test... ssh to one node, ping other nodes and
>>>>>>         the control nodes at random to see if they can see one
>>>>>>         another. Then check to see if there is a firewall running
>>>>>>         which would limit the types of traffic between nodes.
>>>>>>
>>>>>>         One other side note... are these machines multi-homed?
>>>>>>
>>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>>
>>>>>>>         Hello there,
>>>>>>>
>>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>>         don't mind.
>>>>>>>
>>>>>>>         Regards,
>>>>>>>          Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>>         wrote:
>>>>>>>
>>>>>>>             Hi,
>>>>>>>
>>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>>             seem to have difficulties with their communication
>>>>>>>             as their logs say:
>>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>>             logs below. I wondered if this might be due to
>>>>>>>             reasons correllated with the localhost entry in
>>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>>             similar errors, but i checked the file neither
>>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>>             (although you can ping localhost... the technician
>>>>>>>             of the cluster said he'd be looking for the
>>>>>>>             mechanics resolving localhost)
>>>>>>>             * The other nodes can not speak with the namenode
>>>>>>>             and jobtracker (its-cs131). Although it is
>>>>>>>             absolutely not clear, why this is happening: the
>>>>>>>             "dfs -put" i do directly before the job is running
>>>>>>>             fine, which seems to imply that communication
>>>>>>>             between those servers is working flawlessly.
>>>>>>>
>>>>>>>             Is there any reason why this might happen?
>>>>>>>
>>>>>>>
>>>>>>>             Regards,
>>>>>>>             Elmar
>>>>>>>
>>>>>>>             LOGS BELOW:
>>>>>>>
>>>>>>>             \____Datanodes
>>>>>>>
>>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>>             point i thought namenode and datanodes have to
>>>>>>>             communicate), i get the following errors when
>>>>>>>             starting the job:
>>>>>>>
>>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>>             big (about 12MB) and looks like this:
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 5 more
>>>>>>>
>>>>>>>             ... (this continues til the end of the log)
>>>>>>>
>>>>>>>             The second is short kind:
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             DataNode metrics system started
>>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Storage directory
>>>>>>>             /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Formatting ...
>>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             Registered FSDatasetStatusMBean
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             Shutting down all async disk service threads...
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             All async disk service threads have been shut down.
>>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.BindException: Problem binding to
>>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>>             already in use
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>>             Caused by: java.net.BindException: Address already
>>>>>>>             in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>>                 ... 7 more
>>>>>>>
>>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             \_____TastTracker
>>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>>             Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 6 more
>>>>>>>
>>>>>>>
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             TaskTracker metrics system started
>>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log:
>>>>>>>             Logging to
>>>>>>>             org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>>             filtersafety
>>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>>             and reduceRetainSize=-1
>>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tasktracker with owner as bmacek
>>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>>             local directories are:
>>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source jvm registered.
>>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source TaskTrackerMetrics registered.
>>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcDetailedActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>>             starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>>             54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tracker
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>>             with exit code 0
>>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>>             ResourceCalculatorPlugin :
>>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>>             is disabled.
>>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>>             created with max memory = 10485760
>>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ShuffleServerMetrics registered.
>>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>>             task tracker because java.net.BindException: Address
>>>>>>>             already in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>>
>>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hello again,

well i have sorted out about all of doubts, that the communication 
problems are related to the infrastructures. Instead  i found in a new 
execution of my program, that due to some unknown and untracked reasons 
the namenode and the tasktracker stop their services due to too many 
failed map tasks. See the logs below.

 From that time on ofc, the running datanodes/tasktracker cannot 
communicate with jobtracker/namenode.


What i do not understand is, why the jobs do not answer or fail. I 
wanted to look it up in the logs, but somehow they do not contain 
anything from times prior to 24:00/0:00 o'clock - a time at which the 
master(s) were already dead for 2 hours.

Are there any suggestions? Maybe did i do something wrong in the Mapper?

Regards,
Elmar

#################################### JOBLOG ... LAST LINES 
#################################
Task attempt_201208152128_0001_m_000007_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:10 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 21:50:12 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 21:50:13 INFO mapred.JobClient:  map 23% reduce 0%
12/08/15 21:50:14 INFO mapred.JobClient:  map 19% reduce 0%
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000014_1, Status : FAILED
Task attempt_201208152128_0001_m_000014_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:15 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000015_1, Status : FAILED
Task attempt_201208152128_0001_m_000015_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000004_1, Status : FAILED
Task attempt_201208152128_0001_m_000004_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000005_1, Status : FAILED
Task attempt_201208152128_0001_m_000005_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:17 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000012_1, Status : FAILED
Task attempt_201208152128_0001_m_000012_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:18 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000008_1, Status : FAILED
Task attempt_201208152128_0001_m_000008_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000009_1, Status : FAILED
Task attempt_201208152128_0001_m_000009_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000000_1, Status : FAILED
Task attempt_201208152128_0001_m_000000_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:19 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000010_1, Status : FAILED
Task attempt_201208152128_0001_m_000010_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:20 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000002_1, Status : FAILED
Task attempt_201208152128_0001_m_000002_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:21 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000003_1, Status : FAILED
Task attempt_201208152128_0001_m_000003_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_1, Status : FAILED
Task attempt_201208152128_0001_m_000013_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:22 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_1, Status : FAILED
Task attempt_201208152128_0001_m_000001_1 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:23 INFO mapred.JobClient:  map 11% reduce 0%
12/08/15 21:50:25 INFO mapred.JobClient:  map 17% reduce 0%
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_1, Status : FAILED
Task attempt_201208152128_0001_m_000006_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000013_2, Status : FAILED
Task attempt_201208152128_0001_m_000013_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:27 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000001_2, Status : FAILED
Task attempt_201208152128_0001_m_000001_2 failed to report status for 
602 seconds. Killing!
12/08/15 21:50:28 INFO mapred.JobClient:  map 40% reduce 0%
12/08/15 21:50:29 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_2, Status : FAILED
Task attempt_201208152128_0001_m_000011_2 failed to report status for 
601 seconds. Killing!
12/08/15 21:50:30 INFO mapred.JobClient:  map 42% reduce 0%
12/08/15 21:50:31 INFO mapred.JobClient:  map 52% reduce 0%
12/08/15 21:50:33 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:37 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:39 INFO mapred.JobClient:  map 61% reduce 0%
12/08/15 21:50:42 INFO mapred.JobClient:  map 62% reduce 0%
12/08/15 21:50:46 INFO mapred.JobClient:  map 58% reduce 0%
12/08/15 21:50:55 INFO mapred.JobClient:  map 54% reduce 0%
12/08/15 21:50:57 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000011_1, Status : FAILED
Task attempt_201208152128_0001_m_000011_1 failed to report status for 
602 seconds. Killing!
12/08/15 21:52:10 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000006_2, Status : FAILED
Task attempt_201208152128_0001_m_000006_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:25 INFO mapred.JobClient: Task Id : 
attempt_201208152128_0001_m_000007_2, Status : FAILED
Task attempt_201208152128_0001_m_000007_2 failed to report status for 
602 seconds. Killing!
12/08/15 22:00:29 INFO mapred.JobClient:  map 50% reduce 0%
12/08/15 22:00:32 INFO mapred.JobClient:  map 46% reduce 0%
12/08/15 22:00:34 INFO mapred.JobClient:  map 39% reduce 0%
12/08/15 22:00:35 INFO mapred.JobClient:  map 38% reduce 0%
12/08/15 22:00:40 INFO mapred.JobClient: Job complete: job_201208152128_0001
12/08/15 22:00:40 INFO mapred.JobClient: Counters: 8
12/08/15 22:00:40 INFO mapred.JobClient:   Job Counters
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=8861935
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
reduces waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Total time spent by all 
maps waiting after reserving slots (ms)=0
12/08/15 22:00:40 INFO mapred.JobClient:     Rack-local map tasks=52
12/08/15 22:00:40 INFO mapred.JobClient:     Launched map tasks=59
12/08/15 22:00:40 INFO mapred.JobClient:     Data-local map tasks=7
12/08/15 22:00:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
12/08/15 22:00:40 INFO mapred.JobClient:     Failed map tasks=1
12/08/15 22:00:40 INFO mapred.JobClient: Job Failed: # of failed Map 
Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: 
task_201208152128_0001_m_000007
java.io.IOException: Job failed!
     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
     at 
uni.kassel.macek.rtprep.RetweetApplication.run(RetweetApplication.java:76)
     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
     at 
uni.kassel.macek.rtprep.RetweetApplication.main(RetweetApplication.java:27)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:601)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

#################################### NAMENODE ... LAST LINES 
#################################
2012-08-15 21:44:32,976 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.30:50010, blocks: 3, 
processing time: 0 msecs
2012-08-15 21:48:44,346 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:48:44,346 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:50:04,135 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 16 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 21:53:44,355 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:53:44,355 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 21:56:17,856 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK* 
NameSystem.processReport: from 141.51.205.112:50010, blocks: 8, 
processing time: 0 msecs
2012-08-15 21:58:44,362 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 
141.51.205.28
2012-08-15 21:58:44,363 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit 
log, edits.new files already exists in all healthy directories:
   /work/bmacek/hdfs/current/edits.new
2012-08-15 22:00:19,904 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of 
transactions: 0 Total time for transactions(ms): 0Number of transactions 
batched in Syncs: 36 Number of syncs: 1 SyncTimes(ms): 45
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,486 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_8707503977973501396 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.115:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.114:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.112:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.30:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.118:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_331880463236233459 is added to 
invalidSet of 141.51.205.119:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-2215699895714614889 is added to 
invalidSet of 141.51.205.113:50010
2012-08-15 22:00:35,487 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_-9153653706918013228 is added to 
invalidSet of 141.51.205.117:50010
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.115:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.118:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:37,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
ask 141.51.205.114:50010 to delete blk_331880463236233459_1005 
blk_8707503977973501396_1004
2012-08-15 22:00:39,717 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.allocateBlock: 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar. 
blk_8780584073579865736_1010
2012-08-15 22:00:39,724 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addStoredBlock: blockMap updated: 141.51.205.113:50010 is 
added to blk_8780584073579865736_1010 size 39356
2012-08-15 22:00:39,726 INFO org.apache.hadoop.hdfs.StateChange: 
Removing lease on  file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
from client DFSClient_-1191648872
2012-08-15 22:00:39,727 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.completeFile: file 
/output/_logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
is closed by DFSClient_-1191648872
2012-08-15 22:00:39,750 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
NameSystem.addToInvalidates: blk_2297057220333714498 is added to 
invalidSet of 141.51.205.114:50010
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

#################################### JOBTRACKER ... LAST LINES 
#################################

2012-08-15 22:00:33,295 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:33,296 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (TASK_CLEANUP) 'attempt_201208152128_0001_m_000008_2' to tip 
task_201208152128_0001_m_000008, for tracker 
'tracker_its-cs208.its.uni-kassel.de:localhost/127.0.0.1:58503'
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Aborting job job_201208152128_0001
2012-08-15 22:00:33,696 INFO org.apache.hadoop.mapred.JobInProgress: 
Killing job 'job_201208152128_0001'
2012-08-15 22:00:33,697 INFO org.apache.hadoop.mapred.JobTracker: Adding 
task (JOB_CLEANUP) 'attempt_201208152128_0001_m_000016_0' to tip 
task_201208152128_0001_m_000016, for tracker 
'tracker_its-cs120.its.uni-kassel.de:localhost/127.0.0.1:52467'
2012-08-15 22:00:33,698 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_3'
2012-08-15 22:00:33,705 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000015_2: Task 
attempt_201208152128_0001_m_000015_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:33,706 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000015_2'
2012-08-15 22:00:36,200 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000003_2: Task 
attempt_201208152128_0001_m_000003_2 failed to report status for 600 
seconds. Killing!
2012-08-15 22:00:36,201 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_2'
2012-08-15 22:00:36,534 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_2'
2012-08-15 22:00:36,702 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000012_2: Task 
attempt_201208152128_0001_m_000012_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,703 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000012_2'
2012-08-15 22:00:36,708 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000009_2: Task 
attempt_201208152128_0001_m_000009_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,709 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_2'
2012-08-15 22:00:36,778 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000000_2: Task 
attempt_201208152128_0001_m_000000_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_2'
2012-08-15 22:00:36,779 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_2'
2012-08-15 22:00:37,024 INFO org.apache.hadoop.mapred.TaskInProgress: 
TaskInProgress task_201208152128_0001_m_000007 has failed 4 times.
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000007_4'
2012-08-15 22:00:37,085 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000014_3'
2012-08-15 22:00:38,758 INFO org.apache.hadoop.mapred.TaskInProgress: 
Error from attempt_201208152128_0001_m_000002_2: Task 
attempt_201208152128_0001_m_000002_2 failed to report status for 601 
seconds. Killing!
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000002_2'
2012-08-15 22:00:38,759 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000013_3'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_4'
2012-08-15 22:00:39,202 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000003_3'
2012-08-15 22:00:39,205 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000004_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000005_2'
2012-08-15 22:00:39,206 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000009_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000000_3'
2012-08-15 22:00:39,240 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000010_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000001_3'
2012-08-15 22:00:39,303 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000008_2'
2012-08-15 22:00:39,539 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000006_3'
2012-08-15 22:00:39,707 INFO org.apache.hadoop.mapred.JobInProgress: 
Task 'attempt_201208152128_0001_m_000016_0' has completed 
task_201208152128_0001_m_000016 successfully.
2012-08-15 22:00:39,712 INFO 
org.apache.hadoop.mapred.JobInProgress$JobSummary: 
jobId=job_201208152128_0001,submitTime=1345058961257,launchTime=1345058961719,firstMapTaskLaunchTime=1345058968763,firstJobSetupTaskLaunchTime=1345058962669,firstJobCleanupTaskLaunchTime=1345060833697,finishTime=1345060839709,numMaps=16,numSlotsPerMap=1,numReduces=1,numSlotsPerReduce=1,user=bmacek,queue=default,status=FAILED,mapSlotSeconds=8861,reduceSlotsSeconds=0,clusterMapCapacity=22,clusterReduceCapacity=22
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobTracker: 
Removing task 'attempt_201208152128_0001_m_000016_0'
2012-08-15 22:00:39,742 INFO org.apache.hadoop.mapred.JobHistory: 
Creating DONE subfolder at 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,747 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_1345058961257_bmacek_rtprep.jar 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
2012-08-15 22:00:39,762 INFO org.apache.hadoop.mapred.JobHistory: Moving 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/job_201208152128_0001_conf.xml 
to 
file:/gpfs/home02/fb16/bmacek/hadoop-1.0.2/logs/history/done/version-1/its-cs117.its.uni-kassel.de_1345058925479_/2012/08/15/000000
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 >>>>>>>>>>>>> Compare last time with above log<<<<<<<<<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>





Am 14.08.2012 13:25, schrieb Björn-Elmar Macek:
> Hi Michael and Mohammad,
>
> thanks alot for your inpus!
> i have pinged the people at the cluster in order to (eventually 
> disable IPv6) and definetly check the ports corresponding to the 
> appropriate machines. I will keep you updated.
>
> Regards,
> Elmar
>
>
> Am 13.08.2012 22:39, schrieb Michael Segel:
>>
>> The key is to think about what can go wrong, but start with the low 
>> hanging fruit.
>>
>> I mean you could be right, however you're jumping the gun and are 
>> over looking simpler issues.
>>
>> The most common issue is that the networking traffic is being filtered.
>> Of course since we're both diagnosing this with minimal information, 
>> we're kind of shooting from the hip.
>>
>> This is why I'm asking if there is any networking traffic between the 
>> nodes.  If you have partial communication, then focus on why you 
>> can't see the specific traffic.
>>
>>
>> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>>> Thank you so very much for the detailed response Michael. I'll keep 
>>> the tip in mind. Please pardon my ignorance, as I am still in the 
>>> learning phase.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>>
>>>     0.0.0.0 means that the call is going to all interfaces on the
>>>     machine.  (Shouldn't be an issue...)
>>>
>>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>>     data to DNs and they seem to communicate, therefore if its IPv6
>>>     related, wouldn't it impact all traffic and not just a specific
>>>     port?
>>>     I agree... shut down IPv6 if you can.
>>>
>>>     I don't disagree with your assessment. I am just suggesting that
>>>     before you do a really deep dive, you think about the more
>>>     obvious stuff first.
>>>
>>>     There are a couple of other things... like do all of the
>>>     /etc/hosts files on all of the machines match?
>>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>>
>>>     BTW, you said DNS in your response. if you're using DNS, then
>>>     you don't really want to have much info in the /etc/hosts file
>>>     except loopback and the server's IP address.
>>>
>>>     Looking at the problem OP is indicating some traffic works,
>>>     while other traffic doesn't. Most likely something is blocking
>>>     the ports. Iptables is the first place to look.
>>>
>>>     Just saying. ;-)
>>>
>>>
>>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hi Michael,
>>>>            I asked for hosts file because there seems to be some
>>>>     loopback prob to me. The log shows that call is going at
>>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>>     and making sure that there is no prob with the DNS resolution
>>>>     is also necessary. Please correct me if I am wrong. Thank you.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>>     wrote:
>>>>
>>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>>
>>>>         Outside of MapR, multihomed machines can be problematic.
>>>>         Hadoop doesn't generally work well when you're not using
>>>>         the FQDN or its alias.
>>>>
>>>>         The issue isn't the SSH, but if you go to the node which is
>>>>         having trouble connecting to another node,  then try to
>>>>         ping it, or some other general communication,  if it
>>>>         succeeds, your issue is that the port you're trying to
>>>>         communicate with is blocked.  Then its more than likely an
>>>>         ipconfig or firewall issue.
>>>>
>>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>>         Hi Michael,
>>>>>
>>>>>         well i can ssh from any node to any other without being
>>>>>         prompted. The reason for this is, that my home dir is
>>>>>         mounted in every server in the cluster.
>>>>>
>>>>>         If the machines are multihomed: i dont know. i could ask
>>>>>         if this would be of importance.
>>>>>
>>>>>         Shall i?
>>>>>
>>>>>         Regards,
>>>>>         Elmar
>>>>>
>>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>>         If the nodes can communicate and distribute data, then
>>>>>>         the odds are that the issue isn't going to be in his
>>>>>>         /etc/hosts.
>>>>>>
>>>>>>         A more relevant question is if he's running a firewall on
>>>>>>         each of these machines?
>>>>>>
>>>>>>         A simple test... ssh to one node, ping other nodes and
>>>>>>         the control nodes at random to see if they can see one
>>>>>>         another. Then check to see if there is a firewall running
>>>>>>         which would limit the types of traffic between nodes.
>>>>>>
>>>>>>         One other side note... are these machines multi-homed?
>>>>>>
>>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>>
>>>>>>>         Hello there,
>>>>>>>
>>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>>         don't mind.
>>>>>>>
>>>>>>>         Regards,
>>>>>>>          Mohammad Tariq
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>>         wrote:
>>>>>>>
>>>>>>>             Hi,
>>>>>>>
>>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>>             seem to have difficulties with their communication
>>>>>>>             as their logs say:
>>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>>             logs below. I wondered if this might be due to
>>>>>>>             reasons correllated with the localhost entry in
>>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>>             similar errors, but i checked the file neither
>>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>>             (although you can ping localhost... the technician
>>>>>>>             of the cluster said he'd be looking for the
>>>>>>>             mechanics resolving localhost)
>>>>>>>             * The other nodes can not speak with the namenode
>>>>>>>             and jobtracker (its-cs131). Although it is
>>>>>>>             absolutely not clear, why this is happening: the
>>>>>>>             "dfs -put" i do directly before the job is running
>>>>>>>             fine, which seems to imply that communication
>>>>>>>             between those servers is working flawlessly.
>>>>>>>
>>>>>>>             Is there any reason why this might happen?
>>>>>>>
>>>>>>>
>>>>>>>             Regards,
>>>>>>>             Elmar
>>>>>>>
>>>>>>>             LOGS BELOW:
>>>>>>>
>>>>>>>             \____Datanodes
>>>>>>>
>>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>>             point i thought namenode and datanodes have to
>>>>>>>             communicate), i get the following errors when
>>>>>>>             starting the job:
>>>>>>>
>>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>>             big (about 12MB) and looks like this:
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 5 more
>>>>>>>
>>>>>>>             ... (this continues til the end of the log)
>>>>>>>
>>>>>>>             The second is short kind:
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             DataNode metrics system started
>>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Storage directory
>>>>>>>             /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>>             Formatting ...
>>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             Registered FSDatasetStatusMBean
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             Shutting down all async disk service threads...
>>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>>             All async disk service threads have been shut down.
>>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             java.net.BindException: Problem binding to
>>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>>             already in use
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>>             Caused by: java.net.BindException: Address already
>>>>>>>             in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>>                 ... 7 more
>>>>>>>
>>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>>             SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>             \_____TastTracker
>>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>>             ############################### LOG TYPE 1
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>>             Source)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>                 at
>>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>>                 at org.apache.hadoop.net
>>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>>                 ... 6 more
>>>>>>>
>>>>>>>
>>>>>>>             ########################### LOG TYPE 2
>>>>>>>             ############################################################
>>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>>             /************************************************************
>>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>>             STARTUP_MSG: host =
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             STARTUP_MSG: args = []
>>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>>             STARTUP_MSG: build =
>>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>>             23:58:21 UTC 2012
>>>>>>>             ************************************************************/
>>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig:
>>>>>>>             loaded properties from hadoop-metrics2.properties
>>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source MetricsSystem,sub=Stats registered.
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>>             TaskTracker metrics system started
>>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ugi registered.
>>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log:
>>>>>>>             Logging to
>>>>>>>             org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>>             filtersafety
>>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>>             and reduceRetainSize=-1
>>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tasktracker with owner as bmacek
>>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>>             local directories are:
>>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>>             native-hadoop library
>>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source jvm registered.
>>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source TaskTrackerMetrics registered.
>>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcDetailedActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source RpcActivityForPort54850 registered.
>>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>>             starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>>             54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2
>>>>>>>             on 54850: starting
>>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             tracker
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>>             <http://127.0.0.1:54850/>
>>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>>             with exit code 0
>>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>>             ResourceCalculatorPlugin :
>>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>>             is disabled.
>>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>>             created with max memory = 10485760
>>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>>>>             for source ShuffleServerMetrics registered.
>>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>>             task tracker because java.net.BindException: Address
>>>>>>>             already in use
>>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>>                 at sun.nio.ch
>>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>>                 at
>>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>>                 at
>>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>>
>>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>>             /************************************************************
>>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>>             ************************************************************/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi Michael and Mohammad,

thanks alot for your inpus!
i have pinged the people at the cluster in order to (eventually disable 
IPv6) and definetly check the ports corresponding to the appropriate 
machines. I will keep you updated.

Regards,
Elmar


Am 13.08.2012 22:39, schrieb Michael Segel:
>
> The key is to think about what can go wrong, but start with the low 
> hanging fruit.
>
> I mean you could be right, however you're jumping the gun and are over 
> looking simpler issues.
>
> The most common issue is that the networking traffic is being filtered.
> Of course since we're both diagnosing this with minimal information, 
> we're kind of shooting from the hip.
>
> This is why I'm asking if there is any networking traffic between the 
> nodes.  If you have partial communication, then focus on why you can't 
> see the specific traffic.
>
>
> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Thank you so very much for the detailed response Michael. I'll keep 
>> the tip in mind. Please pardon my ignorance, as I am still in the 
>> learning phase.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>
>>     0.0.0.0 means that the call is going to all interfaces on the
>>     machine.  (Shouldn't be an issue...)
>>
>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>     data to DNs and they seem to communicate, therefore if its IPv6
>>     related, wouldn't it impact all traffic and not just a specific port?
>>     I agree... shut down IPv6 if you can.
>>
>>     I don't disagree with your assessment. I am just suggesting that
>>     before you do a really deep dive, you think about the more
>>     obvious stuff first.
>>
>>     There are a couple of other things... like do all of the
>>     /etc/hosts files on all of the machines match?
>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>>     BTW, you said DNS in your response. if you're using DNS, then you
>>     don't really want to have much info in the /etc/hosts file except
>>     loopback and the server's IP address.
>>
>>     Looking at the problem OP is indicating some traffic works, while
>>     other traffic doesn't. Most likely something is blocking the
>>     ports. Iptables is the first place to look.
>>
>>     Just saying. ;-)
>>
>>
>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>>     Hi Michael,
>>>            I asked for hosts file because there seems to be some
>>>     loopback prob to me. The log shows that call is going at
>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>     and making sure that there is no prob with the DNS resolution is
>>>     also necessary. Please correct me if I am wrong. Thank you.
>>>
>>>     Regards,
>>>         Mohammad Tariq
>>>
>>>
>>>
>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>     wrote:
>>>
>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>>         Outside of MapR, multihomed machines can be problematic.
>>>         Hadoop doesn't generally work well when you're not using the
>>>         FQDN or its alias.
>>>
>>>         The issue isn't the SSH, but if you go to the node which is
>>>         having trouble connecting to another node,  then try to ping
>>>         it, or some other general communication,  if it succeeds,
>>>         your issue is that the port you're trying to communicate
>>>         with is blocked.  Then its more than likely an ipconfig or
>>>         firewall issue.
>>>
>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>
>>>>         Hi Michael,
>>>>
>>>>         well i can ssh from any node to any other without being
>>>>         prompted. The reason for this is, that my home dir is
>>>>         mounted in every server in the cluster.
>>>>
>>>>         If the machines are multihomed: i dont know. i could ask if
>>>>         this would be of importance.
>>>>
>>>>         Shall i?
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>         If the nodes can communicate and distribute data, then the
>>>>>         odds are that the issue isn't going to be in his /etc/hosts.
>>>>>
>>>>>         A more relevant question is if he's running a firewall on
>>>>>         each of these machines?
>>>>>
>>>>>         A simple test... ssh to one node, ping other nodes and the
>>>>>         control nodes at random to see if they can see one
>>>>>         another. Then check to see if there is a firewall running
>>>>>         which would limit the types of traffic between nodes.
>>>>>
>>>>>         One other side note... are these machines multi-homed?
>>>>>
>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>
>>>>>>         Hello there,
>>>>>>
>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>         don't mind.
>>>>>>
>>>>>>         Regards,
>>>>>>          Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>         wrote:
>>>>>>
>>>>>>             Hi,
>>>>>>
>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>             seem to have difficulties with their communication as
>>>>>>             their logs say:
>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>             logs below. I wondered if this might be due to
>>>>>>             reasons correllated with the localhost entry in
>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>             similar errors, but i checked the file neither
>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>             (although you can ping localhost... the technician of
>>>>>>             the cluster said he'd be looking for the mechanics
>>>>>>             resolving localhost)
>>>>>>             * The other nodes can not speak with the namenode and
>>>>>>             jobtracker (its-cs131). Although it is absolutely not
>>>>>>             clear, why this is happening: the "dfs -put" i do
>>>>>>             directly before the job is running fine, which seems
>>>>>>             to imply that communication between those servers is
>>>>>>             working flawlessly.
>>>>>>
>>>>>>             Is there any reason why this might happen?
>>>>>>
>>>>>>
>>>>>>             Regards,
>>>>>>             Elmar
>>>>>>
>>>>>>             LOGS BELOW:
>>>>>>
>>>>>>             \____Datanodes
>>>>>>
>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>             point i thought namenode and datanodes have to
>>>>>>             communicate), i get the following errors when
>>>>>>             starting the job:
>>>>>>
>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>             big (about 12MB) and looks like this:
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 5 more
>>>>>>
>>>>>>             ... (this continues til the end of the log)
>>>>>>
>>>>>>             The second is short kind:
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             DataNode metrics system started
>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>>>             directory /home/work/bmacek/hadoop/hdfs/slave is not
>>>>>>             formatted.
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>             Formatting ...
>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             Registered FSDatasetStatusMBean
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             Shutting down all async disk service threads...
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             All async disk service threads have been shut down.
>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.BindException: Problem binding to
>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>             already in use
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>             Caused by: java.net.BindException: Address already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>                 ... 7 more
>>>>>>
>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             \_____TastTracker
>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>             Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 6 more
>>>>>>
>>>>>>
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             TaskTracker metrics system started
>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging
>>>>>>             to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>             filtersafety
>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>             and reduceRetainSize=-1
>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tasktracker with owner as bmacek
>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>             local directories are:
>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source jvm registered.
>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source TaskTrackerMetrics registered.
>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcDetailedActivityForPort54850
>>>>>>             registered.
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcActivityForPort54850 registered.
>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>             starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tracker
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>             with exit code 0
>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>             ResourceCalculatorPlugin :
>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>             is disabled.
>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>             created with max memory = 10485760
>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ShuffleServerMetrics registered.
>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>             task tracker because java.net.BindException: Address
>>>>>>             already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at
>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>                 at
>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>
>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi Michael and Mohammad,

thanks alot for your inpus!
i have pinged the people at the cluster in order to (eventually disable 
IPv6) and definetly check the ports corresponding to the appropriate 
machines. I will keep you updated.

Regards,
Elmar


Am 13.08.2012 22:39, schrieb Michael Segel:
>
> The key is to think about what can go wrong, but start with the low 
> hanging fruit.
>
> I mean you could be right, however you're jumping the gun and are over 
> looking simpler issues.
>
> The most common issue is that the networking traffic is being filtered.
> Of course since we're both diagnosing this with minimal information, 
> we're kind of shooting from the hip.
>
> This is why I'm asking if there is any networking traffic between the 
> nodes.  If you have partial communication, then focus on why you can't 
> see the specific traffic.
>
>
> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Thank you so very much for the detailed response Michael. I'll keep 
>> the tip in mind. Please pardon my ignorance, as I am still in the 
>> learning phase.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>
>>     0.0.0.0 means that the call is going to all interfaces on the
>>     machine.  (Shouldn't be an issue...)
>>
>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>     data to DNs and they seem to communicate, therefore if its IPv6
>>     related, wouldn't it impact all traffic and not just a specific port?
>>     I agree... shut down IPv6 if you can.
>>
>>     I don't disagree with your assessment. I am just suggesting that
>>     before you do a really deep dive, you think about the more
>>     obvious stuff first.
>>
>>     There are a couple of other things... like do all of the
>>     /etc/hosts files on all of the machines match?
>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>>     BTW, you said DNS in your response. if you're using DNS, then you
>>     don't really want to have much info in the /etc/hosts file except
>>     loopback and the server's IP address.
>>
>>     Looking at the problem OP is indicating some traffic works, while
>>     other traffic doesn't. Most likely something is blocking the
>>     ports. Iptables is the first place to look.
>>
>>     Just saying. ;-)
>>
>>
>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>>     Hi Michael,
>>>            I asked for hosts file because there seems to be some
>>>     loopback prob to me. The log shows that call is going at
>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>     and making sure that there is no prob with the DNS resolution is
>>>     also necessary. Please correct me if I am wrong. Thank you.
>>>
>>>     Regards,
>>>         Mohammad Tariq
>>>
>>>
>>>
>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>     wrote:
>>>
>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>>         Outside of MapR, multihomed machines can be problematic.
>>>         Hadoop doesn't generally work well when you're not using the
>>>         FQDN or its alias.
>>>
>>>         The issue isn't the SSH, but if you go to the node which is
>>>         having trouble connecting to another node,  then try to ping
>>>         it, or some other general communication,  if it succeeds,
>>>         your issue is that the port you're trying to communicate
>>>         with is blocked.  Then its more than likely an ipconfig or
>>>         firewall issue.
>>>
>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>
>>>>         Hi Michael,
>>>>
>>>>         well i can ssh from any node to any other without being
>>>>         prompted. The reason for this is, that my home dir is
>>>>         mounted in every server in the cluster.
>>>>
>>>>         If the machines are multihomed: i dont know. i could ask if
>>>>         this would be of importance.
>>>>
>>>>         Shall i?
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>         If the nodes can communicate and distribute data, then the
>>>>>         odds are that the issue isn't going to be in his /etc/hosts.
>>>>>
>>>>>         A more relevant question is if he's running a firewall on
>>>>>         each of these machines?
>>>>>
>>>>>         A simple test... ssh to one node, ping other nodes and the
>>>>>         control nodes at random to see if they can see one
>>>>>         another. Then check to see if there is a firewall running
>>>>>         which would limit the types of traffic between nodes.
>>>>>
>>>>>         One other side note... are these machines multi-homed?
>>>>>
>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>
>>>>>>         Hello there,
>>>>>>
>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>         don't mind.
>>>>>>
>>>>>>         Regards,
>>>>>>          Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>         wrote:
>>>>>>
>>>>>>             Hi,
>>>>>>
>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>             seem to have difficulties with their communication as
>>>>>>             their logs say:
>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>             logs below. I wondered if this might be due to
>>>>>>             reasons correllated with the localhost entry in
>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>             similar errors, but i checked the file neither
>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>             (although you can ping localhost... the technician of
>>>>>>             the cluster said he'd be looking for the mechanics
>>>>>>             resolving localhost)
>>>>>>             * The other nodes can not speak with the namenode and
>>>>>>             jobtracker (its-cs131). Although it is absolutely not
>>>>>>             clear, why this is happening: the "dfs -put" i do
>>>>>>             directly before the job is running fine, which seems
>>>>>>             to imply that communication between those servers is
>>>>>>             working flawlessly.
>>>>>>
>>>>>>             Is there any reason why this might happen?
>>>>>>
>>>>>>
>>>>>>             Regards,
>>>>>>             Elmar
>>>>>>
>>>>>>             LOGS BELOW:
>>>>>>
>>>>>>             \____Datanodes
>>>>>>
>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>             point i thought namenode and datanodes have to
>>>>>>             communicate), i get the following errors when
>>>>>>             starting the job:
>>>>>>
>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>             big (about 12MB) and looks like this:
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 5 more
>>>>>>
>>>>>>             ... (this continues til the end of the log)
>>>>>>
>>>>>>             The second is short kind:
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             DataNode metrics system started
>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>>>             directory /home/work/bmacek/hadoop/hdfs/slave is not
>>>>>>             formatted.
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>             Formatting ...
>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             Registered FSDatasetStatusMBean
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             Shutting down all async disk service threads...
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             All async disk service threads have been shut down.
>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.BindException: Problem binding to
>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>             already in use
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>             Caused by: java.net.BindException: Address already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>                 ... 7 more
>>>>>>
>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             \_____TastTracker
>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>             Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 6 more
>>>>>>
>>>>>>
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             TaskTracker metrics system started
>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging
>>>>>>             to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>             filtersafety
>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>             and reduceRetainSize=-1
>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tasktracker with owner as bmacek
>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>             local directories are:
>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source jvm registered.
>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source TaskTrackerMetrics registered.
>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcDetailedActivityForPort54850
>>>>>>             registered.
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcActivityForPort54850 registered.
>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>             starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tracker
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>             with exit code 0
>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>             ResourceCalculatorPlugin :
>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>             is disabled.
>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>             created with max memory = 10485760
>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ShuffleServerMetrics registered.
>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>             task tracker because java.net.BindException: Address
>>>>>>             already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at
>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>                 at
>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>
>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi Michael and Mohammad,

thanks alot for your inpus!
i have pinged the people at the cluster in order to (eventually disable 
IPv6) and definetly check the ports corresponding to the appropriate 
machines. I will keep you updated.

Regards,
Elmar


Am 13.08.2012 22:39, schrieb Michael Segel:
>
> The key is to think about what can go wrong, but start with the low 
> hanging fruit.
>
> I mean you could be right, however you're jumping the gun and are over 
> looking simpler issues.
>
> The most common issue is that the networking traffic is being filtered.
> Of course since we're both diagnosing this with minimal information, 
> we're kind of shooting from the hip.
>
> This is why I'm asking if there is any networking traffic between the 
> nodes.  If you have partial communication, then focus on why you can't 
> see the specific traffic.
>
>
> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Thank you so very much for the detailed response Michael. I'll keep 
>> the tip in mind. Please pardon my ignorance, as I am still in the 
>> learning phase.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>
>>     0.0.0.0 means that the call is going to all interfaces on the
>>     machine.  (Shouldn't be an issue...)
>>
>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>     data to DNs and they seem to communicate, therefore if its IPv6
>>     related, wouldn't it impact all traffic and not just a specific port?
>>     I agree... shut down IPv6 if you can.
>>
>>     I don't disagree with your assessment. I am just suggesting that
>>     before you do a really deep dive, you think about the more
>>     obvious stuff first.
>>
>>     There are a couple of other things... like do all of the
>>     /etc/hosts files on all of the machines match?
>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>>     BTW, you said DNS in your response. if you're using DNS, then you
>>     don't really want to have much info in the /etc/hosts file except
>>     loopback and the server's IP address.
>>
>>     Looking at the problem OP is indicating some traffic works, while
>>     other traffic doesn't. Most likely something is blocking the
>>     ports. Iptables is the first place to look.
>>
>>     Just saying. ;-)
>>
>>
>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>>     Hi Michael,
>>>            I asked for hosts file because there seems to be some
>>>     loopback prob to me. The log shows that call is going at
>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>     and making sure that there is no prob with the DNS resolution is
>>>     also necessary. Please correct me if I am wrong. Thank you.
>>>
>>>     Regards,
>>>         Mohammad Tariq
>>>
>>>
>>>
>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>     wrote:
>>>
>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>>         Outside of MapR, multihomed machines can be problematic.
>>>         Hadoop doesn't generally work well when you're not using the
>>>         FQDN or its alias.
>>>
>>>         The issue isn't the SSH, but if you go to the node which is
>>>         having trouble connecting to another node,  then try to ping
>>>         it, or some other general communication,  if it succeeds,
>>>         your issue is that the port you're trying to communicate
>>>         with is blocked.  Then its more than likely an ipconfig or
>>>         firewall issue.
>>>
>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>
>>>>         Hi Michael,
>>>>
>>>>         well i can ssh from any node to any other without being
>>>>         prompted. The reason for this is, that my home dir is
>>>>         mounted in every server in the cluster.
>>>>
>>>>         If the machines are multihomed: i dont know. i could ask if
>>>>         this would be of importance.
>>>>
>>>>         Shall i?
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>         If the nodes can communicate and distribute data, then the
>>>>>         odds are that the issue isn't going to be in his /etc/hosts.
>>>>>
>>>>>         A more relevant question is if he's running a firewall on
>>>>>         each of these machines?
>>>>>
>>>>>         A simple test... ssh to one node, ping other nodes and the
>>>>>         control nodes at random to see if they can see one
>>>>>         another. Then check to see if there is a firewall running
>>>>>         which would limit the types of traffic between nodes.
>>>>>
>>>>>         One other side note... are these machines multi-homed?
>>>>>
>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>
>>>>>>         Hello there,
>>>>>>
>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>         don't mind.
>>>>>>
>>>>>>         Regards,
>>>>>>          Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>         wrote:
>>>>>>
>>>>>>             Hi,
>>>>>>
>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>             seem to have difficulties with their communication as
>>>>>>             their logs say:
>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>             logs below. I wondered if this might be due to
>>>>>>             reasons correllated with the localhost entry in
>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>             similar errors, but i checked the file neither
>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>             (although you can ping localhost... the technician of
>>>>>>             the cluster said he'd be looking for the mechanics
>>>>>>             resolving localhost)
>>>>>>             * The other nodes can not speak with the namenode and
>>>>>>             jobtracker (its-cs131). Although it is absolutely not
>>>>>>             clear, why this is happening: the "dfs -put" i do
>>>>>>             directly before the job is running fine, which seems
>>>>>>             to imply that communication between those servers is
>>>>>>             working flawlessly.
>>>>>>
>>>>>>             Is there any reason why this might happen?
>>>>>>
>>>>>>
>>>>>>             Regards,
>>>>>>             Elmar
>>>>>>
>>>>>>             LOGS BELOW:
>>>>>>
>>>>>>             \____Datanodes
>>>>>>
>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>             point i thought namenode and datanodes have to
>>>>>>             communicate), i get the following errors when
>>>>>>             starting the job:
>>>>>>
>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>             big (about 12MB) and looks like this:
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 5 more
>>>>>>
>>>>>>             ... (this continues til the end of the log)
>>>>>>
>>>>>>             The second is short kind:
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             DataNode metrics system started
>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>>>             directory /home/work/bmacek/hadoop/hdfs/slave is not
>>>>>>             formatted.
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>             Formatting ...
>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             Registered FSDatasetStatusMBean
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             Shutting down all async disk service threads...
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             All async disk service threads have been shut down.
>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.BindException: Problem binding to
>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>             already in use
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>             Caused by: java.net.BindException: Address already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>                 ... 7 more
>>>>>>
>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             \_____TastTracker
>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>             Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 6 more
>>>>>>
>>>>>>
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             TaskTracker metrics system started
>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging
>>>>>>             to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>             filtersafety
>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>             and reduceRetainSize=-1
>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tasktracker with owner as bmacek
>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>             local directories are:
>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source jvm registered.
>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source TaskTrackerMetrics registered.
>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcDetailedActivityForPort54850
>>>>>>             registered.
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcActivityForPort54850 registered.
>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>             starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tracker
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>             with exit code 0
>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>             ResourceCalculatorPlugin :
>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>             is disabled.
>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>             created with max memory = 10485760
>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ShuffleServerMetrics registered.
>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>             task tracker because java.net.BindException: Address
>>>>>>             already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at
>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>                 at
>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>
>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <ma...@cs.uni-kassel.de>.
Hi Michael and Mohammad,

thanks alot for your inpus!
i have pinged the people at the cluster in order to (eventually disable 
IPv6) and definetly check the ports corresponding to the appropriate 
machines. I will keep you updated.

Regards,
Elmar


Am 13.08.2012 22:39, schrieb Michael Segel:
>
> The key is to think about what can go wrong, but start with the low 
> hanging fruit.
>
> I mean you could be right, however you're jumping the gun and are over 
> looking simpler issues.
>
> The most common issue is that the networking traffic is being filtered.
> Of course since we're both diagnosing this with minimal information, 
> we're kind of shooting from the hip.
>
> This is why I'm asking if there is any networking traffic between the 
> nodes.  If you have partial communication, then focus on why you can't 
> see the specific traffic.
>
>
> On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Thank you so very much for the detailed response Michael. I'll keep 
>> the tip in mind. Please pardon my ignorance, as I am still in the 
>> learning phase.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel 
>> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>>
>>     0.0.0.0 means that the call is going to all interfaces on the
>>     machine.  (Shouldn't be an issue...)
>>
>>     IPv4 vs IPv6? Could be an issue, however OP says he can write
>>     data to DNs and they seem to communicate, therefore if its IPv6
>>     related, wouldn't it impact all traffic and not just a specific port?
>>     I agree... shut down IPv6 if you can.
>>
>>     I don't disagree with your assessment. I am just suggesting that
>>     before you do a really deep dive, you think about the more
>>     obvious stuff first.
>>
>>     There are a couple of other things... like do all of the
>>     /etc/hosts files on all of the machines match?
>>     Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>>     BTW, you said DNS in your response. if you're using DNS, then you
>>     don't really want to have much info in the /etc/hosts file except
>>     loopback and the server's IP address.
>>
>>     Looking at the problem OP is indicating some traffic works, while
>>     other traffic doesn't. Most likely something is blocking the
>>     ports. Iptables is the first place to look.
>>
>>     Just saying. ;-)
>>
>>
>>     On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <dontariq@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>>     Hi Michael,
>>>            I asked for hosts file because there seems to be some
>>>     loopback prob to me. The log shows that call is going at
>>>     0.0.0.0. Apart from what you have said, I think disabling IPv6
>>>     and making sure that there is no prob with the DNS resolution is
>>>     also necessary. Please correct me if I am wrong. Thank you.
>>>
>>>     Regards,
>>>         Mohammad Tariq
>>>
>>>
>>>
>>>     On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel
>>>     <michael_segel@hotmail.com <ma...@hotmail.com>>
>>>     wrote:
>>>
>>>         Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>>         Outside of MapR, multihomed machines can be problematic.
>>>         Hadoop doesn't generally work well when you're not using the
>>>         FQDN or its alias.
>>>
>>>         The issue isn't the SSH, but if you go to the node which is
>>>         having trouble connecting to another node,  then try to ping
>>>         it, or some other general communication,  if it succeeds,
>>>         your issue is that the port you're trying to communicate
>>>         with is blocked.  Then its more than likely an ipconfig or
>>>         firewall issue.
>>>
>>>         On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>>>         <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>
>>>>         Hi Michael,
>>>>
>>>>         well i can ssh from any node to any other without being
>>>>         prompted. The reason for this is, that my home dir is
>>>>         mounted in every server in the cluster.
>>>>
>>>>         If the machines are multihomed: i dont know. i could ask if
>>>>         this would be of importance.
>>>>
>>>>         Shall i?
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         Am 13.08.12 14:59, schrieb Michael Segel:
>>>>>         If the nodes can communicate and distribute data, then the
>>>>>         odds are that the issue isn't going to be in his /etc/hosts.
>>>>>
>>>>>         A more relevant question is if he's running a firewall on
>>>>>         each of these machines?
>>>>>
>>>>>         A simple test... ssh to one node, ping other nodes and the
>>>>>         control nodes at random to see if they can see one
>>>>>         another. Then check to see if there is a firewall running
>>>>>         which would limit the types of traffic between nodes.
>>>>>
>>>>>         One other side note... are these machines multi-homed?
>>>>>
>>>>>         On Aug 13, 2012, at 7:51 AM, Mohammad Tariq
>>>>>         <dontariq@gmail.com <ma...@gmail.com>> wrote:
>>>>>
>>>>>>         Hello there,
>>>>>>
>>>>>>          Could you please share your /etc/hosts file, if you
>>>>>>         don't mind.
>>>>>>
>>>>>>         Regards,
>>>>>>          Mohammad Tariq
>>>>>>
>>>>>>
>>>>>>
>>>>>>         On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>>>         <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>>
>>>>>>         wrote:
>>>>>>
>>>>>>             Hi,
>>>>>>
>>>>>>             i am currently trying to run my hadoop program on a
>>>>>>             cluster. Sadly though my datanodes and tasktrackers
>>>>>>             seem to have difficulties with their communication as
>>>>>>             their logs say:
>>>>>>             * Some datanodes and tasktrackers seem to have
>>>>>>             portproblems of some kind as it can be seen in the
>>>>>>             logs below. I wondered if this might be due to
>>>>>>             reasons correllated with the localhost entry in
>>>>>>             /etc/hosts as you can read in alot of posts with
>>>>>>             similar errors, but i checked the file neither
>>>>>>             localhost nor 127.0.0.1/127.0.1.1
>>>>>>             <http://127.0.0.1/127.0.1.1> is bound there.
>>>>>>             (although you can ping localhost... the technician of
>>>>>>             the cluster said he'd be looking for the mechanics
>>>>>>             resolving localhost)
>>>>>>             * The other nodes can not speak with the namenode and
>>>>>>             jobtracker (its-cs131). Although it is absolutely not
>>>>>>             clear, why this is happening: the "dfs -put" i do
>>>>>>             directly before the job is running fine, which seems
>>>>>>             to imply that communication between those servers is
>>>>>>             working flawlessly.
>>>>>>
>>>>>>             Is there any reason why this might happen?
>>>>>>
>>>>>>
>>>>>>             Regards,
>>>>>>             Elmar
>>>>>>
>>>>>>             LOGS BELOW:
>>>>>>
>>>>>>             \____Datanodes
>>>>>>
>>>>>>             After successfully putting the data to hdfs (at this
>>>>>>             point i thought namenode and datanodes have to
>>>>>>             communicate), i get the following errors when
>>>>>>             starting the job:
>>>>>>
>>>>>>             There are 2 kinds of logs i found: the first one is
>>>>>>             big (about 12MB) and looks like this:
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 08:23:27,331 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 08:23:28,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>>>             2012-08-13 08:23:29,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>>>             2012-08-13 08:23:30,332 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>>>             2012-08-13 08:23:31,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>>>             2012-08-13 08:23:32,333 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>>>             2012-08-13 08:23:33,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>>>             2012-08-13 08:23:34,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>>>             2012-08-13 08:23:35,334 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>>>             2012-08-13 08:23:36,335 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>>>             2012-08-13 08:23:36,335 WARN
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>>                 at java.lang.Thread.run(Thread.java:619)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 5 more
>>>>>>
>>>>>>             ... (this continues til the end of the log)
>>>>>>
>>>>>>             The second is short kind:
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:19,038 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting DataNode
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:19,203 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:19,216 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:19,217 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:19,218 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             DataNode metrics system started
>>>>>>             2012-08-13 00:59:19,306 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:19,346 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:20,482 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35554
>>>>>>             <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>>>             directory /home/work/bmacek/hadoop/hdfs/slave is not
>>>>>>             formatted.
>>>>>>             2012-08-13 00:59:21,584 INFO
>>>>>>             org.apache.hadoop.hdfs.server.common.Storage:
>>>>>>             Formatting ...
>>>>>>             2012-08-13 00:59:21,787 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             Registered FSDatasetStatusMBean
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             Shutting down all async disk service threads...
>>>>>>             2012-08-13 00:59:21,897 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>>>             All async disk service threads have been shut down.
>>>>>>             2012-08-13 00:59:21,898 ERROR
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             java.net.BindException: Problem binding to
>>>>>>             /0.0.0.0:50010 <http://0.0.0.0:50010/> : Address
>>>>>>             already in use
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>>                 at
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>>>             Caused by: java.net.BindException: Address already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>>                 ... 7 more
>>>>>>
>>>>>>             2012-08-13 00:59:21,899 INFO
>>>>>>             org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>>>             SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down DataNode at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             \_____TastTracker
>>>>>>             With TaskTrackers it is the same: there are 2 kinds.
>>>>>>             ############################### LOG TYPE 1
>>>>>>             ############################################################
>>>>>>             2012-08-13 02:09:54,645 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Resending
>>>>>>             'status' to 'its-cs131' with reponseId '879
>>>>>>             2012-08-13 02:09:55,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 02:09:56,646 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>>>             2012-08-13 02:09:57,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>>>             2012-08-13 02:09:58,647 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>>>             2012-08-13 02:09:59,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>>>             2012-08-13 02:10:00,648 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>>>             2012-08-13 02:10:01,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>>>             2012-08-13 02:10:02,649 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>>>             2012-08-13 02:10:03,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>>>             2012-08-13 02:10:04,650 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>>>             2012-08-13 02:10:04,651 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Caught
>>>>>>             exception: java.net.ConnectException: Call to
>>>>>>             its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/> failed on connection
>>>>>>             exception: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>>>             Source)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>>>             Caused by: java.net.ConnectException: Connection refused
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>                 at
>>>>>>             sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>                 at org.apache.hadoop.net
>>>>>>             <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>>                 at
>>>>>>             org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>>                 ... 6 more
>>>>>>
>>>>>>
>>>>>>             ########################### LOG TYPE 2
>>>>>>             ############################################################
>>>>>>             2012-08-13 00:59:24,376 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>>>             /************************************************************
>>>>>>             STARTUP_MSG: Starting TaskTracker
>>>>>>             STARTUP_MSG: host =
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             STARTUP_MSG: args = []
>>>>>>             STARTUP_MSG: version = 1.0.2
>>>>>>             STARTUP_MSG: build =
>>>>>>             https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>>>             -r 1304954; compiled by 'hortonfo' on Sat Mar 24
>>>>>>             23:58:21 UTC 2012
>>>>>>             ************************************************************/
>>>>>>             2012-08-13 00:59:24,569 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>>>             properties from hadoop-metrics2.properties
>>>>>>             2012-08-13 00:59:24,626 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source MetricsSystem,sub=Stats registered.
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             Scheduled snapshot period at 10 second(s).
>>>>>>             2012-08-13 00:59:24,627 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>>>             TaskTracker metrics system started
>>>>>>             2012-08-13 00:59:24,950 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ugi registered.
>>>>>>             2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging
>>>>>>             to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log)
>>>>>>             via org.mortbay.log.Slf4jLog
>>>>>>             2012-08-13 00:59:25,206 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Added global
>>>>>>             filtersafety
>>>>>>             (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>>>             2012-08-13 00:59:25,232 INFO
>>>>>>             org.apache.hadoop.mapred.TaskLogsTruncater:
>>>>>>             Initializing logs' truncater with mapRetainSize=-1
>>>>>>             and reduceRetainSize=-1
>>>>>>             2012-08-13 00:59:25,237 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tasktracker with owner as bmacek
>>>>>>             2012-08-13 00:59:25,239 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Good mapred
>>>>>>             local directories are:
>>>>>>             /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>>>             2012-08-13 00:59:25,244 INFO
>>>>>>             org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>>>             native-hadoop library
>>>>>>             2012-08-13 00:59:25,255 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source jvm registered.
>>>>>>             2012-08-13 00:59:25,256 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source TaskTrackerMetrics registered.
>>>>>>             2012-08-13 00:59:25,279 INFO
>>>>>>             org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcDetailedActivityForPort54850
>>>>>>             registered.
>>>>>>             2012-08-13 00:59:25,282 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source RpcActivityForPort54850 registered.
>>>>>>             2012-08-13 00:59:25,287 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server Responder:
>>>>>>             starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server listener on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 0 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,288 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 1 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker up
>>>>>>             at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 3 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.ipc.Server: IPC Server handler 2 on
>>>>>>             54850: starting
>>>>>>             2012-08-13 00:59:25,289 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             tracker
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:26,321 INFO
>>>>>>             org.apache.hadoop.ipc.Client: Retrying connect to
>>>>>>             server: its-cs131/141.51.205.41:35555
>>>>>>             <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>>>             2012-08-13 00:59:38,104 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Starting
>>>>>>             thread: Map-events fetcher for all reduce tasks on
>>>>>>             tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>>>             <http://127.0.0.1:54850/>
>>>>>>             2012-08-13 00:59:38,120 INFO
>>>>>>             org.apache.hadoop.util.ProcessTree: setsid exited
>>>>>>             with exit code 0
>>>>>>             2012-08-13 00:59:38,134 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Using
>>>>>>             ResourceCalculatorPlugin :
>>>>>>             org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>>>             2012-08-13 00:59:38,137 WARN
>>>>>>             org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>>>             totalMemoryAllottedForTasks is -1. TaskMemoryManager
>>>>>>             is disabled.
>>>>>>             2012-08-13 00:59:38,145 INFO
>>>>>>             org.apache.hadoop.mapred.IndexCache: IndexCache
>>>>>>             created with max memory = 10485760
>>>>>>             2012-08-13 00:59:38,158 INFO
>>>>>>             org.apache.hadoop.metrics2.impl.MetricsSourceAdapter:
>>>>>>             MBean for source ShuffleServerMetrics registered.
>>>>>>             2012-08-13 00:59:38,161 INFO
>>>>>>             org.apache.hadoop.http.HttpServer: Port returned by
>>>>>>             webServer.getConnectors()[0].getLocalPort() before
>>>>>>             open() is -1. Opening the listener on 50060
>>>>>>             2012-08-13 00:59:38,161 ERROR
>>>>>>             org.apache.hadoop.mapred.TaskTracker: Can not start
>>>>>>             task tracker because java.net.BindException: Address
>>>>>>             already in use
>>>>>>                 at sun.nio.ch.Net.bind(Native Method)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>>                 at sun.nio.ch
>>>>>>             <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>>                 at
>>>>>>             org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>>                 at
>>>>>>             org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>>                 at
>>>>>>             org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>>>
>>>>>>             2012-08-13 00:59:38,163 INFO
>>>>>>             org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>>>             /************************************************************
>>>>>>             SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>>>             its-cs133.its.uni-kassel.de/141.51.205.43
>>>>>>             <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>>>             ************************************************************/
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
The key is to think about what can go wrong, but start with the low hanging fruit. 

I mean you could be right, however you're jumping the gun and are over looking simpler issues. 

The most common issue is that the networking traffic is being filtered. 
Of course since we're both diagnosing this with minimal information, we're kind of shooting from the hip. 

This is why I'm asking if there is any networking traffic between the nodes.  If you have partial communication, then focus on why you can't see the specific traffic.


On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the tip in mind. Please pardon my ignorance, as I am still in the learning phase.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com> wrote:
> 0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)
> 
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
> 
> I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 
> 
> There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
> Is the OP using both /etc/hosts and DNS? If so, are they in sync? 
> 
> BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 
> 
> Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 
> 
> Just saying. ;-) 
> 
> 
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
> 
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
>> 
>> Regards,
>>     Mohammad Tariq
>> 
>> 
>> 
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
>> Based on your /etc/hosts output, why aren't you using DNS? 
>> 
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
>> 
>> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
>> 
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
>> 
>>> Hi Michael,
>>> 
>>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>>> 
>>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>>> 
>>> Shall i?
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>>> 
>>>> A more relevant question is if he's running a firewall on each of these machines? 
>>>> 
>>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>>> 
>>>> One other side note... are these machines multi-homed?
>>>> 
>>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>>> 
>>>>> Hello there,
>>>>> 
>>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>>> 
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>>> Hi,
>>>>> 
>>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>>> 
>>>>> Is there any reason why this might happen?
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Elmar
>>>>> 
>>>>> LOGS BELOW:
>>>>> 
>>>>> \____Datanodes
>>>>> 
>>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>>> 
>>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>     at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 5 more
>>>>> 
>>>>> ... (this continues til the end of the log)
>>>>> 
>>>>> The second is short kind:
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting DataNode
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>> Caused by: java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>     ... 7 more
>>>>> 
>>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> \_____TastTracker
>>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 6 more
>>>>> 
>>>>> 
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting TaskTracker
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>> 
>>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>> 
>>> 
>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Sriram Ramachandrasekaran <sr...@gmail.com>.
the logs indicate already in use exception. is that some sign? :)
On 13 Aug 2012 20:36, "Mohammad Tariq" <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the
> tip in mind. Please pardon my ignorance, as I am still in the learning
> phase.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> 0.0.0.0 means that the call is going to all interfaces on the machine.
>>  (Shouldn't be an issue...)
>>
>> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
>> and they seem to communicate, therefore if its IPv6 related, wouldn't it
>> impact all traffic and not just a specific port?
>> I agree... shut down IPv6 if you can.
>>
>> I don't disagree with your assessment. I am just suggesting that before
>> you do a really deep dive, you think about the more obvious stuff first.
>>
>> There are a couple of other things... like do all of the /etc/hosts files
>> on all of the machines match?
>> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>> BTW, you said DNS in your response. if you're using DNS, then you don't
>> really want to have much info in the /etc/hosts file except loopback and
>> the server's IP address.
>>
>> Looking at the problem OP is indicating some traffic works, while other
>> traffic doesn't. Most likely something is blocking the ports. Iptables is
>> the first place to look.
>>
>> Just saying. ;-)
>>
>>
>> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback
>> prob to me. The log shows that call is going at 0.0.0.0. Apart from what
>> you have said, I think disabling IPv6 and making sure that there is no prob
>> with the DNS resolution is also necessary. Please correct me if I am wrong.
>> Thank you.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <michael_segel@hotmail.com
>> > wrote:
>>
>>> Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>>> generally work well when you're not using the FQDN or its alias.
>>>
>>> The issue isn't the SSH, but if you go to the node which is having
>>> trouble connecting to another node,  then try to ping it, or some other
>>> general communication,  if it succeeds, your issue is that the port you're
>>> trying to communicate with is blocked.  Then its more than likely an
>>> ipconfig or firewall issue.
>>>
>>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>>> wrote:
>>>
>>>  Hi Michael,
>>>
>>> well i can ssh from any node to any other without being prompted. The
>>> reason for this is, that my home dir is mounted in every server in the
>>> cluster.
>>>
>>> If the machines are multihomed: i dont know. i could ask if this would
>>> be of importance.
>>>
>>> Shall i?
>>>
>>> Regards,
>>> Elmar
>>>
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>
>>> If the nodes can communicate and distribute data, then the odds are that
>>> the issue isn't going to be in his /etc/hosts.
>>>
>>>  A more relevant question is if he's running a firewall on each of
>>> these machines?
>>>
>>>  A simple test... ssh to one node, ping other nodes and the control
>>> nodes at random to see if they can see one another. Then check to see if
>>> there is a firewall running which would limit the types of traffic between
>>> nodes.
>>>
>>>  One other side note... are these machines multi-homed?
>>>
>>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>>
>>> Hello there,
>>>
>>>       Could you please share your /etc/hosts file, if you don't mind.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>>> macek@cs.uni-kassel.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>>> though my datanodes and tasktrackers seem to have difficulties with their
>>>> communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some
>>>> kind as it can be seen in the logs below. I wondered if this might be due
>>>> to reasons correllated with the localhost entry in /etc/hosts as you can
>>>> read in alot of posts with similar errors, but i checked the file neither
>>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>>> ping localhost... the technician of the cluster said he'd be looking for
>>>> the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker
>>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>>> imply that communication between those servers is working flawlessly.
>>>>
>>>> Is there any reason why this might happen?
>>>>
>>>>
>>>> Regards,
>>>> Elmar
>>>>
>>>> LOGS BELOW:
>>>>
>>>> \____Datanodes
>>>>
>>>> After successfully putting the data to hdfs (at this point i thought
>>>> namenode and datanodes have to communicate), i get the following errors
>>>> when starting the job:
>>>>
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB)
>>>> and looks like this:
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>>> time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>>> time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>>> time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>>> time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>>> time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>>> time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>>> time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 WARN
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>>
>>>> ... (this continues til the end of the log)
>>>>
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:19,038 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>>> started
>>>> 2012-08-13 00:59:19,306 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>> FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>>> down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>>> disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>>
>>>> 2012-08-13 00:59:21,899 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>>> time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>>> time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>>> time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>>> time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>>> time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>>> time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>>> time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>>> time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>>> time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>>> 141.51.205.41:35555 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>>
>>>>
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>>> system started
>>>> 2012-08-13 00:59:24,950 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>> org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>>> global filtersafety
>>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO
>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>>> registered.
>>>> 2012-08-13 00:59:25,256 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>>> SocketReader
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>>> 127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting thread: Map-events fetcher for all reduce tasks on
>>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>>> exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Using ResourceCalculatorPlugin :
>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>> disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>>> IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>>> -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>>> not start task tracker because java.net.BindException: Address already in
>>>> use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at
>>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
The key is to think about what can go wrong, but start with the low hanging fruit. 

I mean you could be right, however you're jumping the gun and are over looking simpler issues. 

The most common issue is that the networking traffic is being filtered. 
Of course since we're both diagnosing this with minimal information, we're kind of shooting from the hip. 

This is why I'm asking if there is any networking traffic between the nodes.  If you have partial communication, then focus on why you can't see the specific traffic.


On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the tip in mind. Please pardon my ignorance, as I am still in the learning phase.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com> wrote:
> 0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)
> 
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
> 
> I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 
> 
> There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
> Is the OP using both /etc/hosts and DNS? If so, are they in sync? 
> 
> BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 
> 
> Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 
> 
> Just saying. ;-) 
> 
> 
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
> 
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
>> 
>> Regards,
>>     Mohammad Tariq
>> 
>> 
>> 
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
>> Based on your /etc/hosts output, why aren't you using DNS? 
>> 
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
>> 
>> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
>> 
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
>> 
>>> Hi Michael,
>>> 
>>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>>> 
>>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>>> 
>>> Shall i?
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>>> 
>>>> A more relevant question is if he's running a firewall on each of these machines? 
>>>> 
>>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>>> 
>>>> One other side note... are these machines multi-homed?
>>>> 
>>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>>> 
>>>>> Hello there,
>>>>> 
>>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>>> 
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>>> Hi,
>>>>> 
>>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>>> 
>>>>> Is there any reason why this might happen?
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Elmar
>>>>> 
>>>>> LOGS BELOW:
>>>>> 
>>>>> \____Datanodes
>>>>> 
>>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>>> 
>>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>     at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 5 more
>>>>> 
>>>>> ... (this continues til the end of the log)
>>>>> 
>>>>> The second is short kind:
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting DataNode
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>> Caused by: java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>     ... 7 more
>>>>> 
>>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> \_____TastTracker
>>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 6 more
>>>>> 
>>>>> 
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting TaskTracker
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>> 
>>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>> 
>>> 
>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Sriram Ramachandrasekaran <sr...@gmail.com>.
the logs indicate already in use exception. is that some sign? :)
On 13 Aug 2012 20:36, "Mohammad Tariq" <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the
> tip in mind. Please pardon my ignorance, as I am still in the learning
> phase.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> 0.0.0.0 means that the call is going to all interfaces on the machine.
>>  (Shouldn't be an issue...)
>>
>> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
>> and they seem to communicate, therefore if its IPv6 related, wouldn't it
>> impact all traffic and not just a specific port?
>> I agree... shut down IPv6 if you can.
>>
>> I don't disagree with your assessment. I am just suggesting that before
>> you do a really deep dive, you think about the more obvious stuff first.
>>
>> There are a couple of other things... like do all of the /etc/hosts files
>> on all of the machines match?
>> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>> BTW, you said DNS in your response. if you're using DNS, then you don't
>> really want to have much info in the /etc/hosts file except loopback and
>> the server's IP address.
>>
>> Looking at the problem OP is indicating some traffic works, while other
>> traffic doesn't. Most likely something is blocking the ports. Iptables is
>> the first place to look.
>>
>> Just saying. ;-)
>>
>>
>> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback
>> prob to me. The log shows that call is going at 0.0.0.0. Apart from what
>> you have said, I think disabling IPv6 and making sure that there is no prob
>> with the DNS resolution is also necessary. Please correct me if I am wrong.
>> Thank you.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <michael_segel@hotmail.com
>> > wrote:
>>
>>> Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>>> generally work well when you're not using the FQDN or its alias.
>>>
>>> The issue isn't the SSH, but if you go to the node which is having
>>> trouble connecting to another node,  then try to ping it, or some other
>>> general communication,  if it succeeds, your issue is that the port you're
>>> trying to communicate with is blocked.  Then its more than likely an
>>> ipconfig or firewall issue.
>>>
>>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>>> wrote:
>>>
>>>  Hi Michael,
>>>
>>> well i can ssh from any node to any other without being prompted. The
>>> reason for this is, that my home dir is mounted in every server in the
>>> cluster.
>>>
>>> If the machines are multihomed: i dont know. i could ask if this would
>>> be of importance.
>>>
>>> Shall i?
>>>
>>> Regards,
>>> Elmar
>>>
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>
>>> If the nodes can communicate and distribute data, then the odds are that
>>> the issue isn't going to be in his /etc/hosts.
>>>
>>>  A more relevant question is if he's running a firewall on each of
>>> these machines?
>>>
>>>  A simple test... ssh to one node, ping other nodes and the control
>>> nodes at random to see if they can see one another. Then check to see if
>>> there is a firewall running which would limit the types of traffic between
>>> nodes.
>>>
>>>  One other side note... are these machines multi-homed?
>>>
>>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>>
>>> Hello there,
>>>
>>>       Could you please share your /etc/hosts file, if you don't mind.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>>> macek@cs.uni-kassel.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>>> though my datanodes and tasktrackers seem to have difficulties with their
>>>> communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some
>>>> kind as it can be seen in the logs below. I wondered if this might be due
>>>> to reasons correllated with the localhost entry in /etc/hosts as you can
>>>> read in alot of posts with similar errors, but i checked the file neither
>>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>>> ping localhost... the technician of the cluster said he'd be looking for
>>>> the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker
>>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>>> imply that communication between those servers is working flawlessly.
>>>>
>>>> Is there any reason why this might happen?
>>>>
>>>>
>>>> Regards,
>>>> Elmar
>>>>
>>>> LOGS BELOW:
>>>>
>>>> \____Datanodes
>>>>
>>>> After successfully putting the data to hdfs (at this point i thought
>>>> namenode and datanodes have to communicate), i get the following errors
>>>> when starting the job:
>>>>
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB)
>>>> and looks like this:
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>>> time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>>> time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>>> time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>>> time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>>> time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>>> time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>>> time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 WARN
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>>
>>>> ... (this continues til the end of the log)
>>>>
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:19,038 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>>> started
>>>> 2012-08-13 00:59:19,306 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>> FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>>> down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>>> disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>>
>>>> 2012-08-13 00:59:21,899 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>>> time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>>> time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>>> time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>>> time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>>> time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>>> time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>>> time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>>> time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>>> time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>>> 141.51.205.41:35555 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>>
>>>>
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>>> system started
>>>> 2012-08-13 00:59:24,950 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>> org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>>> global filtersafety
>>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO
>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>>> registered.
>>>> 2012-08-13 00:59:25,256 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>>> SocketReader
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>>> 127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting thread: Map-events fetcher for all reduce tasks on
>>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>>> exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Using ResourceCalculatorPlugin :
>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>> disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>>> IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>>> -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>>> not start task tracker because java.net.BindException: Address already in
>>>> use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at
>>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: DataNode and Tasttracker communication

Posted by Sriram Ramachandrasekaran <sr...@gmail.com>.
the logs indicate already in use exception. is that some sign? :)
On 13 Aug 2012 20:36, "Mohammad Tariq" <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the
> tip in mind. Please pardon my ignorance, as I am still in the learning
> phase.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> 0.0.0.0 means that the call is going to all interfaces on the machine.
>>  (Shouldn't be an issue...)
>>
>> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
>> and they seem to communicate, therefore if its IPv6 related, wouldn't it
>> impact all traffic and not just a specific port?
>> I agree... shut down IPv6 if you can.
>>
>> I don't disagree with your assessment. I am just suggesting that before
>> you do a really deep dive, you think about the more obvious stuff first.
>>
>> There are a couple of other things... like do all of the /etc/hosts files
>> on all of the machines match?
>> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>> BTW, you said DNS in your response. if you're using DNS, then you don't
>> really want to have much info in the /etc/hosts file except loopback and
>> the server's IP address.
>>
>> Looking at the problem OP is indicating some traffic works, while other
>> traffic doesn't. Most likely something is blocking the ports. Iptables is
>> the first place to look.
>>
>> Just saying. ;-)
>>
>>
>> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback
>> prob to me. The log shows that call is going at 0.0.0.0. Apart from what
>> you have said, I think disabling IPv6 and making sure that there is no prob
>> with the DNS resolution is also necessary. Please correct me if I am wrong.
>> Thank you.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <michael_segel@hotmail.com
>> > wrote:
>>
>>> Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>>> generally work well when you're not using the FQDN or its alias.
>>>
>>> The issue isn't the SSH, but if you go to the node which is having
>>> trouble connecting to another node,  then try to ping it, or some other
>>> general communication,  if it succeeds, your issue is that the port you're
>>> trying to communicate with is blocked.  Then its more than likely an
>>> ipconfig or firewall issue.
>>>
>>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>>> wrote:
>>>
>>>  Hi Michael,
>>>
>>> well i can ssh from any node to any other without being prompted. The
>>> reason for this is, that my home dir is mounted in every server in the
>>> cluster.
>>>
>>> If the machines are multihomed: i dont know. i could ask if this would
>>> be of importance.
>>>
>>> Shall i?
>>>
>>> Regards,
>>> Elmar
>>>
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>
>>> If the nodes can communicate and distribute data, then the odds are that
>>> the issue isn't going to be in his /etc/hosts.
>>>
>>>  A more relevant question is if he's running a firewall on each of
>>> these machines?
>>>
>>>  A simple test... ssh to one node, ping other nodes and the control
>>> nodes at random to see if they can see one another. Then check to see if
>>> there is a firewall running which would limit the types of traffic between
>>> nodes.
>>>
>>>  One other side note... are these machines multi-homed?
>>>
>>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>>
>>> Hello there,
>>>
>>>       Could you please share your /etc/hosts file, if you don't mind.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>>> macek@cs.uni-kassel.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>>> though my datanodes and tasktrackers seem to have difficulties with their
>>>> communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some
>>>> kind as it can be seen in the logs below. I wondered if this might be due
>>>> to reasons correllated with the localhost entry in /etc/hosts as you can
>>>> read in alot of posts with similar errors, but i checked the file neither
>>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>>> ping localhost... the technician of the cluster said he'd be looking for
>>>> the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker
>>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>>> imply that communication between those servers is working flawlessly.
>>>>
>>>> Is there any reason why this might happen?
>>>>
>>>>
>>>> Regards,
>>>> Elmar
>>>>
>>>> LOGS BELOW:
>>>>
>>>> \____Datanodes
>>>>
>>>> After successfully putting the data to hdfs (at this point i thought
>>>> namenode and datanodes have to communicate), i get the following errors
>>>> when starting the job:
>>>>
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB)
>>>> and looks like this:
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>>> time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>>> time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>>> time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>>> time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>>> time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>>> time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>>> time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 WARN
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>>
>>>> ... (this continues til the end of the log)
>>>>
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:19,038 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>>> started
>>>> 2012-08-13 00:59:19,306 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>> FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>>> down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>>> disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>>
>>>> 2012-08-13 00:59:21,899 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>>> time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>>> time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>>> time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>>> time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>>> time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>>> time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>>> time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>>> time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>>> time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>>> 141.51.205.41:35555 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>>
>>>>
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>>> system started
>>>> 2012-08-13 00:59:24,950 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>> org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>>> global filtersafety
>>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO
>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>>> registered.
>>>> 2012-08-13 00:59:25,256 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>>> SocketReader
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>>> 127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting thread: Map-events fetcher for all reduce tasks on
>>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>>> exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Using ResourceCalculatorPlugin :
>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>> disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>>> IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>>> -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>>> not start task tracker because java.net.BindException: Address already in
>>>> use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at
>>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
The key is to think about what can go wrong, but start with the low hanging fruit. 

I mean you could be right, however you're jumping the gun and are over looking simpler issues. 

The most common issue is that the networking traffic is being filtered. 
Of course since we're both diagnosing this with minimal information, we're kind of shooting from the hip. 

This is why I'm asking if there is any networking traffic between the nodes.  If you have partial communication, then focus on why you can't see the specific traffic.


On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the tip in mind. Please pardon my ignorance, as I am still in the learning phase.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com> wrote:
> 0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)
> 
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
> 
> I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 
> 
> There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
> Is the OP using both /etc/hosts and DNS? If so, are they in sync? 
> 
> BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 
> 
> Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 
> 
> Just saying. ;-) 
> 
> 
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
> 
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
>> 
>> Regards,
>>     Mohammad Tariq
>> 
>> 
>> 
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
>> Based on your /etc/hosts output, why aren't you using DNS? 
>> 
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
>> 
>> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
>> 
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
>> 
>>> Hi Michael,
>>> 
>>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>>> 
>>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>>> 
>>> Shall i?
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>>> 
>>>> A more relevant question is if he's running a firewall on each of these machines? 
>>>> 
>>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>>> 
>>>> One other side note... are these machines multi-homed?
>>>> 
>>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>>> 
>>>>> Hello there,
>>>>> 
>>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>>> 
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>>> Hi,
>>>>> 
>>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>>> 
>>>>> Is there any reason why this might happen?
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Elmar
>>>>> 
>>>>> LOGS BELOW:
>>>>> 
>>>>> \____Datanodes
>>>>> 
>>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>>> 
>>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>     at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 5 more
>>>>> 
>>>>> ... (this continues til the end of the log)
>>>>> 
>>>>> The second is short kind:
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting DataNode
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>> Caused by: java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>     ... 7 more
>>>>> 
>>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> \_____TastTracker
>>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 6 more
>>>>> 
>>>>> 
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting TaskTracker
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>> 
>>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>> 
>>> 
>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
The key is to think about what can go wrong, but start with the low hanging fruit. 

I mean you could be right, however you're jumping the gun and are over looking simpler issues. 

The most common issue is that the networking traffic is being filtered. 
Of course since we're both diagnosing this with minimal information, we're kind of shooting from the hip. 

This is why I'm asking if there is any networking traffic between the nodes.  If you have partial communication, then focus on why you can't see the specific traffic.


On Aug 13, 2012, at 10:05 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the tip in mind. Please pardon my ignorance, as I am still in the learning phase.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com> wrote:
> 0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)
> 
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
> 
> I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 
> 
> There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
> Is the OP using both /etc/hosts and DNS? If so, are they in sync? 
> 
> BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 
> 
> Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 
> 
> Just saying. ;-) 
> 
> 
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
> 
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
>> 
>> Regards,
>>     Mohammad Tariq
>> 
>> 
>> 
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
>> Based on your /etc/hosts output, why aren't you using DNS? 
>> 
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
>> 
>> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
>> 
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
>> 
>>> Hi Michael,
>>> 
>>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>>> 
>>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>>> 
>>> Shall i?
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>>> 
>>>> A more relevant question is if he's running a firewall on each of these machines? 
>>>> 
>>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>>> 
>>>> One other side note... are these machines multi-homed?
>>>> 
>>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>>> 
>>>>> Hello there,
>>>>> 
>>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>>> 
>>>>> Regards,
>>>>>     Mohammad Tariq
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>>> Hi,
>>>>> 
>>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>>> 
>>>>> Is there any reason why this might happen?
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Elmar
>>>>> 
>>>>> LOGS BELOW:
>>>>> 
>>>>> \____Datanodes
>>>>> 
>>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>>> 
>>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>>     at java.lang.Thread.run(Thread.java:619)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 5 more
>>>>> 
>>>>> ... (this continues til the end of the log)
>>>>> 
>>>>> The second is short kind:
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting DataNode
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>> Caused by: java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>>     ... 7 more
>>>>> 
>>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> \_____TastTracker
>>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>>> ############################### LOG TYPE 1 ############################################################
>>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>>     ... 6 more
>>>>> 
>>>>> 
>>>>> ########################### LOG TYPE 2 ############################################################
>>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>> /************************************************************
>>>>> STARTUP_MSG: Starting TaskTracker
>>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> STARTUP_MSG:   args = []
>>>>> STARTUP_MSG:   version = 1.0.2
>>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>>> ************************************************************/
>>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>> 
>>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>> /************************************************************
>>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>>> ************************************************************/
>>>>> 
>>>> 
>>> 
>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Sriram Ramachandrasekaran <sr...@gmail.com>.
the logs indicate already in use exception. is that some sign? :)
On 13 Aug 2012 20:36, "Mohammad Tariq" <do...@gmail.com> wrote:

> Thank you so very much for the detailed response Michael. I'll keep the
> tip in mind. Please pardon my ignorance, as I am still in the learning
> phase.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> 0.0.0.0 means that the call is going to all interfaces on the machine.
>>  (Shouldn't be an issue...)
>>
>> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
>> and they seem to communicate, therefore if its IPv6 related, wouldn't it
>> impact all traffic and not just a specific port?
>> I agree... shut down IPv6 if you can.
>>
>> I don't disagree with your assessment. I am just suggesting that before
>> you do a really deep dive, you think about the more obvious stuff first.
>>
>> There are a couple of other things... like do all of the /etc/hosts files
>> on all of the machines match?
>> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>>
>> BTW, you said DNS in your response. if you're using DNS, then you don't
>> really want to have much info in the /etc/hosts file except loopback and
>> the server's IP address.
>>
>> Looking at the problem OP is indicating some traffic works, while other
>> traffic doesn't. Most likely something is blocking the ports. Iptables is
>> the first place to look.
>>
>> Just saying. ;-)
>>
>>
>> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hi Michael,
>>        I asked for hosts file because there seems to be some loopback
>> prob to me. The log shows that call is going at 0.0.0.0. Apart from what
>> you have said, I think disabling IPv6 and making sure that there is no prob
>> with the DNS resolution is also necessary. Please correct me if I am wrong.
>> Thank you.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <michael_segel@hotmail.com
>> > wrote:
>>
>>> Based on your /etc/hosts output, why aren't you using DNS?
>>>
>>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>>> generally work well when you're not using the FQDN or its alias.
>>>
>>> The issue isn't the SSH, but if you go to the node which is having
>>> trouble connecting to another node,  then try to ping it, or some other
>>> general communication,  if it succeeds, your issue is that the port you're
>>> trying to communicate with is blocked.  Then its more than likely an
>>> ipconfig or firewall issue.
>>>
>>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>>> wrote:
>>>
>>>  Hi Michael,
>>>
>>> well i can ssh from any node to any other without being prompted. The
>>> reason for this is, that my home dir is mounted in every server in the
>>> cluster.
>>>
>>> If the machines are multihomed: i dont know. i could ask if this would
>>> be of importance.
>>>
>>> Shall i?
>>>
>>> Regards,
>>> Elmar
>>>
>>> Am 13.08.12 14:59, schrieb Michael Segel:
>>>
>>> If the nodes can communicate and distribute data, then the odds are that
>>> the issue isn't going to be in his /etc/hosts.
>>>
>>>  A more relevant question is if he's running a firewall on each of
>>> these machines?
>>>
>>>  A simple test... ssh to one node, ping other nodes and the control
>>> nodes at random to see if they can see one another. Then check to see if
>>> there is a firewall running which would limit the types of traffic between
>>> nodes.
>>>
>>>  One other side note... are these machines multi-homed?
>>>
>>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com>
>>> wrote:
>>>
>>> Hello there,
>>>
>>>       Could you please share your /etc/hosts file, if you don't mind.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>>> macek@cs.uni-kassel.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>>> though my datanodes and tasktrackers seem to have difficulties with their
>>>> communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some
>>>> kind as it can be seen in the logs below. I wondered if this might be due
>>>> to reasons correllated with the localhost entry in /etc/hosts as you can
>>>> read in alot of posts with similar errors, but i checked the file neither
>>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>>> ping localhost... the technician of the cluster said he'd be looking for
>>>> the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker
>>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>>> imply that communication between those servers is working flawlessly.
>>>>
>>>> Is there any reason why this might happen?
>>>>
>>>>
>>>> Regards,
>>>> Elmar
>>>>
>>>> LOGS BELOW:
>>>>
>>>> \____Datanodes
>>>>
>>>> After successfully putting the data to hdfs (at this point i thought
>>>> namenode and datanodes have to communicate), i get the following errors
>>>> when starting the job:
>>>>
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB)
>>>> and looks like this:
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>>> time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>>> time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>>> time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>>> time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>>> time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>>> time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>>> time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>>> time(s).
>>>> 2012-08-13 08:23:36,335 WARN
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>>
>>>> ... (this continues til the end of the log)
>>>>
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:19,038 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>>> started
>>>> 2012-08-13 00:59:19,306 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO
>>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>> FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>>> down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>>> disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>>
>>>> 2012-08-13 00:59:21,899 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1
>>>> ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>>> time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>>> time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>>> time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>>> time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>>> time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>>> time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>>> time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>>> time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>>> time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>>> 141.51.205.41:35555 failed on connection exception:
>>>> java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at
>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net
>>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at
>>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>>
>>>>
>>>> ########################### LOG TYPE 2
>>>> ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build =
>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>>> hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>>> period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>>> system started
>>>> 2012-08-13 00:59:24,950 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>>> registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>> org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>>> global filtersafety
>>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO
>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>> Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>>> registered.
>>>> 2012-08-13 00:59:25,256 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>>> SocketReader
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>>> handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>>> 127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>>> time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Starting thread: Map-events fetcher for all reduce tasks on
>>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>>> exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> Using ResourceCalculatorPlugin :
>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>> disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>>> IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO
>>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>>> ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>>> -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>>> not start task tracker because java.net.BindException: Address already in
>>>> use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch
>>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at
>>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at
>>>> org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so very much for the detailed response Michael. I'll keep the tip
in mind. Please pardon my ignorance, as I am still in the learning phase.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:

> 0.0.0.0 means that the call is going to all interfaces on the machine.
>  (Shouldn't be an issue...)
>
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
> and they seem to communicate, therefore if its IPv6 related, wouldn't it
> impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
>
> I don't disagree with your assessment. I am just suggesting that before
> you do a really deep dive, you think about the more obvious stuff first.
>
> There are a couple of other things... like do all of the /etc/hosts files
> on all of the machines match?
> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>
> BTW, you said DNS in your response. if you're using DNS, then you don't
> really want to have much info in the /etc/hosts file except loopback and
> the server's IP address.
>
> Looking at the problem OP is indicating some traffic works, while other
> traffic doesn't. Most likely something is blocking the ports. Iptables is
> the first place to look.
>
> Just saying. ;-)
>
>
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob
> to me. The log shows that call is going at 0.0.0.0. Apart from what you
> have said, I think disabling IPv6 and making sure that there is no prob
> with the DNS resolution is also necessary. Please correct me if I am wrong.
> Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> Based on your /etc/hosts output, why aren't you using DNS?
>>
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>> generally work well when you're not using the FQDN or its alias.
>>
>> The issue isn't the SSH, but if you go to the node which is having
>> trouble connecting to another node,  then try to ping it, or some other
>> general communication,  if it succeeds, your issue is that the port you're
>> trying to communicate with is blocked.  Then its more than likely an
>> ipconfig or firewall issue.
>>
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>> wrote:
>>
>>  Hi Michael,
>>
>> well i can ssh from any node to any other without being prompted. The
>> reason for this is, that my home dir is mounted in every server in the
>> cluster.
>>
>> If the machines are multihomed: i dont know. i could ask if this would be
>> of importance.
>>
>> Shall i?
>>
>> Regards,
>> Elmar
>>
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>
>> If the nodes can communicate and distribute data, then the odds are that
>> the issue isn't going to be in his /etc/hosts.
>>
>>  A more relevant question is if he's running a firewall on each of these
>> machines?
>>
>>  A simple test... ssh to one node, ping other nodes and the control
>> nodes at random to see if they can see one another. Then check to see if
>> there is a firewall running which would limit the types of traffic between
>> nodes.
>>
>>  One other side note... are these machines multi-homed?
>>
>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hello there,
>>
>>       Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>> macek@cs.uni-kassel.de> wrote:
>>
>>> Hi,
>>>
>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>> though my datanodes and tasktrackers seem to have difficulties with their
>>> communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>>> as it can be seen in the logs below. I wondered if this might be due to
>>> reasons correllated with the localhost entry in /etc/hosts as you can read
>>> in alot of posts with similar errors, but i checked the file neither
>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>> ping localhost... the technician of the cluster said he'd be looking for
>>> the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker
>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>> imply that communication between those servers is working flawlessly.
>>>
>>> Is there any reason why this might happen?
>>>
>>>
>>> Regards,
>>> Elmar
>>>
>>> LOGS BELOW:
>>>
>>> \____Datanodes
>>>
>>> After successfully putting the data to hdfs (at this point i thought
>>> namenode and datanodes have to communicate), i get the following errors
>>> when starting the job:
>>>
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>>> looks like this:
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>> time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>> time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>> time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>> time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>> time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>> time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>> time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>> time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>> time(s).
>>> 2012-08-13 08:23:36,335 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>>
>>> ... (this continues til the end of the log)
>>>
>>> The second is short kind:
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:19,038 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>> started
>>> 2012-08-13 00:59:19,306 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>> FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>> down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>> disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>>
>>> 2012-08-13 00:59:21,899 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>>
>>>
>>>
>>>
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>> time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>> time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>> time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>> time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>> time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>> time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>> time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>> time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>> time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>> 141.51.205.41:35555 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>>
>>>
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>> STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>> system started
>>> 2012-08-13 00:59:24,950 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>> global filtersafety
>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>> registered.
>>> 2012-08-13 00:59:25,256 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>> SocketReader
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>> 127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting thread: Map-events fetcher for all reduce tasks on
>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>> exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>> disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>> IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>> -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>> not start task tracker because java.net.BindException: Address already in
>>> use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at
>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>> SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>
>>
>>
>>
>>
>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so very much for the detailed response Michael. I'll keep the tip
in mind. Please pardon my ignorance, as I am still in the learning phase.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:

> 0.0.0.0 means that the call is going to all interfaces on the machine.
>  (Shouldn't be an issue...)
>
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
> and they seem to communicate, therefore if its IPv6 related, wouldn't it
> impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
>
> I don't disagree with your assessment. I am just suggesting that before
> you do a really deep dive, you think about the more obvious stuff first.
>
> There are a couple of other things... like do all of the /etc/hosts files
> on all of the machines match?
> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>
> BTW, you said DNS in your response. if you're using DNS, then you don't
> really want to have much info in the /etc/hosts file except loopback and
> the server's IP address.
>
> Looking at the problem OP is indicating some traffic works, while other
> traffic doesn't. Most likely something is blocking the ports. Iptables is
> the first place to look.
>
> Just saying. ;-)
>
>
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob
> to me. The log shows that call is going at 0.0.0.0. Apart from what you
> have said, I think disabling IPv6 and making sure that there is no prob
> with the DNS resolution is also necessary. Please correct me if I am wrong.
> Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> Based on your /etc/hosts output, why aren't you using DNS?
>>
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>> generally work well when you're not using the FQDN or its alias.
>>
>> The issue isn't the SSH, but if you go to the node which is having
>> trouble connecting to another node,  then try to ping it, or some other
>> general communication,  if it succeeds, your issue is that the port you're
>> trying to communicate with is blocked.  Then its more than likely an
>> ipconfig or firewall issue.
>>
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>> wrote:
>>
>>  Hi Michael,
>>
>> well i can ssh from any node to any other without being prompted. The
>> reason for this is, that my home dir is mounted in every server in the
>> cluster.
>>
>> If the machines are multihomed: i dont know. i could ask if this would be
>> of importance.
>>
>> Shall i?
>>
>> Regards,
>> Elmar
>>
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>
>> If the nodes can communicate and distribute data, then the odds are that
>> the issue isn't going to be in his /etc/hosts.
>>
>>  A more relevant question is if he's running a firewall on each of these
>> machines?
>>
>>  A simple test... ssh to one node, ping other nodes and the control
>> nodes at random to see if they can see one another. Then check to see if
>> there is a firewall running which would limit the types of traffic between
>> nodes.
>>
>>  One other side note... are these machines multi-homed?
>>
>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hello there,
>>
>>       Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>> macek@cs.uni-kassel.de> wrote:
>>
>>> Hi,
>>>
>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>> though my datanodes and tasktrackers seem to have difficulties with their
>>> communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>>> as it can be seen in the logs below. I wondered if this might be due to
>>> reasons correllated with the localhost entry in /etc/hosts as you can read
>>> in alot of posts with similar errors, but i checked the file neither
>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>> ping localhost... the technician of the cluster said he'd be looking for
>>> the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker
>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>> imply that communication between those servers is working flawlessly.
>>>
>>> Is there any reason why this might happen?
>>>
>>>
>>> Regards,
>>> Elmar
>>>
>>> LOGS BELOW:
>>>
>>> \____Datanodes
>>>
>>> After successfully putting the data to hdfs (at this point i thought
>>> namenode and datanodes have to communicate), i get the following errors
>>> when starting the job:
>>>
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>>> looks like this:
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>> time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>> time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>> time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>> time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>> time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>> time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>> time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>> time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>> time(s).
>>> 2012-08-13 08:23:36,335 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>>
>>> ... (this continues til the end of the log)
>>>
>>> The second is short kind:
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:19,038 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>> started
>>> 2012-08-13 00:59:19,306 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>> FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>> down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>> disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>>
>>> 2012-08-13 00:59:21,899 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>>
>>>
>>>
>>>
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>> time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>> time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>> time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>> time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>> time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>> time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>> time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>> time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>> time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>> 141.51.205.41:35555 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>>
>>>
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>> STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>> system started
>>> 2012-08-13 00:59:24,950 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>> global filtersafety
>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>> registered.
>>> 2012-08-13 00:59:25,256 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>> SocketReader
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>> 127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting thread: Map-events fetcher for all reduce tasks on
>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>> exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>> disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>> IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>> -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>> not start task tracker because java.net.BindException: Address already in
>>> use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at
>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>> SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>
>>
>>
>>
>>
>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so very much for the detailed response Michael. I'll keep the tip
in mind. Please pardon my ignorance, as I am still in the learning phase.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:

> 0.0.0.0 means that the call is going to all interfaces on the machine.
>  (Shouldn't be an issue...)
>
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
> and they seem to communicate, therefore if its IPv6 related, wouldn't it
> impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
>
> I don't disagree with your assessment. I am just suggesting that before
> you do a really deep dive, you think about the more obvious stuff first.
>
> There are a couple of other things... like do all of the /etc/hosts files
> on all of the machines match?
> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>
> BTW, you said DNS in your response. if you're using DNS, then you don't
> really want to have much info in the /etc/hosts file except loopback and
> the server's IP address.
>
> Looking at the problem OP is indicating some traffic works, while other
> traffic doesn't. Most likely something is blocking the ports. Iptables is
> the first place to look.
>
> Just saying. ;-)
>
>
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob
> to me. The log shows that call is going at 0.0.0.0. Apart from what you
> have said, I think disabling IPv6 and making sure that there is no prob
> with the DNS resolution is also necessary. Please correct me if I am wrong.
> Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> Based on your /etc/hosts output, why aren't you using DNS?
>>
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>> generally work well when you're not using the FQDN or its alias.
>>
>> The issue isn't the SSH, but if you go to the node which is having
>> trouble connecting to another node,  then try to ping it, or some other
>> general communication,  if it succeeds, your issue is that the port you're
>> trying to communicate with is blocked.  Then its more than likely an
>> ipconfig or firewall issue.
>>
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>> wrote:
>>
>>  Hi Michael,
>>
>> well i can ssh from any node to any other without being prompted. The
>> reason for this is, that my home dir is mounted in every server in the
>> cluster.
>>
>> If the machines are multihomed: i dont know. i could ask if this would be
>> of importance.
>>
>> Shall i?
>>
>> Regards,
>> Elmar
>>
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>
>> If the nodes can communicate and distribute data, then the odds are that
>> the issue isn't going to be in his /etc/hosts.
>>
>>  A more relevant question is if he's running a firewall on each of these
>> machines?
>>
>>  A simple test... ssh to one node, ping other nodes and the control
>> nodes at random to see if they can see one another. Then check to see if
>> there is a firewall running which would limit the types of traffic between
>> nodes.
>>
>>  One other side note... are these machines multi-homed?
>>
>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hello there,
>>
>>       Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>> macek@cs.uni-kassel.de> wrote:
>>
>>> Hi,
>>>
>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>> though my datanodes and tasktrackers seem to have difficulties with their
>>> communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>>> as it can be seen in the logs below. I wondered if this might be due to
>>> reasons correllated with the localhost entry in /etc/hosts as you can read
>>> in alot of posts with similar errors, but i checked the file neither
>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>> ping localhost... the technician of the cluster said he'd be looking for
>>> the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker
>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>> imply that communication between those servers is working flawlessly.
>>>
>>> Is there any reason why this might happen?
>>>
>>>
>>> Regards,
>>> Elmar
>>>
>>> LOGS BELOW:
>>>
>>> \____Datanodes
>>>
>>> After successfully putting the data to hdfs (at this point i thought
>>> namenode and datanodes have to communicate), i get the following errors
>>> when starting the job:
>>>
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>>> looks like this:
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>> time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>> time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>> time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>> time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>> time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>> time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>> time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>> time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>> time(s).
>>> 2012-08-13 08:23:36,335 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>>
>>> ... (this continues til the end of the log)
>>>
>>> The second is short kind:
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:19,038 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>> started
>>> 2012-08-13 00:59:19,306 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>> FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>> down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>> disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>>
>>> 2012-08-13 00:59:21,899 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>>
>>>
>>>
>>>
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>> time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>> time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>> time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>> time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>> time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>> time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>> time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>> time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>> time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>> 141.51.205.41:35555 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>>
>>>
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>> STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>> system started
>>> 2012-08-13 00:59:24,950 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>> global filtersafety
>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>> registered.
>>> 2012-08-13 00:59:25,256 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>> SocketReader
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>> 127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting thread: Map-events fetcher for all reduce tasks on
>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>> exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>> disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>> IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>> -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>> not start task tracker because java.net.BindException: Address already in
>>> use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at
>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>> SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>
>>
>>
>>
>>
>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you so very much for the detailed response Michael. I'll keep the tip
in mind. Please pardon my ignorance, as I am still in the learning phase.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 8:29 PM, Michael Segel <mi...@hotmail.com>wrote:

> 0.0.0.0 means that the call is going to all interfaces on the machine.
>  (Shouldn't be an issue...)
>
> IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs
> and they seem to communicate, therefore if its IPv6 related, wouldn't it
> impact all traffic and not just a specific port?
> I agree... shut down IPv6 if you can.
>
> I don't disagree with your assessment. I am just suggesting that before
> you do a really deep dive, you think about the more obvious stuff first.
>
> There are a couple of other things... like do all of the /etc/hosts files
> on all of the machines match?
> Is the OP using both /etc/hosts and DNS? If so, are they in sync?
>
> BTW, you said DNS in your response. if you're using DNS, then you don't
> really want to have much info in the /etc/hosts file except loopback and
> the server's IP address.
>
> Looking at the problem OP is indicating some traffic works, while other
> traffic doesn't. Most likely something is blocking the ports. Iptables is
> the first place to look.
>
> Just saying. ;-)
>
>
> On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob
> to me. The log shows that call is going at 0.0.0.0. Apart from what you
> have said, I think disabling IPv6 and making sure that there is no prob
> with the DNS resolution is also necessary. Please correct me if I am wrong.
> Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:
>
>> Based on your /etc/hosts output, why aren't you using DNS?
>>
>> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
>> generally work well when you're not using the FQDN or its alias.
>>
>> The issue isn't the SSH, but if you go to the node which is having
>> trouble connecting to another node,  then try to ping it, or some other
>> general communication,  if it succeeds, your issue is that the port you're
>> trying to communicate with is blocked.  Then its more than likely an
>> ipconfig or firewall issue.
>>
>> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
>> wrote:
>>
>>  Hi Michael,
>>
>> well i can ssh from any node to any other without being prompted. The
>> reason for this is, that my home dir is mounted in every server in the
>> cluster.
>>
>> If the machines are multihomed: i dont know. i could ask if this would be
>> of importance.
>>
>> Shall i?
>>
>> Regards,
>> Elmar
>>
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>
>> If the nodes can communicate and distribute data, then the odds are that
>> the issue isn't going to be in his /etc/hosts.
>>
>>  A more relevant question is if he's running a firewall on each of these
>> machines?
>>
>>  A simple test... ssh to one node, ping other nodes and the control
>> nodes at random to see if they can see one another. Then check to see if
>> there is a firewall running which would limit the types of traffic between
>> nodes.
>>
>>  One other side note... are these machines multi-homed?
>>
>>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>
>> Hello there,
>>
>>       Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <
>> macek@cs.uni-kassel.de> wrote:
>>
>>> Hi,
>>>
>>> i am currently trying to run my hadoop program on a cluster. Sadly
>>> though my datanodes and tasktrackers seem to have difficulties with their
>>> communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>>> as it can be seen in the logs below. I wondered if this might be due to
>>> reasons correllated with the localhost entry in /etc/hosts as you can read
>>> in alot of posts with similar errors, but i checked the file neither
>>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can
>>> ping localhost... the technician of the cluster said he'd be looking for
>>> the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker
>>> (its-cs131). Although it is absolutely not clear, why this is happening:
>>> the "dfs -put" i do directly before the job is running fine, which seems to
>>> imply that communication between those servers is working flawlessly.
>>>
>>> Is there any reason why this might happen?
>>>
>>>
>>> Regards,
>>> Elmar
>>>
>>> LOGS BELOW:
>>>
>>> \____Datanodes
>>>
>>> After successfully putting the data to hdfs (at this point i thought
>>> namenode and datanodes have to communicate), i get the following errors
>>> when starting the job:
>>>
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>>> looks like this:
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>>> time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>>> time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>>> time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>>> time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>>> time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>>> time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>>> time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>>> time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>>> time(s).
>>> 2012-08-13 08:23:36,335 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>>
>>> ... (this continues til the end of the log)
>>>
>>> The second is short kind:
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:19,038 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>>> started
>>> 2012-08-13 00:59:19,306 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>> FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>>> down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO
>>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>>> disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>>> Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at
>>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>>
>>> 2012-08-13 00:59:21,899 INFO
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>>
>>>
>>>
>>>
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1
>>> ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>>> time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>>> time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>>> time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>>> time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>>> time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>>> time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>>> time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>>> time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>>> time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>>> Caught exception: java.net.ConnectException: Call to its-cs131/
>>> 141.51.205.41:35555 failed on connection exception:
>>> java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at
>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net
>>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>>
>>>
>>> ########################### LOG TYPE 2
>>> ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>>> STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build =
>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>>> hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>>> period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>>> system started
>>> 2012-08-13 00:59:24,950 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>>> registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>>> global filtersafety
>>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>>> registered.
>>> 2012-08-13 00:59:25,256 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>>> SocketReader
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>>> 127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>>> time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting thread: Map-events fetcher for all reduce tasks on
>>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>>> exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>> disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>> IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO
>>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>>> ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>> -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>> not start task tracker because java.net.BindException: Address already in
>>> use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch
>>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at
>>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>>> SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at
>>> its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>>
>>
>>
>>
>>
>>
>
>

Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)

IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
I agree... shut down IPv6 if you can.

I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 

There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
Is the OP using both /etc/hosts and DNS? If so, are they in sync? 

BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 

Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 

Just saying. ;-) 


On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
> Based on your /etc/hosts output, why aren't you using DNS? 
> 
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
> 
> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
> 
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
> 
>> Hi Michael,
>> 
>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>> 
>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>> 
>> Shall i?
>> 
>> Regards,
>> Elmar
>> 
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>> 
>>> A more relevant question is if he's running a firewall on each of these machines? 
>>> 
>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>> 
>>> One other side note... are these machines multi-homed?
>>> 
>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>> 
>>>> Hello there,
>>>> 
>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>> 
>>>> Regards,
>>>>     Mohammad Tariq
>>>> 
>>>> 
>>>> 
>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>> Hi,
>>>> 
>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>> 
>>>> Is there any reason why this might happen?
>>>> 
>>>> 
>>>> Regards,
>>>> Elmar
>>>> 
>>>> LOGS BELOW:
>>>> 
>>>> \____Datanodes
>>>> 
>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>> 
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>> 
>>>> ... (this continues til the end of the log)
>>>> 
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>> 
>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>> 
>>>> 
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>> 
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi,

with "using DNS" you mean using the servers' non-IP-names, right?
If so, i do use DNS. Since i am working in a SLURM enviroment and i get 
a list of nodes for evry job i schedule, i construct the config files 
for evry job by taking the list of assigned nodes and deviding the 
roles(NameNode,JobTracker,SecondaryNameNode,TaskTrackers,DataNodes) over 
this set of machines. SLURM offers me names like "its-cs<nodenumber>" 
which is enough for ssh to connect - maybe it isnt for all hadoop 
processes. The complete names would be 
"its-cs<nodenumber>.its.uni-kassel.de". I will add this part of the 
adress for testing. But i fear it wont help alot, cause the JobTracker's 
log seems to know the full names:
###
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000887 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000888 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000889 has split on 
node:/default-rack/its-cs195.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000890 has split on 
node:/default-rack/its-cs196.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000891 has split on 
node:/default-rack/its-cs201.its.uni-kassel.de
###

Pings work btw: i could ping the NameNode from all problematic nodes. 
And lsof -i didnt yield and other programs running on the 
NameNode/JobTracker node with the problematic ports. :( Maybe something 
to notice is, that after the NameNode/JobTracker server is atm not 
running anymore although the DataNode/TaskTracker logs are still growing.


Concerning IPv6: as far as i can see i would have to modify global 
config files to dsiable it. Since i am only a user of this cluster with 
very limited insight in why the machines are configured the way they 
are, i want to be very careful with asking the technicians to make 
changes to their setup. I dont want to be respectless.
I will try using the full names first and if this doesnt help, i will 
ofc ask them if no other options are available.


Am 13.08.12 16:12, schrieb Mohammad Tariq:
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback 
> prob to me. The log shows that call is going at 0.0.0.0. Apart from 
> what you have said, I think disabling IPv6 and making sure that there 
> is no prob with the DNS resolution is also necessary. Please correct 
> me if I am wrong. Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel 
> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>
>     Based on your /etc/hosts output, why aren't you using DNS?
>
>     Outside of MapR, multihomed machines can be problematic. Hadoop
>     doesn't generally work well when you're not using the FQDN or its
>     alias.
>
>     The issue isn't the SSH, but if you go to the node which is having
>     trouble connecting to another node,  then try to ping it, or some
>     other general communication,  if it succeeds, your issue is that
>     the port you're trying to communicate with is blocked.  Then its
>     more than likely an ipconfig or firewall issue.
>
>     On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>     <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>>     Hi Michael,
>>
>>     well i can ssh from any node to any other without being prompted.
>>     The reason for this is, that my home dir is mounted in every
>>     server in the cluster.
>>
>>     If the machines are multihomed: i dont know. i could ask if this
>>     would be of importance.
>>
>>     Shall i?
>>
>>     Regards,
>>     Elmar
>>
>>     Am 13.08.12 14:59, schrieb Michael Segel:
>>>     If the nodes can communicate and distribute data, then the odds
>>>     are that the issue isn't going to be in his /etc/hosts.
>>>
>>>     A more relevant question is if he's running a firewall on each
>>>     of these machines?
>>>
>>>     A simple test... ssh to one node, ping other nodes and the
>>>     control nodes at random to see if they can see one another. Then
>>>     check to see if there is a firewall running which would limit
>>>     the types of traffic between nodes.
>>>
>>>     One other side note... are these machines multi-homed?
>>>
>>>     On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hello there,
>>>>
>>>>          Could you please share your /etc/hosts file, if you don't
>>>>     mind.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>     <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         i am currently trying to run my hadoop program on a
>>>>         cluster. Sadly though my datanodes and tasktrackers seem to
>>>>         have difficulties with their communication as their logs say:
>>>>         * Some datanodes and tasktrackers seem to have portproblems
>>>>         of some kind as it can be seen in the logs below. I
>>>>         wondered if this might be due to reasons correllated with
>>>>         the localhost entry in /etc/hosts as you can read in alot
>>>>         of posts with similar errors, but i checked the file
>>>>         neither localhost nor 127.0.0.1/127.0.1.1
>>>>         <http://127.0.0.1/127.0.1.1> is bound there. (although you
>>>>         can ping localhost... the technician of the cluster said
>>>>         he'd be looking for the mechanics resolving localhost)
>>>>         * The other nodes can not speak with the namenode and
>>>>         jobtracker (its-cs131). Although it is absolutely not
>>>>         clear, why this is happening: the "dfs -put" i do directly
>>>>         before the job is running fine, which seems to imply that
>>>>         communication between those servers is working flawlessly.
>>>>
>>>>         Is there any reason why this might happen?
>>>>
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         LOGS BELOW:
>>>>
>>>>         \____Datanodes
>>>>
>>>>         After successfully putting the data to hdfs (at this point
>>>>         i thought namenode and datanodes have to communicate), i
>>>>         get the following errors when starting the job:
>>>>
>>>>         There are 2 kinds of logs i found: the first one is big
>>>>         (about 12MB) and looks like this:
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>         2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>         2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>         2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>         2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>         2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>         2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>         2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>         2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>         2012-08-13 08:23:36,335 WARN
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35554 <http://141.51.205.41:35554/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at $Proxy5.sendHeartbeat(Unknown Source)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>             at java.lang.Thread.run(Thread.java:619)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 5 more
>>>>
>>>>         ... (this continues til the end of the log)
>>>>
>>>>         The second is short kind:
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:19,038 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting DataNode
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:19,203 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:19,216 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:19,217 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:19,218 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>>>         metrics system started
>>>>         2012-08-13 00:59:19,306 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:19,346 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>         directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>         2012-08-13 00:59:21,787 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>>         FSDatasetStatusMBean
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         Shutting down all async disk service threads...
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         All async disk service threads have been shut down.
>>>>         2012-08-13 00:59:21,898 ERROR
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.BindException: Problem binding to /0.0.0.0:50010
>>>>         <http://0.0.0.0:50010/> : Address already in use
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>         Caused by: java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>             ... 7 more
>>>>
>>>>         2012-08-13 00:59:21,899 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down DataNode at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         \_____TastTracker
>>>>         With TaskTrackers it is the same: there are 2 kinds.
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 02:09:54,645 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>>>         'its-cs131' with reponseId '879
>>>>         2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>         2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>         2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>         2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>         2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>         2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>         2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>         2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>         2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>         2012-08-13 02:10:04,651 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35555 <http://141.51.205.41:35555/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>         Source)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 6 more
>>>>
>>>>
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:24,376 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting TaskTracker
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:24,569 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:24,626 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         TaskTracker metrics system started
>>>>         2012-08-13 00:59:24,950 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>>         org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>>         org.mortbay.log.Slf4jLog
>>>>         2012-08-13 00:59:25,206 INFO
>>>>         org.apache.hadoop.http.HttpServer: Added global
>>>>         filtersafety
>>>>         (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>         2012-08-13 00:59:25,232 INFO
>>>>         org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
>>>>         logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>         2012-08-13 00:59:25,237 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tasktracker
>>>>         with owner as bmacek
>>>>         2012-08-13 00:59:25,239 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>>>         directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>         2012-08-13 00:59:25,244 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:25,255 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source jvm registered.
>>>>         2012-08-13 00:59:25,256 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source TaskTrackerMetrics registered.
>>>>         2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>>>         Starting SocketReader
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcDetailedActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server Responder: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server listener on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 0 on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 1 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>>>         localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 3 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 2 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:38,104 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting thread:
>>>>         Map-events fetcher for all reduce tasks on
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:38,120 INFO
>>>>         org.apache.hadoop.util.ProcessTree: setsid exited with exit
>>>>         code 0
>>>>         2012-08-13 00:59:38,134 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Using
>>>>         ResourceCalculatorPlugin :
>>>>         org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>         2012-08-13 00:59:38,137 WARN
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>         totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>>         disabled.
>>>>         2012-08-13 00:59:38,145 INFO
>>>>         org.apache.hadoop.mapred.IndexCache: IndexCache created
>>>>         with max memory = 10485760
>>>>         2012-08-13 00:59:38,158 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ShuffleServerMetrics registered.
>>>>         2012-08-13 00:59:38,161 INFO
>>>>         org.apache.hadoop.http.HttpServer: Port returned by
>>>>         webServer.getConnectors()[0].getLocalPort() before open()
>>>>         is -1. Opening the listener on 50060
>>>>         2012-08-13 00:59:38,161 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Can not start task
>>>>         tracker because java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at
>>>>         org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>             at
>>>>         org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>>         2012-08-13 00:59:38,163 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>
>>
>
>


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)

IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
I agree... shut down IPv6 if you can.

I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 

There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
Is the OP using both /etc/hosts and DNS? If so, are they in sync? 

BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 

Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 

Just saying. ;-) 


On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
> Based on your /etc/hosts output, why aren't you using DNS? 
> 
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
> 
> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
> 
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
> 
>> Hi Michael,
>> 
>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>> 
>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>> 
>> Shall i?
>> 
>> Regards,
>> Elmar
>> 
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>> 
>>> A more relevant question is if he's running a firewall on each of these machines? 
>>> 
>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>> 
>>> One other side note... are these machines multi-homed?
>>> 
>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>> 
>>>> Hello there,
>>>> 
>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>> 
>>>> Regards,
>>>>     Mohammad Tariq
>>>> 
>>>> 
>>>> 
>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>> Hi,
>>>> 
>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>> 
>>>> Is there any reason why this might happen?
>>>> 
>>>> 
>>>> Regards,
>>>> Elmar
>>>> 
>>>> LOGS BELOW:
>>>> 
>>>> \____Datanodes
>>>> 
>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>> 
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>> 
>>>> ... (this continues til the end of the log)
>>>> 
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>> 
>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>> 
>>>> 
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>> 
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi,

with "using DNS" you mean using the servers' non-IP-names, right?
If so, i do use DNS. Since i am working in a SLURM enviroment and i get 
a list of nodes for evry job i schedule, i construct the config files 
for evry job by taking the list of assigned nodes and deviding the 
roles(NameNode,JobTracker,SecondaryNameNode,TaskTrackers,DataNodes) over 
this set of machines. SLURM offers me names like "its-cs<nodenumber>" 
which is enough for ssh to connect - maybe it isnt for all hadoop 
processes. The complete names would be 
"its-cs<nodenumber>.its.uni-kassel.de". I will add this part of the 
adress for testing. But i fear it wont help alot, cause the JobTracker's 
log seems to know the full names:
###
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000887 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000888 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000889 has split on 
node:/default-rack/its-cs195.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000890 has split on 
node:/default-rack/its-cs196.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000891 has split on 
node:/default-rack/its-cs201.its.uni-kassel.de
###

Pings work btw: i could ping the NameNode from all problematic nodes. 
And lsof -i didnt yield and other programs running on the 
NameNode/JobTracker node with the problematic ports. :( Maybe something 
to notice is, that after the NameNode/JobTracker server is atm not 
running anymore although the DataNode/TaskTracker logs are still growing.


Concerning IPv6: as far as i can see i would have to modify global 
config files to dsiable it. Since i am only a user of this cluster with 
very limited insight in why the machines are configured the way they 
are, i want to be very careful with asking the technicians to make 
changes to their setup. I dont want to be respectless.
I will try using the full names first and if this doesnt help, i will 
ofc ask them if no other options are available.


Am 13.08.12 16:12, schrieb Mohammad Tariq:
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback 
> prob to me. The log shows that call is going at 0.0.0.0. Apart from 
> what you have said, I think disabling IPv6 and making sure that there 
> is no prob with the DNS resolution is also necessary. Please correct 
> me if I am wrong. Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel 
> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>
>     Based on your /etc/hosts output, why aren't you using DNS?
>
>     Outside of MapR, multihomed machines can be problematic. Hadoop
>     doesn't generally work well when you're not using the FQDN or its
>     alias.
>
>     The issue isn't the SSH, but if you go to the node which is having
>     trouble connecting to another node,  then try to ping it, or some
>     other general communication,  if it succeeds, your issue is that
>     the port you're trying to communicate with is blocked.  Then its
>     more than likely an ipconfig or firewall issue.
>
>     On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>     <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>>     Hi Michael,
>>
>>     well i can ssh from any node to any other without being prompted.
>>     The reason for this is, that my home dir is mounted in every
>>     server in the cluster.
>>
>>     If the machines are multihomed: i dont know. i could ask if this
>>     would be of importance.
>>
>>     Shall i?
>>
>>     Regards,
>>     Elmar
>>
>>     Am 13.08.12 14:59, schrieb Michael Segel:
>>>     If the nodes can communicate and distribute data, then the odds
>>>     are that the issue isn't going to be in his /etc/hosts.
>>>
>>>     A more relevant question is if he's running a firewall on each
>>>     of these machines?
>>>
>>>     A simple test... ssh to one node, ping other nodes and the
>>>     control nodes at random to see if they can see one another. Then
>>>     check to see if there is a firewall running which would limit
>>>     the types of traffic between nodes.
>>>
>>>     One other side note... are these machines multi-homed?
>>>
>>>     On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hello there,
>>>>
>>>>          Could you please share your /etc/hosts file, if you don't
>>>>     mind.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>     <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         i am currently trying to run my hadoop program on a
>>>>         cluster. Sadly though my datanodes and tasktrackers seem to
>>>>         have difficulties with their communication as their logs say:
>>>>         * Some datanodes and tasktrackers seem to have portproblems
>>>>         of some kind as it can be seen in the logs below. I
>>>>         wondered if this might be due to reasons correllated with
>>>>         the localhost entry in /etc/hosts as you can read in alot
>>>>         of posts with similar errors, but i checked the file
>>>>         neither localhost nor 127.0.0.1/127.0.1.1
>>>>         <http://127.0.0.1/127.0.1.1> is bound there. (although you
>>>>         can ping localhost... the technician of the cluster said
>>>>         he'd be looking for the mechanics resolving localhost)
>>>>         * The other nodes can not speak with the namenode and
>>>>         jobtracker (its-cs131). Although it is absolutely not
>>>>         clear, why this is happening: the "dfs -put" i do directly
>>>>         before the job is running fine, which seems to imply that
>>>>         communication between those servers is working flawlessly.
>>>>
>>>>         Is there any reason why this might happen?
>>>>
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         LOGS BELOW:
>>>>
>>>>         \____Datanodes
>>>>
>>>>         After successfully putting the data to hdfs (at this point
>>>>         i thought namenode and datanodes have to communicate), i
>>>>         get the following errors when starting the job:
>>>>
>>>>         There are 2 kinds of logs i found: the first one is big
>>>>         (about 12MB) and looks like this:
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>         2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>         2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>         2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>         2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>         2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>         2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>         2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>         2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>         2012-08-13 08:23:36,335 WARN
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35554 <http://141.51.205.41:35554/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at $Proxy5.sendHeartbeat(Unknown Source)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>             at java.lang.Thread.run(Thread.java:619)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 5 more
>>>>
>>>>         ... (this continues til the end of the log)
>>>>
>>>>         The second is short kind:
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:19,038 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting DataNode
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:19,203 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:19,216 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:19,217 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:19,218 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>>>         metrics system started
>>>>         2012-08-13 00:59:19,306 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:19,346 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>         directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>         2012-08-13 00:59:21,787 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>>         FSDatasetStatusMBean
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         Shutting down all async disk service threads...
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         All async disk service threads have been shut down.
>>>>         2012-08-13 00:59:21,898 ERROR
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.BindException: Problem binding to /0.0.0.0:50010
>>>>         <http://0.0.0.0:50010/> : Address already in use
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>         Caused by: java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>             ... 7 more
>>>>
>>>>         2012-08-13 00:59:21,899 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down DataNode at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         \_____TastTracker
>>>>         With TaskTrackers it is the same: there are 2 kinds.
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 02:09:54,645 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>>>         'its-cs131' with reponseId '879
>>>>         2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>         2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>         2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>         2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>         2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>         2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>         2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>         2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>         2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>         2012-08-13 02:10:04,651 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35555 <http://141.51.205.41:35555/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>         Source)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 6 more
>>>>
>>>>
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:24,376 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting TaskTracker
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:24,569 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:24,626 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         TaskTracker metrics system started
>>>>         2012-08-13 00:59:24,950 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>>         org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>>         org.mortbay.log.Slf4jLog
>>>>         2012-08-13 00:59:25,206 INFO
>>>>         org.apache.hadoop.http.HttpServer: Added global
>>>>         filtersafety
>>>>         (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>         2012-08-13 00:59:25,232 INFO
>>>>         org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
>>>>         logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>         2012-08-13 00:59:25,237 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tasktracker
>>>>         with owner as bmacek
>>>>         2012-08-13 00:59:25,239 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>>>         directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>         2012-08-13 00:59:25,244 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:25,255 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source jvm registered.
>>>>         2012-08-13 00:59:25,256 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source TaskTrackerMetrics registered.
>>>>         2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>>>         Starting SocketReader
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcDetailedActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server Responder: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server listener on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 0 on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 1 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>>>         localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 3 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 2 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:38,104 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting thread:
>>>>         Map-events fetcher for all reduce tasks on
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:38,120 INFO
>>>>         org.apache.hadoop.util.ProcessTree: setsid exited with exit
>>>>         code 0
>>>>         2012-08-13 00:59:38,134 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Using
>>>>         ResourceCalculatorPlugin :
>>>>         org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>         2012-08-13 00:59:38,137 WARN
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>         totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>>         disabled.
>>>>         2012-08-13 00:59:38,145 INFO
>>>>         org.apache.hadoop.mapred.IndexCache: IndexCache created
>>>>         with max memory = 10485760
>>>>         2012-08-13 00:59:38,158 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ShuffleServerMetrics registered.
>>>>         2012-08-13 00:59:38,161 INFO
>>>>         org.apache.hadoop.http.HttpServer: Port returned by
>>>>         webServer.getConnectors()[0].getLocalPort() before open()
>>>>         is -1. Opening the listener on 50060
>>>>         2012-08-13 00:59:38,161 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Can not start task
>>>>         tracker because java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at
>>>>         org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>             at
>>>>         org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>>         2012-08-13 00:59:38,163 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>
>>
>
>


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)

IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
I agree... shut down IPv6 if you can.

I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 

There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
Is the OP using both /etc/hosts and DNS? If so, are they in sync? 

BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 

Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 

Just saying. ;-) 


On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
> Based on your /etc/hosts output, why aren't you using DNS? 
> 
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
> 
> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
> 
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
> 
>> Hi Michael,
>> 
>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>> 
>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>> 
>> Shall i?
>> 
>> Regards,
>> Elmar
>> 
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>> 
>>> A more relevant question is if he's running a firewall on each of these machines? 
>>> 
>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>> 
>>> One other side note... are these machines multi-homed?
>>> 
>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>> 
>>>> Hello there,
>>>> 
>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>> 
>>>> Regards,
>>>>     Mohammad Tariq
>>>> 
>>>> 
>>>> 
>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>> Hi,
>>>> 
>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>> 
>>>> Is there any reason why this might happen?
>>>> 
>>>> 
>>>> Regards,
>>>> Elmar
>>>> 
>>>> LOGS BELOW:
>>>> 
>>>> \____Datanodes
>>>> 
>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>> 
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>> 
>>>> ... (this continues til the end of the log)
>>>> 
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>> 
>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>> 
>>>> 
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>> 
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi,

with "using DNS" you mean using the servers' non-IP-names, right?
If so, i do use DNS. Since i am working in a SLURM enviroment and i get 
a list of nodes for evry job i schedule, i construct the config files 
for evry job by taking the list of assigned nodes and deviding the 
roles(NameNode,JobTracker,SecondaryNameNode,TaskTrackers,DataNodes) over 
this set of machines. SLURM offers me names like "its-cs<nodenumber>" 
which is enough for ssh to connect - maybe it isnt for all hadoop 
processes. The complete names would be 
"its-cs<nodenumber>.its.uni-kassel.de". I will add this part of the 
adress for testing. But i fear it wont help alot, cause the JobTracker's 
log seems to know the full names:
###
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000887 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000888 has split on 
node:/default-rack/its-cs202.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000889 has split on 
node:/default-rack/its-cs195.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000890 has split on 
node:/default-rack/its-cs196.its.uni-kassel.de
2012-08-13 01:12:02,770 INFO org.apache.hadoop.mapred.JobInProgress: 
tip:task_201208130059_0001_m_000891 has split on 
node:/default-rack/its-cs201.its.uni-kassel.de
###

Pings work btw: i could ping the NameNode from all problematic nodes. 
And lsof -i didnt yield and other programs running on the 
NameNode/JobTracker node with the problematic ports. :( Maybe something 
to notice is, that after the NameNode/JobTracker server is atm not 
running anymore although the DataNode/TaskTracker logs are still growing.


Concerning IPv6: as far as i can see i would have to modify global 
config files to dsiable it. Since i am only a user of this cluster with 
very limited insight in why the machines are configured the way they 
are, i want to be very careful with asking the technicians to make 
changes to their setup. I dont want to be respectless.
I will try using the full names first and if this doesnt help, i will 
ofc ask them if no other options are available.


Am 13.08.12 16:12, schrieb Mohammad Tariq:
> Hi Michael,
>        I asked for hosts file because there seems to be some loopback 
> prob to me. The log shows that call is going at 0.0.0.0. Apart from 
> what you have said, I think disabling IPv6 and making sure that there 
> is no prob with the DNS resolution is also necessary. Please correct 
> me if I am wrong. Thank you.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel 
> <michael_segel@hotmail.com <ma...@hotmail.com>> wrote:
>
>     Based on your /etc/hosts output, why aren't you using DNS?
>
>     Outside of MapR, multihomed machines can be problematic. Hadoop
>     doesn't generally work well when you're not using the FQDN or its
>     alias.
>
>     The issue isn't the SSH, but if you go to the node which is having
>     trouble connecting to another node,  then try to ping it, or some
>     other general communication,  if it succeeds, your issue is that
>     the port you're trying to communicate with is blocked.  Then its
>     more than likely an ipconfig or firewall issue.
>
>     On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek
>     <ema@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>>     Hi Michael,
>>
>>     well i can ssh from any node to any other without being prompted.
>>     The reason for this is, that my home dir is mounted in every
>>     server in the cluster.
>>
>>     If the machines are multihomed: i dont know. i could ask if this
>>     would be of importance.
>>
>>     Shall i?
>>
>>     Regards,
>>     Elmar
>>
>>     Am 13.08.12 14:59, schrieb Michael Segel:
>>>     If the nodes can communicate and distribute data, then the odds
>>>     are that the issue isn't going to be in his /etc/hosts.
>>>
>>>     A more relevant question is if he's running a firewall on each
>>>     of these machines?
>>>
>>>     A simple test... ssh to one node, ping other nodes and the
>>>     control nodes at random to see if they can see one another. Then
>>>     check to see if there is a firewall running which would limit
>>>     the types of traffic between nodes.
>>>
>>>     One other side note... are these machines multi-homed?
>>>
>>>     On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com
>>>     <ma...@gmail.com>> wrote:
>>>
>>>>     Hello there,
>>>>
>>>>          Could you please share your /etc/hosts file, if you don't
>>>>     mind.
>>>>
>>>>     Regards,
>>>>         Mohammad Tariq
>>>>
>>>>
>>>>
>>>>     On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
>>>>     <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>>>
>>>>         Hi,
>>>>
>>>>         i am currently trying to run my hadoop program on a
>>>>         cluster. Sadly though my datanodes and tasktrackers seem to
>>>>         have difficulties with their communication as their logs say:
>>>>         * Some datanodes and tasktrackers seem to have portproblems
>>>>         of some kind as it can be seen in the logs below. I
>>>>         wondered if this might be due to reasons correllated with
>>>>         the localhost entry in /etc/hosts as you can read in alot
>>>>         of posts with similar errors, but i checked the file
>>>>         neither localhost nor 127.0.0.1/127.0.1.1
>>>>         <http://127.0.0.1/127.0.1.1> is bound there. (although you
>>>>         can ping localhost... the technician of the cluster said
>>>>         he'd be looking for the mechanics resolving localhost)
>>>>         * The other nodes can not speak with the namenode and
>>>>         jobtracker (its-cs131). Although it is absolutely not
>>>>         clear, why this is happening: the "dfs -put" i do directly
>>>>         before the job is running fine, which seems to imply that
>>>>         communication between those servers is working flawlessly.
>>>>
>>>>         Is there any reason why this might happen?
>>>>
>>>>
>>>>         Regards,
>>>>         Elmar
>>>>
>>>>         LOGS BELOW:
>>>>
>>>>         \____Datanodes
>>>>
>>>>         After successfully putting the data to hdfs (at this point
>>>>         i thought namenode and datanodes have to communicate), i
>>>>         get the following errors when starting the job:
>>>>
>>>>         There are 2 kinds of logs i found: the first one is big
>>>>         (about 12MB) and looks like this:
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>>>         2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>>>         2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>>>         2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>>>         2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>>>         2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>>>         2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>>>         2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>>>         2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>>>         2012-08-13 08:23:36,335 WARN
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35554 <http://141.51.205.41:35554/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at $Proxy5.sendHeartbeat(Unknown Source)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>             at java.lang.Thread.run(Thread.java:619)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 5 more
>>>>
>>>>         ... (this continues til the end of the log)
>>>>
>>>>         The second is short kind:
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:19,038 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting DataNode
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:19,203 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:19,216 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:19,217 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:19,218 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>>>         metrics system started
>>>>         2012-08-13 00:59:19,306 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:19,346 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35554
>>>>         <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Storage
>>>>         directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>>         2012-08-13 00:59:21,584 INFO
>>>>         org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>>         2012-08-13 00:59:21,787 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>>>         FSDatasetStatusMBean
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         Shutting down all async disk service threads...
>>>>         2012-08-13 00:59:21,897 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>>>         All async disk service threads have been shut down.
>>>>         2012-08-13 00:59:21,898 ERROR
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>>>         java.net.BindException: Problem binding to /0.0.0.0:50010
>>>>         <http://0.0.0.0:50010/> : Address already in use
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>             at
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>>         Caused by: java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>             ... 7 more
>>>>
>>>>         2012-08-13 00:59:21,899 INFO
>>>>         org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down DataNode at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         \_____TastTracker
>>>>         With TaskTrackers it is the same: there are 2 kinds.
>>>>         ############################### LOG TYPE 1
>>>>         ############################################################
>>>>         2012-08-13 02:09:54,645 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>>>         'its-cs131' with reponseId '879
>>>>         2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>>>         2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>>>         2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>>>         2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>>>         2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>>>         2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>>>         2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>>>         2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>>>         2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>>>         2012-08-13 02:10:04,651 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>>>         java.net.ConnectException: Call to
>>>>         its-cs131/141.51.205.41:35555 <http://141.51.205.41:35555/>
>>>>         failed on connection exception: java.net.ConnectException:
>>>>         Connection refused
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>             at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>             at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown
>>>>         Source)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>>         Caused by: java.net.ConnectException: Connection refused
>>>>             at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>             at
>>>>         sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>             at org.apache.hadoop.net
>>>>         <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>             at
>>>>         org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>             at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>             ... 6 more
>>>>
>>>>
>>>>         ########################### LOG TYPE 2
>>>>         ############################################################
>>>>         2012-08-13 00:59:24,376 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>>         /************************************************************
>>>>         STARTUP_MSG: Starting TaskTracker
>>>>         STARTUP_MSG:   host =
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         STARTUP_MSG:   args = []
>>>>         STARTUP_MSG:   version = 1.0.2
>>>>         STARTUP_MSG:   build =
>>>>         https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>>>         -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21
>>>>         UTC 2012
>>>>         ************************************************************/
>>>>         2012-08-13 00:59:24,569 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
>>>>         properties from hadoop-metrics2.properties
>>>>         2012-08-13 00:59:24,626 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source MetricsSystem,sub=Stats registered.
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         Scheduled snapshot period at 10 second(s).
>>>>         2012-08-13 00:59:24,627 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
>>>>         TaskTracker metrics system started
>>>>         2012-08-13 00:59:24,950 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ugi registered.
>>>>         2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>>>         org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>>         org.mortbay.log.Slf4jLog
>>>>         2012-08-13 00:59:25,206 INFO
>>>>         org.apache.hadoop.http.HttpServer: Added global
>>>>         filtersafety
>>>>         (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>         2012-08-13 00:59:25,232 INFO
>>>>         org.apache.hadoop.mapred.TaskLogsTruncater: Initializing
>>>>         logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>>         2012-08-13 00:59:25,237 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tasktracker
>>>>         with owner as bmacek
>>>>         2012-08-13 00:59:25,239 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>>>         directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>>         2012-08-13 00:59:25,244 INFO
>>>>         org.apache.hadoop.util.NativeCodeLoader: Loaded the
>>>>         native-hadoop library
>>>>         2012-08-13 00:59:25,255 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source jvm registered.
>>>>         2012-08-13 00:59:25,256 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source TaskTrackerMetrics registered.
>>>>         2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>>>         Starting SocketReader
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcDetailedActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,282 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source RpcActivityForPort54850 registered.
>>>>         2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server Responder: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server listener on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 0 on 54850: starting
>>>>         2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 1 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>>>         localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 3 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server:
>>>>         IPC Server handler 2 on 54850: starting
>>>>         2012-08-13 00:59:25,289 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>>>         Retrying connect to server: its-cs131/141.51.205.41:35555
>>>>         <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>>>         2012-08-13 00:59:38,104 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Starting thread:
>>>>         Map-events fetcher for all reduce tasks on
>>>>         tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>>         <http://127.0.0.1:54850/>
>>>>         2012-08-13 00:59:38,120 INFO
>>>>         org.apache.hadoop.util.ProcessTree: setsid exited with exit
>>>>         code 0
>>>>         2012-08-13 00:59:38,134 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: Using
>>>>         ResourceCalculatorPlugin :
>>>>         org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>>         2012-08-13 00:59:38,137 WARN
>>>>         org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>>>         totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>>>>         disabled.
>>>>         2012-08-13 00:59:38,145 INFO
>>>>         org.apache.hadoop.mapred.IndexCache: IndexCache created
>>>>         with max memory = 10485760
>>>>         2012-08-13 00:59:38,158 INFO
>>>>         org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean
>>>>         for source ShuffleServerMetrics registered.
>>>>         2012-08-13 00:59:38,161 INFO
>>>>         org.apache.hadoop.http.HttpServer: Port returned by
>>>>         webServer.getConnectors()[0].getLocalPort() before open()
>>>>         is -1. Opening the listener on 50060
>>>>         2012-08-13 00:59:38,161 ERROR
>>>>         org.apache.hadoop.mapred.TaskTracker: Can not start task
>>>>         tracker because java.net.BindException: Address already in use
>>>>             at sun.nio.ch.Net.bind(Native Method)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>             at sun.nio.ch
>>>>         <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>             at
>>>>         org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>             at
>>>>         org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>             at
>>>>         org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>>
>>>>         2012-08-13 00:59:38,163 INFO
>>>>         org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>>         /************************************************************
>>>>         SHUTDOWN_MSG: Shutting down TaskTracker at
>>>>         its-cs133.its.uni-kassel.de/141.51.205.43
>>>>         <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>>>         ************************************************************/
>>>>
>>>>
>>>
>>
>
>


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
0.0.0.0 means that the call is going to all interfaces on the machine.  (Shouldn't be an issue...)

IPv4 vs IPv6? Could be an issue, however OP says he can write data to DNs and they seem to communicate, therefore if its IPv6 related, wouldn't it impact all traffic and not just a specific port?
I agree... shut down IPv6 if you can.

I don't disagree with your assessment. I am just suggesting that before you do a really deep dive, you think about the more obvious stuff first. 

There are a couple of other things... like do all of the /etc/hosts files on all of the machines match? 
Is the OP using both /etc/hosts and DNS? If so, are they in sync? 

BTW, you said DNS in your response. if you're using DNS, then you don't really want to have much info in the /etc/hosts file except loopback and the server's IP address. 

Looking at the problem OP is indicating some traffic works, while other traffic doesn't. Most likely something is blocking the ports. Iptables is the first place to look. 

Just saying. ;-) 


On Aug 13, 2012, at 9:12 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hi Michael,
>        I asked for hosts file because there seems to be some loopback prob to me. The log shows that call is going at 0.0.0.0. Apart from what you have said, I think disabling IPv6 and making sure that there is no prob with the DNS resolution is also necessary. Please correct me if I am wrong. Thank you.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com> wrote:
> Based on your /etc/hosts output, why aren't you using DNS? 
> 
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 
> 
> The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.
> 
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:
> 
>> Hi Michael,
>> 
>> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
>> 
>> If the machines are multihomed: i dont know. i could ask if this would be of importance.
>> 
>> Shall i?
>> 
>> Regards,
>> Elmar
>> 
>> Am 13.08.12 14:59, schrieb Michael Segel:
>>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>>> 
>>> A more relevant question is if he's running a firewall on each of these machines? 
>>> 
>>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>>> 
>>> One other side note... are these machines multi-homed?
>>> 
>>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>>> 
>>>> Hello there,
>>>> 
>>>>      Could you please share your /etc/hosts file, if you don't mind.
>>>> 
>>>> Regards,
>>>>     Mohammad Tariq
>>>> 
>>>> 
>>>> 
>>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>>> Hi,
>>>> 
>>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>>> 
>>>> Is there any reason why this might happen?
>>>> 
>>>> 
>>>> Regards,
>>>> Elmar
>>>> 
>>>> LOGS BELOW:
>>>> 
>>>> \____Datanodes
>>>> 
>>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>>> 
>>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 5 more
>>>> 
>>>> ... (this continues til the end of the log)
>>>> 
>>>> The second is short kind:
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting DataNode
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>>> Caused by: java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>>     ... 7 more
>>>> 
>>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> \_____TastTracker
>>>> With TaskTrackers it is the same: there are 2 kinds.
>>>> ############################### LOG TYPE 1 ############################################################
>>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>>     ... 6 more
>>>> 
>>>> 
>>>> ########################### LOG TYPE 2 ############################################################
>>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>>> /************************************************************
>>>> STARTUP_MSG: Starting TaskTracker
>>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>>> STARTUP_MSG:   args = []
>>>> STARTUP_MSG:   version = 1.0.2
>>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>>> ************************************************************/
>>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>>     at sun.nio.ch.Net.bind(Native Method)
>>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>>> 
>>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>>> /************************************************************
>>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>>> ************************************************************/
>>>> 
>>> 
>> 
> 
> 


Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Michael,
       I asked for hosts file because there seems to be some loopback prob
to me. The log shows that call is going at 0.0.0.0. Apart from what you
have said, I think disabling IPv6 and making sure that there is no prob
with the DNS resolution is also necessary. Please correct me if I am wrong.
Thank you.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:

> Based on your /etc/hosts output, why aren't you using DNS?
>
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
> generally work well when you're not using the FQDN or its alias.
>
> The issue isn't the SSH, but if you go to the node which is having trouble
> connecting to another node,  then try to ping it, or some other general
> communication,  if it succeeds, your issue is that the port you're trying
> to communicate with is blocked.  Then its more than likely an ipconfig or
> firewall issue.
>
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
> wrote:
>
>  Hi Michael,
>
> well i can ssh from any node to any other without being prompted. The
> reason for this is, that my home dir is mounted in every server in the
> cluster.
>
> If the machines are multihomed: i dont know. i could ask if this would be
> of importance.
>
> Shall i?
>
> Regards,
> Elmar
>
> Am 13.08.12 14:59, schrieb Michael Segel:
>
> If the nodes can communicate and distribute data, then the odds are that
> the issue isn't going to be in his /etc/hosts.
>
>  A more relevant question is if he's running a firewall on each of these
> machines?
>
>  A simple test... ssh to one node, ping other nodes and the control nodes
> at random to see if they can see one another. Then check to see if there is
> a firewall running which would limit the types of traffic between nodes.
>
>  One other side note... are these machines multi-homed?
>
>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hello there,
>
>       Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <macek@cs.uni-kassel.de
> > wrote:
>
>> Hi,
>>
>> i am currently trying to run my hadoop program on a cluster. Sadly though
>> my datanodes and tasktrackers seem to have difficulties with their
>> communication as their logs say:
>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>> as it can be seen in the logs below. I wondered if this might be due to
>> reasons correllated with the localhost entry in /etc/hosts as you can read
>> in alot of posts with similar errors, but i checked the file neither
>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
>> localhost... the technician of the cluster said he'd be looking for the
>> mechanics resolving localhost)
>> * The other nodes can not speak with the namenode and jobtracker
>> (its-cs131). Although it is absolutely not clear, why this is happening:
>> the "dfs -put" i do directly before the job is running fine, which seems to
>> imply that communication between those servers is working flawlessly.
>>
>> Is there any reason why this might happen?
>>
>>
>> Regards,
>> Elmar
>>
>> LOGS BELOW:
>>
>> \____Datanodes
>>
>> After successfully putting the data to hdfs (at this point i thought
>> namenode and datanodes have to communicate), i get the following errors
>> when starting the job:
>>
>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>> looks like this:
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>> time(s).
>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>> time(s).
>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>> time(s).
>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>> time(s).
>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>> time(s).
>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>> time(s).
>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>> time(s).
>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>> time(s).
>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>> time(s).
>> 2012-08-13 08:23:36,335 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>     at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 5 more
>>
>> ... (this continues til the end of the log)
>>
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting DataNode
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:19,203 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:19,216 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:19,217 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:19,218 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>> started
>> 2012-08-13 00:59:19,306 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>> 2012-08-13 00:59:21,787 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>> FSDatasetStatusMBean
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>> down all async disk service threads...
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>> disk service threads have been shut down.
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>> Caused by: java.net.BindException: Address already in use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>     ... 7 more
>>
>> 2012-08-13 00:59:21,899 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down DataNode at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>>
>>
>>
>>
>> \_____TastTracker
>> With TaskTrackers it is the same: there are 2 kinds.
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>> Resending 'status' to 'its-cs131' with reponseId '879
>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>> time(s).
>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>> time(s).
>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>> time(s).
>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>> time(s).
>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>> time(s).
>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>> time(s).
>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>> time(s).
>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>> time(s).
>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>> time(s).
>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Caught exception: java.net.ConnectException: Call to its-cs131/
>> 141.51.205.41:35555 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 6 more
>>
>>
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting TaskTracker
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:24,569 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:24,626 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-08-13 00:59:24,950 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>> global filtersafety
>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tasktracker with owner as bmacek
>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:25,255 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>> registered.
>> 2012-08-13 00:59:25,256 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> TaskTrackerMetrics registered.
>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>> SocketReader
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcDetailedActivityForPort54850 registered.
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcActivityForPort54850 registered.
>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>> Responder: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker up at: localhost/127.0.0.1:54850
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>> 127.0.0.1:54850
>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting thread: Map-events fetcher for all reduce tasks on
>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>> exited with exit code 0
>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>> ResourceCalculatorPlugin :
>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>> disabled.
>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>> IndexCache created with max memory = 10485760
>> 2012-08-13 00:59:38,158 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> ShuffleServerMetrics registered.
>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>> -1. Opening the listener on 50060
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already in
>> use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at
>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>
>
>
>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Michael,
       I asked for hosts file because there seems to be some loopback prob
to me. The log shows that call is going at 0.0.0.0. Apart from what you
have said, I think disabling IPv6 and making sure that there is no prob
with the DNS resolution is also necessary. Please correct me if I am wrong.
Thank you.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:

> Based on your /etc/hosts output, why aren't you using DNS?
>
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
> generally work well when you're not using the FQDN or its alias.
>
> The issue isn't the SSH, but if you go to the node which is having trouble
> connecting to another node,  then try to ping it, or some other general
> communication,  if it succeeds, your issue is that the port you're trying
> to communicate with is blocked.  Then its more than likely an ipconfig or
> firewall issue.
>
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
> wrote:
>
>  Hi Michael,
>
> well i can ssh from any node to any other without being prompted. The
> reason for this is, that my home dir is mounted in every server in the
> cluster.
>
> If the machines are multihomed: i dont know. i could ask if this would be
> of importance.
>
> Shall i?
>
> Regards,
> Elmar
>
> Am 13.08.12 14:59, schrieb Michael Segel:
>
> If the nodes can communicate and distribute data, then the odds are that
> the issue isn't going to be in his /etc/hosts.
>
>  A more relevant question is if he's running a firewall on each of these
> machines?
>
>  A simple test... ssh to one node, ping other nodes and the control nodes
> at random to see if they can see one another. Then check to see if there is
> a firewall running which would limit the types of traffic between nodes.
>
>  One other side note... are these machines multi-homed?
>
>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hello there,
>
>       Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <macek@cs.uni-kassel.de
> > wrote:
>
>> Hi,
>>
>> i am currently trying to run my hadoop program on a cluster. Sadly though
>> my datanodes and tasktrackers seem to have difficulties with their
>> communication as their logs say:
>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>> as it can be seen in the logs below. I wondered if this might be due to
>> reasons correllated with the localhost entry in /etc/hosts as you can read
>> in alot of posts with similar errors, but i checked the file neither
>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
>> localhost... the technician of the cluster said he'd be looking for the
>> mechanics resolving localhost)
>> * The other nodes can not speak with the namenode and jobtracker
>> (its-cs131). Although it is absolutely not clear, why this is happening:
>> the "dfs -put" i do directly before the job is running fine, which seems to
>> imply that communication between those servers is working flawlessly.
>>
>> Is there any reason why this might happen?
>>
>>
>> Regards,
>> Elmar
>>
>> LOGS BELOW:
>>
>> \____Datanodes
>>
>> After successfully putting the data to hdfs (at this point i thought
>> namenode and datanodes have to communicate), i get the following errors
>> when starting the job:
>>
>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>> looks like this:
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>> time(s).
>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>> time(s).
>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>> time(s).
>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>> time(s).
>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>> time(s).
>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>> time(s).
>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>> time(s).
>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>> time(s).
>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>> time(s).
>> 2012-08-13 08:23:36,335 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>     at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 5 more
>>
>> ... (this continues til the end of the log)
>>
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting DataNode
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:19,203 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:19,216 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:19,217 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:19,218 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>> started
>> 2012-08-13 00:59:19,306 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>> 2012-08-13 00:59:21,787 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>> FSDatasetStatusMBean
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>> down all async disk service threads...
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>> disk service threads have been shut down.
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>> Caused by: java.net.BindException: Address already in use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>     ... 7 more
>>
>> 2012-08-13 00:59:21,899 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down DataNode at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>>
>>
>>
>>
>> \_____TastTracker
>> With TaskTrackers it is the same: there are 2 kinds.
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>> Resending 'status' to 'its-cs131' with reponseId '879
>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>> time(s).
>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>> time(s).
>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>> time(s).
>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>> time(s).
>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>> time(s).
>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>> time(s).
>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>> time(s).
>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>> time(s).
>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>> time(s).
>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Caught exception: java.net.ConnectException: Call to its-cs131/
>> 141.51.205.41:35555 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 6 more
>>
>>
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting TaskTracker
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:24,569 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:24,626 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-08-13 00:59:24,950 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>> global filtersafety
>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tasktracker with owner as bmacek
>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:25,255 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>> registered.
>> 2012-08-13 00:59:25,256 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> TaskTrackerMetrics registered.
>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>> SocketReader
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcDetailedActivityForPort54850 registered.
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcActivityForPort54850 registered.
>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>> Responder: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker up at: localhost/127.0.0.1:54850
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>> 127.0.0.1:54850
>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting thread: Map-events fetcher for all reduce tasks on
>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>> exited with exit code 0
>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>> ResourceCalculatorPlugin :
>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>> disabled.
>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>> IndexCache created with max memory = 10485760
>> 2012-08-13 00:59:38,158 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> ShuffleServerMetrics registered.
>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>> -1. Opening the listener on 50060
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already in
>> use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at
>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>
>
>
>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Michael,
       I asked for hosts file because there seems to be some loopback prob
to me. The log shows that call is going at 0.0.0.0. Apart from what you
have said, I think disabling IPv6 and making sure that there is no prob
with the DNS resolution is also necessary. Please correct me if I am wrong.
Thank you.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:

> Based on your /etc/hosts output, why aren't you using DNS?
>
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
> generally work well when you're not using the FQDN or its alias.
>
> The issue isn't the SSH, but if you go to the node which is having trouble
> connecting to another node,  then try to ping it, or some other general
> communication,  if it succeeds, your issue is that the port you're trying
> to communicate with is blocked.  Then its more than likely an ipconfig or
> firewall issue.
>
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
> wrote:
>
>  Hi Michael,
>
> well i can ssh from any node to any other without being prompted. The
> reason for this is, that my home dir is mounted in every server in the
> cluster.
>
> If the machines are multihomed: i dont know. i could ask if this would be
> of importance.
>
> Shall i?
>
> Regards,
> Elmar
>
> Am 13.08.12 14:59, schrieb Michael Segel:
>
> If the nodes can communicate and distribute data, then the odds are that
> the issue isn't going to be in his /etc/hosts.
>
>  A more relevant question is if he's running a firewall on each of these
> machines?
>
>  A simple test... ssh to one node, ping other nodes and the control nodes
> at random to see if they can see one another. Then check to see if there is
> a firewall running which would limit the types of traffic between nodes.
>
>  One other side note... are these machines multi-homed?
>
>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hello there,
>
>       Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <macek@cs.uni-kassel.de
> > wrote:
>
>> Hi,
>>
>> i am currently trying to run my hadoop program on a cluster. Sadly though
>> my datanodes and tasktrackers seem to have difficulties with their
>> communication as their logs say:
>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>> as it can be seen in the logs below. I wondered if this might be due to
>> reasons correllated with the localhost entry in /etc/hosts as you can read
>> in alot of posts with similar errors, but i checked the file neither
>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
>> localhost... the technician of the cluster said he'd be looking for the
>> mechanics resolving localhost)
>> * The other nodes can not speak with the namenode and jobtracker
>> (its-cs131). Although it is absolutely not clear, why this is happening:
>> the "dfs -put" i do directly before the job is running fine, which seems to
>> imply that communication between those servers is working flawlessly.
>>
>> Is there any reason why this might happen?
>>
>>
>> Regards,
>> Elmar
>>
>> LOGS BELOW:
>>
>> \____Datanodes
>>
>> After successfully putting the data to hdfs (at this point i thought
>> namenode and datanodes have to communicate), i get the following errors
>> when starting the job:
>>
>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>> looks like this:
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>> time(s).
>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>> time(s).
>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>> time(s).
>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>> time(s).
>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>> time(s).
>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>> time(s).
>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>> time(s).
>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>> time(s).
>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>> time(s).
>> 2012-08-13 08:23:36,335 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>     at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 5 more
>>
>> ... (this continues til the end of the log)
>>
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting DataNode
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:19,203 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:19,216 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:19,217 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:19,218 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>> started
>> 2012-08-13 00:59:19,306 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>> 2012-08-13 00:59:21,787 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>> FSDatasetStatusMBean
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>> down all async disk service threads...
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>> disk service threads have been shut down.
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>> Caused by: java.net.BindException: Address already in use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>     ... 7 more
>>
>> 2012-08-13 00:59:21,899 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down DataNode at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>>
>>
>>
>>
>> \_____TastTracker
>> With TaskTrackers it is the same: there are 2 kinds.
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>> Resending 'status' to 'its-cs131' with reponseId '879
>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>> time(s).
>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>> time(s).
>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>> time(s).
>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>> time(s).
>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>> time(s).
>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>> time(s).
>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>> time(s).
>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>> time(s).
>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>> time(s).
>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Caught exception: java.net.ConnectException: Call to its-cs131/
>> 141.51.205.41:35555 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 6 more
>>
>>
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting TaskTracker
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:24,569 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:24,626 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-08-13 00:59:24,950 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>> global filtersafety
>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tasktracker with owner as bmacek
>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:25,255 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>> registered.
>> 2012-08-13 00:59:25,256 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> TaskTrackerMetrics registered.
>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>> SocketReader
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcDetailedActivityForPort54850 registered.
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcActivityForPort54850 registered.
>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>> Responder: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker up at: localhost/127.0.0.1:54850
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>> 127.0.0.1:54850
>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting thread: Map-events fetcher for all reduce tasks on
>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>> exited with exit code 0
>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>> ResourceCalculatorPlugin :
>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>> disabled.
>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>> IndexCache created with max memory = 10485760
>> 2012-08-13 00:59:38,158 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> ShuffleServerMetrics registered.
>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>> -1. Opening the listener on 50060
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already in
>> use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at
>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>
>
>
>
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hi Michael,
       I asked for hosts file because there seems to be some loopback prob
to me. The log shows that call is going at 0.0.0.0. Apart from what you
have said, I think disabling IPv6 and making sure that there is no prob
with the DNS resolution is also necessary. Please correct me if I am wrong.
Thank you.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 7:06 PM, Michael Segel <mi...@hotmail.com>wrote:

> Based on your /etc/hosts output, why aren't you using DNS?
>
> Outside of MapR, multihomed machines can be problematic. Hadoop doesn't
> generally work well when you're not using the FQDN or its alias.
>
> The issue isn't the SSH, but if you go to the node which is having trouble
> connecting to another node,  then try to ping it, or some other general
> communication,  if it succeeds, your issue is that the port you're trying
> to communicate with is blocked.  Then its more than likely an ipconfig or
> firewall issue.
>
> On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de>
> wrote:
>
>  Hi Michael,
>
> well i can ssh from any node to any other without being prompted. The
> reason for this is, that my home dir is mounted in every server in the
> cluster.
>
> If the machines are multihomed: i dont know. i could ask if this would be
> of importance.
>
> Shall i?
>
> Regards,
> Elmar
>
> Am 13.08.12 14:59, schrieb Michael Segel:
>
> If the nodes can communicate and distribute data, then the odds are that
> the issue isn't going to be in his /etc/hosts.
>
>  A more relevant question is if he's running a firewall on each of these
> machines?
>
>  A simple test... ssh to one node, ping other nodes and the control nodes
> at random to see if they can see one another. Then check to see if there is
> a firewall running which would limit the types of traffic between nodes.
>
>  One other side note... are these machines multi-homed?
>
>   On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Hello there,
>
>       Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <macek@cs.uni-kassel.de
> > wrote:
>
>> Hi,
>>
>> i am currently trying to run my hadoop program on a cluster. Sadly though
>> my datanodes and tasktrackers seem to have difficulties with their
>> communication as their logs say:
>> * Some datanodes and tasktrackers seem to have portproblems of some kind
>> as it can be seen in the logs below. I wondered if this might be due to
>> reasons correllated with the localhost entry in /etc/hosts as you can read
>> in alot of posts with similar errors, but i checked the file neither
>> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
>> localhost... the technician of the cluster said he'd be looking for the
>> mechanics resolving localhost)
>> * The other nodes can not speak with the namenode and jobtracker
>> (its-cs131). Although it is absolutely not clear, why this is happening:
>> the "dfs -put" i do directly before the job is running fine, which seems to
>> imply that communication between those servers is working flawlessly.
>>
>> Is there any reason why this might happen?
>>
>>
>> Regards,
>> Elmar
>>
>> LOGS BELOW:
>>
>> \____Datanodes
>>
>> After successfully putting the data to hdfs (at this point i thought
>> namenode and datanodes have to communicate), i get the following errors
>> when starting the job:
>>
>> There are 2 kinds of logs i found: the first one is big (about 12MB) and
>> looks like this:
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 1
>> time(s).
>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 2
>> time(s).
>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 3
>> time(s).
>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 4
>> time(s).
>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 5
>> time(s).
>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 6
>> time(s).
>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 7
>> time(s).
>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 8
>> time(s).
>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 9
>> time(s).
>> 2012-08-13 08:23:36,335 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException:
>> Call to its-cs131/141.51.205.41:35554 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>     at java.lang.Thread.run(Thread.java:619)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 5 more
>>
>> ... (this continues til the end of the log)
>>
>> The second is short kind:
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:19,038 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting DataNode
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:19,203 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:19,216 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:19,217 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:19,218 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system
>> started
>> 2012-08-13 00:59:19,306 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35554. Already tried 0
>> time(s).
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>> /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>> 2012-08-13 00:59:21,584 INFO
>> org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>> 2012-08-13 00:59:21,787 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>> FSDatasetStatusMBean
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting
>> down all async disk service threads...
>> 2012-08-13 00:59:21,897 INFO
>> org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async
>> disk service threads have been shut down.
>> 2012-08-13 00:59:21,898 ERROR
>> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
>> Problem binding to /0.0.0.0:50010 : Address already in use
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>     at
>> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>> Caused by: java.net.BindException: Address already in use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>     ... 7 more
>>
>> 2012-08-13 00:59:21,899 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down DataNode at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>>
>>
>>
>>
>> \_____TastTracker
>> With TaskTrackers it is the same: there are 2 kinds.
>> ############################### LOG TYPE 1
>> ############################################################
>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>> Resending 'status' to 'its-cs131' with reponseId '879
>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 1
>> time(s).
>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 2
>> time(s).
>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 3
>> time(s).
>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 4
>> time(s).
>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 5
>> time(s).
>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 6
>> time(s).
>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 7
>> time(s).
>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 8
>> time(s).
>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 9
>> time(s).
>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Caught exception: java.net.ConnectException: Call to its-cs131/
>> 141.51.205.41:35555 failed on connection exception:
>> java.net.ConnectException: Connection refused
>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>     at
>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>> Caused by: java.net.ConnectException: Connection refused
>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>     at
>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>     at org.apache.hadoop.net
>> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>     ... 6 more
>>
>>
>> ########################### LOG TYPE 2
>> ############################################################
>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting TaskTracker
>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 1.0.2
>> STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r
>> 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>> ************************************************************/
>> 2012-08-13 00:59:24,569 INFO
>> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from
>> hadoop-metrics2.properties
>> 2012-08-13 00:59:24,626 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> MetricsSystem,sub=Stats registered.
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
>> period at 10 second(s).
>> 2012-08-13 00:59:24,627 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics
>> system started
>> 2012-08-13 00:59:24,950 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi
>> registered.
>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added
>> global filtersafety
>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tasktracker with owner as bmacek
>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good
>> mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader:
>> Loaded the native-hadoop library
>> 2012-08-13 00:59:25,255 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm
>> registered.
>> 2012-08-13 00:59:25,256 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> TaskTrackerMetrics registered.
>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
>> SocketReader
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcDetailedActivityForPort54850 registered.
>> 2012-08-13 00:59:25,282 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> RpcActivityForPort54850 registered.
>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
>> Responder: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 54850: starting
>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker up at: localhost/127.0.0.1:54850
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 3 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 54850: starting
>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/
>> 127.0.0.1:54850
>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect to server: its-cs131/141.51.205.41:35555. Already tried 0
>> time(s).
>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting thread: Map-events fetcher for all reduce tasks on
>> tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid
>> exited with exit code 0
>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using
>> ResourceCalculatorPlugin :
>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
>> disabled.
>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>> IndexCache created with max memory = 10485760
>> 2012-08-13 00:59:38,158 INFO
>> org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source
>> ShuffleServerMetrics registered.
>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port
>> returned by webServer.getConnectors()[0].getLocalPort() before open() is
>> -1. Opening the listener on 50060
>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.net.BindException: Address already in
>> use
>>     at sun.nio.ch.Net.bind(Native Method)
>>     at sun.nio.ch
>> .ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>     at
>> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at
>> its-cs133.its.uni-kassel.de/141.51.205.43
>> ************************************************************/
>>
>
>
>
>
>

Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
Based on your /etc/hosts output, why aren't you using DNS? 

Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 

The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.

On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:

> Hi Michael,
> 
> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
> 
> If the machines are multihomed: i dont know. i could ask if this would be of importance.
> 
> Shall i?
> 
> Regards,
> Elmar
> 
> Am 13.08.12 14:59, schrieb Michael Segel:
>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>> 
>> A more relevant question is if he's running a firewall on each of these machines? 
>> 
>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>> 
>> One other side note... are these machines multi-homed?
>> 
>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> 
>>> Hello there,
>>> 
>>>      Could you please share your /etc/hosts file, if you don't mind.
>>> 
>>> Regards,
>>>     Mohammad Tariq
>>> 
>>> 
>>> 
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>> Hi,
>>> 
>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>> 
>>> Is there any reason why this might happen?
>>> 
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> LOGS BELOW:
>>> 
>>> \____Datanodes
>>> 
>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>> 
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>> 
>>> ... (this continues til the end of the log)
>>> 
>>> The second is short kind:
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>> 
>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>>> 
>>> 
>>> 
>>> 
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>> 
>>> 
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>> 
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>> 
> 


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
Based on your /etc/hosts output, why aren't you using DNS? 

Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 

The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.

On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:

> Hi Michael,
> 
> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
> 
> If the machines are multihomed: i dont know. i could ask if this would be of importance.
> 
> Shall i?
> 
> Regards,
> Elmar
> 
> Am 13.08.12 14:59, schrieb Michael Segel:
>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>> 
>> A more relevant question is if he's running a firewall on each of these machines? 
>> 
>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>> 
>> One other side note... are these machines multi-homed?
>> 
>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> 
>>> Hello there,
>>> 
>>>      Could you please share your /etc/hosts file, if you don't mind.
>>> 
>>> Regards,
>>>     Mohammad Tariq
>>> 
>>> 
>>> 
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>> Hi,
>>> 
>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>> 
>>> Is there any reason why this might happen?
>>> 
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> LOGS BELOW:
>>> 
>>> \____Datanodes
>>> 
>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>> 
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>> 
>>> ... (this continues til the end of the log)
>>> 
>>> The second is short kind:
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>> 
>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>>> 
>>> 
>>> 
>>> 
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>> 
>>> 
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>> 
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>> 
> 


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
Based on your /etc/hosts output, why aren't you using DNS? 

Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 

The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.

On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:

> Hi Michael,
> 
> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
> 
> If the machines are multihomed: i dont know. i could ask if this would be of importance.
> 
> Shall i?
> 
> Regards,
> Elmar
> 
> Am 13.08.12 14:59, schrieb Michael Segel:
>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>> 
>> A more relevant question is if he's running a firewall on each of these machines? 
>> 
>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>> 
>> One other side note... are these machines multi-homed?
>> 
>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> 
>>> Hello there,
>>> 
>>>      Could you please share your /etc/hosts file, if you don't mind.
>>> 
>>> Regards,
>>>     Mohammad Tariq
>>> 
>>> 
>>> 
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>> Hi,
>>> 
>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>> 
>>> Is there any reason why this might happen?
>>> 
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> LOGS BELOW:
>>> 
>>> \____Datanodes
>>> 
>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>> 
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>> 
>>> ... (this continues til the end of the log)
>>> 
>>> The second is short kind:
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>> 
>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>>> 
>>> 
>>> 
>>> 
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>> 
>>> 
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>> 
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>> 
> 


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
Based on your /etc/hosts output, why aren't you using DNS? 

Outside of MapR, multihomed machines can be problematic. Hadoop doesn't generally work well when you're not using the FQDN or its alias. 

The issue isn't the SSH, but if you go to the node which is having trouble connecting to another node,  then try to ping it, or some other general communication,  if it succeeds, your issue is that the port you're trying to communicate with is blocked.  Then its more than likely an ipconfig or firewall issue.

On Aug 13, 2012, at 8:17 AM, Björn-Elmar Macek <em...@cs.uni-kassel.de> wrote:

> Hi Michael,
> 
> well i can ssh from any node to any other without being prompted. The reason for this is, that my home dir is mounted in every server in the cluster. 
> 
> If the machines are multihomed: i dont know. i could ask if this would be of importance.
> 
> Shall i?
> 
> Regards,
> Elmar
> 
> Am 13.08.12 14:59, schrieb Michael Segel:
>> If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 
>> 
>> A more relevant question is if he's running a firewall on each of these machines? 
>> 
>> A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 
>> 
>> One other side note... are these machines multi-homed?
>> 
>> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> 
>>> Hello there,
>>> 
>>>      Could you please share your /etc/hosts file, if you don't mind.
>>> 
>>> Regards,
>>>     Mohammad Tariq
>>> 
>>> 
>>> 
>>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
>>> Hi,
>>> 
>>> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
>>> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
>>> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
>>> 
>>> Is there any reason why this might happen?
>>> 
>>> 
>>> Regards,
>>> Elmar
>>> 
>>> LOGS BELOW:
>>> 
>>> \____Datanodes
>>> 
>>> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
>>> 
>>> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
>>> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
>>> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
>>> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
>>> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
>>> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
>>> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
>>> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
>>> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
>>> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at $Proxy5.sendHeartbeat(Unknown Source)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>>     at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 5 more
>>> 
>>> ... (this continues til the end of the log)
>>> 
>>> The second is short kind:
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting DataNode
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
>>> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
>>> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
>>> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>> Caused by: java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>>     ... 7 more
>>> 
>>> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>>> 
>>> 
>>> 
>>> 
>>> \_____TastTracker
>>> With TaskTrackers it is the same: there are 2 kinds.
>>> ############################### LOG TYPE 1 ############################################################
>>> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
>>> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
>>> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
>>> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
>>> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
>>> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
>>> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
>>> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
>>> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
>>> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
>>> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>> Caused by: java.net.ConnectException: Connection refused
>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>>     ... 6 more
>>> 
>>> 
>>> ########################### LOG TYPE 2 ############################################################
>>> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>> STARTUP_MSG:   args = []
>>> STARTUP_MSG:   version = 1.0.2
>>> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>> ************************************************************/
>>> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
>>> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
>>> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
>>> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
>>> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
>>> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
>>> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
>>> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
>>> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
>>> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
>>> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
>>> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
>>> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
>>> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
>>> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
>>> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>>>     at sun.nio.ch.Net.bind(Native Method)
>>>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>> 
>>> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
>>> ************************************************************/
>>> 
>> 
> 


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi Michael,

well i can ssh from any node to any other without being prompted. The 
reason for this is, that my home dir is mounted in every server in the 
cluster.

If the machines are multihomed: i dont know. i could ask if this would 
be of importance.

Shall i?

Regards,
Elmar

Am 13.08.12 14:59, schrieb Michael Segel:
> If the nodes can communicate and distribute data, then the odds are 
> that the issue isn't going to be in his /etc/hosts.
>
> A more relevant question is if he's running a firewall on each of 
> these machines?
>
> A simple test... ssh to one node, ping other nodes and the control 
> nodes at random to see if they can see one another. Then check to see 
> if there is a firewall running which would limit the types of traffic 
> between nodes.
>
> One other side note... are these machines multi-homed?
>
> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Hello there,
>>
>>      Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
>> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>
>>     Hi,
>>
>>     i am currently trying to run my hadoop program on a cluster.
>>     Sadly though my datanodes and tasktrackers seem to have
>>     difficulties with their communication as their logs say:
>>     * Some datanodes and tasktrackers seem to have portproblems of
>>     some kind as it can be seen in the logs below. I wondered if this
>>     might be due to reasons correllated with the localhost entry in
>>     /etc/hosts as you can read in alot of posts with similar errors,
>>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>>     ping localhost... the technician of the cluster said he'd be
>>     looking for the mechanics resolving localhost)
>>     * The other nodes can not speak with the namenode and jobtracker
>>     (its-cs131). Although it is absolutely not clear, why this is
>>     happening: the "dfs -put" i do directly before the job is running
>>     fine, which seems to imply that communication between those
>>     servers is working flawlessly.
>>
>>     Is there any reason why this might happen?
>>
>>
>>     Regards,
>>     Elmar
>>
>>     LOGS BELOW:
>>
>>     \____Datanodes
>>
>>     After successfully putting the data to hdfs (at this point i
>>     thought namenode and datanodes have to communicate), i get the
>>     following errors when starting the job:
>>
>>     There are 2 kinds of logs i found: the first one is big (about
>>     12MB) and looks like this:
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>     2012-08-13 08:23:36,335 WARN
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at $Proxy5.sendHeartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>         at java.lang.Thread.run(Thread.java:619)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 5 more
>>
>>     ... (this continues til the end of the log)
>>
>>     The second is short kind:
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:19,038 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting DataNode
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:19,203 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:19,216 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:19,217 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:19,218 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>     metrics system started
>>     2012-08-13 00:59:19,306 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:19,346 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>     2012-08-13 00:59:21,787 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>     FSDatasetStatusMBean
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     Shutting down all async disk service threads...
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     All async disk service threads have been shut down.
>>     2012-08-13 00:59:21,898 ERROR
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.BindException: Problem binding to /0.0.0.0:50010
>>     <http://0.0.0.0:50010/> : Address already in use
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>     Caused by: java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>         ... 7 more
>>
>>     2012-08-13 00:59:21,899 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down DataNode at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>>
>>
>>
>>     \_____TastTracker
>>     With TaskTrackers it is the same: there are 2 kinds.
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 02:09:54,645 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>     'its-cs131' with reponseId '879
>>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>     2012-08-13 02:10:04,651 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 6 more
>>
>>
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:24,376 INFO
>>     org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting TaskTracker
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:24,569 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:24,626 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>>     metrics system started
>>     2012-08-13 00:59:24,950 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>     org.mortbay.log.Slf4jLog
>>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>>     Added global filtersafety
>>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>     2012-08-13 00:59:25,232 INFO
>>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>     2012-08-13 00:59:25,237 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with
>>     owner as bmacek
>>     2012-08-13 00:59:25,239 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>     directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>     2012-08-13 00:59:25,244 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:25,255 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source jvm registered.
>>     2012-08-13 00:59:25,256 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source TaskTrackerMetrics registered.
>>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>     Starting SocketReader
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcDetailedActivityForPort54850 registered.
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcActivityForPort54850 registered.
>>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server Responder: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server listener on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 0 on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 1 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>     localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 3 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 2 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 00:59:38,104 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events
>>     fetcher for all reduce tasks on
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>>     setsid exited with exit code 0
>>     2012-08-13 00:59:38,134 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Using
>>     ResourceCalculatorPlugin :
>>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>     2012-08-13 00:59:38,137 WARN
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>     totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>     IndexCache created with max memory = 10485760
>>     2012-08-13 00:59:38,158 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ShuffleServerMetrics registered.
>>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>>     Port returned by webServer.getConnectors()[0].getLocalPort()
>>     before open() is -1. Opening the listener on 50060
>>     2012-08-13 00:59:38,161 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>>     because java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at
>>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>>     2012-08-13 00:59:38,163 INFO
>>     org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down TaskTracker at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi Michael,

well i can ssh from any node to any other without being prompted. The 
reason for this is, that my home dir is mounted in every server in the 
cluster.

If the machines are multihomed: i dont know. i could ask if this would 
be of importance.

Shall i?

Regards,
Elmar

Am 13.08.12 14:59, schrieb Michael Segel:
> If the nodes can communicate and distribute data, then the odds are 
> that the issue isn't going to be in his /etc/hosts.
>
> A more relevant question is if he's running a firewall on each of 
> these machines?
>
> A simple test... ssh to one node, ping other nodes and the control 
> nodes at random to see if they can see one another. Then check to see 
> if there is a firewall running which would limit the types of traffic 
> between nodes.
>
> One other side note... are these machines multi-homed?
>
> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Hello there,
>>
>>      Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
>> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>
>>     Hi,
>>
>>     i am currently trying to run my hadoop program on a cluster.
>>     Sadly though my datanodes and tasktrackers seem to have
>>     difficulties with their communication as their logs say:
>>     * Some datanodes and tasktrackers seem to have portproblems of
>>     some kind as it can be seen in the logs below. I wondered if this
>>     might be due to reasons correllated with the localhost entry in
>>     /etc/hosts as you can read in alot of posts with similar errors,
>>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>>     ping localhost... the technician of the cluster said he'd be
>>     looking for the mechanics resolving localhost)
>>     * The other nodes can not speak with the namenode and jobtracker
>>     (its-cs131). Although it is absolutely not clear, why this is
>>     happening: the "dfs -put" i do directly before the job is running
>>     fine, which seems to imply that communication between those
>>     servers is working flawlessly.
>>
>>     Is there any reason why this might happen?
>>
>>
>>     Regards,
>>     Elmar
>>
>>     LOGS BELOW:
>>
>>     \____Datanodes
>>
>>     After successfully putting the data to hdfs (at this point i
>>     thought namenode and datanodes have to communicate), i get the
>>     following errors when starting the job:
>>
>>     There are 2 kinds of logs i found: the first one is big (about
>>     12MB) and looks like this:
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>     2012-08-13 08:23:36,335 WARN
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at $Proxy5.sendHeartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>         at java.lang.Thread.run(Thread.java:619)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 5 more
>>
>>     ... (this continues til the end of the log)
>>
>>     The second is short kind:
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:19,038 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting DataNode
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:19,203 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:19,216 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:19,217 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:19,218 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>     metrics system started
>>     2012-08-13 00:59:19,306 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:19,346 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>     2012-08-13 00:59:21,787 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>     FSDatasetStatusMBean
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     Shutting down all async disk service threads...
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     All async disk service threads have been shut down.
>>     2012-08-13 00:59:21,898 ERROR
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.BindException: Problem binding to /0.0.0.0:50010
>>     <http://0.0.0.0:50010/> : Address already in use
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>     Caused by: java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>         ... 7 more
>>
>>     2012-08-13 00:59:21,899 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down DataNode at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>>
>>
>>
>>     \_____TastTracker
>>     With TaskTrackers it is the same: there are 2 kinds.
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 02:09:54,645 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>     'its-cs131' with reponseId '879
>>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>     2012-08-13 02:10:04,651 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 6 more
>>
>>
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:24,376 INFO
>>     org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting TaskTracker
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:24,569 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:24,626 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>>     metrics system started
>>     2012-08-13 00:59:24,950 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>     org.mortbay.log.Slf4jLog
>>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>>     Added global filtersafety
>>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>     2012-08-13 00:59:25,232 INFO
>>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>     2012-08-13 00:59:25,237 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with
>>     owner as bmacek
>>     2012-08-13 00:59:25,239 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>     directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>     2012-08-13 00:59:25,244 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:25,255 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source jvm registered.
>>     2012-08-13 00:59:25,256 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source TaskTrackerMetrics registered.
>>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>     Starting SocketReader
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcDetailedActivityForPort54850 registered.
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcActivityForPort54850 registered.
>>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server Responder: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server listener on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 0 on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 1 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>     localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 3 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 2 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 00:59:38,104 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events
>>     fetcher for all reduce tasks on
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>>     setsid exited with exit code 0
>>     2012-08-13 00:59:38,134 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Using
>>     ResourceCalculatorPlugin :
>>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>     2012-08-13 00:59:38,137 WARN
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>     totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>     IndexCache created with max memory = 10485760
>>     2012-08-13 00:59:38,158 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ShuffleServerMetrics registered.
>>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>>     Port returned by webServer.getConnectors()[0].getLocalPort()
>>     before open() is -1. Opening the listener on 50060
>>     2012-08-13 00:59:38,161 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>>     because java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at
>>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>>     2012-08-13 00:59:38,163 INFO
>>     org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down TaskTracker at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi Michael,

well i can ssh from any node to any other without being prompted. The 
reason for this is, that my home dir is mounted in every server in the 
cluster.

If the machines are multihomed: i dont know. i could ask if this would 
be of importance.

Shall i?

Regards,
Elmar

Am 13.08.12 14:59, schrieb Michael Segel:
> If the nodes can communicate and distribute data, then the odds are 
> that the issue isn't going to be in his /etc/hosts.
>
> A more relevant question is if he's running a firewall on each of 
> these machines?
>
> A simple test... ssh to one node, ping other nodes and the control 
> nodes at random to see if they can see one another. Then check to see 
> if there is a firewall running which would limit the types of traffic 
> between nodes.
>
> One other side note... are these machines multi-homed?
>
> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Hello there,
>>
>>      Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
>> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>
>>     Hi,
>>
>>     i am currently trying to run my hadoop program on a cluster.
>>     Sadly though my datanodes and tasktrackers seem to have
>>     difficulties with their communication as their logs say:
>>     * Some datanodes and tasktrackers seem to have portproblems of
>>     some kind as it can be seen in the logs below. I wondered if this
>>     might be due to reasons correllated with the localhost entry in
>>     /etc/hosts as you can read in alot of posts with similar errors,
>>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>>     ping localhost... the technician of the cluster said he'd be
>>     looking for the mechanics resolving localhost)
>>     * The other nodes can not speak with the namenode and jobtracker
>>     (its-cs131). Although it is absolutely not clear, why this is
>>     happening: the "dfs -put" i do directly before the job is running
>>     fine, which seems to imply that communication between those
>>     servers is working flawlessly.
>>
>>     Is there any reason why this might happen?
>>
>>
>>     Regards,
>>     Elmar
>>
>>     LOGS BELOW:
>>
>>     \____Datanodes
>>
>>     After successfully putting the data to hdfs (at this point i
>>     thought namenode and datanodes have to communicate), i get the
>>     following errors when starting the job:
>>
>>     There are 2 kinds of logs i found: the first one is big (about
>>     12MB) and looks like this:
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>     2012-08-13 08:23:36,335 WARN
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at $Proxy5.sendHeartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>         at java.lang.Thread.run(Thread.java:619)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 5 more
>>
>>     ... (this continues til the end of the log)
>>
>>     The second is short kind:
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:19,038 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting DataNode
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:19,203 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:19,216 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:19,217 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:19,218 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>     metrics system started
>>     2012-08-13 00:59:19,306 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:19,346 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>     2012-08-13 00:59:21,787 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>     FSDatasetStatusMBean
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     Shutting down all async disk service threads...
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     All async disk service threads have been shut down.
>>     2012-08-13 00:59:21,898 ERROR
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.BindException: Problem binding to /0.0.0.0:50010
>>     <http://0.0.0.0:50010/> : Address already in use
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>     Caused by: java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>         ... 7 more
>>
>>     2012-08-13 00:59:21,899 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down DataNode at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>>
>>
>>
>>     \_____TastTracker
>>     With TaskTrackers it is the same: there are 2 kinds.
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 02:09:54,645 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>     'its-cs131' with reponseId '879
>>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>     2012-08-13 02:10:04,651 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 6 more
>>
>>
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:24,376 INFO
>>     org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting TaskTracker
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:24,569 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:24,626 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>>     metrics system started
>>     2012-08-13 00:59:24,950 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>     org.mortbay.log.Slf4jLog
>>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>>     Added global filtersafety
>>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>     2012-08-13 00:59:25,232 INFO
>>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>     2012-08-13 00:59:25,237 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with
>>     owner as bmacek
>>     2012-08-13 00:59:25,239 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>     directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>     2012-08-13 00:59:25,244 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:25,255 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source jvm registered.
>>     2012-08-13 00:59:25,256 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source TaskTrackerMetrics registered.
>>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>     Starting SocketReader
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcDetailedActivityForPort54850 registered.
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcActivityForPort54850 registered.
>>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server Responder: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server listener on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 0 on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 1 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>     localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 3 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 2 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 00:59:38,104 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events
>>     fetcher for all reduce tasks on
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>>     setsid exited with exit code 0
>>     2012-08-13 00:59:38,134 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Using
>>     ResourceCalculatorPlugin :
>>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>     2012-08-13 00:59:38,137 WARN
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>     totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>     IndexCache created with max memory = 10485760
>>     2012-08-13 00:59:38,158 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ShuffleServerMetrics registered.
>>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>>     Port returned by webServer.getConnectors()[0].getLocalPort()
>>     before open() is -1. Opening the listener on 50060
>>     2012-08-13 00:59:38,161 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>>     because java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at
>>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>>     2012-08-13 00:59:38,163 INFO
>>     org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down TaskTracker at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Hi Michael,

well i can ssh from any node to any other without being prompted. The 
reason for this is, that my home dir is mounted in every server in the 
cluster.

If the machines are multihomed: i dont know. i could ask if this would 
be of importance.

Shall i?

Regards,
Elmar

Am 13.08.12 14:59, schrieb Michael Segel:
> If the nodes can communicate and distribute data, then the odds are 
> that the issue isn't going to be in his /etc/hosts.
>
> A more relevant question is if he's running a firewall on each of 
> these machines?
>
> A simple test... ssh to one node, ping other nodes and the control 
> nodes at random to see if they can see one another. Then check to see 
> if there is a firewall running which would limit the types of traffic 
> between nodes.
>
> One other side note... are these machines multi-homed?
>
> On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <dontariq@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> Hello there,
>>
>>      Could you please share your /etc/hosts file, if you don't mind.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
>> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>>
>>     Hi,
>>
>>     i am currently trying to run my hadoop program on a cluster.
>>     Sadly though my datanodes and tasktrackers seem to have
>>     difficulties with their communication as their logs say:
>>     * Some datanodes and tasktrackers seem to have portproblems of
>>     some kind as it can be seen in the logs below. I wondered if this
>>     might be due to reasons correllated with the localhost entry in
>>     /etc/hosts as you can read in alot of posts with similar errors,
>>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>>     ping localhost... the technician of the cluster said he'd be
>>     looking for the mechanics resolving localhost)
>>     * The other nodes can not speak with the namenode and jobtracker
>>     (its-cs131). Although it is absolutely not clear, why this is
>>     happening: the "dfs -put" i do directly before the job is running
>>     fine, which seems to imply that communication between those
>>     servers is working flawlessly.
>>
>>     Is there any reason why this might happen?
>>
>>
>>     Regards,
>>     Elmar
>>
>>     LOGS BELOW:
>>
>>     \____Datanodes
>>
>>     After successfully putting the data to hdfs (at this point i
>>     thought namenode and datanodes have to communicate), i get the
>>     following errors when starting the job:
>>
>>     There are 2 kinds of logs i found: the first one is big (about
>>     12MB) and looks like this:
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 1 time(s).
>>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 2 time(s).
>>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 3 time(s).
>>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 4 time(s).
>>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 5 time(s).
>>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 6 time(s).
>>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 7 time(s).
>>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 8 time(s).
>>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 9 time(s).
>>     2012-08-13 08:23:36,335 WARN
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at $Proxy5.sendHeartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>>         at java.lang.Thread.run(Thread.java:619)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 5 more
>>
>>     ... (this continues til the end of the log)
>>
>>     The second is short kind:
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:19,038 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting DataNode
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:19,203 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:19,216 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:19,217 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:19,218 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>>     metrics system started
>>     2012-08-13 00:59:19,306 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:19,346 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35554
>>     <http://141.51.205.41:35554/>. Already tried 0 time(s).
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>>     2012-08-13 00:59:21,584 INFO
>>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>>     2012-08-13 00:59:21,787 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>>     FSDatasetStatusMBean
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     Shutting down all async disk service threads...
>>     2012-08-13 00:59:21,897 INFO
>>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>>     All async disk service threads have been shut down.
>>     2012-08-13 00:59:21,898 ERROR
>>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>>     java.net.BindException: Problem binding to /0.0.0.0:50010
>>     <http://0.0.0.0:50010/> : Address already in use
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>>         at
>>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>>     Caused by: java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>>         ... 7 more
>>
>>     2012-08-13 00:59:21,899 INFO
>>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down DataNode at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>>
>>
>>
>>     \_____TastTracker
>>     With TaskTrackers it is the same: there are 2 kinds.
>>     ############################### LOG TYPE 1
>>     ############################################################
>>     2012-08-13 02:09:54,645 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Resending 'status' to
>>     'its-cs131' with reponseId '879
>>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 1 time(s).
>>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 2 time(s).
>>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 3 time(s).
>>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 4 time(s).
>>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 5 time(s).
>>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 6 time(s).
>>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 7 time(s).
>>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 8 time(s).
>>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 9 time(s).
>>     2012-08-13 02:10:04,651 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/> failed on connection exception:
>>     java.net.ConnectException: Connection refused
>>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>>     Caused by: java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at
>>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net
>>     <http://org.apache.hadoop.net/>.NetUtils.connect(NetUtils.java:489)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>>         at
>>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>>         ... 6 more
>>
>>
>>     ########################### LOG TYPE 2
>>     ############################################################
>>     2012-08-13 00:59:24,376 INFO
>>     org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
>>     /************************************************************
>>     STARTUP_MSG: Starting TaskTracker
>>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     STARTUP_MSG:   args = []
>>     STARTUP_MSG:   version = 1.0.2
>>     STARTUP_MSG:   build =
>>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>>     ************************************************************/
>>     2012-08-13 00:59:24,569 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>>     from hadoop-metrics2.properties
>>     2012-08-13 00:59:24,626 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source MetricsSystem,sub=Stats registered.
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>>     snapshot period at 10 second(s).
>>     2012-08-13 00:59:24,627 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>>     metrics system started
>>     2012-08-13 00:59:24,950 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ugi registered.
>>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>     org.mortbay.log.Slf4jLog
>>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>>     Added global filtersafety
>>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>     2012-08-13 00:59:25,232 INFO
>>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>     2012-08-13 00:59:25,237 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with
>>     owner as bmacek
>>     2012-08-13 00:59:25,239 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Good mapred local
>>     directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>>     2012-08-13 00:59:25,244 INFO
>>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>>     library
>>     2012-08-13 00:59:25,255 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source jvm registered.
>>     2012-08-13 00:59:25,256 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source TaskTrackerMetrics registered.
>>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>>     Starting SocketReader
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcDetailedActivityForPort54850 registered.
>>     2012-08-13 00:59:25,282 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source RpcActivityForPort54850 registered.
>>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server Responder: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server listener on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 0 on 54850: starting
>>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 1 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker up at:
>>     localhost/127.0.0.1:54850 <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 3 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>>     Server handler 2 on 54850: starting
>>     2012-08-13 00:59:25,289 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting tracker
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>>     Retrying connect to server: its-cs131/141.51.205.41:35555
>>     <http://141.51.205.41:35555/>. Already tried 0 time(s).
>>     2012-08-13 00:59:38,104 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events
>>     fetcher for all reduce tasks on
>>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>>     <http://127.0.0.1:54850/>
>>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>>     setsid exited with exit code 0
>>     2012-08-13 00:59:38,134 INFO
>>     org.apache.hadoop.mapred.TaskTracker: Using
>>     ResourceCalculatorPlugin :
>>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>>     2012-08-13 00:59:38,137 WARN
>>     org.apache.hadoop.mapred.TaskTracker: TaskTracker's
>>     totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
>>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>>     IndexCache created with max memory = 10485760
>>     2012-08-13 00:59:38,158 INFO
>>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>>     source ShuffleServerMetrics registered.
>>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>>     Port returned by webServer.getConnectors()[0].getLocalPort()
>>     before open() is -1. Opening the listener on 50060
>>     2012-08-13 00:59:38,161 ERROR
>>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>>     because java.net.BindException: Address already in use
>>         at sun.nio.ch.Net.bind(Native Method)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>>         at sun.nio.ch
>>     <http://sun.nio.ch/>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>>         at
>>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>>         at
>>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>>
>>     2012-08-13 00:59:38,163 INFO
>>     org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>     /************************************************************
>>     SHUTDOWN_MSG: Shutting down TaskTracker at
>>     its-cs133.its.uni-kassel.de/141.51.205.43
>>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>>     ************************************************************/
>>
>>
>


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 

A more relevant question is if he's running a firewall on each of these machines? 

A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 

One other side note... are these machines multi-homed?

On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello there,
> 
>      Could you please share your /etc/hosts file, if you don't mind.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
> Hi,
> 
> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
> 
> Is there any reason why this might happen?
> 
> 
> Regards,
> Elmar
> 
> LOGS BELOW:
> 
> \____Datanodes
> 
> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
> 
> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 5 more
> 
> ... (this continues til the end of the log)
> 
> The second is short kind:
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>     ... 7 more
> 
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 
> 
> 
> 
> 
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 6 more
> 
> 
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
> 
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Sure i can, but it is long as it is a cluster:


141.51.12.86  hrz-cs400.hrz.uni-kassel.de hrz-cs400

141.51.204.11 hrz-cs401.hrz.uni-kassel.de hrz-cs401
141.51.204.12 hrz-cs402.hrz.uni-kassel.de hrz-cs402
141.51.204.13 hrz-cs403.hrz.uni-kassel.de hrz-cs403
141.51.204.14 hrz-cs404.hrz.uni-kassel.de hrz-cs404
141.51.204.15 hrz-cs405.hrz.uni-kassel.de hrz-cs405
141.51.204.16 hrz-cs406.hrz.uni-kassel.de hrz-cs406
141.51.204.17 hrz-cs407.hrz.uni-kassel.de hrz-cs407
141.51.204.18 hrz-cs408.hrz.uni-kassel.de hrz-cs408
141.51.204.19 hrz-cs409.hrz.uni-kassel.de hrz-cs409
141.51.204.20 hrz-cs410.hrz.uni-kassel.de hrz-cs410
141.51.204.21 hrz-cs411.hrz.uni-kassel.de hrz-cs411
141.51.204.22 hrz-cs412.hrz.uni-kassel.de hrz-cs412
141.51.204.23 hrz-cs413.hrz.uni-kassel.de hrz-cs413
141.51.204.24 hrz-cs414.hrz.uni-kassel.de hrz-cs414
141.51.204.25 hrz-cs415.hrz.uni-kassel.de hrz-cs415
141.51.204.26 hrz-cs416.hrz.uni-kassel.de hrz-cs416
141.51.204.27 hrz-cs417.hrz.uni-kassel.de hrz-cs417
141.51.204.28 hrz-cs418.hrz.uni-kassel.de hrz-cs418
141.51.204.29 hrz-cs419.hrz.uni-kassel.de hrz-cs419
141.51.204.31 hrz-cs421.hrz.uni-kassel.de hrz-cs421
141.51.204.32 hrz-cs422.hrz.uni-kassel.de hrz-cs422
141.51.204.33 hrz-cs423.hrz.uni-kassel.de hrz-cs423
141.51.204.34 hrz-cs424.hrz.uni-kassel.de hrz-cs424
141.51.204.35 hrz-cs425.hrz.uni-kassel.de hrz-cs425
141.51.204.36 hrz-cs426.hrz.uni-kassel.de hrz-cs426
141.51.204.37 hrz-cs427.hrz.uni-kassel.de hrz-cs427
141.51.204.38 hrz-cs428.hrz.uni-kassel.de hrz-cs428
141.51.204.39 hrz-cs429.hrz.uni-kassel.de hrz-cs429
141.51.204.40 hrz-cs430.hrz.uni-kassel.de hrz-cs430
141.51.204.47 hrz-cs437.hrz.uni-kassel.de hrz-cs437
141.51.204.48 hrz-cs438.hrz.uni-kassel.de hrz-cs438
141.51.204.49 hrz-cs439.hrz.uni-kassel.de hrz-cs439
141.51.204.50 hrz-cs440.hrz.uni-kassel.de hrz-cs440
141.51.204.51 hrz-cs441.hrz.uni-kassel.de hrz-cs441
141.51.204.54 hrz-cs444.hrz.uni-kassel.de hrz-cs444
141.51.204.65 hrz-cs455.hrz.uni-kassel.de hrz-cs455
141.51.204.66 hrz-cs456.hrz.uni-kassel.de hrz-cs456
141.51.204.69 hrz-cs459.hrz.uni-kassel.de hrz-cs459
141.51.204.70 hrz-cs460.hrz.uni-kassel.de hrz-cs460
141.51.204.71 hrz-cs461.hrz.uni-kassel.de hrz-cs461
141.51.204.72 hrz-cs462.hrz.uni-kassel.de hrz-cs462
141.51.204.73 hrz-cs463.hrz.uni-kassel.de hrz-cs463
141.51.204.74 hrz-cs464.hrz.uni-kassel.de hrz-cs464
141.51.204.75 hrz-cs465.hrz.uni-kassel.de hrz-cs465
141.51.204.76 hrz-cs466.hrz.uni-kassel.de hrz-cs466
141.51.204.77 hrz-cs467.hrz.uni-kassel.de hrz-cs467
141.51.204.78 hrz-cs468.hrz.uni-kassel.de hrz-cs468
141.51.204.79 hrz-cs469.hrz.uni-kassel.de hrz-cs469
141.51.204.80 hrz-cs470.hrz.uni-kassel.de hrz-cs470
141.51.204.81 hrz-cs471.hrz.uni-kassel.de hrz-cs471
141.51.204.82 hrz-cs472.hrz.uni-kassel.de hrz-cs472
141.51.204.83 hrz-cs473.hrz.uni-kassel.de hrz-cs473
141.51.204.84 hrz-cs474.hrz.uni-kassel.de hrz-cs474
141.51.204.85 hrz-cs475.hrz.uni-kassel.de hrz-cs475
141.51.204.86 hrz-cs476.hrz.uni-kassel.de hrz-cs476
141.51.204.87 hrz-cs477.hrz.uni-kassel.de hrz-cs477
141.51.204.88 hrz-cs478.hrz.uni-kassel.de hrz-cs478
141.51.204.89 hrz-cs479.hrz.uni-kassel.de hrz-cs479
141.51.204.90 hrz-cs480.hrz.uni-kassel.de hrz-cs480
141.51.204.91 hrz-cs481.hrz.uni-kassel.de hrz-cs481
141.51.204.92 hrz-cs482.hrz.uni-kassel.de hrz-cs482
141.51.204.93 hrz-cs483.hrz.uni-kassel.de hrz-cs483
141.51.204.94 hrz-cs484.hrz.uni-kassel.de hrz-cs484
141.51.204.95 hrz-cs485.hrz.uni-kassel.de hrz-cs485
141.51.204.96 hrz-cs486.hrz.uni-kassel.de hrz-cs486
141.51.204.97 hrz-cs487.hrz.uni-kassel.de hrz-cs487
141.51.204.98 hrz-cs488.hrz.uni-kassel.de hrz-cs488
141.51.204.99 hrz-cs489.hrz.uni-kassel.de hrz-cs489
141.51.204.100 hrz-cs490.hrz.uni-kassel.de hrz-cs490
141.51.204.101 hrz-cs491.hrz.uni-kassel.de hrz-cs491
141.51.204.102 hrz-cs492.hrz.uni-kassel.de hrz-cs492
141.51.204.103 hrz-cs493.hrz.uni-kassel.de hrz-cs493
141.51.204.104 hrz-cs494.hrz.uni-kassel.de hrz-cs494
141.51.204.105 hrz-cs495.hrz.uni-kassel.de hrz-cs495
141.51.204.106 hrz-cs496.hrz.uni-kassel.de hrz-cs496
141.51.204.107 hrz-cs497.hrz.uni-kassel.de hrz-cs497
141.51.204.108 hrz-cs498.hrz.uni-kassel.de hrz-cs498
141.51.204.109 hrz-cs499.hrz.uni-kassel.de hrz-cs499
141.51.204.110 hrz-cs500.hrz.uni-kassel.de hrz-cs500
141.51.204.111 hrz-cs501.hrz.uni-kassel.de hrz-cs501
141.51.204.112 hrz-cs502.hrz.uni-kassel.de hrz-cs502
141.51.204.113 hrz-cs503.hrz.uni-kassel.de hrz-cs503
141.51.204.114 hrz-cs504.hrz.uni-kassel.de hrz-cs504
141.51.204.115 hrz-cs505.hrz.uni-kassel.de hrz-cs505
141.51.204.116 hrz-cs506.hrz.uni-kassel.de hrz-cs506
141.51.204.117 hrz-cs507.hrz.uni-kassel.de hrz-cs507
141.51.204.118 hrz-cs508.hrz.uni-kassel.de hrz-cs508
141.51.204.119 hrz-cs509.hrz.uni-kassel.de hrz-cs509
141.51.204.120 hrz-cs510.hrz.uni-kassel.de hrz-cs510
141.51.204.121 hrz-cs511.hrz.uni-kassel.de hrz-cs511
141.51.204.122 hrz-cs512.hrz.uni-kassel.de hrz-cs512
141.51.204.123 hrz-cs513.hrz.uni-kassel.de hrz-cs513
141.51.204.124 hrz-cs514.hrz.uni-kassel.de hrz-cs514
141.51.204.125 hrz-cs515.hrz.uni-kassel.de hrz-cs515
141.51.204.126 hrz-cs516.hrz.uni-kassel.de hrz-cs516
141.51.204.127 hrz-cs517.hrz.uni-kassel.de hrz-cs517
141.51.204.128 hrz-cs518.hrz.uni-kassel.de hrz-cs518
141.51.204.129 hrz-cs519.hrz.uni-kassel.de hrz-cs519
141.51.204.130 hrz-cs520.hrz.uni-kassel.de hrz-cs520
141.51.204.131 hrz-cs521.hrz.uni-kassel.de hrz-cs521
141.51.204.132 hrz-cs522.hrz.uni-kassel.de hrz-cs522
141.51.204.133 hrz-cs523.hrz.uni-kassel.de hrz-cs523
141.51.204.134 hrz-cs524.hrz.uni-kassel.de hrz-cs524
141.51.204.135 hrz-cs525.hrz.uni-kassel.de hrz-cs525
141.51.204.136 hrz-cs526.hrz.uni-kassel.de hrz-cs526
141.51.204.137 hrz-cs527.hrz.uni-kassel.de hrz-cs527
141.51.204.138 hrz-cs528.hrz.uni-kassel.de hrz-cs528
141.51.204.139 hrz-cs529.hrz.uni-kassel.de hrz-cs529
141.51.204.140 hrz-cs530.hrz.uni-kassel.de hrz-cs530
141.51.204.141 hrz-cs531.hrz.uni-kassel.de hrz-cs531
141.51.204.142 hrz-cs532.hrz.uni-kassel.de hrz-cs532
141.51.204.143 hrz-cs533.hrz.uni-kassel.de hrz-cs533
141.51.204.144 hrz-cs534.hrz.uni-kassel.de hrz-cs534
141.51.204.145 hrz-cs535.hrz.uni-kassel.de hrz-cs535
141.51.204.146 hrz-cs536.hrz.uni-kassel.de hrz-cs536
141.51.204.147 hrz-cs537.hrz.uni-kassel.de hrz-cs537
141.51.204.148 hrz-cs538.hrz.uni-kassel.de hrz-cs538
141.51.204.149 hrz-cs539.hrz.uni-kassel.de hrz-cs539
141.51.204.150 hrz-cs540.hrz.uni-kassel.de hrz-cs540
141.51.204.151 hrz-cs541.hrz.uni-kassel.de hrz-cs541
141.51.204.152 hrz-cs542.hrz.uni-kassel.de hrz-cs542
141.51.204.153 hrz-cs543.hrz.uni-kassel.de hrz-cs543
141.51.204.154 hrz-cs544.hrz.uni-kassel.de hrz-cs544
141.51.204.155 hrz-cs545.hrz.uni-kassel.de hrz-cs545
141.51.204.156 hrz-cs546.hrz.uni-kassel.de hrz-cs546
141.51.204.157 hrz-cs547.hrz.uni-kassel.de hrz-cs547
141.51.204.158 hrz-cs548.hrz.uni-kassel.de hrz-cs548
141.51.204.159 hrz-cs549.hrz.uni-kassel.de hrz-cs549
141.51.204.160 hrz-cs550.hrz.uni-kassel.de hrz-cs550
141.51.204.161 hrz-cs551.hrz.uni-kassel.de hrz-cs551
141.51.204.162 hrz-cs552.hrz.uni-kassel.de hrz-cs552
141.51.204.163 hrz-cs553.hrz.uni-kassel.de hrz-cs553
141.51.204.164 hrz-cs554.hrz.uni-kassel.de hrz-cs554
141.51.204.165 hrz-cs555.hrz.uni-kassel.de hrz-cs555
141.51.204.166 hrz-cs556.hrz.uni-kassel.de hrz-cs556
141.51.204.167 hrz-cs557.hrz.uni-kassel.de hrz-cs557
141.51.204.168 hrz-cs558.hrz.uni-kassel.de hrz-cs558
141.51.204.169 hrz-cs559.hrz.uni-kassel.de hrz-cs559
141.51.204.215 hrz-cs560.hrz.uni-kassel.de hrz-cs560
141.51.204.216 hrz-cs561.hrz.uni-kassel.de hrz-cs561
141.51.204.217 hrz-cs562.hrz.uni-kassel.de hrz-cs562
141.51.204.218 hrz-cs563.hrz.uni-kassel.de hrz-cs563
141.51.204.219 hrz-cs564.hrz.uni-kassel.de hrz-cs564
141.51.204.220 hrz-cs565.hrz.uni-kassel.de hrz-cs565
141.51.204.221 hrz-cs566.hrz.uni-kassel.de hrz-cs566
141.51.204.222 hrz-cs567.hrz.uni-kassel.de hrz-cs567
141.51.204.223 hrz-cs568.hrz.uni-kassel.de hrz-cs568
141.51.204.224 hrz-cs569.hrz.uni-kassel.de hrz-cs569
141.51.204.225 hrz-cs570.hrz.uni-kassel.de hrz-cs570
141.51.204.226 hrz-cs571.hrz.uni-kassel.de hrz-cs571
141.51.204.227 hrz-cs572.hrz.uni-kassel.de hrz-cs572
141.51.204.228 hrz-cs573.hrz.uni-kassel.de hrz-cs573
141.51.204.229 hrz-cs574.hrz.uni-kassel.de hrz-cs574
141.51.204.230 hrz-cs575.hrz.uni-kassel.de hrz-cs575
141.51.204.231 hrz-cs576.hrz.uni-kassel.de hrz-cs576
141.51.204.232 hrz-cs577.hrz.uni-kassel.de hrz-cs577
141.51.204.233 hrz-cs578.hrz.uni-kassel.de hrz-cs578
141.51.204.234 hrz-cs579.hrz.uni-kassel.de hrz-cs579
141.51.204.235 hrz-cs580.hrz.uni-kassel.de hrz-cs580
141.51.204.236 hrz-cs581.hrz.uni-kassel.de hrz-cs581
141.51.204.237 hrz-cs582.hrz.uni-kassel.de hrz-cs582
141.51.204.238 hrz-cs583.hrz.uni-kassel.de hrz-cs583
141.51.204.239 hrz-cs584.hrz.uni-kassel.de hrz-cs584
141.51.204.240 hrz-cs585.hrz.uni-kassel.de hrz-cs585
141.51.204.241 hrz-cs586.hrz.uni-kassel.de hrz-cs586
141.51.204.242 hrz-cs587.hrz.uni-kassel.de hrz-cs587
141.51.204.243 hrz-cs588.hrz.uni-kassel.de hrz-cs588
141.51.204.244 hrz-cs589.hrz.uni-kassel.de hrz-cs589
141.51.204.245 hrz-cs590.hrz.uni-kassel.de hrz-cs590
141.51.204.246 hrz-cs591.hrz.uni-kassel.de hrz-cs591
141.51.204.247 hrz-cs592.hrz.uni-kassel.de hrz-cs592
141.51.204.248 hrz-cs593.hrz.uni-kassel.de hrz-cs593
141.51.204.249 hrz-cs594.hrz.uni-kassel.de hrz-cs594
141.51.204.250 hrz-cs595.hrz.uni-kassel.de hrz-cs595
141.51.204.251 hrz-cs596.hrz.uni-kassel.de hrz-cs596
141.51.204.252 hrz-cs597.hrz.uni-kassel.de hrz-cs597
141.51.204.253 hrz-cs598.hrz.uni-kassel.de hrz-cs598
141.51.204.254 hrz-cs599.hrz.uni-kassel.de hrz-cs599


192.168.204.11 hrz-no401.hrz.uni-kassel.de hrz-no401
192.168.204.12 hrz-no402.hrz.uni-kassel.de hrz-no402
192.168.204.13 hrz-no403.hrz.uni-kassel.de hrz-no403
192.168.204.14 hrz-no404.hrz.uni-kassel.de hrz-no404
192.168.204.15 hrz-no405.hrz.uni-kassel.de hrz-no405
192.168.204.16 hrz-no406.hrz.uni-kassel.de hrz-no406
192.168.204.17 hrz-no407.hrz.uni-kassel.de hrz-no407
192.168.204.18 hrz-no408.hrz.uni-kassel.de hrz-no408
192.168.204.19 hrz-no409.hrz.uni-kassel.de hrz-no409
192.168.204.20 hrz-no410.hrz.uni-kassel.de hrz-no410
192.168.204.21 hrz-no411.hrz.uni-kassel.de hrz-no411
192.168.204.22 hrz-no412.hrz.uni-kassel.de hrz-no412
192.168.204.23 hrz-no413.hrz.uni-kassel.de hrz-no413
192.168.204.24 hrz-no414.hrz.uni-kassel.de hrz-no414
192.168.204.25 hrz-no415.hrz.uni-kassel.de hrz-no415
192.168.204.26 hrz-no416.hrz.uni-kassel.de hrz-no416
192.168.204.27 hrz-no417.hrz.uni-kassel.de hrz-no417
192.168.204.28 hrz-no418.hrz.uni-kassel.de hrz-no418
192.168.204.29 hrz-no419.hrz.uni-kassel.de hrz-no419
192.168.204.31 hrz-no421.hrz.uni-kassel.de hrz-no421
192.168.204.32 hrz-no422.hrz.uni-kassel.de hrz-no422
192.168.204.33 hrz-no423.hrz.uni-kassel.de hrz-no423
192.168.204.34 hrz-no424.hrz.uni-kassel.de hrz-no424
192.168.204.35 hrz-no425.hrz.uni-kassel.de hrz-no425
192.168.204.36 hrz-no426.hrz.uni-kassel.de hrz-no426
192.168.204.37 hrz-no427.hrz.uni-kassel.de hrz-no427
192.168.204.38 hrz-no428.hrz.uni-kassel.de hrz-no428
192.168.204.39 hrz-no429.hrz.uni-kassel.de hrz-no429
192.168.204.40 hrz-no430.hrz.uni-kassel.de hrz-no430
192.168.204.47 hrz-no437.hrz.uni-kassel.de hrz-no437
192.168.204.48 hrz-no438.hrz.uni-kassel.de hrz-no438
192.168.204.49 hrz-no439.hrz.uni-kassel.de hrz-no439
192.168.204.50 hrz-no440.hrz.uni-kassel.de hrz-no440
192.168.204.51 hrz-no441.hrz.uni-kassel.de hrz-no441
192.168.204.54 hrz-no444.hrz.uni-kassel.de hrz-no444
192.168.204.65 hrz-no455.hrz.uni-kassel.de hrz-no455
192.168.204.66 hrz-no456.hrz.uni-kassel.de hrz-no456
192.168.204.69 hrz-no459.hrz.uni-kassel.de hrz-no459
192.168.204.70 hrz-no460.hrz.uni-kassel.de hrz-no460
192.168.204.71 hrz-no461.hrz.uni-kassel.de hrz-no461
192.168.204.72 hrz-no462.hrz.uni-kassel.de hrz-no462
192.168.204.73 hrz-no463.hrz.uni-kassel.de hrz-no463
192.168.204.74 hrz-no464.hrz.uni-kassel.de hrz-no464
192.168.204.75 hrz-no465.hrz.uni-kassel.de hrz-no465
192.168.204.76 hrz-no466.hrz.uni-kassel.de hrz-no466
192.168.204.77 hrz-no467.hrz.uni-kassel.de hrz-no467
192.168.204.78 hrz-no468.hrz.uni-kassel.de hrz-no468
192.168.204.79 hrz-no469.hrz.uni-kassel.de hrz-no469
192.168.204.80 hrz-no470.hrz.uni-kassel.de hrz-no470
192.168.204.81 hrz-no471.hrz.uni-kassel.de hrz-no471
192.168.204.82 hrz-no472.hrz.uni-kassel.de hrz-no472
192.168.204.83 hrz-no473.hrz.uni-kassel.de hrz-no473
192.168.204.84 hrz-no474.hrz.uni-kassel.de hrz-no474
192.168.204.85 hrz-no475.hrz.uni-kassel.de hrz-no475
192.168.204.86 hrz-no476.hrz.uni-kassel.de hrz-no476
192.168.204.87 hrz-no477.hrz.uni-kassel.de hrz-no477
192.168.204.88 hrz-no478.hrz.uni-kassel.de hrz-no478
192.168.204.89 hrz-no479.hrz.uni-kassel.de hrz-no479
192.168.204.90 hrz-no480.hrz.uni-kassel.de hrz-no480
192.168.204.91 hrz-no481.hrz.uni-kassel.de hrz-no481
192.168.204.92 hrz-no482.hrz.uni-kassel.de hrz-no482
192.168.204.93 hrz-no483.hrz.uni-kassel.de hrz-no483
192.168.204.94 hrz-no484.hrz.uni-kassel.de hrz-no484
192.168.204.95 hrz-no485.hrz.uni-kassel.de hrz-no485
192.168.204.96 hrz-no486.hrz.uni-kassel.de hrz-no486
192.168.204.97 hrz-no487.hrz.uni-kassel.de hrz-no487
192.168.204.98 hrz-no488.hrz.uni-kassel.de hrz-no488
192.168.204.99 hrz-no489.hrz.uni-kassel.de hrz-no489
192.168.204.100 hrz-no490.hrz.uni-kassel.de hrz-no490
192.168.204.101 hrz-no491.hrz.uni-kassel.de hrz-no491
192.168.204.102 hrz-no492.hrz.uni-kassel.de hrz-no492
192.168.204.103 hrz-no493.hrz.uni-kassel.de hrz-no493
192.168.204.104 hrz-no494.hrz.uni-kassel.de hrz-no494
192.168.204.105 hrz-no495.hrz.uni-kassel.de hrz-no495
192.168.204.106 hrz-no496.hrz.uni-kassel.de hrz-no496
192.168.204.107 hrz-no497.hrz.uni-kassel.de hrz-no497
192.168.204.108 hrz-no498.hrz.uni-kassel.de hrz-no498
192.168.204.109 hrz-no499.hrz.uni-kassel.de hrz-no499
192.168.204.110 hrz-no500.hrz.uni-kassel.de hrz-no500
192.168.204.111 hrz-no501.hrz.uni-kassel.de hrz-no501
192.168.204.112 hrz-no502.hrz.uni-kassel.de hrz-no502
192.168.204.113 hrz-no503.hrz.uni-kassel.de hrz-no503
192.168.204.114 hrz-no504.hrz.uni-kassel.de hrz-no504
192.168.204.115 hrz-no505.hrz.uni-kassel.de hrz-no505
192.168.204.116 hrz-no506.hrz.uni-kassel.de hrz-no506
192.168.204.117 hrz-no507.hrz.uni-kassel.de hrz-no507
192.168.204.118 hrz-no508.hrz.uni-kassel.de hrz-no508
192.168.204.119 hrz-no509.hrz.uni-kassel.de hrz-no509
192.168.204.120 hrz-no510.hrz.uni-kassel.de hrz-no510
192.168.204.121 hrz-no511.hrz.uni-kassel.de hrz-no511
192.168.204.122 hrz-no512.hrz.uni-kassel.de hrz-no512
192.168.204.123 hrz-no513.hrz.uni-kassel.de hrz-no513
192.168.204.124 hrz-no514.hrz.uni-kassel.de hrz-no514
192.168.204.125 hrz-no515.hrz.uni-kassel.de hrz-no515
192.168.204.126 hrz-no516.hrz.uni-kassel.de hrz-no516
192.168.204.127 hrz-no517.hrz.uni-kassel.de hrz-no517
192.168.204.128 hrz-no518.hrz.uni-kassel.de hrz-no518
192.168.204.129 hrz-no519.hrz.uni-kassel.de hrz-no519
192.168.204.130 hrz-no520.hrz.uni-kassel.de hrz-no520
192.168.204.131 hrz-no521.hrz.uni-kassel.de hrz-no521
192.168.204.132 hrz-no522.hrz.uni-kassel.de hrz-no522
192.168.204.133 hrz-no523.hrz.uni-kassel.de hrz-no523
192.168.204.134 hrz-no524.hrz.uni-kassel.de hrz-no524
192.168.204.135 hrz-no525.hrz.uni-kassel.de hrz-no525
192.168.204.136 hrz-no526.hrz.uni-kassel.de hrz-no526
192.168.204.137 hrz-no527.hrz.uni-kassel.de hrz-no527
192.168.204.138 hrz-no528.hrz.uni-kassel.de hrz-no528
192.168.204.139 hrz-no529.hrz.uni-kassel.de hrz-no529
192.168.204.140 hrz-no530.hrz.uni-kassel.de hrz-no530
192.168.204.141 hrz-no531.hrz.uni-kassel.de hrz-no531
192.168.204.142 hrz-no532.hrz.uni-kassel.de hrz-no532
192.168.204.143 hrz-no533.hrz.uni-kassel.de hrz-no533
192.168.204.144 hrz-no534.hrz.uni-kassel.de hrz-no534
192.168.204.145 hrz-no535.hrz.uni-kassel.de hrz-no535
192.168.204.146 hrz-no536.hrz.uni-kassel.de hrz-no536
192.168.204.147 hrz-no537.hrz.uni-kassel.de hrz-no537
192.168.204.148 hrz-no538.hrz.uni-kassel.de hrz-no538
192.168.204.149 hrz-no539.hrz.uni-kassel.de hrz-no539
192.168.204.150 hrz-no540.hrz.uni-kassel.de hrz-no540
192.168.204.151 hrz-no541.hrz.uni-kassel.de hrz-no541
192.168.204.152 hrz-no542.hrz.uni-kassel.de hrz-no542
192.168.204.153 hrz-no543.hrz.uni-kassel.de hrz-no543
192.168.204.154 hrz-no544.hrz.uni-kassel.de hrz-no544
192.168.204.155 hrz-no545.hrz.uni-kassel.de hrz-no545
192.168.204.156 hrz-no546.hrz.uni-kassel.de hrz-no546
192.168.204.157 hrz-no547.hrz.uni-kassel.de hrz-no547
192.168.204.158 hrz-no548.hrz.uni-kassel.de hrz-no548
192.168.204.159 hrz-no549.hrz.uni-kassel.de hrz-no549
192.168.204.160 hrz-no550.hrz.uni-kassel.de hrz-no550
192.168.204.161 hrz-no551.hrz.uni-kassel.de hrz-no551
192.168.204.162 hrz-no552.hrz.uni-kassel.de hrz-no552
192.168.204.163 hrz-no553.hrz.uni-kassel.de hrz-no553
192.168.204.164 hrz-no554.hrz.uni-kassel.de hrz-no554
192.168.204.165 hrz-no555.hrz.uni-kassel.de hrz-no555
192.168.204.166 hrz-no556.hrz.uni-kassel.de hrz-no556
192.168.204.167 hrz-no557.hrz.uni-kassel.de hrz-no557
192.168.204.168 hrz-no558.hrz.uni-kassel.de hrz-no558
192.168.204.169 hrz-no559.hrz.uni-kassel.de hrz-no559
192.168.204.215 hrz-no560.hrz.uni-kassel.de hrz-no560
192.168.204.216 hrz-no561.hrz.uni-kassel.de hrz-no561
192.168.204.217 hrz-no562.hrz.uni-kassel.de hrz-no562
192.168.204.218 hrz-no563.hrz.uni-kassel.de hrz-no563
192.168.204.219 hrz-no564.hrz.uni-kassel.de hrz-no564
192.168.204.220 hrz-no565.hrz.uni-kassel.de hrz-no565
192.168.204.221 hrz-no566.hrz.uni-kassel.de hrz-no566
192.168.204.222 hrz-no567.hrz.uni-kassel.de hrz-no567
192.168.204.223 hrz-no568.hrz.uni-kassel.de hrz-no568
192.168.204.224 hrz-no569.hrz.uni-kassel.de hrz-no569
192.168.204.225 hrz-no570.hrz.uni-kassel.de hrz-no570
192.168.204.226 hrz-no571.hrz.uni-kassel.de hrz-no571
192.168.204.227 hrz-no572.hrz.uni-kassel.de hrz-no572
192.168.204.228 hrz-no573.hrz.uni-kassel.de hrz-no573
192.168.204.229 hrz-no574.hrz.uni-kassel.de hrz-no574
192.168.204.230 hrz-no575.hrz.uni-kassel.de hrz-no575
192.168.204.231 hrz-no576.hrz.uni-kassel.de hrz-no576
192.168.204.232 hrz-no577.hrz.uni-kassel.de hrz-no577
192.168.204.233 hrz-no578.hrz.uni-kassel.de hrz-no578
192.168.204.234 hrz-no579.hrz.uni-kassel.de hrz-no579
192.168.204.235 hrz-no580.hrz.uni-kassel.de hrz-no580
192.168.204.236 hrz-no581.hrz.uni-kassel.de hrz-no581
192.168.204.237 hrz-no582.hrz.uni-kassel.de hrz-no582
192.168.204.238 hrz-no583.hrz.uni-kassel.de hrz-no583
192.168.204.239 hrz-no584.hrz.uni-kassel.de hrz-no584
192.168.204.240 hrz-no585.hrz.uni-kassel.de hrz-no585
192.168.204.241 hrz-no586.hrz.uni-kassel.de hrz-no586
192.168.204.242 hrz-no587.hrz.uni-kassel.de hrz-no587
192.168.204.243 hrz-no588.hrz.uni-kassel.de hrz-no588
192.168.204.244 hrz-no589.hrz.uni-kassel.de hrz-no589
192.168.204.245 hrz-no590.hrz.uni-kassel.de hrz-no590
192.168.204.246 hrz-no591.hrz.uni-kassel.de hrz-no591
192.168.204.247 hrz-no592.hrz.uni-kassel.de hrz-no592
192.168.204.248 hrz-no593.hrz.uni-kassel.de hrz-no593
192.168.204.249 hrz-no594.hrz.uni-kassel.de hrz-no594
192.168.204.250 hrz-no595.hrz.uni-kassel.de hrz-no595
192.168.204.251 hrz-no596.hrz.uni-kassel.de hrz-no596
192.168.204.252 hrz-no597.hrz.uni-kassel.de hrz-no597
192.168.204.253 hrz-no598.hrz.uni-kassel.de hrz-no598
192.168.204.254 hrz-no599.hrz.uni-kassel.de hrz-no599

141.51.204.190    hrz-gc100 hrz-gc100.hrz.uni-kassel.de
141.51.204.191    hrz-gc101.hrz.uni-kassel.de hrz-gc101
141.51.204.192    hrz-gc102.hrz.uni-kassel.de hrz-gc102
141.51.204.193    hrz-gc103.hrz.uni-kassel.de hrz-gc103
141.51.204.194    hrz-gc104.hrz.uni-kassel.de hrz-gc104
141.51.204.195    hrz-gc105.hrz.uni-kassel.de hrz-gc105
141.51.204.196    hrz-gc106.hrz.uni-kassel.de hrz-gc106
141.51.204.197    hrz-gc107.hrz.uni-kassel.de hrz-gc107
141.51.204.198    hrz-gc108.hrz.uni-kassel.de hrz-gc108
141.51.204.199    hrz-gc109.hrz.uni-kassel.de hrz-gc109
141.51.204.200    hrz-gc110.hrz.uni-kassel.de hrz-gc110
141.51.204.201    hrz-gc111.hrz.uni-kassel.de hrz-gc111
141.51.204.202    hrz-gc112.hrz.uni-kassel.de hrz-gc112
141.51.204.203    hrz-gc113.hrz.uni-kassel.de hrz-gc113
141.51.204.204    hrz-gc114.hrz.uni-kassel.de hrz-gc114
141.51.204.205    hrz-gc115.hrz.uni-kassel.de hrz-gc115
141.51.204.206    hrz-gc116.hrz.uni-kassel.de hrz-gc116
141.51.204.207    hrz-gc117.hrz.uni-kassel.de hrz-gc117
141.51.204.208    hrz-gc118.hrz.uni-kassel.de hrz-gc118
141.51.204.209    hrz-gc119.hrz.uni-kassel.de hrz-gc119
141.51.204.210    hrz-gc120.hrz.uni-kassel.de hrz-gc120

# Cluster neu
141.51.204.30 its-cs1.its.uni-kassel.de its-cs1
141.51.204.170 its-cs10.its.uni-kassel.de its-cs10
141.51.204.171 its-cs11.its.uni-kassel.de its-cs11
141.51.204.172 its-cs12.its.uni-kassel.de its-cs12
141.51.204.173 its-cs13.its.uni-kassel.de its-cs13
141.51.204.174 its-cs14.its.uni-kassel.de its-cs14
141.51.204.175 its-cs15.its.uni-kassel.de its-cs15
141.51.204.176 its-cs16.its.uni-kassel.de its-cs16
141.51.204.177 its-cs17.its.uni-kassel.de its-cs17
141.51.204.178 its-cs18.its.uni-kassel.de its-cs18
141.51.204.179 its-cs19.its.uni-kassel.de its-cs19
141.51.205.10 its-cs100.its.uni-kassel.de its-cs100
141.51.205.11 its-cs101.its.uni-kassel.de its-cs101
141.51.205.12 its-cs102.its.uni-kassel.de its-cs102
141.51.205.13 its-cs103.its.uni-kassel.de its-cs103
141.51.205.14 its-cs104.its.uni-kassel.de its-cs104
141.51.205.15 its-cs105.its.uni-kassel.de its-cs105
141.51.205.16 its-cs106.its.uni-kassel.de its-cs106
141.51.205.17 its-cs107.its.uni-kassel.de its-cs107
141.51.205.18 its-cs108.its.uni-kassel.de its-cs108
141.51.205.19 its-cs109.its.uni-kassel.de its-cs109
141.51.205.20 its-cs110.its.uni-kassel.de its-cs110
141.51.205.21 its-cs111.its.uni-kassel.de its-cs111
141.51.205.22 its-cs112.its.uni-kassel.de its-cs112
141.51.205.23 its-cs113.its.uni-kassel.de its-cs113
141.51.205.24 its-cs114.its.uni-kassel.de its-cs114
141.51.205.25 its-cs115.its.uni-kassel.de its-cs115
141.51.205.26 its-cs116.its.uni-kassel.de its-cs116
141.51.205.27 its-cs117.its.uni-kassel.de its-cs117
141.51.205.28 its-cs118.its.uni-kassel.de its-cs118
141.51.205.29 its-cs119.its.uni-kassel.de its-cs119
141.51.205.30 its-cs120.its.uni-kassel.de its-cs120
141.51.205.31 its-cs121.its.uni-kassel.de its-cs121
141.51.205.32 its-cs122.its.uni-kassel.de its-cs122
141.51.205.33 its-cs123.its.uni-kassel.de its-cs123
141.51.205.34 its-cs124.its.uni-kassel.de its-cs124
141.51.205.35 its-cs125.its.uni-kassel.de its-cs125
141.51.205.36 its-cs126.its.uni-kassel.de its-cs126
141.51.205.37 its-cs127.its.uni-kassel.de its-cs127
141.51.205.38 its-cs128.its.uni-kassel.de its-cs128
141.51.205.39 its-cs129.its.uni-kassel.de its-cs129
141.51.205.40 its-cs130.its.uni-kassel.de its-cs130
141.51.205.41 its-cs131.its.uni-kassel.de its-cs131
141.51.205.42 its-cs132.its.uni-kassel.de its-cs132
141.51.205.43 its-cs133.its.uni-kassel.de its-cs133
141.51.205.44 its-cs134.its.uni-kassel.de its-cs134
141.51.205.45 its-cs135.its.uni-kassel.de its-cs135
141.51.205.46 its-cs136.its.uni-kassel.de its-cs136
141.51.205.47 its-cs137.its.uni-kassel.de its-cs137
141.51.205.48 its-cs138.its.uni-kassel.de its-cs138
141.51.205.49 its-cs139.its.uni-kassel.de its-cs139
141.51.205.50 its-cs140.its.uni-kassel.de its-cs140
141.51.205.51 its-cs141.its.uni-kassel.de its-cs141
141.51.205.52 its-cs142.its.uni-kassel.de its-cs142
141.51.205.53 its-cs143.its.uni-kassel.de its-cs143
141.51.205.54 its-cs144.its.uni-kassel.de its-cs144
141.51.205.55 its-cs145.its.uni-kassel.de its-cs145
141.51.205.56 its-cs146.its.uni-kassel.de its-cs146
141.51.205.57 its-cs147.its.uni-kassel.de its-cs147
141.51.205.58 its-cs148.its.uni-kassel.de its-cs148
141.51.205.59 its-cs149.its.uni-kassel.de its-cs149
141.51.205.60 its-cs150.its.uni-kassel.de its-cs150
141.51.205.61 its-cs151.its.uni-kassel.de its-cs151
141.51.205.62 its-cs152.its.uni-kassel.de its-cs152
141.51.205.63 its-cs153.its.uni-kassel.de its-cs153
141.51.205.64 its-cs154.its.uni-kassel.de its-cs154
141.51.205.65 its-cs155.its.uni-kassel.de its-cs155
141.51.205.66 its-cs156.its.uni-kassel.de its-cs156
141.51.205.67 its-cs157.its.uni-kassel.de its-cs157
141.51.205.68 its-cs158.its.uni-kassel.de its-cs158
141.51.205.69 its-cs159.its.uni-kassel.de its-cs159
141.51.205.70 its-cs160.its.uni-kassel.de its-cs160
141.51.205.71 its-cs161.its.uni-kassel.de its-cs161
141.51.205.72 its-cs162.its.uni-kassel.de its-cs162
141.51.205.73 its-cs163.its.uni-kassel.de its-cs163
141.51.205.74 its-cs164.its.uni-kassel.de its-cs164
141.51.205.75 its-cs165.its.uni-kassel.de its-cs165
141.51.205.76 its-cs166.its.uni-kassel.de its-cs166
141.51.205.77 its-cs167.its.uni-kassel.de its-cs167
141.51.205.78 its-cs168.its.uni-kassel.de its-cs168
141.51.205.79 its-cs169.its.uni-kassel.de its-cs169
141.51.205.80 its-cs170.its.uni-kassel.de its-cs170
141.51.205.81 its-cs171.its.uni-kassel.de its-cs171
141.51.205.82 its-cs172.its.uni-kassel.de its-cs172
141.51.205.83 its-cs173.its.uni-kassel.de its-cs173
141.51.205.84 its-cs174.its.uni-kassel.de its-cs174
141.51.205.85 its-cs175.its.uni-kassel.de its-cs175
141.51.205.86 its-cs176.its.uni-kassel.de its-cs176
141.51.205.87 its-cs177.its.uni-kassel.de its-cs177
141.51.205.88 its-cs178.its.uni-kassel.de its-cs178
141.51.205.89 its-cs179.its.uni-kassel.de its-cs179
141.51.205.90 its-cs180.its.uni-kassel.de its-cs180
141.51.205.91 its-cs181.its.uni-kassel.de its-cs181
141.51.205.92 its-cs182.its.uni-kassel.de its-cs182
141.51.205.93 its-cs183.its.uni-kassel.de its-cs183
141.51.205.94 its-cs184.its.uni-kassel.de its-cs184
141.51.205.95 its-cs185.its.uni-kassel.de its-cs185
141.51.205.96 its-cs186.its.uni-kassel.de its-cs186
141.51.205.97 its-cs187.its.uni-kassel.de its-cs187
141.51.205.98 its-cs188.its.uni-kassel.de its-cs188
141.51.205.99 its-cs189.its.uni-kassel.de its-cs189
141.51.205.100 its-cs190.its.uni-kassel.de its-cs190
141.51.205.101 its-cs191.its.uni-kassel.de its-cs191
141.51.205.102 its-cs192.its.uni-kassel.de its-cs192
141.51.205.103 its-cs193.its.uni-kassel.de its-cs193
141.51.205.104 its-cs194.its.uni-kassel.de its-cs194
141.51.205.105 its-cs195.its.uni-kassel.de its-cs195
141.51.205.106 its-cs196.its.uni-kassel.de its-cs196
141.51.205.107 its-cs197.its.uni-kassel.de its-cs197
141.51.205.108 its-cs198.its.uni-kassel.de its-cs198
141.51.205.109 its-cs199.its.uni-kassel.de its-cs199
141.51.205.110 its-cs200.its.uni-kassel.de its-cs200
141.51.205.111 its-cs201.its.uni-kassel.de its-cs201
141.51.205.112 its-cs202.its.uni-kassel.de its-cs202
141.51.205.113 its-cs203.its.uni-kassel.de its-cs203
141.51.205.114 its-cs204.its.uni-kassel.de its-cs204
141.51.205.115 its-cs205.its.uni-kassel.de its-cs205
141.51.205.116 its-cs206.its.uni-kassel.de its-cs206
141.51.205.117 its-cs207.its.uni-kassel.de its-cs207
141.51.205.118 its-cs208.its.uni-kassel.de its-cs208
141.51.205.119 its-cs209.its.uni-kassel.de its-cs209
141.51.205.120 its-cs210.its.uni-kassel.de its-cs210
141.51.205.121 its-cs211.its.uni-kassel.de its-cs211
141.51.205.122 its-cs212.its.uni-kassel.de its-cs212
141.51.205.123 its-cs213.its.uni-kassel.de its-cs213
141.51.205.124 its-cs214.its.uni-kassel.de its-cs214
141.51.205.125 its-cs215.its.uni-kassel.de its-cs215
141.51.205.126 its-cs216.its.uni-kassel.de its-cs216
141.51.205.127 its-cs217.its.uni-kassel.de its-cs217
141.51.205.128 its-cs218.its.uni-kassel.de its-cs218
141.51.205.129 its-cs219.its.uni-kassel.de its-cs219
141.51.205.130 its-cs220.its.uni-kassel.de its-cs220
141.51.205.131 its-cs221.its.uni-kassel.de its-cs221
141.51.205.132 its-cs222.its.uni-kassel.de its-cs222
141.51.205.133 its-cs223.its.uni-kassel.de its-cs223
141.51.205.134 its-cs224.its.uni-kassel.de its-cs224
141.51.205.135 its-cs225.its.uni-kassel.de its-cs225
141.51.205.136 its-cs226.its.uni-kassel.de its-cs226
141.51.205.137 its-cs227.its.uni-kassel.de its-cs227
141.51.205.138 its-cs228.its.uni-kassel.de its-cs228
141.51.205.139 its-cs229.its.uni-kassel.de its-cs229
141.51.205.140 its-cs230.its.uni-kassel.de its-cs230
141.51.205.141 its-cs231.its.uni-kassel.de its-cs231
141.51.205.142 its-cs232.its.uni-kassel.de its-cs232
141.51.205.143 its-cs233.its.uni-kassel.de its-cs233
141.51.205.144 its-cs234.its.uni-kassel.de its-cs234
141.51.205.145 its-cs235.its.uni-kassel.de its-cs235
141.51.205.146 its-cs236.its.uni-kassel.de its-cs236
141.51.205.147 its-cs237.its.uni-kassel.de its-cs237
141.51.205.148 its-cs238.its.uni-kassel.de its-cs238
141.51.205.149 its-cs239.its.uni-kassel.de its-cs239
141.51.205.150 its-cs240.its.uni-kassel.de its-cs240
141.51.205.151 its-cs241.its.uni-kassel.de its-cs241
141.51.205.152 its-cs242.its.uni-kassel.de its-cs242
141.51.205.153 its-cs243.its.uni-kassel.de its-cs243
141.51.205.154 its-cs244.its.uni-kassel.de its-cs244
141.51.205.155 its-cs245.its.uni-kassel.de its-cs245
141.51.205.156 its-cs246.its.uni-kassel.de its-cs246
141.51.205.157 its-cs247.its.uni-kassel.de its-cs247
141.51.205.158 its-cs248.its.uni-kassel.de its-cs248
141.51.205.159 its-cs249.its.uni-kassel.de its-cs249
141.51.205.160 its-cs250.its.uni-kassel.de its-cs250
141.51.205.161 its-cs251.its.uni-kassel.de its-cs251
141.51.205.162 its-cs252.its.uni-kassel.de its-cs252
141.51.205.163 its-cs253.its.uni-kassel.de its-cs253
141.51.205.164 its-cs254.its.uni-kassel.de its-cs254
141.51.205.165 its-cs255.its.uni-kassel.de its-cs255
141.51.205.166 its-cs256.its.uni-kassel.de its-cs256
141.51.205.167 its-cs257.its.uni-kassel.de its-cs257
141.51.205.168 its-cs258.its.uni-kassel.de its-cs258
141.51.205.169 its-cs259.its.uni-kassel.de its-cs259
141.51.205.170 its-cs260.its.uni-kassel.de its-cs260
141.51.205.171 its-cs261.its.uni-kassel.de its-cs261
141.51.205.172 its-cs262.its.uni-kassel.de its-cs262
141.51.205.173 its-cs263.its.uni-kassel.de its-cs263
141.51.205.174 its-cs264.its.uni-kassel.de its-cs264
141.51.205.175 its-cs265.its.uni-kassel.de its-cs265
141.51.205.176 its-cs266.its.uni-kassel.de its-cs266
141.51.205.177 its-cs267.its.uni-kassel.de its-cs267
141.51.205.178 its-cs268.its.uni-kassel.de its-cs268
141.51.205.179 its-cs269.its.uni-kassel.de its-cs269
141.51.205.180 its-cs270.its.uni-kassel.de its-cs270
141.51.205.181 its-cs271.its.uni-kassel.de its-cs271
141.51.205.182 its-cs272.its.uni-kassel.de its-cs272
141.51.205.183 its-cs273.its.uni-kassel.de its-cs273
141.51.205.184 its-cs274.its.uni-kassel.de its-cs274
141.51.205.185 its-cs275.its.uni-kassel.de its-cs275
141.51.205.186 its-cs276.its.uni-kassel.de its-cs276
141.51.205.187 its-cs277.its.uni-kassel.de its-cs277
141.51.205.188 its-cs278.its.uni-kassel.de its-cs278
141.51.205.189 its-cs279.its.uni-kassel.de its-cs279
141.51.205.190 its-cs280.its.uni-kassel.de its-cs280
141.51.205.191 its-cs281.its.uni-kassel.de its-cs281
141.51.205.192 its-cs282.its.uni-kassel.de its-cs282
141.51.205.193 its-cs283.its.uni-kassel.de its-cs283
141.51.205.194 its-cs284.its.uni-kassel.de its-cs284
141.51.205.195 its-cs285.its.uni-kassel.de its-cs285
141.51.205.196 its-cs286.its.uni-kassel.de its-cs286
141.51.205.197 its-cs287.its.uni-kassel.de its-cs287
141.51.205.198 its-cs288.its.uni-kassel.de its-cs288
141.51.205.199 its-cs289.its.uni-kassel.de its-cs289
141.51.205.200 its-cs290.its.uni-kassel.de its-cs290
141.51.205.201 its-cs291.its.uni-kassel.de its-cs291
141.51.205.202 its-cs292.its.uni-kassel.de its-cs292
141.51.205.203 its-cs293.its.uni-kassel.de its-cs293
141.51.205.204 its-cs294.its.uni-kassel.de its-cs294
141.51.205.205 its-cs295.its.uni-kassel.de its-cs295
141.51.205.206 its-cs296.its.uni-kassel.de its-cs296
141.51.205.207 its-cs297.its.uni-kassel.de its-cs297
141.51.205.208 its-cs298.its.uni-kassel.de its-cs298
141.51.205.209 its-cs299.its.uni-kassel.de its-cs299
192.168.204.30 its-no1.its.uni-kassel.de its-no1
192.168.204.170 its-no10.its.uni-kassel.de its-no10
192.168.204.171 its-no11.its.uni-kassel.de its-no11
192.168.204.172 its-no12.its.uni-kassel.de its-no12
192.168.204.173 its-no13.its.uni-kassel.de its-no13
192.168.204.174 its-no14.its.uni-kassel.de its-no14
192.168.204.175 its-no15.its.uni-kassel.de its-no15
192.168.204.176 its-no16.its.uni-kassel.de its-no16
192.168.204.177 its-no17.its.uni-kassel.de its-no17
192.168.204.178 its-no18.its.uni-kassel.de its-no18
192.168.204.179 its-no19.its.uni-kassel.de its-no19
192.168.205.10 its-no100.its.uni-kassel.de its-no100
192.168.205.11 its-no101.its.uni-kassel.de its-no101
192.168.205.12 its-no102.its.uni-kassel.de its-no102
192.168.205.13 its-no103.its.uni-kassel.de its-no103
192.168.205.14 its-no104.its.uni-kassel.de its-no104
192.168.205.15 its-no105.its.uni-kassel.de its-no105
192.168.205.16 its-no106.its.uni-kassel.de its-no106
192.168.205.17 its-no107.its.uni-kassel.de its-no107
192.168.205.18 its-no108.its.uni-kassel.de its-no108
192.168.205.19 its-no109.its.uni-kassel.de its-no109
192.168.205.20 its-no110.its.uni-kassel.de its-no110
192.168.205.21 its-no111.its.uni-kassel.de its-no111
192.168.205.22 its-no112.its.uni-kassel.de its-no112
192.168.205.23 its-no113.its.uni-kassel.de its-no113
192.168.205.24 its-no114.its.uni-kassel.de its-no114
192.168.205.25 its-no115.its.uni-kassel.de its-no115
192.168.205.26 its-no116.its.uni-kassel.de its-no116
192.168.205.27 its-no117.its.uni-kassel.de its-no117
192.168.205.28 its-no118.its.uni-kassel.de its-no118
192.168.205.29 its-no119.its.uni-kassel.de its-no119
192.168.205.30 its-no120.its.uni-kassel.de its-no120
192.168.205.31 its-no121.its.uni-kassel.de its-no121
192.168.205.32 its-no122.its.uni-kassel.de its-no122
192.168.205.33 its-no123.its.uni-kassel.de its-no123
192.168.205.34 its-no124.its.uni-kassel.de its-no124
192.168.205.35 its-no125.its.uni-kassel.de its-no125
192.168.205.36 its-no126.its.uni-kassel.de its-no126
192.168.205.37 its-no127.its.uni-kassel.de its-no127
192.168.205.38 its-no128.its.uni-kassel.de its-no128
192.168.205.39 its-no129.its.uni-kassel.de its-no129
192.168.205.40 its-no130.its.uni-kassel.de its-no130
192.168.205.41 its-no131.its.uni-kassel.de its-no131
192.168.205.42 its-no132.its.uni-kassel.de its-no132
192.168.205.43 its-no133.its.uni-kassel.de its-no133
192.168.205.44 its-no134.its.uni-kassel.de its-no134
192.168.205.45 its-no135.its.uni-kassel.de its-no135
192.168.205.46 its-no136.its.uni-kassel.de its-no136
192.168.205.47 its-no137.its.uni-kassel.de its-no137
192.168.205.48 its-no138.its.uni-kassel.de its-no138
192.168.205.49 its-no139.its.uni-kassel.de its-no139
192.168.205.50 its-no140.its.uni-kassel.de its-no140
192.168.205.51 its-no141.its.uni-kassel.de its-no141
192.168.205.52 its-no142.its.uni-kassel.de its-no142
192.168.205.53 its-no143.its.uni-kassel.de its-no143
192.168.205.54 its-no144.its.uni-kassel.de its-no144
192.168.205.55 its-no145.its.uni-kassel.de its-no145
192.168.205.56 its-no146.its.uni-kassel.de its-no146
192.168.205.57 its-no147.its.uni-kassel.de its-no147
192.168.205.58 its-no148.its.uni-kassel.de its-no148
192.168.205.59 its-no149.its.uni-kassel.de its-no149
192.168.205.60 its-no150.its.uni-kassel.de its-no150
192.168.205.61 its-no151.its.uni-kassel.de its-no151
192.168.205.62 its-no152.its.uni-kassel.de its-no152
192.168.205.63 its-no153.its.uni-kassel.de its-no153
192.168.205.64 its-no154.its.uni-kassel.de its-no154
192.168.205.65 its-no155.its.uni-kassel.de its-no155
192.168.205.66 its-no156.its.uni-kassel.de its-no156
192.168.205.67 its-no157.its.uni-kassel.de its-no157
192.168.205.68 its-no158.its.uni-kassel.de its-no158
192.168.205.69 its-no159.its.uni-kassel.de its-no159
192.168.205.70 its-no160.its.uni-kassel.de its-no160
192.168.205.71 its-no161.its.uni-kassel.de its-no161
192.168.205.72 its-no162.its.uni-kassel.de its-no162
192.168.205.73 its-no163.its.uni-kassel.de its-no163
192.168.205.74 its-no164.its.uni-kassel.de its-no164
192.168.205.75 its-no165.its.uni-kassel.de its-no165
192.168.205.76 its-no166.its.uni-kassel.de its-no166
192.168.205.77 its-no167.its.uni-kassel.de its-no167
192.168.205.78 its-no168.its.uni-kassel.de its-no168
192.168.205.79 its-no169.its.uni-kassel.de its-no169
192.168.205.80 its-no170.its.uni-kassel.de its-no170
192.168.205.81 its-no171.its.uni-kassel.de its-no171
192.168.205.82 its-no172.its.uni-kassel.de its-no172
192.168.205.83 its-no173.its.uni-kassel.de its-no173
192.168.205.84 its-no174.its.uni-kassel.de its-no174
192.168.205.85 its-no175.its.uni-kassel.de its-no175
192.168.205.86 its-no176.its.uni-kassel.de its-no176
192.168.205.87 its-no177.its.uni-kassel.de its-no177
192.168.205.88 its-no178.its.uni-kassel.de its-no178
192.168.205.89 its-no179.its.uni-kassel.de its-no179
192.168.205.90 its-no180.its.uni-kassel.de its-no180
192.168.205.91 its-no181.its.uni-kassel.de its-no181
192.168.205.92 its-no182.its.uni-kassel.de its-no182
192.168.205.93 its-no183.its.uni-kassel.de its-no183
192.168.205.94 its-no184.its.uni-kassel.de its-no184
192.168.205.95 its-no185.its.uni-kassel.de its-no185
192.168.205.96 its-no186.its.uni-kassel.de its-no186
192.168.205.97 its-no187.its.uni-kassel.de its-no187
192.168.205.98 its-no188.its.uni-kassel.de its-no188
192.168.205.99 its-no189.its.uni-kassel.de its-no189
192.168.205.100 its-no190.its.uni-kassel.de its-no190
192.168.205.101 its-no191.its.uni-kassel.de its-no191
192.168.205.102 its-no192.its.uni-kassel.de its-no192
192.168.205.103 its-no193.its.uni-kassel.de its-no193
192.168.205.104 its-no194.its.uni-kassel.de its-no194
192.168.205.105 its-no195.its.uni-kassel.de its-no195
192.168.205.106 its-no196.its.uni-kassel.de its-no196
192.168.205.107 its-no197.its.uni-kassel.de its-no197
192.168.205.108 its-no198.its.uni-kassel.de its-no198
192.168.205.109 its-no199.its.uni-kassel.de its-no199
192.168.205.110 its-no200.its.uni-kassel.de its-no200
192.168.205.111 its-no201.its.uni-kassel.de its-no201
192.168.205.112 its-no202.its.uni-kassel.de its-no202
192.168.205.113 its-no203.its.uni-kassel.de its-no203
192.168.205.114 its-no204.its.uni-kassel.de its-no204
192.168.205.115 its-no205.its.uni-kassel.de its-no205
192.168.205.116 its-no206.its.uni-kassel.de its-no206
192.168.205.117 its-no207.its.uni-kassel.de its-no207
192.168.205.118 its-no208.its.uni-kassel.de its-no208
192.168.205.119 its-no209.its.uni-kassel.de its-no209
192.168.205.120 its-no210.its.uni-kassel.de its-no210
192.168.205.121 its-no211.its.uni-kassel.de its-no211
192.168.205.122 its-no212.its.uni-kassel.de its-no212
192.168.205.123 its-no213.its.uni-kassel.de its-no213
192.168.205.124 its-no214.its.uni-kassel.de its-no214
192.168.205.125 its-no215.its.uni-kassel.de its-no215
192.168.205.126 its-no216.its.uni-kassel.de its-no216
192.168.205.127 its-no217.its.uni-kassel.de its-no217
192.168.205.128 its-no218.its.uni-kassel.de its-no218
192.168.205.129 its-no219.its.uni-kassel.de its-no219
192.168.205.130 its-no220.its.uni-kassel.de its-no220
192.168.205.131 its-no221.its.uni-kassel.de its-no221
192.168.205.132 its-no222.its.uni-kassel.de its-no222
192.168.205.133 its-no223.its.uni-kassel.de its-no223
192.168.205.134 its-no224.its.uni-kassel.de its-no224
192.168.205.135 its-no225.its.uni-kassel.de its-no225
192.168.205.136 its-no226.its.uni-kassel.de its-no226
192.168.205.137 its-no227.its.uni-kassel.de its-no227
192.168.205.138 its-no228.its.uni-kassel.de its-no228
192.168.205.139 its-no229.its.uni-kassel.de its-no229
192.168.205.140 its-no230.its.uni-kassel.de its-no230
192.168.205.141 its-no231.its.uni-kassel.de its-no231
192.168.205.142 its-no232.its.uni-kassel.de its-no232
192.168.205.143 its-no233.its.uni-kassel.de its-no233
192.168.205.144 its-no234.its.uni-kassel.de its-no234
192.168.205.145 its-no235.its.uni-kassel.de its-no235
192.168.205.146 its-no236.its.uni-kassel.de its-no236
192.168.205.147 its-no237.its.uni-kassel.de its-no237
192.168.205.148 its-no238.its.uni-kassel.de its-no238
192.168.205.149 its-no239.its.uni-kassel.de its-no239
192.168.205.150 its-no240.its.uni-kassel.de its-no240
192.168.205.151 its-no241.its.uni-kassel.de its-no241
192.168.205.152 its-no242.its.uni-kassel.de its-no242
192.168.205.153 its-no243.its.uni-kassel.de its-no243
192.168.205.154 its-no244.its.uni-kassel.de its-no244
192.168.205.155 its-no245.its.uni-kassel.de its-no245
192.168.205.156 its-no246.its.uni-kassel.de its-no246
192.168.205.157 its-no247.its.uni-kassel.de its-no247
192.168.205.158 its-no248.its.uni-kassel.de its-no248
192.168.205.159 its-no249.its.uni-kassel.de its-no249
192.168.205.160 its-no250.its.uni-kassel.de its-no250
192.168.205.161 its-no251.its.uni-kassel.de its-no251
192.168.205.162 its-no252.its.uni-kassel.de its-no252
192.168.205.163 its-no253.its.uni-kassel.de its-no253
192.168.205.164 its-no254.its.uni-kassel.de its-no254
192.168.205.165 its-no255.its.uni-kassel.de its-no255
192.168.205.166 its-no256.its.uni-kassel.de its-no256
192.168.205.167 its-no257.its.uni-kassel.de its-no257
192.168.205.168 its-no258.its.uni-kassel.de its-no258
192.168.205.169 its-no259.its.uni-kassel.de its-no259
192.168.205.170 its-no260.its.uni-kassel.de its-no260
192.168.205.171 its-no261.its.uni-kassel.de its-no261
192.168.205.172 its-no262.its.uni-kassel.de its-no262
192.168.205.173 its-no263.its.uni-kassel.de its-no263
192.168.205.174 its-no264.its.uni-kassel.de its-no264
192.168.205.175 its-no265.its.uni-kassel.de its-no265
192.168.205.176 its-no266.its.uni-kassel.de its-no266
192.168.205.177 its-no267.its.uni-kassel.de its-no267
192.168.205.178 its-no268.its.uni-kassel.de its-no268
192.168.205.179 its-no269.its.uni-kassel.de its-no269
192.168.205.180 its-no270.its.uni-kassel.de its-no270
192.168.205.181 its-no271.its.uni-kassel.de its-no271
192.168.205.182 its-no272.its.uni-kassel.de its-no272
192.168.205.183 its-no273.its.uni-kassel.de its-no273
192.168.205.184 its-no274.its.uni-kassel.de its-no274
192.168.205.185 its-no275.its.uni-kassel.de its-no275
192.168.205.186 its-no276.its.uni-kassel.de its-no276
192.168.205.187 its-no277.its.uni-kassel.de its-no277
192.168.205.188 its-no278.its.uni-kassel.de its-no278
192.168.205.189 its-no279.its.uni-kassel.de its-no279
192.168.205.190 its-no280.its.uni-kassel.de its-no280
192.168.205.191 its-no281.its.uni-kassel.de its-no281
192.168.205.192 its-no282.its.uni-kassel.de its-no282
192.168.205.193 its-no283.its.uni-kassel.de its-no283
192.168.205.194 its-no284.its.uni-kassel.de its-no284
192.168.205.195 its-no285.its.uni-kassel.de its-no285
192.168.205.196 its-no286.its.uni-kassel.de its-no286
192.168.205.197 its-no287.its.uni-kassel.de its-no287
192.168.205.198 its-no288.its.uni-kassel.de its-no288
192.168.205.199 its-no289.its.uni-kassel.de its-no289
192.168.205.200 its-no290.its.uni-kassel.de its-no290
192.168.205.201 its-no291.its.uni-kassel.de its-no291
192.168.205.202 its-no292.its.uni-kassel.de its-no292
192.168.205.203 its-no293.its.uni-kassel.de its-no293
192.168.205.204 its-no294.its.uni-kassel.de its-no294
192.168.205.205 its-no295.its.uni-kassel.de its-no295
192.168.205.206 its-no296.its.uni-kassel.de its-no296
192.168.205.207 its-no297.its.uni-kassel.de its-no297
192.168.205.208 its-no298.its.uni-kassel.de its-no298
192.168.205.209 its-no299.its.uni-kassel.de its-no299
192.168.168.30 its-ib1.its.uni-kassel.de its-ib1
192.168.168.170 its-ib10.its.uni-kassel.de its-ib10
192.168.168.171 its-ib11.its.uni-kassel.de its-ib11
192.168.168.172 its-ib12.its.uni-kassel.de its-ib12
192.168.168.173 its-ib13.its.uni-kassel.de its-ib13
192.168.168.174 its-ib14.its.uni-kassel.de its-ib14
192.168.168.175 its-ib15.its.uni-kassel.de its-ib15
192.168.168.176 its-ib16.its.uni-kassel.de its-ib16
192.168.168.177 its-ib17.its.uni-kassel.de its-ib17
192.168.168.178 its-ib18.its.uni-kassel.de its-ib18
192.168.168.179 its-ib19.its.uni-kassel.de its-ib19
192.168.169.10 its-ib100.its.uni-kassel.de its-ib100
192.168.169.11 its-ib101.its.uni-kassel.de its-ib101
192.168.169.12 its-ib102.its.uni-kassel.de its-ib102
192.168.169.13 its-ib103.its.uni-kassel.de its-ib103
192.168.169.14 its-ib104.its.uni-kassel.de its-ib104
192.168.169.15 its-ib105.its.uni-kassel.de its-ib105
192.168.169.16 its-ib106.its.uni-kassel.de its-ib106
192.168.169.17 its-ib107.its.uni-kassel.de its-ib107
192.168.169.18 its-ib108.its.uni-kassel.de its-ib108
192.168.169.19 its-ib109.its.uni-kassel.de its-ib109
192.168.169.20 its-ib110.its.uni-kassel.de its-ib110
192.168.169.21 its-ib111.its.uni-kassel.de its-ib111
192.168.169.22 its-ib112.its.uni-kassel.de its-ib112
192.168.169.23 its-ib113.its.uni-kassel.de its-ib113
192.168.169.24 its-ib114.its.uni-kassel.de its-ib114
192.168.169.25 its-ib115.its.uni-kassel.de its-ib115
192.168.169.26 its-ib116.its.uni-kassel.de its-ib116
192.168.169.27 its-ib117.its.uni-kassel.de its-ib117
192.168.169.28 its-ib118.its.uni-kassel.de its-ib118
192.168.169.29 its-ib119.its.uni-kassel.de its-ib119
192.168.169.30 its-ib120.its.uni-kassel.de its-ib120
192.168.169.31 its-ib121.its.uni-kassel.de its-ib121
192.168.169.32 its-ib122.its.uni-kassel.de its-ib122
192.168.169.33 its-ib123.its.uni-kassel.de its-ib123
192.168.169.34 its-ib124.its.uni-kassel.de its-ib124
192.168.169.35 its-ib125.its.uni-kassel.de its-ib125
192.168.169.36 its-ib126.its.uni-kassel.de its-ib126
192.168.169.37 its-ib127.its.uni-kassel.de its-ib127
192.168.169.38 its-ib128.its.uni-kassel.de its-ib128
192.168.169.39 its-ib129.its.uni-kassel.de its-ib129
192.168.169.40 its-ib130.its.uni-kassel.de its-ib130
192.168.169.41 its-ib131.its.uni-kassel.de its-ib131
192.168.169.42 its-ib132.its.uni-kassel.de its-ib132
192.168.169.43 its-ib133.its.uni-kassel.de its-ib133
192.168.169.44 its-ib134.its.uni-kassel.de its-ib134
192.168.169.45 its-ib135.its.uni-kassel.de its-ib135
192.168.169.46 its-ib136.its.uni-kassel.de its-ib136
192.168.169.47 its-ib137.its.uni-kassel.de its-ib137
192.168.169.48 its-ib138.its.uni-kassel.de its-ib138
192.168.169.49 its-ib139.its.uni-kassel.de its-ib139
192.168.169.50 its-ib140.its.uni-kassel.de its-ib140
192.168.169.51 its-ib141.its.uni-kassel.de its-ib141
192.168.169.52 its-ib142.its.uni-kassel.de its-ib142
192.168.169.53 its-ib143.its.uni-kassel.de its-ib143
192.168.169.54 its-ib144.its.uni-kassel.de its-ib144
192.168.169.55 its-ib145.its.uni-kassel.de its-ib145
192.168.169.56 its-ib146.its.uni-kassel.de its-ib146
192.168.169.57 its-ib147.its.uni-kassel.de its-ib147
192.168.169.58 its-ib148.its.uni-kassel.de its-ib148
192.168.169.59 its-ib149.its.uni-kassel.de its-ib149
192.168.169.60 its-ib150.its.uni-kassel.de its-ib150
192.168.169.61 its-ib151.its.uni-kassel.de its-ib151
192.168.169.62 its-ib152.its.uni-kassel.de its-ib152
192.168.169.63 its-ib153.its.uni-kassel.de its-ib153
192.168.169.64 its-ib154.its.uni-kassel.de its-ib154
192.168.169.65 its-ib155.its.uni-kassel.de its-ib155
192.168.169.66 its-ib156.its.uni-kassel.de its-ib156
192.168.169.67 its-ib157.its.uni-kassel.de its-ib157
192.168.169.68 its-ib158.its.uni-kassel.de its-ib158
192.168.169.69 its-ib159.its.uni-kassel.de its-ib159
192.168.169.70 its-ib160.its.uni-kassel.de its-ib160
192.168.169.71 its-ib161.its.uni-kassel.de its-ib161
192.168.169.72 its-ib162.its.uni-kassel.de its-ib162
192.168.169.73 its-ib163.its.uni-kassel.de its-ib163
192.168.169.74 its-ib164.its.uni-kassel.de its-ib164
192.168.169.75 its-ib165.its.uni-kassel.de its-ib165
192.168.169.76 its-ib166.its.uni-kassel.de its-ib166
192.168.169.77 its-ib167.its.uni-kassel.de its-ib167
192.168.169.78 its-ib168.its.uni-kassel.de its-ib168
192.168.169.79 its-ib169.its.uni-kassel.de its-ib169
192.168.169.80 its-ib170.its.uni-kassel.de its-ib170
192.168.169.81 its-ib171.its.uni-kassel.de its-ib171
192.168.169.82 its-ib172.its.uni-kassel.de its-ib172
192.168.169.83 its-ib173.its.uni-kassel.de its-ib173
192.168.169.84 its-ib174.its.uni-kassel.de its-ib174
192.168.169.85 its-ib175.its.uni-kassel.de its-ib175
192.168.169.86 its-ib176.its.uni-kassel.de its-ib176
192.168.169.87 its-ib177.its.uni-kassel.de its-ib177
192.168.169.88 its-ib178.its.uni-kassel.de its-ib178
192.168.169.89 its-ib179.its.uni-kassel.de its-ib179
192.168.169.90 its-ib180.its.uni-kassel.de its-ib180
192.168.169.91 its-ib181.its.uni-kassel.de its-ib181
192.168.169.92 its-ib182.its.uni-kassel.de its-ib182
192.168.169.93 its-ib183.its.uni-kassel.de its-ib183
192.168.169.94 its-ib184.its.uni-kassel.de its-ib184
192.168.169.95 its-ib185.its.uni-kassel.de its-ib185
192.168.169.96 its-ib186.its.uni-kassel.de its-ib186
192.168.169.97 its-ib187.its.uni-kassel.de its-ib187
192.168.169.98 its-ib188.its.uni-kassel.de its-ib188
192.168.169.99 its-ib189.its.uni-kassel.de its-ib189
192.168.169.100 its-ib190.its.uni-kassel.de its-ib190
192.168.169.101 its-ib191.its.uni-kassel.de its-ib191
192.168.169.102 its-ib192.its.uni-kassel.de its-ib192
192.168.169.103 its-ib193.its.uni-kassel.de its-ib193
192.168.169.104 its-ib194.its.uni-kassel.de its-ib194
192.168.169.105 its-ib195.its.uni-kassel.de its-ib195
192.168.169.106 its-ib196.its.uni-kassel.de its-ib196
192.168.169.107 its-ib197.its.uni-kassel.de its-ib197
192.168.169.108 its-ib198.its.uni-kassel.de its-ib198
192.168.169.109 its-ib199.its.uni-kassel.de its-ib199
192.168.169.110 its-ib200.its.uni-kassel.de its-ib200
192.168.169.111 its-ib201.its.uni-kassel.de its-ib201
192.168.169.112 its-ib202.its.uni-kassel.de its-ib202
192.168.169.113 its-ib203.its.uni-kassel.de its-ib203
192.168.169.114 its-ib204.its.uni-kassel.de its-ib204
192.168.169.115 its-ib205.its.uni-kassel.de its-ib205
192.168.169.116 its-ib206.its.uni-kassel.de its-ib206
192.168.169.117 its-ib207.its.uni-kassel.de its-ib207
192.168.169.118 its-ib208.its.uni-kassel.de its-ib208
192.168.169.119 its-ib209.its.uni-kassel.de its-ib209
192.168.169.120 its-ib210.its.uni-kassel.de its-ib210
192.168.169.121 its-ib211.its.uni-kassel.de its-ib211
192.168.169.122 its-ib212.its.uni-kassel.de its-ib212
192.168.169.123 its-ib213.its.uni-kassel.de its-ib213
192.168.169.124 its-ib214.its.uni-kassel.de its-ib214
192.168.169.125 its-ib215.its.uni-kassel.de its-ib215
192.168.169.126 its-ib216.its.uni-kassel.de its-ib216
192.168.169.127 its-ib217.its.uni-kassel.de its-ib217
192.168.169.128 its-ib218.its.uni-kassel.de its-ib218
192.168.169.129 its-ib219.its.uni-kassel.de its-ib219
192.168.169.130 its-ib220.its.uni-kassel.de its-ib220
192.168.169.131 its-ib221.its.uni-kassel.de its-ib221
192.168.169.132 its-ib222.its.uni-kassel.de its-ib222
192.168.169.133 its-ib223.its.uni-kassel.de its-ib223
192.168.169.134 its-ib224.its.uni-kassel.de its-ib224
192.168.169.135 its-ib225.its.uni-kassel.de its-ib225
192.168.169.136 its-ib226.its.uni-kassel.de its-ib226
192.168.169.137 its-ib227.its.uni-kassel.de its-ib227
192.168.169.138 its-ib228.its.uni-kassel.de its-ib228
192.168.169.139 its-ib229.its.uni-kassel.de its-ib229
192.168.169.140 its-ib230.its.uni-kassel.de its-ib230
192.168.169.141 its-ib231.its.uni-kassel.de its-ib231
192.168.169.142 its-ib232.its.uni-kassel.de its-ib232
192.168.169.143 its-ib233.its.uni-kassel.de its-ib233
192.168.169.144 its-ib234.its.uni-kassel.de its-ib234
192.168.169.145 its-ib235.its.uni-kassel.de its-ib235
192.168.169.146 its-ib236.its.uni-kassel.de its-ib236
192.168.169.147 its-ib237.its.uni-kassel.de its-ib237
192.168.169.148 its-ib238.its.uni-kassel.de its-ib238
192.168.169.149 its-ib239.its.uni-kassel.de its-ib239
192.168.169.150 its-ib240.its.uni-kassel.de its-ib240
192.168.169.151 its-ib241.its.uni-kassel.de its-ib241
192.168.169.152 its-ib242.its.uni-kassel.de its-ib242
192.168.169.153 its-ib243.its.uni-kassel.de its-ib243
192.168.169.154 its-ib244.its.uni-kassel.de its-ib244
192.168.169.155 its-ib245.its.uni-kassel.de its-ib245
192.168.169.156 its-ib246.its.uni-kassel.de its-ib246
192.168.169.157 its-ib247.its.uni-kassel.de its-ib247
192.168.169.158 its-ib248.its.uni-kassel.de its-ib248
192.168.169.159 its-ib249.its.uni-kassel.de its-ib249
192.168.169.160 its-ib250.its.uni-kassel.de its-ib250
192.168.169.161 its-ib251.its.uni-kassel.de its-ib251
192.168.169.162 its-ib252.its.uni-kassel.de its-ib252
192.168.169.163 its-ib253.its.uni-kassel.de its-ib253
192.168.169.164 its-ib254.its.uni-kassel.de its-ib254
192.168.169.165 its-ib255.its.uni-kassel.de its-ib255
192.168.169.166 its-ib256.its.uni-kassel.de its-ib256
192.168.169.167 its-ib257.its.uni-kassel.de its-ib257
192.168.169.168 its-ib258.its.uni-kassel.de its-ib258
192.168.169.169 its-ib259.its.uni-kassel.de its-ib259
192.168.169.170 its-ib260.its.uni-kassel.de its-ib260
192.168.169.171 its-ib261.its.uni-kassel.de its-ib261
192.168.169.172 its-ib262.its.uni-kassel.de its-ib262
192.168.169.173 its-ib263.its.uni-kassel.de its-ib263
192.168.169.174 its-ib264.its.uni-kassel.de its-ib264
192.168.169.175 its-ib265.its.uni-kassel.de its-ib265
192.168.169.176 its-ib266.its.uni-kassel.de its-ib266
192.168.169.177 its-ib267.its.uni-kassel.de its-ib267
192.168.169.178 its-ib268.its.uni-kassel.de its-ib268
192.168.169.179 its-ib269.its.uni-kassel.de its-ib269
192.168.169.180 its-ib270.its.uni-kassel.de its-ib270
192.168.169.181 its-ib271.its.uni-kassel.de its-ib271
192.168.169.182 its-ib272.its.uni-kassel.de its-ib272
192.168.169.183 its-ib273.its.uni-kassel.de its-ib273
192.168.169.184 its-ib274.its.uni-kassel.de its-ib274
192.168.169.185 its-ib275.its.uni-kassel.de its-ib275
192.168.169.186 its-ib276.its.uni-kassel.de its-ib276
192.168.169.187 its-ib277.its.uni-kassel.de its-ib277
192.168.169.188 its-ib278.its.uni-kassel.de its-ib278
192.168.169.189 its-ib279.its.uni-kassel.de its-ib279
192.168.169.190 its-ib280.its.uni-kassel.de its-ib280
192.168.169.191 its-ib281.its.uni-kassel.de its-ib281
192.168.169.192 its-ib282.its.uni-kassel.de its-ib282
192.168.169.193 its-ib283.its.uni-kassel.de its-ib283
192.168.169.194 its-ib284.its.uni-kassel.de its-ib284
192.168.169.195 its-ib285.its.uni-kassel.de its-ib285
192.168.169.196 its-ib286.its.uni-kassel.de its-ib286
192.168.169.197 its-ib287.its.uni-kassel.de its-ib287
192.168.169.198 its-ib288.its.uni-kassel.de its-ib288
192.168.169.199 its-ib289.its.uni-kassel.de its-ib289
192.168.169.200 its-ib290.its.uni-kassel.de its-ib290
192.168.169.201 its-ib291.its.uni-kassel.de its-ib291
192.168.169.202 its-ib292.its.uni-kassel.de its-ib292
192.168.169.203 its-ib293.its.uni-kassel.de its-ib293
192.168.169.204 its-ib294.its.uni-kassel.de its-ib294
192.168.169.205 its-ib295.its.uni-kassel.de its-ib295
192.168.169.206 its-ib296.its.uni-kassel.de its-ib296
192.168.169.207 its-ib297.its.uni-kassel.de its-ib297
192.168.169.208 its-ib298.its.uni-kassel.de its-ib298
192.168.169.209 its-ib299.its.uni-kassel.de its-ib299
141.51.205.210 its-cs300.its.uni-kassel.de its-cs300
141.51.205.211 its-cs301.its.uni-kassel.de its-cs301
141.51.205.212 its-cs302.its.uni-kassel.de its-cs302
141.51.205.213 its-cs303.its.uni-kassel.de its-cs303
141.51.205.214 its-cs304.its.uni-kassel.de its-cs304
141.51.205.215 its-cs305.its.uni-kassel.de its-cs305
141.51.205.216 its-cs306.its.uni-kassel.de its-cs306
141.51.205.217 its-cs307.its.uni-kassel.de its-cs307
141.51.205.218 its-cs308.its.uni-kassel.de its-cs308
141.51.205.219 its-cs309.its.uni-kassel.de its-cs309
141.51.205.220 its-cs310.its.uni-kassel.de its-cs310
141.51.205.221 its-cs311.its.uni-kassel.de its-cs311
141.51.205.222 its-cs312.its.uni-kassel.de its-cs312
141.51.205.223 its-cs313.its.uni-kassel.de its-cs313
141.51.205.224 its-cs314.its.uni-kassel.de its-cs314
141.51.205.225 its-cs315.its.uni-kassel.de its-cs315
141.51.205.226 its-cs316.its.uni-kassel.de its-cs316
141.51.205.227 its-cs317.its.uni-kassel.de its-cs317
141.51.205.228 its-cs318.its.uni-kassel.de its-cs318
141.51.205.229 its-cs319.its.uni-kassel.de its-cs319
141.51.205.230 its-cs320.its.uni-kassel.de its-cs320
141.51.205.231 its-cs321.its.uni-kassel.de its-cs321
141.51.205.232 its-cs322.its.uni-kassel.de its-cs322
141.51.205.233 its-cs323.its.uni-kassel.de its-cs323
141.51.205.234 its-cs324.its.uni-kassel.de its-cs324
141.51.205.235 its-cs325.its.uni-kassel.de its-cs325
141.51.205.236 its-cs326.its.uni-kassel.de its-cs326
141.51.205.237 its-cs327.its.uni-kassel.de its-cs327
141.51.205.238 its-cs328.its.uni-kassel.de its-cs328
141.51.205.239 its-cs329.its.uni-kassel.de its-cs329
141.51.205.240 its-cs330.its.uni-kassel.de its-cs330
141.51.205.241 its-cs331.its.uni-kassel.de its-cs331
141.51.205.242 its-cs332.its.uni-kassel.de its-cs332
141.51.205.243 its-cs333.its.uni-kassel.de its-cs333
141.51.205.244 its-cs334.its.uni-kassel.de its-cs334
141.51.205.245 its-cs335.its.uni-kassel.de its-cs335
141.51.205.246 its-cs336.its.uni-kassel.de its-cs336
141.51.205.247 its-cs337.its.uni-kassel.de its-cs337
141.51.205.248 its-cs338.its.uni-kassel.de its-cs338
141.51.205.249 its-cs339.its.uni-kassel.de its-cs339
141.51.205.250 its-cs340.its.uni-kassel.de its-cs340
141.51.205.251 its-cs341.its.uni-kassel.de its-cs341
141.51.205.252 its-cs342.its.uni-kassel.de its-cs342
141.51.205.253 its-cs343.its.uni-kassel.de its-cs343
141.51.205.254 its-cs344.its.uni-kassel.de its-cs344
192.168.205.210 its-no300.its.uni-kassel.de its-no300
192.168.205.211 its-no301.its.uni-kassel.de its-no301
192.168.205.212 its-no302.its.uni-kassel.de its-no302
192.168.205.213 its-no303.its.uni-kassel.de its-no303
192.168.205.214 its-no304.its.uni-kassel.de its-no304
192.168.205.215 its-no305.its.uni-kassel.de its-no305
192.168.205.216 its-no306.its.uni-kassel.de its-no306
192.168.205.217 its-no307.its.uni-kassel.de its-no307
192.168.205.218 its-no308.its.uni-kassel.de its-no308
192.168.205.219 its-no309.its.uni-kassel.de its-no309
192.168.205.220 its-no310.its.uni-kassel.de its-no310
192.168.205.221 its-no311.its.uni-kassel.de its-no311
192.168.205.222 its-no312.its.uni-kassel.de its-no312
192.168.205.223 its-no313.its.uni-kassel.de its-no313
192.168.205.224 its-no314.its.uni-kassel.de its-no314
192.168.205.225 its-no315.its.uni-kassel.de its-no315
192.168.205.226 its-no316.its.uni-kassel.de its-no316
192.168.205.227 its-no317.its.uni-kassel.de its-no317
192.168.205.228 its-no318.its.uni-kassel.de its-no318
192.168.205.229 its-no319.its.uni-kassel.de its-no319
192.168.205.230 its-no320.its.uni-kassel.de its-no320
192.168.205.231 its-no321.its.uni-kassel.de its-no321
192.168.205.232 its-no322.its.uni-kassel.de its-no322
192.168.205.233 its-no323.its.uni-kassel.de its-no323
192.168.205.234 its-no324.its.uni-kassel.de its-no324
192.168.205.235 its-no325.its.uni-kassel.de its-no325
192.168.205.236 its-no326.its.uni-kassel.de its-no326
192.168.205.237 its-no327.its.uni-kassel.de its-no327
192.168.205.238 its-no328.its.uni-kassel.de its-no328
192.168.205.239 its-no329.its.uni-kassel.de its-no329
192.168.205.240 its-no330.its.uni-kassel.de its-no330
192.168.205.241 its-no331.its.uni-kassel.de its-no331
192.168.205.242 its-no332.its.uni-kassel.de its-no332
192.168.205.243 its-no333.its.uni-kassel.de its-no333
192.168.205.244 its-no334.its.uni-kassel.de its-no334
192.168.205.245 its-no335.its.uni-kassel.de its-no335
192.168.205.246 its-no336.its.uni-kassel.de its-no336
192.168.205.247 its-no337.its.uni-kassel.de its-no337
192.168.205.248 its-no338.its.uni-kassel.de its-no338
192.168.205.249 its-no339.its.uni-kassel.de its-no339
192.168.205.250 its-no340.its.uni-kassel.de its-no340
192.168.205.251 its-no341.its.uni-kassel.de its-no341
192.168.205.252 its-no342.its.uni-kassel.de its-no342
192.168.205.253 its-no343.its.uni-kassel.de its-no343
192.168.205.254 its-no344.its.uni-kassel.de its-no344
192.168.169.210 its-ib300.its.uni-kassel.de its-ib300
192.168.169.211 its-ib301.its.uni-kassel.de its-ib301
192.168.169.212 its-ib302.its.uni-kassel.de its-ib302
192.168.169.213 its-ib303.its.uni-kassel.de its-ib303
192.168.169.214 its-ib304.its.uni-kassel.de its-ib304
192.168.169.215 its-ib305.its.uni-kassel.de its-ib305
192.168.169.216 its-ib306.its.uni-kassel.de its-ib306
192.168.169.217 its-ib307.its.uni-kassel.de its-ib307
192.168.169.218 its-ib308.its.uni-kassel.de its-ib308
192.168.169.219 its-ib309.its.uni-kassel.de its-ib309
192.168.169.220 its-ib310.its.uni-kassel.de its-ib310
192.168.169.221 its-ib311.its.uni-kassel.de its-ib311
192.168.169.222 its-ib312.its.uni-kassel.de its-ib312
192.168.169.223 its-ib313.its.uni-kassel.de its-ib313
192.168.169.224 its-ib314.its.uni-kassel.de its-ib314
192.168.169.225 its-ib315.its.uni-kassel.de its-ib315
192.168.169.226 its-ib316.its.uni-kassel.de its-ib316
192.168.169.227 its-ib317.its.uni-kassel.de its-ib317
192.168.169.228 its-ib318.its.uni-kassel.de its-ib318
192.168.169.229 its-ib319.its.uni-kassel.de its-ib319
192.168.169.230 its-ib320.its.uni-kassel.de its-ib320
192.168.169.231 its-ib321.its.uni-kassel.de its-ib321
192.168.169.232 its-ib322.its.uni-kassel.de its-ib322
192.168.169.233 its-ib323.its.uni-kassel.de its-ib323
192.168.169.234 its-ib324.its.uni-kassel.de its-ib324
192.168.169.235 its-ib325.its.uni-kassel.de its-ib325
192.168.169.236 its-ib326.its.uni-kassel.de its-ib326
192.168.169.237 its-ib327.its.uni-kassel.de its-ib327
192.168.169.238 its-ib328.its.uni-kassel.de its-ib328
192.168.169.239 its-ib329.its.uni-kassel.de its-ib329
192.168.169.240 its-ib330.its.uni-kassel.de its-ib330
192.168.169.241 its-ib331.its.uni-kassel.de its-ib331
192.168.169.242 its-ib332.its.uni-kassel.de its-ib332
192.168.169.243 its-ib333.its.uni-kassel.de its-ib333
192.168.169.244 its-ib334.its.uni-kassel.de its-ib334
192.168.169.245 its-ib335.its.uni-kassel.de its-ib335
192.168.169.246 its-ib336.its.uni-kassel.de its-ib336
192.168.169.247 its-ib337.its.uni-kassel.de its-ib337
192.168.169.248 its-ib338.its.uni-kassel.de its-ib338
192.168.169.249 its-ib339.its.uni-kassel.de its-ib339
192.168.169.250 its-ib340.its.uni-kassel.de its-ib340
192.168.169.251 its-ib341.its.uni-kassel.de its-ib341
192.168.169.252 its-ib342.its.uni-kassel.de its-ib342
192.168.169.253 its-ib343.its.uni-kassel.de its-ib343
192.168.169.254 its-ib344.its.uni-kassel.de its-ib344


> Hello there,
>
>      Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>     Hi,
>
>     i am currently trying to run my hadoop program on a cluster. Sadly
>     though my datanodes and tasktrackers seem to have difficulties
>     with their communication as their logs say:
>     * Some datanodes and tasktrackers seem to have portproblems of
>     some kind as it can be seen in the logs below. I wondered if this
>     might be due to reasons correllated with the localhost entry in
>     /etc/hosts as you can read in alot of posts with similar errors,
>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>     ping localhost... the technician of the cluster said he'd be
>     looking for the mechanics resolving localhost)
>     * The other nodes can not speak with the namenode and jobtracker
>     (its-cs131). Although it is absolutely not clear, why this is
>     happening: the "dfs -put" i do directly before the job is running
>     fine, which seems to imply that communication between those
>     servers is working flawlessly.
>
>     Is there any reason why this might happen?
>
>
>     Regards,
>     Elmar
>
>     LOGS BELOW:
>
>     \____Datanodes
>
>     After successfully putting the data to hdfs (at this point i
>     thought namenode and datanodes have to communicate), i get the
>     following errors when starting the job:
>
>     There are 2 kinds of logs i found: the first one is big (about
>     12MB) and looks like this:
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 1 time(s).
>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 2 time(s).
>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 3 time(s).
>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 4 time(s).
>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 5 time(s).
>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 6 time(s).
>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 7 time(s).
>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 8 time(s).
>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 9 time(s).
>     2012-08-13 08:23:36,335 WARN
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at $Proxy5.sendHeartbeat(Unknown Source)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>         at java.lang.Thread.run(Thread.java:619)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 5 more
>
>     ... (this continues til the end of the log)
>
>     The second is short kind:
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:19,038 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting DataNode
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:19,203 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:19,216 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:19,217 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:19,218 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>     metrics system started
>     2012-08-13 00:59:19,306 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:19,346 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>     2012-08-13 00:59:21,787 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>     FSDatasetStatusMBean
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     Shutting down all async disk service threads...
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     All async disk service threads have been shut down.
>     2012-08-13 00:59:21,898 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.BindException: Problem binding to /0.0.0.0:50010
>     <http://0.0.0.0:50010> : Address already in use
>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>     Caused by: java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>         ... 7 more
>
>     2012-08-13 00:59:21,899 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down DataNode at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>
>
>
>
>     \_____TastTracker
>     With TaskTrackers it is the same: there are 2 kinds.
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>     Resending 'status' to 'its-cs131' with reponseId '879
>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 1 time(s).
>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 2 time(s).
>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 3 time(s).
>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 4 time(s).
>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 5 time(s).
>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 6 time(s).
>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 7 time(s).
>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 8 time(s).
>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 9 time(s).
>     2012-08-13 02:10:04,651 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>         at
>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>         at
>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 6 more
>
>
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>     STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting TaskTracker
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:24,569 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:24,626 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>     metrics system started
>     2012-08-13 00:59:24,950 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>     org.mortbay.log.Slf4jLog
>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>     Added global filtersafety
>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>     2012-08-13 00:59:25,232 INFO
>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>     2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tasktracker with owner as bmacek
>     2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker:
>     Good mapred local directories are:
>     /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>     2012-08-13 00:59:25,244 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:25,255 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source jvm registered.
>     2012-08-13 00:59:25,256 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source TaskTrackerMetrics registered.
>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>     Starting SocketReader
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcDetailedActivityForPort54850 registered.
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcActivityForPort54850 registered.
>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>     Server Responder: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server listener on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 0 on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 1 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker up at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850>
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 3 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 2 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tracker
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting thread: Map-events fetcher for all reduce tasks on
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>     setsid exited with exit code 0
>     2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>     Using ResourceCalculatorPlugin :
>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>     2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager
>     is disabled.
>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>     IndexCache created with max memory = 10485760
>     2012-08-13 00:59:38,158 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ShuffleServerMetrics registered.
>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>     Port returned by webServer.getConnectors()[0].getLocalPort()
>     before open() is -1. Opening the listener on 50060
>     2012-08-13 00:59:38,161 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>     because java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at
>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>         at
>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>
>     2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>     SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down TaskTracker at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Sure i can, but it is long as it is a cluster:


141.51.12.86  hrz-cs400.hrz.uni-kassel.de hrz-cs400

141.51.204.11 hrz-cs401.hrz.uni-kassel.de hrz-cs401
141.51.204.12 hrz-cs402.hrz.uni-kassel.de hrz-cs402
141.51.204.13 hrz-cs403.hrz.uni-kassel.de hrz-cs403
141.51.204.14 hrz-cs404.hrz.uni-kassel.de hrz-cs404
141.51.204.15 hrz-cs405.hrz.uni-kassel.de hrz-cs405
141.51.204.16 hrz-cs406.hrz.uni-kassel.de hrz-cs406
141.51.204.17 hrz-cs407.hrz.uni-kassel.de hrz-cs407
141.51.204.18 hrz-cs408.hrz.uni-kassel.de hrz-cs408
141.51.204.19 hrz-cs409.hrz.uni-kassel.de hrz-cs409
141.51.204.20 hrz-cs410.hrz.uni-kassel.de hrz-cs410
141.51.204.21 hrz-cs411.hrz.uni-kassel.de hrz-cs411
141.51.204.22 hrz-cs412.hrz.uni-kassel.de hrz-cs412
141.51.204.23 hrz-cs413.hrz.uni-kassel.de hrz-cs413
141.51.204.24 hrz-cs414.hrz.uni-kassel.de hrz-cs414
141.51.204.25 hrz-cs415.hrz.uni-kassel.de hrz-cs415
141.51.204.26 hrz-cs416.hrz.uni-kassel.de hrz-cs416
141.51.204.27 hrz-cs417.hrz.uni-kassel.de hrz-cs417
141.51.204.28 hrz-cs418.hrz.uni-kassel.de hrz-cs418
141.51.204.29 hrz-cs419.hrz.uni-kassel.de hrz-cs419
141.51.204.31 hrz-cs421.hrz.uni-kassel.de hrz-cs421
141.51.204.32 hrz-cs422.hrz.uni-kassel.de hrz-cs422
141.51.204.33 hrz-cs423.hrz.uni-kassel.de hrz-cs423
141.51.204.34 hrz-cs424.hrz.uni-kassel.de hrz-cs424
141.51.204.35 hrz-cs425.hrz.uni-kassel.de hrz-cs425
141.51.204.36 hrz-cs426.hrz.uni-kassel.de hrz-cs426
141.51.204.37 hrz-cs427.hrz.uni-kassel.de hrz-cs427
141.51.204.38 hrz-cs428.hrz.uni-kassel.de hrz-cs428
141.51.204.39 hrz-cs429.hrz.uni-kassel.de hrz-cs429
141.51.204.40 hrz-cs430.hrz.uni-kassel.de hrz-cs430
141.51.204.47 hrz-cs437.hrz.uni-kassel.de hrz-cs437
141.51.204.48 hrz-cs438.hrz.uni-kassel.de hrz-cs438
141.51.204.49 hrz-cs439.hrz.uni-kassel.de hrz-cs439
141.51.204.50 hrz-cs440.hrz.uni-kassel.de hrz-cs440
141.51.204.51 hrz-cs441.hrz.uni-kassel.de hrz-cs441
141.51.204.54 hrz-cs444.hrz.uni-kassel.de hrz-cs444
141.51.204.65 hrz-cs455.hrz.uni-kassel.de hrz-cs455
141.51.204.66 hrz-cs456.hrz.uni-kassel.de hrz-cs456
141.51.204.69 hrz-cs459.hrz.uni-kassel.de hrz-cs459
141.51.204.70 hrz-cs460.hrz.uni-kassel.de hrz-cs460
141.51.204.71 hrz-cs461.hrz.uni-kassel.de hrz-cs461
141.51.204.72 hrz-cs462.hrz.uni-kassel.de hrz-cs462
141.51.204.73 hrz-cs463.hrz.uni-kassel.de hrz-cs463
141.51.204.74 hrz-cs464.hrz.uni-kassel.de hrz-cs464
141.51.204.75 hrz-cs465.hrz.uni-kassel.de hrz-cs465
141.51.204.76 hrz-cs466.hrz.uni-kassel.de hrz-cs466
141.51.204.77 hrz-cs467.hrz.uni-kassel.de hrz-cs467
141.51.204.78 hrz-cs468.hrz.uni-kassel.de hrz-cs468
141.51.204.79 hrz-cs469.hrz.uni-kassel.de hrz-cs469
141.51.204.80 hrz-cs470.hrz.uni-kassel.de hrz-cs470
141.51.204.81 hrz-cs471.hrz.uni-kassel.de hrz-cs471
141.51.204.82 hrz-cs472.hrz.uni-kassel.de hrz-cs472
141.51.204.83 hrz-cs473.hrz.uni-kassel.de hrz-cs473
141.51.204.84 hrz-cs474.hrz.uni-kassel.de hrz-cs474
141.51.204.85 hrz-cs475.hrz.uni-kassel.de hrz-cs475
141.51.204.86 hrz-cs476.hrz.uni-kassel.de hrz-cs476
141.51.204.87 hrz-cs477.hrz.uni-kassel.de hrz-cs477
141.51.204.88 hrz-cs478.hrz.uni-kassel.de hrz-cs478
141.51.204.89 hrz-cs479.hrz.uni-kassel.de hrz-cs479
141.51.204.90 hrz-cs480.hrz.uni-kassel.de hrz-cs480
141.51.204.91 hrz-cs481.hrz.uni-kassel.de hrz-cs481
141.51.204.92 hrz-cs482.hrz.uni-kassel.de hrz-cs482
141.51.204.93 hrz-cs483.hrz.uni-kassel.de hrz-cs483
141.51.204.94 hrz-cs484.hrz.uni-kassel.de hrz-cs484
141.51.204.95 hrz-cs485.hrz.uni-kassel.de hrz-cs485
141.51.204.96 hrz-cs486.hrz.uni-kassel.de hrz-cs486
141.51.204.97 hrz-cs487.hrz.uni-kassel.de hrz-cs487
141.51.204.98 hrz-cs488.hrz.uni-kassel.de hrz-cs488
141.51.204.99 hrz-cs489.hrz.uni-kassel.de hrz-cs489
141.51.204.100 hrz-cs490.hrz.uni-kassel.de hrz-cs490
141.51.204.101 hrz-cs491.hrz.uni-kassel.de hrz-cs491
141.51.204.102 hrz-cs492.hrz.uni-kassel.de hrz-cs492
141.51.204.103 hrz-cs493.hrz.uni-kassel.de hrz-cs493
141.51.204.104 hrz-cs494.hrz.uni-kassel.de hrz-cs494
141.51.204.105 hrz-cs495.hrz.uni-kassel.de hrz-cs495
141.51.204.106 hrz-cs496.hrz.uni-kassel.de hrz-cs496
141.51.204.107 hrz-cs497.hrz.uni-kassel.de hrz-cs497
141.51.204.108 hrz-cs498.hrz.uni-kassel.de hrz-cs498
141.51.204.109 hrz-cs499.hrz.uni-kassel.de hrz-cs499
141.51.204.110 hrz-cs500.hrz.uni-kassel.de hrz-cs500
141.51.204.111 hrz-cs501.hrz.uni-kassel.de hrz-cs501
141.51.204.112 hrz-cs502.hrz.uni-kassel.de hrz-cs502
141.51.204.113 hrz-cs503.hrz.uni-kassel.de hrz-cs503
141.51.204.114 hrz-cs504.hrz.uni-kassel.de hrz-cs504
141.51.204.115 hrz-cs505.hrz.uni-kassel.de hrz-cs505
141.51.204.116 hrz-cs506.hrz.uni-kassel.de hrz-cs506
141.51.204.117 hrz-cs507.hrz.uni-kassel.de hrz-cs507
141.51.204.118 hrz-cs508.hrz.uni-kassel.de hrz-cs508
141.51.204.119 hrz-cs509.hrz.uni-kassel.de hrz-cs509
141.51.204.120 hrz-cs510.hrz.uni-kassel.de hrz-cs510
141.51.204.121 hrz-cs511.hrz.uni-kassel.de hrz-cs511
141.51.204.122 hrz-cs512.hrz.uni-kassel.de hrz-cs512
141.51.204.123 hrz-cs513.hrz.uni-kassel.de hrz-cs513
141.51.204.124 hrz-cs514.hrz.uni-kassel.de hrz-cs514
141.51.204.125 hrz-cs515.hrz.uni-kassel.de hrz-cs515
141.51.204.126 hrz-cs516.hrz.uni-kassel.de hrz-cs516
141.51.204.127 hrz-cs517.hrz.uni-kassel.de hrz-cs517
141.51.204.128 hrz-cs518.hrz.uni-kassel.de hrz-cs518
141.51.204.129 hrz-cs519.hrz.uni-kassel.de hrz-cs519
141.51.204.130 hrz-cs520.hrz.uni-kassel.de hrz-cs520
141.51.204.131 hrz-cs521.hrz.uni-kassel.de hrz-cs521
141.51.204.132 hrz-cs522.hrz.uni-kassel.de hrz-cs522
141.51.204.133 hrz-cs523.hrz.uni-kassel.de hrz-cs523
141.51.204.134 hrz-cs524.hrz.uni-kassel.de hrz-cs524
141.51.204.135 hrz-cs525.hrz.uni-kassel.de hrz-cs525
141.51.204.136 hrz-cs526.hrz.uni-kassel.de hrz-cs526
141.51.204.137 hrz-cs527.hrz.uni-kassel.de hrz-cs527
141.51.204.138 hrz-cs528.hrz.uni-kassel.de hrz-cs528
141.51.204.139 hrz-cs529.hrz.uni-kassel.de hrz-cs529
141.51.204.140 hrz-cs530.hrz.uni-kassel.de hrz-cs530
141.51.204.141 hrz-cs531.hrz.uni-kassel.de hrz-cs531
141.51.204.142 hrz-cs532.hrz.uni-kassel.de hrz-cs532
141.51.204.143 hrz-cs533.hrz.uni-kassel.de hrz-cs533
141.51.204.144 hrz-cs534.hrz.uni-kassel.de hrz-cs534
141.51.204.145 hrz-cs535.hrz.uni-kassel.de hrz-cs535
141.51.204.146 hrz-cs536.hrz.uni-kassel.de hrz-cs536
141.51.204.147 hrz-cs537.hrz.uni-kassel.de hrz-cs537
141.51.204.148 hrz-cs538.hrz.uni-kassel.de hrz-cs538
141.51.204.149 hrz-cs539.hrz.uni-kassel.de hrz-cs539
141.51.204.150 hrz-cs540.hrz.uni-kassel.de hrz-cs540
141.51.204.151 hrz-cs541.hrz.uni-kassel.de hrz-cs541
141.51.204.152 hrz-cs542.hrz.uni-kassel.de hrz-cs542
141.51.204.153 hrz-cs543.hrz.uni-kassel.de hrz-cs543
141.51.204.154 hrz-cs544.hrz.uni-kassel.de hrz-cs544
141.51.204.155 hrz-cs545.hrz.uni-kassel.de hrz-cs545
141.51.204.156 hrz-cs546.hrz.uni-kassel.de hrz-cs546
141.51.204.157 hrz-cs547.hrz.uni-kassel.de hrz-cs547
141.51.204.158 hrz-cs548.hrz.uni-kassel.de hrz-cs548
141.51.204.159 hrz-cs549.hrz.uni-kassel.de hrz-cs549
141.51.204.160 hrz-cs550.hrz.uni-kassel.de hrz-cs550
141.51.204.161 hrz-cs551.hrz.uni-kassel.de hrz-cs551
141.51.204.162 hrz-cs552.hrz.uni-kassel.de hrz-cs552
141.51.204.163 hrz-cs553.hrz.uni-kassel.de hrz-cs553
141.51.204.164 hrz-cs554.hrz.uni-kassel.de hrz-cs554
141.51.204.165 hrz-cs555.hrz.uni-kassel.de hrz-cs555
141.51.204.166 hrz-cs556.hrz.uni-kassel.de hrz-cs556
141.51.204.167 hrz-cs557.hrz.uni-kassel.de hrz-cs557
141.51.204.168 hrz-cs558.hrz.uni-kassel.de hrz-cs558
141.51.204.169 hrz-cs559.hrz.uni-kassel.de hrz-cs559
141.51.204.215 hrz-cs560.hrz.uni-kassel.de hrz-cs560
141.51.204.216 hrz-cs561.hrz.uni-kassel.de hrz-cs561
141.51.204.217 hrz-cs562.hrz.uni-kassel.de hrz-cs562
141.51.204.218 hrz-cs563.hrz.uni-kassel.de hrz-cs563
141.51.204.219 hrz-cs564.hrz.uni-kassel.de hrz-cs564
141.51.204.220 hrz-cs565.hrz.uni-kassel.de hrz-cs565
141.51.204.221 hrz-cs566.hrz.uni-kassel.de hrz-cs566
141.51.204.222 hrz-cs567.hrz.uni-kassel.de hrz-cs567
141.51.204.223 hrz-cs568.hrz.uni-kassel.de hrz-cs568
141.51.204.224 hrz-cs569.hrz.uni-kassel.de hrz-cs569
141.51.204.225 hrz-cs570.hrz.uni-kassel.de hrz-cs570
141.51.204.226 hrz-cs571.hrz.uni-kassel.de hrz-cs571
141.51.204.227 hrz-cs572.hrz.uni-kassel.de hrz-cs572
141.51.204.228 hrz-cs573.hrz.uni-kassel.de hrz-cs573
141.51.204.229 hrz-cs574.hrz.uni-kassel.de hrz-cs574
141.51.204.230 hrz-cs575.hrz.uni-kassel.de hrz-cs575
141.51.204.231 hrz-cs576.hrz.uni-kassel.de hrz-cs576
141.51.204.232 hrz-cs577.hrz.uni-kassel.de hrz-cs577
141.51.204.233 hrz-cs578.hrz.uni-kassel.de hrz-cs578
141.51.204.234 hrz-cs579.hrz.uni-kassel.de hrz-cs579
141.51.204.235 hrz-cs580.hrz.uni-kassel.de hrz-cs580
141.51.204.236 hrz-cs581.hrz.uni-kassel.de hrz-cs581
141.51.204.237 hrz-cs582.hrz.uni-kassel.de hrz-cs582
141.51.204.238 hrz-cs583.hrz.uni-kassel.de hrz-cs583
141.51.204.239 hrz-cs584.hrz.uni-kassel.de hrz-cs584
141.51.204.240 hrz-cs585.hrz.uni-kassel.de hrz-cs585
141.51.204.241 hrz-cs586.hrz.uni-kassel.de hrz-cs586
141.51.204.242 hrz-cs587.hrz.uni-kassel.de hrz-cs587
141.51.204.243 hrz-cs588.hrz.uni-kassel.de hrz-cs588
141.51.204.244 hrz-cs589.hrz.uni-kassel.de hrz-cs589
141.51.204.245 hrz-cs590.hrz.uni-kassel.de hrz-cs590
141.51.204.246 hrz-cs591.hrz.uni-kassel.de hrz-cs591
141.51.204.247 hrz-cs592.hrz.uni-kassel.de hrz-cs592
141.51.204.248 hrz-cs593.hrz.uni-kassel.de hrz-cs593
141.51.204.249 hrz-cs594.hrz.uni-kassel.de hrz-cs594
141.51.204.250 hrz-cs595.hrz.uni-kassel.de hrz-cs595
141.51.204.251 hrz-cs596.hrz.uni-kassel.de hrz-cs596
141.51.204.252 hrz-cs597.hrz.uni-kassel.de hrz-cs597
141.51.204.253 hrz-cs598.hrz.uni-kassel.de hrz-cs598
141.51.204.254 hrz-cs599.hrz.uni-kassel.de hrz-cs599


192.168.204.11 hrz-no401.hrz.uni-kassel.de hrz-no401
192.168.204.12 hrz-no402.hrz.uni-kassel.de hrz-no402
192.168.204.13 hrz-no403.hrz.uni-kassel.de hrz-no403
192.168.204.14 hrz-no404.hrz.uni-kassel.de hrz-no404
192.168.204.15 hrz-no405.hrz.uni-kassel.de hrz-no405
192.168.204.16 hrz-no406.hrz.uni-kassel.de hrz-no406
192.168.204.17 hrz-no407.hrz.uni-kassel.de hrz-no407
192.168.204.18 hrz-no408.hrz.uni-kassel.de hrz-no408
192.168.204.19 hrz-no409.hrz.uni-kassel.de hrz-no409
192.168.204.20 hrz-no410.hrz.uni-kassel.de hrz-no410
192.168.204.21 hrz-no411.hrz.uni-kassel.de hrz-no411
192.168.204.22 hrz-no412.hrz.uni-kassel.de hrz-no412
192.168.204.23 hrz-no413.hrz.uni-kassel.de hrz-no413
192.168.204.24 hrz-no414.hrz.uni-kassel.de hrz-no414
192.168.204.25 hrz-no415.hrz.uni-kassel.de hrz-no415
192.168.204.26 hrz-no416.hrz.uni-kassel.de hrz-no416
192.168.204.27 hrz-no417.hrz.uni-kassel.de hrz-no417
192.168.204.28 hrz-no418.hrz.uni-kassel.de hrz-no418
192.168.204.29 hrz-no419.hrz.uni-kassel.de hrz-no419
192.168.204.31 hrz-no421.hrz.uni-kassel.de hrz-no421
192.168.204.32 hrz-no422.hrz.uni-kassel.de hrz-no422
192.168.204.33 hrz-no423.hrz.uni-kassel.de hrz-no423
192.168.204.34 hrz-no424.hrz.uni-kassel.de hrz-no424
192.168.204.35 hrz-no425.hrz.uni-kassel.de hrz-no425
192.168.204.36 hrz-no426.hrz.uni-kassel.de hrz-no426
192.168.204.37 hrz-no427.hrz.uni-kassel.de hrz-no427
192.168.204.38 hrz-no428.hrz.uni-kassel.de hrz-no428
192.168.204.39 hrz-no429.hrz.uni-kassel.de hrz-no429
192.168.204.40 hrz-no430.hrz.uni-kassel.de hrz-no430
192.168.204.47 hrz-no437.hrz.uni-kassel.de hrz-no437
192.168.204.48 hrz-no438.hrz.uni-kassel.de hrz-no438
192.168.204.49 hrz-no439.hrz.uni-kassel.de hrz-no439
192.168.204.50 hrz-no440.hrz.uni-kassel.de hrz-no440
192.168.204.51 hrz-no441.hrz.uni-kassel.de hrz-no441
192.168.204.54 hrz-no444.hrz.uni-kassel.de hrz-no444
192.168.204.65 hrz-no455.hrz.uni-kassel.de hrz-no455
192.168.204.66 hrz-no456.hrz.uni-kassel.de hrz-no456
192.168.204.69 hrz-no459.hrz.uni-kassel.de hrz-no459
192.168.204.70 hrz-no460.hrz.uni-kassel.de hrz-no460
192.168.204.71 hrz-no461.hrz.uni-kassel.de hrz-no461
192.168.204.72 hrz-no462.hrz.uni-kassel.de hrz-no462
192.168.204.73 hrz-no463.hrz.uni-kassel.de hrz-no463
192.168.204.74 hrz-no464.hrz.uni-kassel.de hrz-no464
192.168.204.75 hrz-no465.hrz.uni-kassel.de hrz-no465
192.168.204.76 hrz-no466.hrz.uni-kassel.de hrz-no466
192.168.204.77 hrz-no467.hrz.uni-kassel.de hrz-no467
192.168.204.78 hrz-no468.hrz.uni-kassel.de hrz-no468
192.168.204.79 hrz-no469.hrz.uni-kassel.de hrz-no469
192.168.204.80 hrz-no470.hrz.uni-kassel.de hrz-no470
192.168.204.81 hrz-no471.hrz.uni-kassel.de hrz-no471
192.168.204.82 hrz-no472.hrz.uni-kassel.de hrz-no472
192.168.204.83 hrz-no473.hrz.uni-kassel.de hrz-no473
192.168.204.84 hrz-no474.hrz.uni-kassel.de hrz-no474
192.168.204.85 hrz-no475.hrz.uni-kassel.de hrz-no475
192.168.204.86 hrz-no476.hrz.uni-kassel.de hrz-no476
192.168.204.87 hrz-no477.hrz.uni-kassel.de hrz-no477
192.168.204.88 hrz-no478.hrz.uni-kassel.de hrz-no478
192.168.204.89 hrz-no479.hrz.uni-kassel.de hrz-no479
192.168.204.90 hrz-no480.hrz.uni-kassel.de hrz-no480
192.168.204.91 hrz-no481.hrz.uni-kassel.de hrz-no481
192.168.204.92 hrz-no482.hrz.uni-kassel.de hrz-no482
192.168.204.93 hrz-no483.hrz.uni-kassel.de hrz-no483
192.168.204.94 hrz-no484.hrz.uni-kassel.de hrz-no484
192.168.204.95 hrz-no485.hrz.uni-kassel.de hrz-no485
192.168.204.96 hrz-no486.hrz.uni-kassel.de hrz-no486
192.168.204.97 hrz-no487.hrz.uni-kassel.de hrz-no487
192.168.204.98 hrz-no488.hrz.uni-kassel.de hrz-no488
192.168.204.99 hrz-no489.hrz.uni-kassel.de hrz-no489
192.168.204.100 hrz-no490.hrz.uni-kassel.de hrz-no490
192.168.204.101 hrz-no491.hrz.uni-kassel.de hrz-no491
192.168.204.102 hrz-no492.hrz.uni-kassel.de hrz-no492
192.168.204.103 hrz-no493.hrz.uni-kassel.de hrz-no493
192.168.204.104 hrz-no494.hrz.uni-kassel.de hrz-no494
192.168.204.105 hrz-no495.hrz.uni-kassel.de hrz-no495
192.168.204.106 hrz-no496.hrz.uni-kassel.de hrz-no496
192.168.204.107 hrz-no497.hrz.uni-kassel.de hrz-no497
192.168.204.108 hrz-no498.hrz.uni-kassel.de hrz-no498
192.168.204.109 hrz-no499.hrz.uni-kassel.de hrz-no499
192.168.204.110 hrz-no500.hrz.uni-kassel.de hrz-no500
192.168.204.111 hrz-no501.hrz.uni-kassel.de hrz-no501
192.168.204.112 hrz-no502.hrz.uni-kassel.de hrz-no502
192.168.204.113 hrz-no503.hrz.uni-kassel.de hrz-no503
192.168.204.114 hrz-no504.hrz.uni-kassel.de hrz-no504
192.168.204.115 hrz-no505.hrz.uni-kassel.de hrz-no505
192.168.204.116 hrz-no506.hrz.uni-kassel.de hrz-no506
192.168.204.117 hrz-no507.hrz.uni-kassel.de hrz-no507
192.168.204.118 hrz-no508.hrz.uni-kassel.de hrz-no508
192.168.204.119 hrz-no509.hrz.uni-kassel.de hrz-no509
192.168.204.120 hrz-no510.hrz.uni-kassel.de hrz-no510
192.168.204.121 hrz-no511.hrz.uni-kassel.de hrz-no511
192.168.204.122 hrz-no512.hrz.uni-kassel.de hrz-no512
192.168.204.123 hrz-no513.hrz.uni-kassel.de hrz-no513
192.168.204.124 hrz-no514.hrz.uni-kassel.de hrz-no514
192.168.204.125 hrz-no515.hrz.uni-kassel.de hrz-no515
192.168.204.126 hrz-no516.hrz.uni-kassel.de hrz-no516
192.168.204.127 hrz-no517.hrz.uni-kassel.de hrz-no517
192.168.204.128 hrz-no518.hrz.uni-kassel.de hrz-no518
192.168.204.129 hrz-no519.hrz.uni-kassel.de hrz-no519
192.168.204.130 hrz-no520.hrz.uni-kassel.de hrz-no520
192.168.204.131 hrz-no521.hrz.uni-kassel.de hrz-no521
192.168.204.132 hrz-no522.hrz.uni-kassel.de hrz-no522
192.168.204.133 hrz-no523.hrz.uni-kassel.de hrz-no523
192.168.204.134 hrz-no524.hrz.uni-kassel.de hrz-no524
192.168.204.135 hrz-no525.hrz.uni-kassel.de hrz-no525
192.168.204.136 hrz-no526.hrz.uni-kassel.de hrz-no526
192.168.204.137 hrz-no527.hrz.uni-kassel.de hrz-no527
192.168.204.138 hrz-no528.hrz.uni-kassel.de hrz-no528
192.168.204.139 hrz-no529.hrz.uni-kassel.de hrz-no529
192.168.204.140 hrz-no530.hrz.uni-kassel.de hrz-no530
192.168.204.141 hrz-no531.hrz.uni-kassel.de hrz-no531
192.168.204.142 hrz-no532.hrz.uni-kassel.de hrz-no532
192.168.204.143 hrz-no533.hrz.uni-kassel.de hrz-no533
192.168.204.144 hrz-no534.hrz.uni-kassel.de hrz-no534
192.168.204.145 hrz-no535.hrz.uni-kassel.de hrz-no535
192.168.204.146 hrz-no536.hrz.uni-kassel.de hrz-no536
192.168.204.147 hrz-no537.hrz.uni-kassel.de hrz-no537
192.168.204.148 hrz-no538.hrz.uni-kassel.de hrz-no538
192.168.204.149 hrz-no539.hrz.uni-kassel.de hrz-no539
192.168.204.150 hrz-no540.hrz.uni-kassel.de hrz-no540
192.168.204.151 hrz-no541.hrz.uni-kassel.de hrz-no541
192.168.204.152 hrz-no542.hrz.uni-kassel.de hrz-no542
192.168.204.153 hrz-no543.hrz.uni-kassel.de hrz-no543
192.168.204.154 hrz-no544.hrz.uni-kassel.de hrz-no544
192.168.204.155 hrz-no545.hrz.uni-kassel.de hrz-no545
192.168.204.156 hrz-no546.hrz.uni-kassel.de hrz-no546
192.168.204.157 hrz-no547.hrz.uni-kassel.de hrz-no547
192.168.204.158 hrz-no548.hrz.uni-kassel.de hrz-no548
192.168.204.159 hrz-no549.hrz.uni-kassel.de hrz-no549
192.168.204.160 hrz-no550.hrz.uni-kassel.de hrz-no550
192.168.204.161 hrz-no551.hrz.uni-kassel.de hrz-no551
192.168.204.162 hrz-no552.hrz.uni-kassel.de hrz-no552
192.168.204.163 hrz-no553.hrz.uni-kassel.de hrz-no553
192.168.204.164 hrz-no554.hrz.uni-kassel.de hrz-no554
192.168.204.165 hrz-no555.hrz.uni-kassel.de hrz-no555
192.168.204.166 hrz-no556.hrz.uni-kassel.de hrz-no556
192.168.204.167 hrz-no557.hrz.uni-kassel.de hrz-no557
192.168.204.168 hrz-no558.hrz.uni-kassel.de hrz-no558
192.168.204.169 hrz-no559.hrz.uni-kassel.de hrz-no559
192.168.204.215 hrz-no560.hrz.uni-kassel.de hrz-no560
192.168.204.216 hrz-no561.hrz.uni-kassel.de hrz-no561
192.168.204.217 hrz-no562.hrz.uni-kassel.de hrz-no562
192.168.204.218 hrz-no563.hrz.uni-kassel.de hrz-no563
192.168.204.219 hrz-no564.hrz.uni-kassel.de hrz-no564
192.168.204.220 hrz-no565.hrz.uni-kassel.de hrz-no565
192.168.204.221 hrz-no566.hrz.uni-kassel.de hrz-no566
192.168.204.222 hrz-no567.hrz.uni-kassel.de hrz-no567
192.168.204.223 hrz-no568.hrz.uni-kassel.de hrz-no568
192.168.204.224 hrz-no569.hrz.uni-kassel.de hrz-no569
192.168.204.225 hrz-no570.hrz.uni-kassel.de hrz-no570
192.168.204.226 hrz-no571.hrz.uni-kassel.de hrz-no571
192.168.204.227 hrz-no572.hrz.uni-kassel.de hrz-no572
192.168.204.228 hrz-no573.hrz.uni-kassel.de hrz-no573
192.168.204.229 hrz-no574.hrz.uni-kassel.de hrz-no574
192.168.204.230 hrz-no575.hrz.uni-kassel.de hrz-no575
192.168.204.231 hrz-no576.hrz.uni-kassel.de hrz-no576
192.168.204.232 hrz-no577.hrz.uni-kassel.de hrz-no577
192.168.204.233 hrz-no578.hrz.uni-kassel.de hrz-no578
192.168.204.234 hrz-no579.hrz.uni-kassel.de hrz-no579
192.168.204.235 hrz-no580.hrz.uni-kassel.de hrz-no580
192.168.204.236 hrz-no581.hrz.uni-kassel.de hrz-no581
192.168.204.237 hrz-no582.hrz.uni-kassel.de hrz-no582
192.168.204.238 hrz-no583.hrz.uni-kassel.de hrz-no583
192.168.204.239 hrz-no584.hrz.uni-kassel.de hrz-no584
192.168.204.240 hrz-no585.hrz.uni-kassel.de hrz-no585
192.168.204.241 hrz-no586.hrz.uni-kassel.de hrz-no586
192.168.204.242 hrz-no587.hrz.uni-kassel.de hrz-no587
192.168.204.243 hrz-no588.hrz.uni-kassel.de hrz-no588
192.168.204.244 hrz-no589.hrz.uni-kassel.de hrz-no589
192.168.204.245 hrz-no590.hrz.uni-kassel.de hrz-no590
192.168.204.246 hrz-no591.hrz.uni-kassel.de hrz-no591
192.168.204.247 hrz-no592.hrz.uni-kassel.de hrz-no592
192.168.204.248 hrz-no593.hrz.uni-kassel.de hrz-no593
192.168.204.249 hrz-no594.hrz.uni-kassel.de hrz-no594
192.168.204.250 hrz-no595.hrz.uni-kassel.de hrz-no595
192.168.204.251 hrz-no596.hrz.uni-kassel.de hrz-no596
192.168.204.252 hrz-no597.hrz.uni-kassel.de hrz-no597
192.168.204.253 hrz-no598.hrz.uni-kassel.de hrz-no598
192.168.204.254 hrz-no599.hrz.uni-kassel.de hrz-no599

141.51.204.190    hrz-gc100 hrz-gc100.hrz.uni-kassel.de
141.51.204.191    hrz-gc101.hrz.uni-kassel.de hrz-gc101
141.51.204.192    hrz-gc102.hrz.uni-kassel.de hrz-gc102
141.51.204.193    hrz-gc103.hrz.uni-kassel.de hrz-gc103
141.51.204.194    hrz-gc104.hrz.uni-kassel.de hrz-gc104
141.51.204.195    hrz-gc105.hrz.uni-kassel.de hrz-gc105
141.51.204.196    hrz-gc106.hrz.uni-kassel.de hrz-gc106
141.51.204.197    hrz-gc107.hrz.uni-kassel.de hrz-gc107
141.51.204.198    hrz-gc108.hrz.uni-kassel.de hrz-gc108
141.51.204.199    hrz-gc109.hrz.uni-kassel.de hrz-gc109
141.51.204.200    hrz-gc110.hrz.uni-kassel.de hrz-gc110
141.51.204.201    hrz-gc111.hrz.uni-kassel.de hrz-gc111
141.51.204.202    hrz-gc112.hrz.uni-kassel.de hrz-gc112
141.51.204.203    hrz-gc113.hrz.uni-kassel.de hrz-gc113
141.51.204.204    hrz-gc114.hrz.uni-kassel.de hrz-gc114
141.51.204.205    hrz-gc115.hrz.uni-kassel.de hrz-gc115
141.51.204.206    hrz-gc116.hrz.uni-kassel.de hrz-gc116
141.51.204.207    hrz-gc117.hrz.uni-kassel.de hrz-gc117
141.51.204.208    hrz-gc118.hrz.uni-kassel.de hrz-gc118
141.51.204.209    hrz-gc119.hrz.uni-kassel.de hrz-gc119
141.51.204.210    hrz-gc120.hrz.uni-kassel.de hrz-gc120

# Cluster neu
141.51.204.30 its-cs1.its.uni-kassel.de its-cs1
141.51.204.170 its-cs10.its.uni-kassel.de its-cs10
141.51.204.171 its-cs11.its.uni-kassel.de its-cs11
141.51.204.172 its-cs12.its.uni-kassel.de its-cs12
141.51.204.173 its-cs13.its.uni-kassel.de its-cs13
141.51.204.174 its-cs14.its.uni-kassel.de its-cs14
141.51.204.175 its-cs15.its.uni-kassel.de its-cs15
141.51.204.176 its-cs16.its.uni-kassel.de its-cs16
141.51.204.177 its-cs17.its.uni-kassel.de its-cs17
141.51.204.178 its-cs18.its.uni-kassel.de its-cs18
141.51.204.179 its-cs19.its.uni-kassel.de its-cs19
141.51.205.10 its-cs100.its.uni-kassel.de its-cs100
141.51.205.11 its-cs101.its.uni-kassel.de its-cs101
141.51.205.12 its-cs102.its.uni-kassel.de its-cs102
141.51.205.13 its-cs103.its.uni-kassel.de its-cs103
141.51.205.14 its-cs104.its.uni-kassel.de its-cs104
141.51.205.15 its-cs105.its.uni-kassel.de its-cs105
141.51.205.16 its-cs106.its.uni-kassel.de its-cs106
141.51.205.17 its-cs107.its.uni-kassel.de its-cs107
141.51.205.18 its-cs108.its.uni-kassel.de its-cs108
141.51.205.19 its-cs109.its.uni-kassel.de its-cs109
141.51.205.20 its-cs110.its.uni-kassel.de its-cs110
141.51.205.21 its-cs111.its.uni-kassel.de its-cs111
141.51.205.22 its-cs112.its.uni-kassel.de its-cs112
141.51.205.23 its-cs113.its.uni-kassel.de its-cs113
141.51.205.24 its-cs114.its.uni-kassel.de its-cs114
141.51.205.25 its-cs115.its.uni-kassel.de its-cs115
141.51.205.26 its-cs116.its.uni-kassel.de its-cs116
141.51.205.27 its-cs117.its.uni-kassel.de its-cs117
141.51.205.28 its-cs118.its.uni-kassel.de its-cs118
141.51.205.29 its-cs119.its.uni-kassel.de its-cs119
141.51.205.30 its-cs120.its.uni-kassel.de its-cs120
141.51.205.31 its-cs121.its.uni-kassel.de its-cs121
141.51.205.32 its-cs122.its.uni-kassel.de its-cs122
141.51.205.33 its-cs123.its.uni-kassel.de its-cs123
141.51.205.34 its-cs124.its.uni-kassel.de its-cs124
141.51.205.35 its-cs125.its.uni-kassel.de its-cs125
141.51.205.36 its-cs126.its.uni-kassel.de its-cs126
141.51.205.37 its-cs127.its.uni-kassel.de its-cs127
141.51.205.38 its-cs128.its.uni-kassel.de its-cs128
141.51.205.39 its-cs129.its.uni-kassel.de its-cs129
141.51.205.40 its-cs130.its.uni-kassel.de its-cs130
141.51.205.41 its-cs131.its.uni-kassel.de its-cs131
141.51.205.42 its-cs132.its.uni-kassel.de its-cs132
141.51.205.43 its-cs133.its.uni-kassel.de its-cs133
141.51.205.44 its-cs134.its.uni-kassel.de its-cs134
141.51.205.45 its-cs135.its.uni-kassel.de its-cs135
141.51.205.46 its-cs136.its.uni-kassel.de its-cs136
141.51.205.47 its-cs137.its.uni-kassel.de its-cs137
141.51.205.48 its-cs138.its.uni-kassel.de its-cs138
141.51.205.49 its-cs139.its.uni-kassel.de its-cs139
141.51.205.50 its-cs140.its.uni-kassel.de its-cs140
141.51.205.51 its-cs141.its.uni-kassel.de its-cs141
141.51.205.52 its-cs142.its.uni-kassel.de its-cs142
141.51.205.53 its-cs143.its.uni-kassel.de its-cs143
141.51.205.54 its-cs144.its.uni-kassel.de its-cs144
141.51.205.55 its-cs145.its.uni-kassel.de its-cs145
141.51.205.56 its-cs146.its.uni-kassel.de its-cs146
141.51.205.57 its-cs147.its.uni-kassel.de its-cs147
141.51.205.58 its-cs148.its.uni-kassel.de its-cs148
141.51.205.59 its-cs149.its.uni-kassel.de its-cs149
141.51.205.60 its-cs150.its.uni-kassel.de its-cs150
141.51.205.61 its-cs151.its.uni-kassel.de its-cs151
141.51.205.62 its-cs152.its.uni-kassel.de its-cs152
141.51.205.63 its-cs153.its.uni-kassel.de its-cs153
141.51.205.64 its-cs154.its.uni-kassel.de its-cs154
141.51.205.65 its-cs155.its.uni-kassel.de its-cs155
141.51.205.66 its-cs156.its.uni-kassel.de its-cs156
141.51.205.67 its-cs157.its.uni-kassel.de its-cs157
141.51.205.68 its-cs158.its.uni-kassel.de its-cs158
141.51.205.69 its-cs159.its.uni-kassel.de its-cs159
141.51.205.70 its-cs160.its.uni-kassel.de its-cs160
141.51.205.71 its-cs161.its.uni-kassel.de its-cs161
141.51.205.72 its-cs162.its.uni-kassel.de its-cs162
141.51.205.73 its-cs163.its.uni-kassel.de its-cs163
141.51.205.74 its-cs164.its.uni-kassel.de its-cs164
141.51.205.75 its-cs165.its.uni-kassel.de its-cs165
141.51.205.76 its-cs166.its.uni-kassel.de its-cs166
141.51.205.77 its-cs167.its.uni-kassel.de its-cs167
141.51.205.78 its-cs168.its.uni-kassel.de its-cs168
141.51.205.79 its-cs169.its.uni-kassel.de its-cs169
141.51.205.80 its-cs170.its.uni-kassel.de its-cs170
141.51.205.81 its-cs171.its.uni-kassel.de its-cs171
141.51.205.82 its-cs172.its.uni-kassel.de its-cs172
141.51.205.83 its-cs173.its.uni-kassel.de its-cs173
141.51.205.84 its-cs174.its.uni-kassel.de its-cs174
141.51.205.85 its-cs175.its.uni-kassel.de its-cs175
141.51.205.86 its-cs176.its.uni-kassel.de its-cs176
141.51.205.87 its-cs177.its.uni-kassel.de its-cs177
141.51.205.88 its-cs178.its.uni-kassel.de its-cs178
141.51.205.89 its-cs179.its.uni-kassel.de its-cs179
141.51.205.90 its-cs180.its.uni-kassel.de its-cs180
141.51.205.91 its-cs181.its.uni-kassel.de its-cs181
141.51.205.92 its-cs182.its.uni-kassel.de its-cs182
141.51.205.93 its-cs183.its.uni-kassel.de its-cs183
141.51.205.94 its-cs184.its.uni-kassel.de its-cs184
141.51.205.95 its-cs185.its.uni-kassel.de its-cs185
141.51.205.96 its-cs186.its.uni-kassel.de its-cs186
141.51.205.97 its-cs187.its.uni-kassel.de its-cs187
141.51.205.98 its-cs188.its.uni-kassel.de its-cs188
141.51.205.99 its-cs189.its.uni-kassel.de its-cs189
141.51.205.100 its-cs190.its.uni-kassel.de its-cs190
141.51.205.101 its-cs191.its.uni-kassel.de its-cs191
141.51.205.102 its-cs192.its.uni-kassel.de its-cs192
141.51.205.103 its-cs193.its.uni-kassel.de its-cs193
141.51.205.104 its-cs194.its.uni-kassel.de its-cs194
141.51.205.105 its-cs195.its.uni-kassel.de its-cs195
141.51.205.106 its-cs196.its.uni-kassel.de its-cs196
141.51.205.107 its-cs197.its.uni-kassel.de its-cs197
141.51.205.108 its-cs198.its.uni-kassel.de its-cs198
141.51.205.109 its-cs199.its.uni-kassel.de its-cs199
141.51.205.110 its-cs200.its.uni-kassel.de its-cs200
141.51.205.111 its-cs201.its.uni-kassel.de its-cs201
141.51.205.112 its-cs202.its.uni-kassel.de its-cs202
141.51.205.113 its-cs203.its.uni-kassel.de its-cs203
141.51.205.114 its-cs204.its.uni-kassel.de its-cs204
141.51.205.115 its-cs205.its.uni-kassel.de its-cs205
141.51.205.116 its-cs206.its.uni-kassel.de its-cs206
141.51.205.117 its-cs207.its.uni-kassel.de its-cs207
141.51.205.118 its-cs208.its.uni-kassel.de its-cs208
141.51.205.119 its-cs209.its.uni-kassel.de its-cs209
141.51.205.120 its-cs210.its.uni-kassel.de its-cs210
141.51.205.121 its-cs211.its.uni-kassel.de its-cs211
141.51.205.122 its-cs212.its.uni-kassel.de its-cs212
141.51.205.123 its-cs213.its.uni-kassel.de its-cs213
141.51.205.124 its-cs214.its.uni-kassel.de its-cs214
141.51.205.125 its-cs215.its.uni-kassel.de its-cs215
141.51.205.126 its-cs216.its.uni-kassel.de its-cs216
141.51.205.127 its-cs217.its.uni-kassel.de its-cs217
141.51.205.128 its-cs218.its.uni-kassel.de its-cs218
141.51.205.129 its-cs219.its.uni-kassel.de its-cs219
141.51.205.130 its-cs220.its.uni-kassel.de its-cs220
141.51.205.131 its-cs221.its.uni-kassel.de its-cs221
141.51.205.132 its-cs222.its.uni-kassel.de its-cs222
141.51.205.133 its-cs223.its.uni-kassel.de its-cs223
141.51.205.134 its-cs224.its.uni-kassel.de its-cs224
141.51.205.135 its-cs225.its.uni-kassel.de its-cs225
141.51.205.136 its-cs226.its.uni-kassel.de its-cs226
141.51.205.137 its-cs227.its.uni-kassel.de its-cs227
141.51.205.138 its-cs228.its.uni-kassel.de its-cs228
141.51.205.139 its-cs229.its.uni-kassel.de its-cs229
141.51.205.140 its-cs230.its.uni-kassel.de its-cs230
141.51.205.141 its-cs231.its.uni-kassel.de its-cs231
141.51.205.142 its-cs232.its.uni-kassel.de its-cs232
141.51.205.143 its-cs233.its.uni-kassel.de its-cs233
141.51.205.144 its-cs234.its.uni-kassel.de its-cs234
141.51.205.145 its-cs235.its.uni-kassel.de its-cs235
141.51.205.146 its-cs236.its.uni-kassel.de its-cs236
141.51.205.147 its-cs237.its.uni-kassel.de its-cs237
141.51.205.148 its-cs238.its.uni-kassel.de its-cs238
141.51.205.149 its-cs239.its.uni-kassel.de its-cs239
141.51.205.150 its-cs240.its.uni-kassel.de its-cs240
141.51.205.151 its-cs241.its.uni-kassel.de its-cs241
141.51.205.152 its-cs242.its.uni-kassel.de its-cs242
141.51.205.153 its-cs243.its.uni-kassel.de its-cs243
141.51.205.154 its-cs244.its.uni-kassel.de its-cs244
141.51.205.155 its-cs245.its.uni-kassel.de its-cs245
141.51.205.156 its-cs246.its.uni-kassel.de its-cs246
141.51.205.157 its-cs247.its.uni-kassel.de its-cs247
141.51.205.158 its-cs248.its.uni-kassel.de its-cs248
141.51.205.159 its-cs249.its.uni-kassel.de its-cs249
141.51.205.160 its-cs250.its.uni-kassel.de its-cs250
141.51.205.161 its-cs251.its.uni-kassel.de its-cs251
141.51.205.162 its-cs252.its.uni-kassel.de its-cs252
141.51.205.163 its-cs253.its.uni-kassel.de its-cs253
141.51.205.164 its-cs254.its.uni-kassel.de its-cs254
141.51.205.165 its-cs255.its.uni-kassel.de its-cs255
141.51.205.166 its-cs256.its.uni-kassel.de its-cs256
141.51.205.167 its-cs257.its.uni-kassel.de its-cs257
141.51.205.168 its-cs258.its.uni-kassel.de its-cs258
141.51.205.169 its-cs259.its.uni-kassel.de its-cs259
141.51.205.170 its-cs260.its.uni-kassel.de its-cs260
141.51.205.171 its-cs261.its.uni-kassel.de its-cs261
141.51.205.172 its-cs262.its.uni-kassel.de its-cs262
141.51.205.173 its-cs263.its.uni-kassel.de its-cs263
141.51.205.174 its-cs264.its.uni-kassel.de its-cs264
141.51.205.175 its-cs265.its.uni-kassel.de its-cs265
141.51.205.176 its-cs266.its.uni-kassel.de its-cs266
141.51.205.177 its-cs267.its.uni-kassel.de its-cs267
141.51.205.178 its-cs268.its.uni-kassel.de its-cs268
141.51.205.179 its-cs269.its.uni-kassel.de its-cs269
141.51.205.180 its-cs270.its.uni-kassel.de its-cs270
141.51.205.181 its-cs271.its.uni-kassel.de its-cs271
141.51.205.182 its-cs272.its.uni-kassel.de its-cs272
141.51.205.183 its-cs273.its.uni-kassel.de its-cs273
141.51.205.184 its-cs274.its.uni-kassel.de its-cs274
141.51.205.185 its-cs275.its.uni-kassel.de its-cs275
141.51.205.186 its-cs276.its.uni-kassel.de its-cs276
141.51.205.187 its-cs277.its.uni-kassel.de its-cs277
141.51.205.188 its-cs278.its.uni-kassel.de its-cs278
141.51.205.189 its-cs279.its.uni-kassel.de its-cs279
141.51.205.190 its-cs280.its.uni-kassel.de its-cs280
141.51.205.191 its-cs281.its.uni-kassel.de its-cs281
141.51.205.192 its-cs282.its.uni-kassel.de its-cs282
141.51.205.193 its-cs283.its.uni-kassel.de its-cs283
141.51.205.194 its-cs284.its.uni-kassel.de its-cs284
141.51.205.195 its-cs285.its.uni-kassel.de its-cs285
141.51.205.196 its-cs286.its.uni-kassel.de its-cs286
141.51.205.197 its-cs287.its.uni-kassel.de its-cs287
141.51.205.198 its-cs288.its.uni-kassel.de its-cs288
141.51.205.199 its-cs289.its.uni-kassel.de its-cs289
141.51.205.200 its-cs290.its.uni-kassel.de its-cs290
141.51.205.201 its-cs291.its.uni-kassel.de its-cs291
141.51.205.202 its-cs292.its.uni-kassel.de its-cs292
141.51.205.203 its-cs293.its.uni-kassel.de its-cs293
141.51.205.204 its-cs294.its.uni-kassel.de its-cs294
141.51.205.205 its-cs295.its.uni-kassel.de its-cs295
141.51.205.206 its-cs296.its.uni-kassel.de its-cs296
141.51.205.207 its-cs297.its.uni-kassel.de its-cs297
141.51.205.208 its-cs298.its.uni-kassel.de its-cs298
141.51.205.209 its-cs299.its.uni-kassel.de its-cs299
192.168.204.30 its-no1.its.uni-kassel.de its-no1
192.168.204.170 its-no10.its.uni-kassel.de its-no10
192.168.204.171 its-no11.its.uni-kassel.de its-no11
192.168.204.172 its-no12.its.uni-kassel.de its-no12
192.168.204.173 its-no13.its.uni-kassel.de its-no13
192.168.204.174 its-no14.its.uni-kassel.de its-no14
192.168.204.175 its-no15.its.uni-kassel.de its-no15
192.168.204.176 its-no16.its.uni-kassel.de its-no16
192.168.204.177 its-no17.its.uni-kassel.de its-no17
192.168.204.178 its-no18.its.uni-kassel.de its-no18
192.168.204.179 its-no19.its.uni-kassel.de its-no19
192.168.205.10 its-no100.its.uni-kassel.de its-no100
192.168.205.11 its-no101.its.uni-kassel.de its-no101
192.168.205.12 its-no102.its.uni-kassel.de its-no102
192.168.205.13 its-no103.its.uni-kassel.de its-no103
192.168.205.14 its-no104.its.uni-kassel.de its-no104
192.168.205.15 its-no105.its.uni-kassel.de its-no105
192.168.205.16 its-no106.its.uni-kassel.de its-no106
192.168.205.17 its-no107.its.uni-kassel.de its-no107
192.168.205.18 its-no108.its.uni-kassel.de its-no108
192.168.205.19 its-no109.its.uni-kassel.de its-no109
192.168.205.20 its-no110.its.uni-kassel.de its-no110
192.168.205.21 its-no111.its.uni-kassel.de its-no111
192.168.205.22 its-no112.its.uni-kassel.de its-no112
192.168.205.23 its-no113.its.uni-kassel.de its-no113
192.168.205.24 its-no114.its.uni-kassel.de its-no114
192.168.205.25 its-no115.its.uni-kassel.de its-no115
192.168.205.26 its-no116.its.uni-kassel.de its-no116
192.168.205.27 its-no117.its.uni-kassel.de its-no117
192.168.205.28 its-no118.its.uni-kassel.de its-no118
192.168.205.29 its-no119.its.uni-kassel.de its-no119
192.168.205.30 its-no120.its.uni-kassel.de its-no120
192.168.205.31 its-no121.its.uni-kassel.de its-no121
192.168.205.32 its-no122.its.uni-kassel.de its-no122
192.168.205.33 its-no123.its.uni-kassel.de its-no123
192.168.205.34 its-no124.its.uni-kassel.de its-no124
192.168.205.35 its-no125.its.uni-kassel.de its-no125
192.168.205.36 its-no126.its.uni-kassel.de its-no126
192.168.205.37 its-no127.its.uni-kassel.de its-no127
192.168.205.38 its-no128.its.uni-kassel.de its-no128
192.168.205.39 its-no129.its.uni-kassel.de its-no129
192.168.205.40 its-no130.its.uni-kassel.de its-no130
192.168.205.41 its-no131.its.uni-kassel.de its-no131
192.168.205.42 its-no132.its.uni-kassel.de its-no132
192.168.205.43 its-no133.its.uni-kassel.de its-no133
192.168.205.44 its-no134.its.uni-kassel.de its-no134
192.168.205.45 its-no135.its.uni-kassel.de its-no135
192.168.205.46 its-no136.its.uni-kassel.de its-no136
192.168.205.47 its-no137.its.uni-kassel.de its-no137
192.168.205.48 its-no138.its.uni-kassel.de its-no138
192.168.205.49 its-no139.its.uni-kassel.de its-no139
192.168.205.50 its-no140.its.uni-kassel.de its-no140
192.168.205.51 its-no141.its.uni-kassel.de its-no141
192.168.205.52 its-no142.its.uni-kassel.de its-no142
192.168.205.53 its-no143.its.uni-kassel.de its-no143
192.168.205.54 its-no144.its.uni-kassel.de its-no144
192.168.205.55 its-no145.its.uni-kassel.de its-no145
192.168.205.56 its-no146.its.uni-kassel.de its-no146
192.168.205.57 its-no147.its.uni-kassel.de its-no147
192.168.205.58 its-no148.its.uni-kassel.de its-no148
192.168.205.59 its-no149.its.uni-kassel.de its-no149
192.168.205.60 its-no150.its.uni-kassel.de its-no150
192.168.205.61 its-no151.its.uni-kassel.de its-no151
192.168.205.62 its-no152.its.uni-kassel.de its-no152
192.168.205.63 its-no153.its.uni-kassel.de its-no153
192.168.205.64 its-no154.its.uni-kassel.de its-no154
192.168.205.65 its-no155.its.uni-kassel.de its-no155
192.168.205.66 its-no156.its.uni-kassel.de its-no156
192.168.205.67 its-no157.its.uni-kassel.de its-no157
192.168.205.68 its-no158.its.uni-kassel.de its-no158
192.168.205.69 its-no159.its.uni-kassel.de its-no159
192.168.205.70 its-no160.its.uni-kassel.de its-no160
192.168.205.71 its-no161.its.uni-kassel.de its-no161
192.168.205.72 its-no162.its.uni-kassel.de its-no162
192.168.205.73 its-no163.its.uni-kassel.de its-no163
192.168.205.74 its-no164.its.uni-kassel.de its-no164
192.168.205.75 its-no165.its.uni-kassel.de its-no165
192.168.205.76 its-no166.its.uni-kassel.de its-no166
192.168.205.77 its-no167.its.uni-kassel.de its-no167
192.168.205.78 its-no168.its.uni-kassel.de its-no168
192.168.205.79 its-no169.its.uni-kassel.de its-no169
192.168.205.80 its-no170.its.uni-kassel.de its-no170
192.168.205.81 its-no171.its.uni-kassel.de its-no171
192.168.205.82 its-no172.its.uni-kassel.de its-no172
192.168.205.83 its-no173.its.uni-kassel.de its-no173
192.168.205.84 its-no174.its.uni-kassel.de its-no174
192.168.205.85 its-no175.its.uni-kassel.de its-no175
192.168.205.86 its-no176.its.uni-kassel.de its-no176
192.168.205.87 its-no177.its.uni-kassel.de its-no177
192.168.205.88 its-no178.its.uni-kassel.de its-no178
192.168.205.89 its-no179.its.uni-kassel.de its-no179
192.168.205.90 its-no180.its.uni-kassel.de its-no180
192.168.205.91 its-no181.its.uni-kassel.de its-no181
192.168.205.92 its-no182.its.uni-kassel.de its-no182
192.168.205.93 its-no183.its.uni-kassel.de its-no183
192.168.205.94 its-no184.its.uni-kassel.de its-no184
192.168.205.95 its-no185.its.uni-kassel.de its-no185
192.168.205.96 its-no186.its.uni-kassel.de its-no186
192.168.205.97 its-no187.its.uni-kassel.de its-no187
192.168.205.98 its-no188.its.uni-kassel.de its-no188
192.168.205.99 its-no189.its.uni-kassel.de its-no189
192.168.205.100 its-no190.its.uni-kassel.de its-no190
192.168.205.101 its-no191.its.uni-kassel.de its-no191
192.168.205.102 its-no192.its.uni-kassel.de its-no192
192.168.205.103 its-no193.its.uni-kassel.de its-no193
192.168.205.104 its-no194.its.uni-kassel.de its-no194
192.168.205.105 its-no195.its.uni-kassel.de its-no195
192.168.205.106 its-no196.its.uni-kassel.de its-no196
192.168.205.107 its-no197.its.uni-kassel.de its-no197
192.168.205.108 its-no198.its.uni-kassel.de its-no198
192.168.205.109 its-no199.its.uni-kassel.de its-no199
192.168.205.110 its-no200.its.uni-kassel.de its-no200
192.168.205.111 its-no201.its.uni-kassel.de its-no201
192.168.205.112 its-no202.its.uni-kassel.de its-no202
192.168.205.113 its-no203.its.uni-kassel.de its-no203
192.168.205.114 its-no204.its.uni-kassel.de its-no204
192.168.205.115 its-no205.its.uni-kassel.de its-no205
192.168.205.116 its-no206.its.uni-kassel.de its-no206
192.168.205.117 its-no207.its.uni-kassel.de its-no207
192.168.205.118 its-no208.its.uni-kassel.de its-no208
192.168.205.119 its-no209.its.uni-kassel.de its-no209
192.168.205.120 its-no210.its.uni-kassel.de its-no210
192.168.205.121 its-no211.its.uni-kassel.de its-no211
192.168.205.122 its-no212.its.uni-kassel.de its-no212
192.168.205.123 its-no213.its.uni-kassel.de its-no213
192.168.205.124 its-no214.its.uni-kassel.de its-no214
192.168.205.125 its-no215.its.uni-kassel.de its-no215
192.168.205.126 its-no216.its.uni-kassel.de its-no216
192.168.205.127 its-no217.its.uni-kassel.de its-no217
192.168.205.128 its-no218.its.uni-kassel.de its-no218
192.168.205.129 its-no219.its.uni-kassel.de its-no219
192.168.205.130 its-no220.its.uni-kassel.de its-no220
192.168.205.131 its-no221.its.uni-kassel.de its-no221
192.168.205.132 its-no222.its.uni-kassel.de its-no222
192.168.205.133 its-no223.its.uni-kassel.de its-no223
192.168.205.134 its-no224.its.uni-kassel.de its-no224
192.168.205.135 its-no225.its.uni-kassel.de its-no225
192.168.205.136 its-no226.its.uni-kassel.de its-no226
192.168.205.137 its-no227.its.uni-kassel.de its-no227
192.168.205.138 its-no228.its.uni-kassel.de its-no228
192.168.205.139 its-no229.its.uni-kassel.de its-no229
192.168.205.140 its-no230.its.uni-kassel.de its-no230
192.168.205.141 its-no231.its.uni-kassel.de its-no231
192.168.205.142 its-no232.its.uni-kassel.de its-no232
192.168.205.143 its-no233.its.uni-kassel.de its-no233
192.168.205.144 its-no234.its.uni-kassel.de its-no234
192.168.205.145 its-no235.its.uni-kassel.de its-no235
192.168.205.146 its-no236.its.uni-kassel.de its-no236
192.168.205.147 its-no237.its.uni-kassel.de its-no237
192.168.205.148 its-no238.its.uni-kassel.de its-no238
192.168.205.149 its-no239.its.uni-kassel.de its-no239
192.168.205.150 its-no240.its.uni-kassel.de its-no240
192.168.205.151 its-no241.its.uni-kassel.de its-no241
192.168.205.152 its-no242.its.uni-kassel.de its-no242
192.168.205.153 its-no243.its.uni-kassel.de its-no243
192.168.205.154 its-no244.its.uni-kassel.de its-no244
192.168.205.155 its-no245.its.uni-kassel.de its-no245
192.168.205.156 its-no246.its.uni-kassel.de its-no246
192.168.205.157 its-no247.its.uni-kassel.de its-no247
192.168.205.158 its-no248.its.uni-kassel.de its-no248
192.168.205.159 its-no249.its.uni-kassel.de its-no249
192.168.205.160 its-no250.its.uni-kassel.de its-no250
192.168.205.161 its-no251.its.uni-kassel.de its-no251
192.168.205.162 its-no252.its.uni-kassel.de its-no252
192.168.205.163 its-no253.its.uni-kassel.de its-no253
192.168.205.164 its-no254.its.uni-kassel.de its-no254
192.168.205.165 its-no255.its.uni-kassel.de its-no255
192.168.205.166 its-no256.its.uni-kassel.de its-no256
192.168.205.167 its-no257.its.uni-kassel.de its-no257
192.168.205.168 its-no258.its.uni-kassel.de its-no258
192.168.205.169 its-no259.its.uni-kassel.de its-no259
192.168.205.170 its-no260.its.uni-kassel.de its-no260
192.168.205.171 its-no261.its.uni-kassel.de its-no261
192.168.205.172 its-no262.its.uni-kassel.de its-no262
192.168.205.173 its-no263.its.uni-kassel.de its-no263
192.168.205.174 its-no264.its.uni-kassel.de its-no264
192.168.205.175 its-no265.its.uni-kassel.de its-no265
192.168.205.176 its-no266.its.uni-kassel.de its-no266
192.168.205.177 its-no267.its.uni-kassel.de its-no267
192.168.205.178 its-no268.its.uni-kassel.de its-no268
192.168.205.179 its-no269.its.uni-kassel.de its-no269
192.168.205.180 its-no270.its.uni-kassel.de its-no270
192.168.205.181 its-no271.its.uni-kassel.de its-no271
192.168.205.182 its-no272.its.uni-kassel.de its-no272
192.168.205.183 its-no273.its.uni-kassel.de its-no273
192.168.205.184 its-no274.its.uni-kassel.de its-no274
192.168.205.185 its-no275.its.uni-kassel.de its-no275
192.168.205.186 its-no276.its.uni-kassel.de its-no276
192.168.205.187 its-no277.its.uni-kassel.de its-no277
192.168.205.188 its-no278.its.uni-kassel.de its-no278
192.168.205.189 its-no279.its.uni-kassel.de its-no279
192.168.205.190 its-no280.its.uni-kassel.de its-no280
192.168.205.191 its-no281.its.uni-kassel.de its-no281
192.168.205.192 its-no282.its.uni-kassel.de its-no282
192.168.205.193 its-no283.its.uni-kassel.de its-no283
192.168.205.194 its-no284.its.uni-kassel.de its-no284
192.168.205.195 its-no285.its.uni-kassel.de its-no285
192.168.205.196 its-no286.its.uni-kassel.de its-no286
192.168.205.197 its-no287.its.uni-kassel.de its-no287
192.168.205.198 its-no288.its.uni-kassel.de its-no288
192.168.205.199 its-no289.its.uni-kassel.de its-no289
192.168.205.200 its-no290.its.uni-kassel.de its-no290
192.168.205.201 its-no291.its.uni-kassel.de its-no291
192.168.205.202 its-no292.its.uni-kassel.de its-no292
192.168.205.203 its-no293.its.uni-kassel.de its-no293
192.168.205.204 its-no294.its.uni-kassel.de its-no294
192.168.205.205 its-no295.its.uni-kassel.de its-no295
192.168.205.206 its-no296.its.uni-kassel.de its-no296
192.168.205.207 its-no297.its.uni-kassel.de its-no297
192.168.205.208 its-no298.its.uni-kassel.de its-no298
192.168.205.209 its-no299.its.uni-kassel.de its-no299
192.168.168.30 its-ib1.its.uni-kassel.de its-ib1
192.168.168.170 its-ib10.its.uni-kassel.de its-ib10
192.168.168.171 its-ib11.its.uni-kassel.de its-ib11
192.168.168.172 its-ib12.its.uni-kassel.de its-ib12
192.168.168.173 its-ib13.its.uni-kassel.de its-ib13
192.168.168.174 its-ib14.its.uni-kassel.de its-ib14
192.168.168.175 its-ib15.its.uni-kassel.de its-ib15
192.168.168.176 its-ib16.its.uni-kassel.de its-ib16
192.168.168.177 its-ib17.its.uni-kassel.de its-ib17
192.168.168.178 its-ib18.its.uni-kassel.de its-ib18
192.168.168.179 its-ib19.its.uni-kassel.de its-ib19
192.168.169.10 its-ib100.its.uni-kassel.de its-ib100
192.168.169.11 its-ib101.its.uni-kassel.de its-ib101
192.168.169.12 its-ib102.its.uni-kassel.de its-ib102
192.168.169.13 its-ib103.its.uni-kassel.de its-ib103
192.168.169.14 its-ib104.its.uni-kassel.de its-ib104
192.168.169.15 its-ib105.its.uni-kassel.de its-ib105
192.168.169.16 its-ib106.its.uni-kassel.de its-ib106
192.168.169.17 its-ib107.its.uni-kassel.de its-ib107
192.168.169.18 its-ib108.its.uni-kassel.de its-ib108
192.168.169.19 its-ib109.its.uni-kassel.de its-ib109
192.168.169.20 its-ib110.its.uni-kassel.de its-ib110
192.168.169.21 its-ib111.its.uni-kassel.de its-ib111
192.168.169.22 its-ib112.its.uni-kassel.de its-ib112
192.168.169.23 its-ib113.its.uni-kassel.de its-ib113
192.168.169.24 its-ib114.its.uni-kassel.de its-ib114
192.168.169.25 its-ib115.its.uni-kassel.de its-ib115
192.168.169.26 its-ib116.its.uni-kassel.de its-ib116
192.168.169.27 its-ib117.its.uni-kassel.de its-ib117
192.168.169.28 its-ib118.its.uni-kassel.de its-ib118
192.168.169.29 its-ib119.its.uni-kassel.de its-ib119
192.168.169.30 its-ib120.its.uni-kassel.de its-ib120
192.168.169.31 its-ib121.its.uni-kassel.de its-ib121
192.168.169.32 its-ib122.its.uni-kassel.de its-ib122
192.168.169.33 its-ib123.its.uni-kassel.de its-ib123
192.168.169.34 its-ib124.its.uni-kassel.de its-ib124
192.168.169.35 its-ib125.its.uni-kassel.de its-ib125
192.168.169.36 its-ib126.its.uni-kassel.de its-ib126
192.168.169.37 its-ib127.its.uni-kassel.de its-ib127
192.168.169.38 its-ib128.its.uni-kassel.de its-ib128
192.168.169.39 its-ib129.its.uni-kassel.de its-ib129
192.168.169.40 its-ib130.its.uni-kassel.de its-ib130
192.168.169.41 its-ib131.its.uni-kassel.de its-ib131
192.168.169.42 its-ib132.its.uni-kassel.de its-ib132
192.168.169.43 its-ib133.its.uni-kassel.de its-ib133
192.168.169.44 its-ib134.its.uni-kassel.de its-ib134
192.168.169.45 its-ib135.its.uni-kassel.de its-ib135
192.168.169.46 its-ib136.its.uni-kassel.de its-ib136
192.168.169.47 its-ib137.its.uni-kassel.de its-ib137
192.168.169.48 its-ib138.its.uni-kassel.de its-ib138
192.168.169.49 its-ib139.its.uni-kassel.de its-ib139
192.168.169.50 its-ib140.its.uni-kassel.de its-ib140
192.168.169.51 its-ib141.its.uni-kassel.de its-ib141
192.168.169.52 its-ib142.its.uni-kassel.de its-ib142
192.168.169.53 its-ib143.its.uni-kassel.de its-ib143
192.168.169.54 its-ib144.its.uni-kassel.de its-ib144
192.168.169.55 its-ib145.its.uni-kassel.de its-ib145
192.168.169.56 its-ib146.its.uni-kassel.de its-ib146
192.168.169.57 its-ib147.its.uni-kassel.de its-ib147
192.168.169.58 its-ib148.its.uni-kassel.de its-ib148
192.168.169.59 its-ib149.its.uni-kassel.de its-ib149
192.168.169.60 its-ib150.its.uni-kassel.de its-ib150
192.168.169.61 its-ib151.its.uni-kassel.de its-ib151
192.168.169.62 its-ib152.its.uni-kassel.de its-ib152
192.168.169.63 its-ib153.its.uni-kassel.de its-ib153
192.168.169.64 its-ib154.its.uni-kassel.de its-ib154
192.168.169.65 its-ib155.its.uni-kassel.de its-ib155
192.168.169.66 its-ib156.its.uni-kassel.de its-ib156
192.168.169.67 its-ib157.its.uni-kassel.de its-ib157
192.168.169.68 its-ib158.its.uni-kassel.de its-ib158
192.168.169.69 its-ib159.its.uni-kassel.de its-ib159
192.168.169.70 its-ib160.its.uni-kassel.de its-ib160
192.168.169.71 its-ib161.its.uni-kassel.de its-ib161
192.168.169.72 its-ib162.its.uni-kassel.de its-ib162
192.168.169.73 its-ib163.its.uni-kassel.de its-ib163
192.168.169.74 its-ib164.its.uni-kassel.de its-ib164
192.168.169.75 its-ib165.its.uni-kassel.de its-ib165
192.168.169.76 its-ib166.its.uni-kassel.de its-ib166
192.168.169.77 its-ib167.its.uni-kassel.de its-ib167
192.168.169.78 its-ib168.its.uni-kassel.de its-ib168
192.168.169.79 its-ib169.its.uni-kassel.de its-ib169
192.168.169.80 its-ib170.its.uni-kassel.de its-ib170
192.168.169.81 its-ib171.its.uni-kassel.de its-ib171
192.168.169.82 its-ib172.its.uni-kassel.de its-ib172
192.168.169.83 its-ib173.its.uni-kassel.de its-ib173
192.168.169.84 its-ib174.its.uni-kassel.de its-ib174
192.168.169.85 its-ib175.its.uni-kassel.de its-ib175
192.168.169.86 its-ib176.its.uni-kassel.de its-ib176
192.168.169.87 its-ib177.its.uni-kassel.de its-ib177
192.168.169.88 its-ib178.its.uni-kassel.de its-ib178
192.168.169.89 its-ib179.its.uni-kassel.de its-ib179
192.168.169.90 its-ib180.its.uni-kassel.de its-ib180
192.168.169.91 its-ib181.its.uni-kassel.de its-ib181
192.168.169.92 its-ib182.its.uni-kassel.de its-ib182
192.168.169.93 its-ib183.its.uni-kassel.de its-ib183
192.168.169.94 its-ib184.its.uni-kassel.de its-ib184
192.168.169.95 its-ib185.its.uni-kassel.de its-ib185
192.168.169.96 its-ib186.its.uni-kassel.de its-ib186
192.168.169.97 its-ib187.its.uni-kassel.de its-ib187
192.168.169.98 its-ib188.its.uni-kassel.de its-ib188
192.168.169.99 its-ib189.its.uni-kassel.de its-ib189
192.168.169.100 its-ib190.its.uni-kassel.de its-ib190
192.168.169.101 its-ib191.its.uni-kassel.de its-ib191
192.168.169.102 its-ib192.its.uni-kassel.de its-ib192
192.168.169.103 its-ib193.its.uni-kassel.de its-ib193
192.168.169.104 its-ib194.its.uni-kassel.de its-ib194
192.168.169.105 its-ib195.its.uni-kassel.de its-ib195
192.168.169.106 its-ib196.its.uni-kassel.de its-ib196
192.168.169.107 its-ib197.its.uni-kassel.de its-ib197
192.168.169.108 its-ib198.its.uni-kassel.de its-ib198
192.168.169.109 its-ib199.its.uni-kassel.de its-ib199
192.168.169.110 its-ib200.its.uni-kassel.de its-ib200
192.168.169.111 its-ib201.its.uni-kassel.de its-ib201
192.168.169.112 its-ib202.its.uni-kassel.de its-ib202
192.168.169.113 its-ib203.its.uni-kassel.de its-ib203
192.168.169.114 its-ib204.its.uni-kassel.de its-ib204
192.168.169.115 its-ib205.its.uni-kassel.de its-ib205
192.168.169.116 its-ib206.its.uni-kassel.de its-ib206
192.168.169.117 its-ib207.its.uni-kassel.de its-ib207
192.168.169.118 its-ib208.its.uni-kassel.de its-ib208
192.168.169.119 its-ib209.its.uni-kassel.de its-ib209
192.168.169.120 its-ib210.its.uni-kassel.de its-ib210
192.168.169.121 its-ib211.its.uni-kassel.de its-ib211
192.168.169.122 its-ib212.its.uni-kassel.de its-ib212
192.168.169.123 its-ib213.its.uni-kassel.de its-ib213
192.168.169.124 its-ib214.its.uni-kassel.de its-ib214
192.168.169.125 its-ib215.its.uni-kassel.de its-ib215
192.168.169.126 its-ib216.its.uni-kassel.de its-ib216
192.168.169.127 its-ib217.its.uni-kassel.de its-ib217
192.168.169.128 its-ib218.its.uni-kassel.de its-ib218
192.168.169.129 its-ib219.its.uni-kassel.de its-ib219
192.168.169.130 its-ib220.its.uni-kassel.de its-ib220
192.168.169.131 its-ib221.its.uni-kassel.de its-ib221
192.168.169.132 its-ib222.its.uni-kassel.de its-ib222
192.168.169.133 its-ib223.its.uni-kassel.de its-ib223
192.168.169.134 its-ib224.its.uni-kassel.de its-ib224
192.168.169.135 its-ib225.its.uni-kassel.de its-ib225
192.168.169.136 its-ib226.its.uni-kassel.de its-ib226
192.168.169.137 its-ib227.its.uni-kassel.de its-ib227
192.168.169.138 its-ib228.its.uni-kassel.de its-ib228
192.168.169.139 its-ib229.its.uni-kassel.de its-ib229
192.168.169.140 its-ib230.its.uni-kassel.de its-ib230
192.168.169.141 its-ib231.its.uni-kassel.de its-ib231
192.168.169.142 its-ib232.its.uni-kassel.de its-ib232
192.168.169.143 its-ib233.its.uni-kassel.de its-ib233
192.168.169.144 its-ib234.its.uni-kassel.de its-ib234
192.168.169.145 its-ib235.its.uni-kassel.de its-ib235
192.168.169.146 its-ib236.its.uni-kassel.de its-ib236
192.168.169.147 its-ib237.its.uni-kassel.de its-ib237
192.168.169.148 its-ib238.its.uni-kassel.de its-ib238
192.168.169.149 its-ib239.its.uni-kassel.de its-ib239
192.168.169.150 its-ib240.its.uni-kassel.de its-ib240
192.168.169.151 its-ib241.its.uni-kassel.de its-ib241
192.168.169.152 its-ib242.its.uni-kassel.de its-ib242
192.168.169.153 its-ib243.its.uni-kassel.de its-ib243
192.168.169.154 its-ib244.its.uni-kassel.de its-ib244
192.168.169.155 its-ib245.its.uni-kassel.de its-ib245
192.168.169.156 its-ib246.its.uni-kassel.de its-ib246
192.168.169.157 its-ib247.its.uni-kassel.de its-ib247
192.168.169.158 its-ib248.its.uni-kassel.de its-ib248
192.168.169.159 its-ib249.its.uni-kassel.de its-ib249
192.168.169.160 its-ib250.its.uni-kassel.de its-ib250
192.168.169.161 its-ib251.its.uni-kassel.de its-ib251
192.168.169.162 its-ib252.its.uni-kassel.de its-ib252
192.168.169.163 its-ib253.its.uni-kassel.de its-ib253
192.168.169.164 its-ib254.its.uni-kassel.de its-ib254
192.168.169.165 its-ib255.its.uni-kassel.de its-ib255
192.168.169.166 its-ib256.its.uni-kassel.de its-ib256
192.168.169.167 its-ib257.its.uni-kassel.de its-ib257
192.168.169.168 its-ib258.its.uni-kassel.de its-ib258
192.168.169.169 its-ib259.its.uni-kassel.de its-ib259
192.168.169.170 its-ib260.its.uni-kassel.de its-ib260
192.168.169.171 its-ib261.its.uni-kassel.de its-ib261
192.168.169.172 its-ib262.its.uni-kassel.de its-ib262
192.168.169.173 its-ib263.its.uni-kassel.de its-ib263
192.168.169.174 its-ib264.its.uni-kassel.de its-ib264
192.168.169.175 its-ib265.its.uni-kassel.de its-ib265
192.168.169.176 its-ib266.its.uni-kassel.de its-ib266
192.168.169.177 its-ib267.its.uni-kassel.de its-ib267
192.168.169.178 its-ib268.its.uni-kassel.de its-ib268
192.168.169.179 its-ib269.its.uni-kassel.de its-ib269
192.168.169.180 its-ib270.its.uni-kassel.de its-ib270
192.168.169.181 its-ib271.its.uni-kassel.de its-ib271
192.168.169.182 its-ib272.its.uni-kassel.de its-ib272
192.168.169.183 its-ib273.its.uni-kassel.de its-ib273
192.168.169.184 its-ib274.its.uni-kassel.de its-ib274
192.168.169.185 its-ib275.its.uni-kassel.de its-ib275
192.168.169.186 its-ib276.its.uni-kassel.de its-ib276
192.168.169.187 its-ib277.its.uni-kassel.de its-ib277
192.168.169.188 its-ib278.its.uni-kassel.de its-ib278
192.168.169.189 its-ib279.its.uni-kassel.de its-ib279
192.168.169.190 its-ib280.its.uni-kassel.de its-ib280
192.168.169.191 its-ib281.its.uni-kassel.de its-ib281
192.168.169.192 its-ib282.its.uni-kassel.de its-ib282
192.168.169.193 its-ib283.its.uni-kassel.de its-ib283
192.168.169.194 its-ib284.its.uni-kassel.de its-ib284
192.168.169.195 its-ib285.its.uni-kassel.de its-ib285
192.168.169.196 its-ib286.its.uni-kassel.de its-ib286
192.168.169.197 its-ib287.its.uni-kassel.de its-ib287
192.168.169.198 its-ib288.its.uni-kassel.de its-ib288
192.168.169.199 its-ib289.its.uni-kassel.de its-ib289
192.168.169.200 its-ib290.its.uni-kassel.de its-ib290
192.168.169.201 its-ib291.its.uni-kassel.de its-ib291
192.168.169.202 its-ib292.its.uni-kassel.de its-ib292
192.168.169.203 its-ib293.its.uni-kassel.de its-ib293
192.168.169.204 its-ib294.its.uni-kassel.de its-ib294
192.168.169.205 its-ib295.its.uni-kassel.de its-ib295
192.168.169.206 its-ib296.its.uni-kassel.de its-ib296
192.168.169.207 its-ib297.its.uni-kassel.de its-ib297
192.168.169.208 its-ib298.its.uni-kassel.de its-ib298
192.168.169.209 its-ib299.its.uni-kassel.de its-ib299
141.51.205.210 its-cs300.its.uni-kassel.de its-cs300
141.51.205.211 its-cs301.its.uni-kassel.de its-cs301
141.51.205.212 its-cs302.its.uni-kassel.de its-cs302
141.51.205.213 its-cs303.its.uni-kassel.de its-cs303
141.51.205.214 its-cs304.its.uni-kassel.de its-cs304
141.51.205.215 its-cs305.its.uni-kassel.de its-cs305
141.51.205.216 its-cs306.its.uni-kassel.de its-cs306
141.51.205.217 its-cs307.its.uni-kassel.de its-cs307
141.51.205.218 its-cs308.its.uni-kassel.de its-cs308
141.51.205.219 its-cs309.its.uni-kassel.de its-cs309
141.51.205.220 its-cs310.its.uni-kassel.de its-cs310
141.51.205.221 its-cs311.its.uni-kassel.de its-cs311
141.51.205.222 its-cs312.its.uni-kassel.de its-cs312
141.51.205.223 its-cs313.its.uni-kassel.de its-cs313
141.51.205.224 its-cs314.its.uni-kassel.de its-cs314
141.51.205.225 its-cs315.its.uni-kassel.de its-cs315
141.51.205.226 its-cs316.its.uni-kassel.de its-cs316
141.51.205.227 its-cs317.its.uni-kassel.de its-cs317
141.51.205.228 its-cs318.its.uni-kassel.de its-cs318
141.51.205.229 its-cs319.its.uni-kassel.de its-cs319
141.51.205.230 its-cs320.its.uni-kassel.de its-cs320
141.51.205.231 its-cs321.its.uni-kassel.de its-cs321
141.51.205.232 its-cs322.its.uni-kassel.de its-cs322
141.51.205.233 its-cs323.its.uni-kassel.de its-cs323
141.51.205.234 its-cs324.its.uni-kassel.de its-cs324
141.51.205.235 its-cs325.its.uni-kassel.de its-cs325
141.51.205.236 its-cs326.its.uni-kassel.de its-cs326
141.51.205.237 its-cs327.its.uni-kassel.de its-cs327
141.51.205.238 its-cs328.its.uni-kassel.de its-cs328
141.51.205.239 its-cs329.its.uni-kassel.de its-cs329
141.51.205.240 its-cs330.its.uni-kassel.de its-cs330
141.51.205.241 its-cs331.its.uni-kassel.de its-cs331
141.51.205.242 its-cs332.its.uni-kassel.de its-cs332
141.51.205.243 its-cs333.its.uni-kassel.de its-cs333
141.51.205.244 its-cs334.its.uni-kassel.de its-cs334
141.51.205.245 its-cs335.its.uni-kassel.de its-cs335
141.51.205.246 its-cs336.its.uni-kassel.de its-cs336
141.51.205.247 its-cs337.its.uni-kassel.de its-cs337
141.51.205.248 its-cs338.its.uni-kassel.de its-cs338
141.51.205.249 its-cs339.its.uni-kassel.de its-cs339
141.51.205.250 its-cs340.its.uni-kassel.de its-cs340
141.51.205.251 its-cs341.its.uni-kassel.de its-cs341
141.51.205.252 its-cs342.its.uni-kassel.de its-cs342
141.51.205.253 its-cs343.its.uni-kassel.de its-cs343
141.51.205.254 its-cs344.its.uni-kassel.de its-cs344
192.168.205.210 its-no300.its.uni-kassel.de its-no300
192.168.205.211 its-no301.its.uni-kassel.de its-no301
192.168.205.212 its-no302.its.uni-kassel.de its-no302
192.168.205.213 its-no303.its.uni-kassel.de its-no303
192.168.205.214 its-no304.its.uni-kassel.de its-no304
192.168.205.215 its-no305.its.uni-kassel.de its-no305
192.168.205.216 its-no306.its.uni-kassel.de its-no306
192.168.205.217 its-no307.its.uni-kassel.de its-no307
192.168.205.218 its-no308.its.uni-kassel.de its-no308
192.168.205.219 its-no309.its.uni-kassel.de its-no309
192.168.205.220 its-no310.its.uni-kassel.de its-no310
192.168.205.221 its-no311.its.uni-kassel.de its-no311
192.168.205.222 its-no312.its.uni-kassel.de its-no312
192.168.205.223 its-no313.its.uni-kassel.de its-no313
192.168.205.224 its-no314.its.uni-kassel.de its-no314
192.168.205.225 its-no315.its.uni-kassel.de its-no315
192.168.205.226 its-no316.its.uni-kassel.de its-no316
192.168.205.227 its-no317.its.uni-kassel.de its-no317
192.168.205.228 its-no318.its.uni-kassel.de its-no318
192.168.205.229 its-no319.its.uni-kassel.de its-no319
192.168.205.230 its-no320.its.uni-kassel.de its-no320
192.168.205.231 its-no321.its.uni-kassel.de its-no321
192.168.205.232 its-no322.its.uni-kassel.de its-no322
192.168.205.233 its-no323.its.uni-kassel.de its-no323
192.168.205.234 its-no324.its.uni-kassel.de its-no324
192.168.205.235 its-no325.its.uni-kassel.de its-no325
192.168.205.236 its-no326.its.uni-kassel.de its-no326
192.168.205.237 its-no327.its.uni-kassel.de its-no327
192.168.205.238 its-no328.its.uni-kassel.de its-no328
192.168.205.239 its-no329.its.uni-kassel.de its-no329
192.168.205.240 its-no330.its.uni-kassel.de its-no330
192.168.205.241 its-no331.its.uni-kassel.de its-no331
192.168.205.242 its-no332.its.uni-kassel.de its-no332
192.168.205.243 its-no333.its.uni-kassel.de its-no333
192.168.205.244 its-no334.its.uni-kassel.de its-no334
192.168.205.245 its-no335.its.uni-kassel.de its-no335
192.168.205.246 its-no336.its.uni-kassel.de its-no336
192.168.205.247 its-no337.its.uni-kassel.de its-no337
192.168.205.248 its-no338.its.uni-kassel.de its-no338
192.168.205.249 its-no339.its.uni-kassel.de its-no339
192.168.205.250 its-no340.its.uni-kassel.de its-no340
192.168.205.251 its-no341.its.uni-kassel.de its-no341
192.168.205.252 its-no342.its.uni-kassel.de its-no342
192.168.205.253 its-no343.its.uni-kassel.de its-no343
192.168.205.254 its-no344.its.uni-kassel.de its-no344
192.168.169.210 its-ib300.its.uni-kassel.de its-ib300
192.168.169.211 its-ib301.its.uni-kassel.de its-ib301
192.168.169.212 its-ib302.its.uni-kassel.de its-ib302
192.168.169.213 its-ib303.its.uni-kassel.de its-ib303
192.168.169.214 its-ib304.its.uni-kassel.de its-ib304
192.168.169.215 its-ib305.its.uni-kassel.de its-ib305
192.168.169.216 its-ib306.its.uni-kassel.de its-ib306
192.168.169.217 its-ib307.its.uni-kassel.de its-ib307
192.168.169.218 its-ib308.its.uni-kassel.de its-ib308
192.168.169.219 its-ib309.its.uni-kassel.de its-ib309
192.168.169.220 its-ib310.its.uni-kassel.de its-ib310
192.168.169.221 its-ib311.its.uni-kassel.de its-ib311
192.168.169.222 its-ib312.its.uni-kassel.de its-ib312
192.168.169.223 its-ib313.its.uni-kassel.de its-ib313
192.168.169.224 its-ib314.its.uni-kassel.de its-ib314
192.168.169.225 its-ib315.its.uni-kassel.de its-ib315
192.168.169.226 its-ib316.its.uni-kassel.de its-ib316
192.168.169.227 its-ib317.its.uni-kassel.de its-ib317
192.168.169.228 its-ib318.its.uni-kassel.de its-ib318
192.168.169.229 its-ib319.its.uni-kassel.de its-ib319
192.168.169.230 its-ib320.its.uni-kassel.de its-ib320
192.168.169.231 its-ib321.its.uni-kassel.de its-ib321
192.168.169.232 its-ib322.its.uni-kassel.de its-ib322
192.168.169.233 its-ib323.its.uni-kassel.de its-ib323
192.168.169.234 its-ib324.its.uni-kassel.de its-ib324
192.168.169.235 its-ib325.its.uni-kassel.de its-ib325
192.168.169.236 its-ib326.its.uni-kassel.de its-ib326
192.168.169.237 its-ib327.its.uni-kassel.de its-ib327
192.168.169.238 its-ib328.its.uni-kassel.de its-ib328
192.168.169.239 its-ib329.its.uni-kassel.de its-ib329
192.168.169.240 its-ib330.its.uni-kassel.de its-ib330
192.168.169.241 its-ib331.its.uni-kassel.de its-ib331
192.168.169.242 its-ib332.its.uni-kassel.de its-ib332
192.168.169.243 its-ib333.its.uni-kassel.de its-ib333
192.168.169.244 its-ib334.its.uni-kassel.de its-ib334
192.168.169.245 its-ib335.its.uni-kassel.de its-ib335
192.168.169.246 its-ib336.its.uni-kassel.de its-ib336
192.168.169.247 its-ib337.its.uni-kassel.de its-ib337
192.168.169.248 its-ib338.its.uni-kassel.de its-ib338
192.168.169.249 its-ib339.its.uni-kassel.de its-ib339
192.168.169.250 its-ib340.its.uni-kassel.de its-ib340
192.168.169.251 its-ib341.its.uni-kassel.de its-ib341
192.168.169.252 its-ib342.its.uni-kassel.de its-ib342
192.168.169.253 its-ib343.its.uni-kassel.de its-ib343
192.168.169.254 its-ib344.its.uni-kassel.de its-ib344


> Hello there,
>
>      Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>     Hi,
>
>     i am currently trying to run my hadoop program on a cluster. Sadly
>     though my datanodes and tasktrackers seem to have difficulties
>     with their communication as their logs say:
>     * Some datanodes and tasktrackers seem to have portproblems of
>     some kind as it can be seen in the logs below. I wondered if this
>     might be due to reasons correllated with the localhost entry in
>     /etc/hosts as you can read in alot of posts with similar errors,
>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>     ping localhost... the technician of the cluster said he'd be
>     looking for the mechanics resolving localhost)
>     * The other nodes can not speak with the namenode and jobtracker
>     (its-cs131). Although it is absolutely not clear, why this is
>     happening: the "dfs -put" i do directly before the job is running
>     fine, which seems to imply that communication between those
>     servers is working flawlessly.
>
>     Is there any reason why this might happen?
>
>
>     Regards,
>     Elmar
>
>     LOGS BELOW:
>
>     \____Datanodes
>
>     After successfully putting the data to hdfs (at this point i
>     thought namenode and datanodes have to communicate), i get the
>     following errors when starting the job:
>
>     There are 2 kinds of logs i found: the first one is big (about
>     12MB) and looks like this:
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 1 time(s).
>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 2 time(s).
>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 3 time(s).
>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 4 time(s).
>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 5 time(s).
>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 6 time(s).
>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 7 time(s).
>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 8 time(s).
>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 9 time(s).
>     2012-08-13 08:23:36,335 WARN
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at $Proxy5.sendHeartbeat(Unknown Source)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>         at java.lang.Thread.run(Thread.java:619)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 5 more
>
>     ... (this continues til the end of the log)
>
>     The second is short kind:
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:19,038 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting DataNode
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:19,203 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:19,216 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:19,217 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:19,218 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>     metrics system started
>     2012-08-13 00:59:19,306 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:19,346 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>     2012-08-13 00:59:21,787 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>     FSDatasetStatusMBean
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     Shutting down all async disk service threads...
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     All async disk service threads have been shut down.
>     2012-08-13 00:59:21,898 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.BindException: Problem binding to /0.0.0.0:50010
>     <http://0.0.0.0:50010> : Address already in use
>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>     Caused by: java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>         ... 7 more
>
>     2012-08-13 00:59:21,899 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down DataNode at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>
>
>
>
>     \_____TastTracker
>     With TaskTrackers it is the same: there are 2 kinds.
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>     Resending 'status' to 'its-cs131' with reponseId '879
>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 1 time(s).
>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 2 time(s).
>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 3 time(s).
>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 4 time(s).
>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 5 time(s).
>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 6 time(s).
>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 7 time(s).
>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 8 time(s).
>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 9 time(s).
>     2012-08-13 02:10:04,651 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>         at
>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>         at
>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 6 more
>
>
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>     STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting TaskTracker
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:24,569 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:24,626 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>     metrics system started
>     2012-08-13 00:59:24,950 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>     org.mortbay.log.Slf4jLog
>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>     Added global filtersafety
>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>     2012-08-13 00:59:25,232 INFO
>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>     2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tasktracker with owner as bmacek
>     2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker:
>     Good mapred local directories are:
>     /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>     2012-08-13 00:59:25,244 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:25,255 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source jvm registered.
>     2012-08-13 00:59:25,256 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source TaskTrackerMetrics registered.
>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>     Starting SocketReader
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcDetailedActivityForPort54850 registered.
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcActivityForPort54850 registered.
>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>     Server Responder: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server listener on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 0 on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 1 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker up at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850>
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 3 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 2 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tracker
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting thread: Map-events fetcher for all reduce tasks on
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>     setsid exited with exit code 0
>     2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>     Using ResourceCalculatorPlugin :
>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>     2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager
>     is disabled.
>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>     IndexCache created with max memory = 10485760
>     2012-08-13 00:59:38,158 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ShuffleServerMetrics registered.
>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>     Port returned by webServer.getConnectors()[0].getLocalPort()
>     before open() is -1. Opening the listener on 50060
>     2012-08-13 00:59:38,161 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>     because java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at
>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>         at
>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>
>     2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>     SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down TaskTracker at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Sure i can, but it is long as it is a cluster:


141.51.12.86  hrz-cs400.hrz.uni-kassel.de hrz-cs400

141.51.204.11 hrz-cs401.hrz.uni-kassel.de hrz-cs401
141.51.204.12 hrz-cs402.hrz.uni-kassel.de hrz-cs402
141.51.204.13 hrz-cs403.hrz.uni-kassel.de hrz-cs403
141.51.204.14 hrz-cs404.hrz.uni-kassel.de hrz-cs404
141.51.204.15 hrz-cs405.hrz.uni-kassel.de hrz-cs405
141.51.204.16 hrz-cs406.hrz.uni-kassel.de hrz-cs406
141.51.204.17 hrz-cs407.hrz.uni-kassel.de hrz-cs407
141.51.204.18 hrz-cs408.hrz.uni-kassel.de hrz-cs408
141.51.204.19 hrz-cs409.hrz.uni-kassel.de hrz-cs409
141.51.204.20 hrz-cs410.hrz.uni-kassel.de hrz-cs410
141.51.204.21 hrz-cs411.hrz.uni-kassel.de hrz-cs411
141.51.204.22 hrz-cs412.hrz.uni-kassel.de hrz-cs412
141.51.204.23 hrz-cs413.hrz.uni-kassel.de hrz-cs413
141.51.204.24 hrz-cs414.hrz.uni-kassel.de hrz-cs414
141.51.204.25 hrz-cs415.hrz.uni-kassel.de hrz-cs415
141.51.204.26 hrz-cs416.hrz.uni-kassel.de hrz-cs416
141.51.204.27 hrz-cs417.hrz.uni-kassel.de hrz-cs417
141.51.204.28 hrz-cs418.hrz.uni-kassel.de hrz-cs418
141.51.204.29 hrz-cs419.hrz.uni-kassel.de hrz-cs419
141.51.204.31 hrz-cs421.hrz.uni-kassel.de hrz-cs421
141.51.204.32 hrz-cs422.hrz.uni-kassel.de hrz-cs422
141.51.204.33 hrz-cs423.hrz.uni-kassel.de hrz-cs423
141.51.204.34 hrz-cs424.hrz.uni-kassel.de hrz-cs424
141.51.204.35 hrz-cs425.hrz.uni-kassel.de hrz-cs425
141.51.204.36 hrz-cs426.hrz.uni-kassel.de hrz-cs426
141.51.204.37 hrz-cs427.hrz.uni-kassel.de hrz-cs427
141.51.204.38 hrz-cs428.hrz.uni-kassel.de hrz-cs428
141.51.204.39 hrz-cs429.hrz.uni-kassel.de hrz-cs429
141.51.204.40 hrz-cs430.hrz.uni-kassel.de hrz-cs430
141.51.204.47 hrz-cs437.hrz.uni-kassel.de hrz-cs437
141.51.204.48 hrz-cs438.hrz.uni-kassel.de hrz-cs438
141.51.204.49 hrz-cs439.hrz.uni-kassel.de hrz-cs439
141.51.204.50 hrz-cs440.hrz.uni-kassel.de hrz-cs440
141.51.204.51 hrz-cs441.hrz.uni-kassel.de hrz-cs441
141.51.204.54 hrz-cs444.hrz.uni-kassel.de hrz-cs444
141.51.204.65 hrz-cs455.hrz.uni-kassel.de hrz-cs455
141.51.204.66 hrz-cs456.hrz.uni-kassel.de hrz-cs456
141.51.204.69 hrz-cs459.hrz.uni-kassel.de hrz-cs459
141.51.204.70 hrz-cs460.hrz.uni-kassel.de hrz-cs460
141.51.204.71 hrz-cs461.hrz.uni-kassel.de hrz-cs461
141.51.204.72 hrz-cs462.hrz.uni-kassel.de hrz-cs462
141.51.204.73 hrz-cs463.hrz.uni-kassel.de hrz-cs463
141.51.204.74 hrz-cs464.hrz.uni-kassel.de hrz-cs464
141.51.204.75 hrz-cs465.hrz.uni-kassel.de hrz-cs465
141.51.204.76 hrz-cs466.hrz.uni-kassel.de hrz-cs466
141.51.204.77 hrz-cs467.hrz.uni-kassel.de hrz-cs467
141.51.204.78 hrz-cs468.hrz.uni-kassel.de hrz-cs468
141.51.204.79 hrz-cs469.hrz.uni-kassel.de hrz-cs469
141.51.204.80 hrz-cs470.hrz.uni-kassel.de hrz-cs470
141.51.204.81 hrz-cs471.hrz.uni-kassel.de hrz-cs471
141.51.204.82 hrz-cs472.hrz.uni-kassel.de hrz-cs472
141.51.204.83 hrz-cs473.hrz.uni-kassel.de hrz-cs473
141.51.204.84 hrz-cs474.hrz.uni-kassel.de hrz-cs474
141.51.204.85 hrz-cs475.hrz.uni-kassel.de hrz-cs475
141.51.204.86 hrz-cs476.hrz.uni-kassel.de hrz-cs476
141.51.204.87 hrz-cs477.hrz.uni-kassel.de hrz-cs477
141.51.204.88 hrz-cs478.hrz.uni-kassel.de hrz-cs478
141.51.204.89 hrz-cs479.hrz.uni-kassel.de hrz-cs479
141.51.204.90 hrz-cs480.hrz.uni-kassel.de hrz-cs480
141.51.204.91 hrz-cs481.hrz.uni-kassel.de hrz-cs481
141.51.204.92 hrz-cs482.hrz.uni-kassel.de hrz-cs482
141.51.204.93 hrz-cs483.hrz.uni-kassel.de hrz-cs483
141.51.204.94 hrz-cs484.hrz.uni-kassel.de hrz-cs484
141.51.204.95 hrz-cs485.hrz.uni-kassel.de hrz-cs485
141.51.204.96 hrz-cs486.hrz.uni-kassel.de hrz-cs486
141.51.204.97 hrz-cs487.hrz.uni-kassel.de hrz-cs487
141.51.204.98 hrz-cs488.hrz.uni-kassel.de hrz-cs488
141.51.204.99 hrz-cs489.hrz.uni-kassel.de hrz-cs489
141.51.204.100 hrz-cs490.hrz.uni-kassel.de hrz-cs490
141.51.204.101 hrz-cs491.hrz.uni-kassel.de hrz-cs491
141.51.204.102 hrz-cs492.hrz.uni-kassel.de hrz-cs492
141.51.204.103 hrz-cs493.hrz.uni-kassel.de hrz-cs493
141.51.204.104 hrz-cs494.hrz.uni-kassel.de hrz-cs494
141.51.204.105 hrz-cs495.hrz.uni-kassel.de hrz-cs495
141.51.204.106 hrz-cs496.hrz.uni-kassel.de hrz-cs496
141.51.204.107 hrz-cs497.hrz.uni-kassel.de hrz-cs497
141.51.204.108 hrz-cs498.hrz.uni-kassel.de hrz-cs498
141.51.204.109 hrz-cs499.hrz.uni-kassel.de hrz-cs499
141.51.204.110 hrz-cs500.hrz.uni-kassel.de hrz-cs500
141.51.204.111 hrz-cs501.hrz.uni-kassel.de hrz-cs501
141.51.204.112 hrz-cs502.hrz.uni-kassel.de hrz-cs502
141.51.204.113 hrz-cs503.hrz.uni-kassel.de hrz-cs503
141.51.204.114 hrz-cs504.hrz.uni-kassel.de hrz-cs504
141.51.204.115 hrz-cs505.hrz.uni-kassel.de hrz-cs505
141.51.204.116 hrz-cs506.hrz.uni-kassel.de hrz-cs506
141.51.204.117 hrz-cs507.hrz.uni-kassel.de hrz-cs507
141.51.204.118 hrz-cs508.hrz.uni-kassel.de hrz-cs508
141.51.204.119 hrz-cs509.hrz.uni-kassel.de hrz-cs509
141.51.204.120 hrz-cs510.hrz.uni-kassel.de hrz-cs510
141.51.204.121 hrz-cs511.hrz.uni-kassel.de hrz-cs511
141.51.204.122 hrz-cs512.hrz.uni-kassel.de hrz-cs512
141.51.204.123 hrz-cs513.hrz.uni-kassel.de hrz-cs513
141.51.204.124 hrz-cs514.hrz.uni-kassel.de hrz-cs514
141.51.204.125 hrz-cs515.hrz.uni-kassel.de hrz-cs515
141.51.204.126 hrz-cs516.hrz.uni-kassel.de hrz-cs516
141.51.204.127 hrz-cs517.hrz.uni-kassel.de hrz-cs517
141.51.204.128 hrz-cs518.hrz.uni-kassel.de hrz-cs518
141.51.204.129 hrz-cs519.hrz.uni-kassel.de hrz-cs519
141.51.204.130 hrz-cs520.hrz.uni-kassel.de hrz-cs520
141.51.204.131 hrz-cs521.hrz.uni-kassel.de hrz-cs521
141.51.204.132 hrz-cs522.hrz.uni-kassel.de hrz-cs522
141.51.204.133 hrz-cs523.hrz.uni-kassel.de hrz-cs523
141.51.204.134 hrz-cs524.hrz.uni-kassel.de hrz-cs524
141.51.204.135 hrz-cs525.hrz.uni-kassel.de hrz-cs525
141.51.204.136 hrz-cs526.hrz.uni-kassel.de hrz-cs526
141.51.204.137 hrz-cs527.hrz.uni-kassel.de hrz-cs527
141.51.204.138 hrz-cs528.hrz.uni-kassel.de hrz-cs528
141.51.204.139 hrz-cs529.hrz.uni-kassel.de hrz-cs529
141.51.204.140 hrz-cs530.hrz.uni-kassel.de hrz-cs530
141.51.204.141 hrz-cs531.hrz.uni-kassel.de hrz-cs531
141.51.204.142 hrz-cs532.hrz.uni-kassel.de hrz-cs532
141.51.204.143 hrz-cs533.hrz.uni-kassel.de hrz-cs533
141.51.204.144 hrz-cs534.hrz.uni-kassel.de hrz-cs534
141.51.204.145 hrz-cs535.hrz.uni-kassel.de hrz-cs535
141.51.204.146 hrz-cs536.hrz.uni-kassel.de hrz-cs536
141.51.204.147 hrz-cs537.hrz.uni-kassel.de hrz-cs537
141.51.204.148 hrz-cs538.hrz.uni-kassel.de hrz-cs538
141.51.204.149 hrz-cs539.hrz.uni-kassel.de hrz-cs539
141.51.204.150 hrz-cs540.hrz.uni-kassel.de hrz-cs540
141.51.204.151 hrz-cs541.hrz.uni-kassel.de hrz-cs541
141.51.204.152 hrz-cs542.hrz.uni-kassel.de hrz-cs542
141.51.204.153 hrz-cs543.hrz.uni-kassel.de hrz-cs543
141.51.204.154 hrz-cs544.hrz.uni-kassel.de hrz-cs544
141.51.204.155 hrz-cs545.hrz.uni-kassel.de hrz-cs545
141.51.204.156 hrz-cs546.hrz.uni-kassel.de hrz-cs546
141.51.204.157 hrz-cs547.hrz.uni-kassel.de hrz-cs547
141.51.204.158 hrz-cs548.hrz.uni-kassel.de hrz-cs548
141.51.204.159 hrz-cs549.hrz.uni-kassel.de hrz-cs549
141.51.204.160 hrz-cs550.hrz.uni-kassel.de hrz-cs550
141.51.204.161 hrz-cs551.hrz.uni-kassel.de hrz-cs551
141.51.204.162 hrz-cs552.hrz.uni-kassel.de hrz-cs552
141.51.204.163 hrz-cs553.hrz.uni-kassel.de hrz-cs553
141.51.204.164 hrz-cs554.hrz.uni-kassel.de hrz-cs554
141.51.204.165 hrz-cs555.hrz.uni-kassel.de hrz-cs555
141.51.204.166 hrz-cs556.hrz.uni-kassel.de hrz-cs556
141.51.204.167 hrz-cs557.hrz.uni-kassel.de hrz-cs557
141.51.204.168 hrz-cs558.hrz.uni-kassel.de hrz-cs558
141.51.204.169 hrz-cs559.hrz.uni-kassel.de hrz-cs559
141.51.204.215 hrz-cs560.hrz.uni-kassel.de hrz-cs560
141.51.204.216 hrz-cs561.hrz.uni-kassel.de hrz-cs561
141.51.204.217 hrz-cs562.hrz.uni-kassel.de hrz-cs562
141.51.204.218 hrz-cs563.hrz.uni-kassel.de hrz-cs563
141.51.204.219 hrz-cs564.hrz.uni-kassel.de hrz-cs564
141.51.204.220 hrz-cs565.hrz.uni-kassel.de hrz-cs565
141.51.204.221 hrz-cs566.hrz.uni-kassel.de hrz-cs566
141.51.204.222 hrz-cs567.hrz.uni-kassel.de hrz-cs567
141.51.204.223 hrz-cs568.hrz.uni-kassel.de hrz-cs568
141.51.204.224 hrz-cs569.hrz.uni-kassel.de hrz-cs569
141.51.204.225 hrz-cs570.hrz.uni-kassel.de hrz-cs570
141.51.204.226 hrz-cs571.hrz.uni-kassel.de hrz-cs571
141.51.204.227 hrz-cs572.hrz.uni-kassel.de hrz-cs572
141.51.204.228 hrz-cs573.hrz.uni-kassel.de hrz-cs573
141.51.204.229 hrz-cs574.hrz.uni-kassel.de hrz-cs574
141.51.204.230 hrz-cs575.hrz.uni-kassel.de hrz-cs575
141.51.204.231 hrz-cs576.hrz.uni-kassel.de hrz-cs576
141.51.204.232 hrz-cs577.hrz.uni-kassel.de hrz-cs577
141.51.204.233 hrz-cs578.hrz.uni-kassel.de hrz-cs578
141.51.204.234 hrz-cs579.hrz.uni-kassel.de hrz-cs579
141.51.204.235 hrz-cs580.hrz.uni-kassel.de hrz-cs580
141.51.204.236 hrz-cs581.hrz.uni-kassel.de hrz-cs581
141.51.204.237 hrz-cs582.hrz.uni-kassel.de hrz-cs582
141.51.204.238 hrz-cs583.hrz.uni-kassel.de hrz-cs583
141.51.204.239 hrz-cs584.hrz.uni-kassel.de hrz-cs584
141.51.204.240 hrz-cs585.hrz.uni-kassel.de hrz-cs585
141.51.204.241 hrz-cs586.hrz.uni-kassel.de hrz-cs586
141.51.204.242 hrz-cs587.hrz.uni-kassel.de hrz-cs587
141.51.204.243 hrz-cs588.hrz.uni-kassel.de hrz-cs588
141.51.204.244 hrz-cs589.hrz.uni-kassel.de hrz-cs589
141.51.204.245 hrz-cs590.hrz.uni-kassel.de hrz-cs590
141.51.204.246 hrz-cs591.hrz.uni-kassel.de hrz-cs591
141.51.204.247 hrz-cs592.hrz.uni-kassel.de hrz-cs592
141.51.204.248 hrz-cs593.hrz.uni-kassel.de hrz-cs593
141.51.204.249 hrz-cs594.hrz.uni-kassel.de hrz-cs594
141.51.204.250 hrz-cs595.hrz.uni-kassel.de hrz-cs595
141.51.204.251 hrz-cs596.hrz.uni-kassel.de hrz-cs596
141.51.204.252 hrz-cs597.hrz.uni-kassel.de hrz-cs597
141.51.204.253 hrz-cs598.hrz.uni-kassel.de hrz-cs598
141.51.204.254 hrz-cs599.hrz.uni-kassel.de hrz-cs599


192.168.204.11 hrz-no401.hrz.uni-kassel.de hrz-no401
192.168.204.12 hrz-no402.hrz.uni-kassel.de hrz-no402
192.168.204.13 hrz-no403.hrz.uni-kassel.de hrz-no403
192.168.204.14 hrz-no404.hrz.uni-kassel.de hrz-no404
192.168.204.15 hrz-no405.hrz.uni-kassel.de hrz-no405
192.168.204.16 hrz-no406.hrz.uni-kassel.de hrz-no406
192.168.204.17 hrz-no407.hrz.uni-kassel.de hrz-no407
192.168.204.18 hrz-no408.hrz.uni-kassel.de hrz-no408
192.168.204.19 hrz-no409.hrz.uni-kassel.de hrz-no409
192.168.204.20 hrz-no410.hrz.uni-kassel.de hrz-no410
192.168.204.21 hrz-no411.hrz.uni-kassel.de hrz-no411
192.168.204.22 hrz-no412.hrz.uni-kassel.de hrz-no412
192.168.204.23 hrz-no413.hrz.uni-kassel.de hrz-no413
192.168.204.24 hrz-no414.hrz.uni-kassel.de hrz-no414
192.168.204.25 hrz-no415.hrz.uni-kassel.de hrz-no415
192.168.204.26 hrz-no416.hrz.uni-kassel.de hrz-no416
192.168.204.27 hrz-no417.hrz.uni-kassel.de hrz-no417
192.168.204.28 hrz-no418.hrz.uni-kassel.de hrz-no418
192.168.204.29 hrz-no419.hrz.uni-kassel.de hrz-no419
192.168.204.31 hrz-no421.hrz.uni-kassel.de hrz-no421
192.168.204.32 hrz-no422.hrz.uni-kassel.de hrz-no422
192.168.204.33 hrz-no423.hrz.uni-kassel.de hrz-no423
192.168.204.34 hrz-no424.hrz.uni-kassel.de hrz-no424
192.168.204.35 hrz-no425.hrz.uni-kassel.de hrz-no425
192.168.204.36 hrz-no426.hrz.uni-kassel.de hrz-no426
192.168.204.37 hrz-no427.hrz.uni-kassel.de hrz-no427
192.168.204.38 hrz-no428.hrz.uni-kassel.de hrz-no428
192.168.204.39 hrz-no429.hrz.uni-kassel.de hrz-no429
192.168.204.40 hrz-no430.hrz.uni-kassel.de hrz-no430
192.168.204.47 hrz-no437.hrz.uni-kassel.de hrz-no437
192.168.204.48 hrz-no438.hrz.uni-kassel.de hrz-no438
192.168.204.49 hrz-no439.hrz.uni-kassel.de hrz-no439
192.168.204.50 hrz-no440.hrz.uni-kassel.de hrz-no440
192.168.204.51 hrz-no441.hrz.uni-kassel.de hrz-no441
192.168.204.54 hrz-no444.hrz.uni-kassel.de hrz-no444
192.168.204.65 hrz-no455.hrz.uni-kassel.de hrz-no455
192.168.204.66 hrz-no456.hrz.uni-kassel.de hrz-no456
192.168.204.69 hrz-no459.hrz.uni-kassel.de hrz-no459
192.168.204.70 hrz-no460.hrz.uni-kassel.de hrz-no460
192.168.204.71 hrz-no461.hrz.uni-kassel.de hrz-no461
192.168.204.72 hrz-no462.hrz.uni-kassel.de hrz-no462
192.168.204.73 hrz-no463.hrz.uni-kassel.de hrz-no463
192.168.204.74 hrz-no464.hrz.uni-kassel.de hrz-no464
192.168.204.75 hrz-no465.hrz.uni-kassel.de hrz-no465
192.168.204.76 hrz-no466.hrz.uni-kassel.de hrz-no466
192.168.204.77 hrz-no467.hrz.uni-kassel.de hrz-no467
192.168.204.78 hrz-no468.hrz.uni-kassel.de hrz-no468
192.168.204.79 hrz-no469.hrz.uni-kassel.de hrz-no469
192.168.204.80 hrz-no470.hrz.uni-kassel.de hrz-no470
192.168.204.81 hrz-no471.hrz.uni-kassel.de hrz-no471
192.168.204.82 hrz-no472.hrz.uni-kassel.de hrz-no472
192.168.204.83 hrz-no473.hrz.uni-kassel.de hrz-no473
192.168.204.84 hrz-no474.hrz.uni-kassel.de hrz-no474
192.168.204.85 hrz-no475.hrz.uni-kassel.de hrz-no475
192.168.204.86 hrz-no476.hrz.uni-kassel.de hrz-no476
192.168.204.87 hrz-no477.hrz.uni-kassel.de hrz-no477
192.168.204.88 hrz-no478.hrz.uni-kassel.de hrz-no478
192.168.204.89 hrz-no479.hrz.uni-kassel.de hrz-no479
192.168.204.90 hrz-no480.hrz.uni-kassel.de hrz-no480
192.168.204.91 hrz-no481.hrz.uni-kassel.de hrz-no481
192.168.204.92 hrz-no482.hrz.uni-kassel.de hrz-no482
192.168.204.93 hrz-no483.hrz.uni-kassel.de hrz-no483
192.168.204.94 hrz-no484.hrz.uni-kassel.de hrz-no484
192.168.204.95 hrz-no485.hrz.uni-kassel.de hrz-no485
192.168.204.96 hrz-no486.hrz.uni-kassel.de hrz-no486
192.168.204.97 hrz-no487.hrz.uni-kassel.de hrz-no487
192.168.204.98 hrz-no488.hrz.uni-kassel.de hrz-no488
192.168.204.99 hrz-no489.hrz.uni-kassel.de hrz-no489
192.168.204.100 hrz-no490.hrz.uni-kassel.de hrz-no490
192.168.204.101 hrz-no491.hrz.uni-kassel.de hrz-no491
192.168.204.102 hrz-no492.hrz.uni-kassel.de hrz-no492
192.168.204.103 hrz-no493.hrz.uni-kassel.de hrz-no493
192.168.204.104 hrz-no494.hrz.uni-kassel.de hrz-no494
192.168.204.105 hrz-no495.hrz.uni-kassel.de hrz-no495
192.168.204.106 hrz-no496.hrz.uni-kassel.de hrz-no496
192.168.204.107 hrz-no497.hrz.uni-kassel.de hrz-no497
192.168.204.108 hrz-no498.hrz.uni-kassel.de hrz-no498
192.168.204.109 hrz-no499.hrz.uni-kassel.de hrz-no499
192.168.204.110 hrz-no500.hrz.uni-kassel.de hrz-no500
192.168.204.111 hrz-no501.hrz.uni-kassel.de hrz-no501
192.168.204.112 hrz-no502.hrz.uni-kassel.de hrz-no502
192.168.204.113 hrz-no503.hrz.uni-kassel.de hrz-no503
192.168.204.114 hrz-no504.hrz.uni-kassel.de hrz-no504
192.168.204.115 hrz-no505.hrz.uni-kassel.de hrz-no505
192.168.204.116 hrz-no506.hrz.uni-kassel.de hrz-no506
192.168.204.117 hrz-no507.hrz.uni-kassel.de hrz-no507
192.168.204.118 hrz-no508.hrz.uni-kassel.de hrz-no508
192.168.204.119 hrz-no509.hrz.uni-kassel.de hrz-no509
192.168.204.120 hrz-no510.hrz.uni-kassel.de hrz-no510
192.168.204.121 hrz-no511.hrz.uni-kassel.de hrz-no511
192.168.204.122 hrz-no512.hrz.uni-kassel.de hrz-no512
192.168.204.123 hrz-no513.hrz.uni-kassel.de hrz-no513
192.168.204.124 hrz-no514.hrz.uni-kassel.de hrz-no514
192.168.204.125 hrz-no515.hrz.uni-kassel.de hrz-no515
192.168.204.126 hrz-no516.hrz.uni-kassel.de hrz-no516
192.168.204.127 hrz-no517.hrz.uni-kassel.de hrz-no517
192.168.204.128 hrz-no518.hrz.uni-kassel.de hrz-no518
192.168.204.129 hrz-no519.hrz.uni-kassel.de hrz-no519
192.168.204.130 hrz-no520.hrz.uni-kassel.de hrz-no520
192.168.204.131 hrz-no521.hrz.uni-kassel.de hrz-no521
192.168.204.132 hrz-no522.hrz.uni-kassel.de hrz-no522
192.168.204.133 hrz-no523.hrz.uni-kassel.de hrz-no523
192.168.204.134 hrz-no524.hrz.uni-kassel.de hrz-no524
192.168.204.135 hrz-no525.hrz.uni-kassel.de hrz-no525
192.168.204.136 hrz-no526.hrz.uni-kassel.de hrz-no526
192.168.204.137 hrz-no527.hrz.uni-kassel.de hrz-no527
192.168.204.138 hrz-no528.hrz.uni-kassel.de hrz-no528
192.168.204.139 hrz-no529.hrz.uni-kassel.de hrz-no529
192.168.204.140 hrz-no530.hrz.uni-kassel.de hrz-no530
192.168.204.141 hrz-no531.hrz.uni-kassel.de hrz-no531
192.168.204.142 hrz-no532.hrz.uni-kassel.de hrz-no532
192.168.204.143 hrz-no533.hrz.uni-kassel.de hrz-no533
192.168.204.144 hrz-no534.hrz.uni-kassel.de hrz-no534
192.168.204.145 hrz-no535.hrz.uni-kassel.de hrz-no535
192.168.204.146 hrz-no536.hrz.uni-kassel.de hrz-no536
192.168.204.147 hrz-no537.hrz.uni-kassel.de hrz-no537
192.168.204.148 hrz-no538.hrz.uni-kassel.de hrz-no538
192.168.204.149 hrz-no539.hrz.uni-kassel.de hrz-no539
192.168.204.150 hrz-no540.hrz.uni-kassel.de hrz-no540
192.168.204.151 hrz-no541.hrz.uni-kassel.de hrz-no541
192.168.204.152 hrz-no542.hrz.uni-kassel.de hrz-no542
192.168.204.153 hrz-no543.hrz.uni-kassel.de hrz-no543
192.168.204.154 hrz-no544.hrz.uni-kassel.de hrz-no544
192.168.204.155 hrz-no545.hrz.uni-kassel.de hrz-no545
192.168.204.156 hrz-no546.hrz.uni-kassel.de hrz-no546
192.168.204.157 hrz-no547.hrz.uni-kassel.de hrz-no547
192.168.204.158 hrz-no548.hrz.uni-kassel.de hrz-no548
192.168.204.159 hrz-no549.hrz.uni-kassel.de hrz-no549
192.168.204.160 hrz-no550.hrz.uni-kassel.de hrz-no550
192.168.204.161 hrz-no551.hrz.uni-kassel.de hrz-no551
192.168.204.162 hrz-no552.hrz.uni-kassel.de hrz-no552
192.168.204.163 hrz-no553.hrz.uni-kassel.de hrz-no553
192.168.204.164 hrz-no554.hrz.uni-kassel.de hrz-no554
192.168.204.165 hrz-no555.hrz.uni-kassel.de hrz-no555
192.168.204.166 hrz-no556.hrz.uni-kassel.de hrz-no556
192.168.204.167 hrz-no557.hrz.uni-kassel.de hrz-no557
192.168.204.168 hrz-no558.hrz.uni-kassel.de hrz-no558
192.168.204.169 hrz-no559.hrz.uni-kassel.de hrz-no559
192.168.204.215 hrz-no560.hrz.uni-kassel.de hrz-no560
192.168.204.216 hrz-no561.hrz.uni-kassel.de hrz-no561
192.168.204.217 hrz-no562.hrz.uni-kassel.de hrz-no562
192.168.204.218 hrz-no563.hrz.uni-kassel.de hrz-no563
192.168.204.219 hrz-no564.hrz.uni-kassel.de hrz-no564
192.168.204.220 hrz-no565.hrz.uni-kassel.de hrz-no565
192.168.204.221 hrz-no566.hrz.uni-kassel.de hrz-no566
192.168.204.222 hrz-no567.hrz.uni-kassel.de hrz-no567
192.168.204.223 hrz-no568.hrz.uni-kassel.de hrz-no568
192.168.204.224 hrz-no569.hrz.uni-kassel.de hrz-no569
192.168.204.225 hrz-no570.hrz.uni-kassel.de hrz-no570
192.168.204.226 hrz-no571.hrz.uni-kassel.de hrz-no571
192.168.204.227 hrz-no572.hrz.uni-kassel.de hrz-no572
192.168.204.228 hrz-no573.hrz.uni-kassel.de hrz-no573
192.168.204.229 hrz-no574.hrz.uni-kassel.de hrz-no574
192.168.204.230 hrz-no575.hrz.uni-kassel.de hrz-no575
192.168.204.231 hrz-no576.hrz.uni-kassel.de hrz-no576
192.168.204.232 hrz-no577.hrz.uni-kassel.de hrz-no577
192.168.204.233 hrz-no578.hrz.uni-kassel.de hrz-no578
192.168.204.234 hrz-no579.hrz.uni-kassel.de hrz-no579
192.168.204.235 hrz-no580.hrz.uni-kassel.de hrz-no580
192.168.204.236 hrz-no581.hrz.uni-kassel.de hrz-no581
192.168.204.237 hrz-no582.hrz.uni-kassel.de hrz-no582
192.168.204.238 hrz-no583.hrz.uni-kassel.de hrz-no583
192.168.204.239 hrz-no584.hrz.uni-kassel.de hrz-no584
192.168.204.240 hrz-no585.hrz.uni-kassel.de hrz-no585
192.168.204.241 hrz-no586.hrz.uni-kassel.de hrz-no586
192.168.204.242 hrz-no587.hrz.uni-kassel.de hrz-no587
192.168.204.243 hrz-no588.hrz.uni-kassel.de hrz-no588
192.168.204.244 hrz-no589.hrz.uni-kassel.de hrz-no589
192.168.204.245 hrz-no590.hrz.uni-kassel.de hrz-no590
192.168.204.246 hrz-no591.hrz.uni-kassel.de hrz-no591
192.168.204.247 hrz-no592.hrz.uni-kassel.de hrz-no592
192.168.204.248 hrz-no593.hrz.uni-kassel.de hrz-no593
192.168.204.249 hrz-no594.hrz.uni-kassel.de hrz-no594
192.168.204.250 hrz-no595.hrz.uni-kassel.de hrz-no595
192.168.204.251 hrz-no596.hrz.uni-kassel.de hrz-no596
192.168.204.252 hrz-no597.hrz.uni-kassel.de hrz-no597
192.168.204.253 hrz-no598.hrz.uni-kassel.de hrz-no598
192.168.204.254 hrz-no599.hrz.uni-kassel.de hrz-no599

141.51.204.190    hrz-gc100 hrz-gc100.hrz.uni-kassel.de
141.51.204.191    hrz-gc101.hrz.uni-kassel.de hrz-gc101
141.51.204.192    hrz-gc102.hrz.uni-kassel.de hrz-gc102
141.51.204.193    hrz-gc103.hrz.uni-kassel.de hrz-gc103
141.51.204.194    hrz-gc104.hrz.uni-kassel.de hrz-gc104
141.51.204.195    hrz-gc105.hrz.uni-kassel.de hrz-gc105
141.51.204.196    hrz-gc106.hrz.uni-kassel.de hrz-gc106
141.51.204.197    hrz-gc107.hrz.uni-kassel.de hrz-gc107
141.51.204.198    hrz-gc108.hrz.uni-kassel.de hrz-gc108
141.51.204.199    hrz-gc109.hrz.uni-kassel.de hrz-gc109
141.51.204.200    hrz-gc110.hrz.uni-kassel.de hrz-gc110
141.51.204.201    hrz-gc111.hrz.uni-kassel.de hrz-gc111
141.51.204.202    hrz-gc112.hrz.uni-kassel.de hrz-gc112
141.51.204.203    hrz-gc113.hrz.uni-kassel.de hrz-gc113
141.51.204.204    hrz-gc114.hrz.uni-kassel.de hrz-gc114
141.51.204.205    hrz-gc115.hrz.uni-kassel.de hrz-gc115
141.51.204.206    hrz-gc116.hrz.uni-kassel.de hrz-gc116
141.51.204.207    hrz-gc117.hrz.uni-kassel.de hrz-gc117
141.51.204.208    hrz-gc118.hrz.uni-kassel.de hrz-gc118
141.51.204.209    hrz-gc119.hrz.uni-kassel.de hrz-gc119
141.51.204.210    hrz-gc120.hrz.uni-kassel.de hrz-gc120

# Cluster neu
141.51.204.30 its-cs1.its.uni-kassel.de its-cs1
141.51.204.170 its-cs10.its.uni-kassel.de its-cs10
141.51.204.171 its-cs11.its.uni-kassel.de its-cs11
141.51.204.172 its-cs12.its.uni-kassel.de its-cs12
141.51.204.173 its-cs13.its.uni-kassel.de its-cs13
141.51.204.174 its-cs14.its.uni-kassel.de its-cs14
141.51.204.175 its-cs15.its.uni-kassel.de its-cs15
141.51.204.176 its-cs16.its.uni-kassel.de its-cs16
141.51.204.177 its-cs17.its.uni-kassel.de its-cs17
141.51.204.178 its-cs18.its.uni-kassel.de its-cs18
141.51.204.179 its-cs19.its.uni-kassel.de its-cs19
141.51.205.10 its-cs100.its.uni-kassel.de its-cs100
141.51.205.11 its-cs101.its.uni-kassel.de its-cs101
141.51.205.12 its-cs102.its.uni-kassel.de its-cs102
141.51.205.13 its-cs103.its.uni-kassel.de its-cs103
141.51.205.14 its-cs104.its.uni-kassel.de its-cs104
141.51.205.15 its-cs105.its.uni-kassel.de its-cs105
141.51.205.16 its-cs106.its.uni-kassel.de its-cs106
141.51.205.17 its-cs107.its.uni-kassel.de its-cs107
141.51.205.18 its-cs108.its.uni-kassel.de its-cs108
141.51.205.19 its-cs109.its.uni-kassel.de its-cs109
141.51.205.20 its-cs110.its.uni-kassel.de its-cs110
141.51.205.21 its-cs111.its.uni-kassel.de its-cs111
141.51.205.22 its-cs112.its.uni-kassel.de its-cs112
141.51.205.23 its-cs113.its.uni-kassel.de its-cs113
141.51.205.24 its-cs114.its.uni-kassel.de its-cs114
141.51.205.25 its-cs115.its.uni-kassel.de its-cs115
141.51.205.26 its-cs116.its.uni-kassel.de its-cs116
141.51.205.27 its-cs117.its.uni-kassel.de its-cs117
141.51.205.28 its-cs118.its.uni-kassel.de its-cs118
141.51.205.29 its-cs119.its.uni-kassel.de its-cs119
141.51.205.30 its-cs120.its.uni-kassel.de its-cs120
141.51.205.31 its-cs121.its.uni-kassel.de its-cs121
141.51.205.32 its-cs122.its.uni-kassel.de its-cs122
141.51.205.33 its-cs123.its.uni-kassel.de its-cs123
141.51.205.34 its-cs124.its.uni-kassel.de its-cs124
141.51.205.35 its-cs125.its.uni-kassel.de its-cs125
141.51.205.36 its-cs126.its.uni-kassel.de its-cs126
141.51.205.37 its-cs127.its.uni-kassel.de its-cs127
141.51.205.38 its-cs128.its.uni-kassel.de its-cs128
141.51.205.39 its-cs129.its.uni-kassel.de its-cs129
141.51.205.40 its-cs130.its.uni-kassel.de its-cs130
141.51.205.41 its-cs131.its.uni-kassel.de its-cs131
141.51.205.42 its-cs132.its.uni-kassel.de its-cs132
141.51.205.43 its-cs133.its.uni-kassel.de its-cs133
141.51.205.44 its-cs134.its.uni-kassel.de its-cs134
141.51.205.45 its-cs135.its.uni-kassel.de its-cs135
141.51.205.46 its-cs136.its.uni-kassel.de its-cs136
141.51.205.47 its-cs137.its.uni-kassel.de its-cs137
141.51.205.48 its-cs138.its.uni-kassel.de its-cs138
141.51.205.49 its-cs139.its.uni-kassel.de its-cs139
141.51.205.50 its-cs140.its.uni-kassel.de its-cs140
141.51.205.51 its-cs141.its.uni-kassel.de its-cs141
141.51.205.52 its-cs142.its.uni-kassel.de its-cs142
141.51.205.53 its-cs143.its.uni-kassel.de its-cs143
141.51.205.54 its-cs144.its.uni-kassel.de its-cs144
141.51.205.55 its-cs145.its.uni-kassel.de its-cs145
141.51.205.56 its-cs146.its.uni-kassel.de its-cs146
141.51.205.57 its-cs147.its.uni-kassel.de its-cs147
141.51.205.58 its-cs148.its.uni-kassel.de its-cs148
141.51.205.59 its-cs149.its.uni-kassel.de its-cs149
141.51.205.60 its-cs150.its.uni-kassel.de its-cs150
141.51.205.61 its-cs151.its.uni-kassel.de its-cs151
141.51.205.62 its-cs152.its.uni-kassel.de its-cs152
141.51.205.63 its-cs153.its.uni-kassel.de its-cs153
141.51.205.64 its-cs154.its.uni-kassel.de its-cs154
141.51.205.65 its-cs155.its.uni-kassel.de its-cs155
141.51.205.66 its-cs156.its.uni-kassel.de its-cs156
141.51.205.67 its-cs157.its.uni-kassel.de its-cs157
141.51.205.68 its-cs158.its.uni-kassel.de its-cs158
141.51.205.69 its-cs159.its.uni-kassel.de its-cs159
141.51.205.70 its-cs160.its.uni-kassel.de its-cs160
141.51.205.71 its-cs161.its.uni-kassel.de its-cs161
141.51.205.72 its-cs162.its.uni-kassel.de its-cs162
141.51.205.73 its-cs163.its.uni-kassel.de its-cs163
141.51.205.74 its-cs164.its.uni-kassel.de its-cs164
141.51.205.75 its-cs165.its.uni-kassel.de its-cs165
141.51.205.76 its-cs166.its.uni-kassel.de its-cs166
141.51.205.77 its-cs167.its.uni-kassel.de its-cs167
141.51.205.78 its-cs168.its.uni-kassel.de its-cs168
141.51.205.79 its-cs169.its.uni-kassel.de its-cs169
141.51.205.80 its-cs170.its.uni-kassel.de its-cs170
141.51.205.81 its-cs171.its.uni-kassel.de its-cs171
141.51.205.82 its-cs172.its.uni-kassel.de its-cs172
141.51.205.83 its-cs173.its.uni-kassel.de its-cs173
141.51.205.84 its-cs174.its.uni-kassel.de its-cs174
141.51.205.85 its-cs175.its.uni-kassel.de its-cs175
141.51.205.86 its-cs176.its.uni-kassel.de its-cs176
141.51.205.87 its-cs177.its.uni-kassel.de its-cs177
141.51.205.88 its-cs178.its.uni-kassel.de its-cs178
141.51.205.89 its-cs179.its.uni-kassel.de its-cs179
141.51.205.90 its-cs180.its.uni-kassel.de its-cs180
141.51.205.91 its-cs181.its.uni-kassel.de its-cs181
141.51.205.92 its-cs182.its.uni-kassel.de its-cs182
141.51.205.93 its-cs183.its.uni-kassel.de its-cs183
141.51.205.94 its-cs184.its.uni-kassel.de its-cs184
141.51.205.95 its-cs185.its.uni-kassel.de its-cs185
141.51.205.96 its-cs186.its.uni-kassel.de its-cs186
141.51.205.97 its-cs187.its.uni-kassel.de its-cs187
141.51.205.98 its-cs188.its.uni-kassel.de its-cs188
141.51.205.99 its-cs189.its.uni-kassel.de its-cs189
141.51.205.100 its-cs190.its.uni-kassel.de its-cs190
141.51.205.101 its-cs191.its.uni-kassel.de its-cs191
141.51.205.102 its-cs192.its.uni-kassel.de its-cs192
141.51.205.103 its-cs193.its.uni-kassel.de its-cs193
141.51.205.104 its-cs194.its.uni-kassel.de its-cs194
141.51.205.105 its-cs195.its.uni-kassel.de its-cs195
141.51.205.106 its-cs196.its.uni-kassel.de its-cs196
141.51.205.107 its-cs197.its.uni-kassel.de its-cs197
141.51.205.108 its-cs198.its.uni-kassel.de its-cs198
141.51.205.109 its-cs199.its.uni-kassel.de its-cs199
141.51.205.110 its-cs200.its.uni-kassel.de its-cs200
141.51.205.111 its-cs201.its.uni-kassel.de its-cs201
141.51.205.112 its-cs202.its.uni-kassel.de its-cs202
141.51.205.113 its-cs203.its.uni-kassel.de its-cs203
141.51.205.114 its-cs204.its.uni-kassel.de its-cs204
141.51.205.115 its-cs205.its.uni-kassel.de its-cs205
141.51.205.116 its-cs206.its.uni-kassel.de its-cs206
141.51.205.117 its-cs207.its.uni-kassel.de its-cs207
141.51.205.118 its-cs208.its.uni-kassel.de its-cs208
141.51.205.119 its-cs209.its.uni-kassel.de its-cs209
141.51.205.120 its-cs210.its.uni-kassel.de its-cs210
141.51.205.121 its-cs211.its.uni-kassel.de its-cs211
141.51.205.122 its-cs212.its.uni-kassel.de its-cs212
141.51.205.123 its-cs213.its.uni-kassel.de its-cs213
141.51.205.124 its-cs214.its.uni-kassel.de its-cs214
141.51.205.125 its-cs215.its.uni-kassel.de its-cs215
141.51.205.126 its-cs216.its.uni-kassel.de its-cs216
141.51.205.127 its-cs217.its.uni-kassel.de its-cs217
141.51.205.128 its-cs218.its.uni-kassel.de its-cs218
141.51.205.129 its-cs219.its.uni-kassel.de its-cs219
141.51.205.130 its-cs220.its.uni-kassel.de its-cs220
141.51.205.131 its-cs221.its.uni-kassel.de its-cs221
141.51.205.132 its-cs222.its.uni-kassel.de its-cs222
141.51.205.133 its-cs223.its.uni-kassel.de its-cs223
141.51.205.134 its-cs224.its.uni-kassel.de its-cs224
141.51.205.135 its-cs225.its.uni-kassel.de its-cs225
141.51.205.136 its-cs226.its.uni-kassel.de its-cs226
141.51.205.137 its-cs227.its.uni-kassel.de its-cs227
141.51.205.138 its-cs228.its.uni-kassel.de its-cs228
141.51.205.139 its-cs229.its.uni-kassel.de its-cs229
141.51.205.140 its-cs230.its.uni-kassel.de its-cs230
141.51.205.141 its-cs231.its.uni-kassel.de its-cs231
141.51.205.142 its-cs232.its.uni-kassel.de its-cs232
141.51.205.143 its-cs233.its.uni-kassel.de its-cs233
141.51.205.144 its-cs234.its.uni-kassel.de its-cs234
141.51.205.145 its-cs235.its.uni-kassel.de its-cs235
141.51.205.146 its-cs236.its.uni-kassel.de its-cs236
141.51.205.147 its-cs237.its.uni-kassel.de its-cs237
141.51.205.148 its-cs238.its.uni-kassel.de its-cs238
141.51.205.149 its-cs239.its.uni-kassel.de its-cs239
141.51.205.150 its-cs240.its.uni-kassel.de its-cs240
141.51.205.151 its-cs241.its.uni-kassel.de its-cs241
141.51.205.152 its-cs242.its.uni-kassel.de its-cs242
141.51.205.153 its-cs243.its.uni-kassel.de its-cs243
141.51.205.154 its-cs244.its.uni-kassel.de its-cs244
141.51.205.155 its-cs245.its.uni-kassel.de its-cs245
141.51.205.156 its-cs246.its.uni-kassel.de its-cs246
141.51.205.157 its-cs247.its.uni-kassel.de its-cs247
141.51.205.158 its-cs248.its.uni-kassel.de its-cs248
141.51.205.159 its-cs249.its.uni-kassel.de its-cs249
141.51.205.160 its-cs250.its.uni-kassel.de its-cs250
141.51.205.161 its-cs251.its.uni-kassel.de its-cs251
141.51.205.162 its-cs252.its.uni-kassel.de its-cs252
141.51.205.163 its-cs253.its.uni-kassel.de its-cs253
141.51.205.164 its-cs254.its.uni-kassel.de its-cs254
141.51.205.165 its-cs255.its.uni-kassel.de its-cs255
141.51.205.166 its-cs256.its.uni-kassel.de its-cs256
141.51.205.167 its-cs257.its.uni-kassel.de its-cs257
141.51.205.168 its-cs258.its.uni-kassel.de its-cs258
141.51.205.169 its-cs259.its.uni-kassel.de its-cs259
141.51.205.170 its-cs260.its.uni-kassel.de its-cs260
141.51.205.171 its-cs261.its.uni-kassel.de its-cs261
141.51.205.172 its-cs262.its.uni-kassel.de its-cs262
141.51.205.173 its-cs263.its.uni-kassel.de its-cs263
141.51.205.174 its-cs264.its.uni-kassel.de its-cs264
141.51.205.175 its-cs265.its.uni-kassel.de its-cs265
141.51.205.176 its-cs266.its.uni-kassel.de its-cs266
141.51.205.177 its-cs267.its.uni-kassel.de its-cs267
141.51.205.178 its-cs268.its.uni-kassel.de its-cs268
141.51.205.179 its-cs269.its.uni-kassel.de its-cs269
141.51.205.180 its-cs270.its.uni-kassel.de its-cs270
141.51.205.181 its-cs271.its.uni-kassel.de its-cs271
141.51.205.182 its-cs272.its.uni-kassel.de its-cs272
141.51.205.183 its-cs273.its.uni-kassel.de its-cs273
141.51.205.184 its-cs274.its.uni-kassel.de its-cs274
141.51.205.185 its-cs275.its.uni-kassel.de its-cs275
141.51.205.186 its-cs276.its.uni-kassel.de its-cs276
141.51.205.187 its-cs277.its.uni-kassel.de its-cs277
141.51.205.188 its-cs278.its.uni-kassel.de its-cs278
141.51.205.189 its-cs279.its.uni-kassel.de its-cs279
141.51.205.190 its-cs280.its.uni-kassel.de its-cs280
141.51.205.191 its-cs281.its.uni-kassel.de its-cs281
141.51.205.192 its-cs282.its.uni-kassel.de its-cs282
141.51.205.193 its-cs283.its.uni-kassel.de its-cs283
141.51.205.194 its-cs284.its.uni-kassel.de its-cs284
141.51.205.195 its-cs285.its.uni-kassel.de its-cs285
141.51.205.196 its-cs286.its.uni-kassel.de its-cs286
141.51.205.197 its-cs287.its.uni-kassel.de its-cs287
141.51.205.198 its-cs288.its.uni-kassel.de its-cs288
141.51.205.199 its-cs289.its.uni-kassel.de its-cs289
141.51.205.200 its-cs290.its.uni-kassel.de its-cs290
141.51.205.201 its-cs291.its.uni-kassel.de its-cs291
141.51.205.202 its-cs292.its.uni-kassel.de its-cs292
141.51.205.203 its-cs293.its.uni-kassel.de its-cs293
141.51.205.204 its-cs294.its.uni-kassel.de its-cs294
141.51.205.205 its-cs295.its.uni-kassel.de its-cs295
141.51.205.206 its-cs296.its.uni-kassel.de its-cs296
141.51.205.207 its-cs297.its.uni-kassel.de its-cs297
141.51.205.208 its-cs298.its.uni-kassel.de its-cs298
141.51.205.209 its-cs299.its.uni-kassel.de its-cs299
192.168.204.30 its-no1.its.uni-kassel.de its-no1
192.168.204.170 its-no10.its.uni-kassel.de its-no10
192.168.204.171 its-no11.its.uni-kassel.de its-no11
192.168.204.172 its-no12.its.uni-kassel.de its-no12
192.168.204.173 its-no13.its.uni-kassel.de its-no13
192.168.204.174 its-no14.its.uni-kassel.de its-no14
192.168.204.175 its-no15.its.uni-kassel.de its-no15
192.168.204.176 its-no16.its.uni-kassel.de its-no16
192.168.204.177 its-no17.its.uni-kassel.de its-no17
192.168.204.178 its-no18.its.uni-kassel.de its-no18
192.168.204.179 its-no19.its.uni-kassel.de its-no19
192.168.205.10 its-no100.its.uni-kassel.de its-no100
192.168.205.11 its-no101.its.uni-kassel.de its-no101
192.168.205.12 its-no102.its.uni-kassel.de its-no102
192.168.205.13 its-no103.its.uni-kassel.de its-no103
192.168.205.14 its-no104.its.uni-kassel.de its-no104
192.168.205.15 its-no105.its.uni-kassel.de its-no105
192.168.205.16 its-no106.its.uni-kassel.de its-no106
192.168.205.17 its-no107.its.uni-kassel.de its-no107
192.168.205.18 its-no108.its.uni-kassel.de its-no108
192.168.205.19 its-no109.its.uni-kassel.de its-no109
192.168.205.20 its-no110.its.uni-kassel.de its-no110
192.168.205.21 its-no111.its.uni-kassel.de its-no111
192.168.205.22 its-no112.its.uni-kassel.de its-no112
192.168.205.23 its-no113.its.uni-kassel.de its-no113
192.168.205.24 its-no114.its.uni-kassel.de its-no114
192.168.205.25 its-no115.its.uni-kassel.de its-no115
192.168.205.26 its-no116.its.uni-kassel.de its-no116
192.168.205.27 its-no117.its.uni-kassel.de its-no117
192.168.205.28 its-no118.its.uni-kassel.de its-no118
192.168.205.29 its-no119.its.uni-kassel.de its-no119
192.168.205.30 its-no120.its.uni-kassel.de its-no120
192.168.205.31 its-no121.its.uni-kassel.de its-no121
192.168.205.32 its-no122.its.uni-kassel.de its-no122
192.168.205.33 its-no123.its.uni-kassel.de its-no123
192.168.205.34 its-no124.its.uni-kassel.de its-no124
192.168.205.35 its-no125.its.uni-kassel.de its-no125
192.168.205.36 its-no126.its.uni-kassel.de its-no126
192.168.205.37 its-no127.its.uni-kassel.de its-no127
192.168.205.38 its-no128.its.uni-kassel.de its-no128
192.168.205.39 its-no129.its.uni-kassel.de its-no129
192.168.205.40 its-no130.its.uni-kassel.de its-no130
192.168.205.41 its-no131.its.uni-kassel.de its-no131
192.168.205.42 its-no132.its.uni-kassel.de its-no132
192.168.205.43 its-no133.its.uni-kassel.de its-no133
192.168.205.44 its-no134.its.uni-kassel.de its-no134
192.168.205.45 its-no135.its.uni-kassel.de its-no135
192.168.205.46 its-no136.its.uni-kassel.de its-no136
192.168.205.47 its-no137.its.uni-kassel.de its-no137
192.168.205.48 its-no138.its.uni-kassel.de its-no138
192.168.205.49 its-no139.its.uni-kassel.de its-no139
192.168.205.50 its-no140.its.uni-kassel.de its-no140
192.168.205.51 its-no141.its.uni-kassel.de its-no141
192.168.205.52 its-no142.its.uni-kassel.de its-no142
192.168.205.53 its-no143.its.uni-kassel.de its-no143
192.168.205.54 its-no144.its.uni-kassel.de its-no144
192.168.205.55 its-no145.its.uni-kassel.de its-no145
192.168.205.56 its-no146.its.uni-kassel.de its-no146
192.168.205.57 its-no147.its.uni-kassel.de its-no147
192.168.205.58 its-no148.its.uni-kassel.de its-no148
192.168.205.59 its-no149.its.uni-kassel.de its-no149
192.168.205.60 its-no150.its.uni-kassel.de its-no150
192.168.205.61 its-no151.its.uni-kassel.de its-no151
192.168.205.62 its-no152.its.uni-kassel.de its-no152
192.168.205.63 its-no153.its.uni-kassel.de its-no153
192.168.205.64 its-no154.its.uni-kassel.de its-no154
192.168.205.65 its-no155.its.uni-kassel.de its-no155
192.168.205.66 its-no156.its.uni-kassel.de its-no156
192.168.205.67 its-no157.its.uni-kassel.de its-no157
192.168.205.68 its-no158.its.uni-kassel.de its-no158
192.168.205.69 its-no159.its.uni-kassel.de its-no159
192.168.205.70 its-no160.its.uni-kassel.de its-no160
192.168.205.71 its-no161.its.uni-kassel.de its-no161
192.168.205.72 its-no162.its.uni-kassel.de its-no162
192.168.205.73 its-no163.its.uni-kassel.de its-no163
192.168.205.74 its-no164.its.uni-kassel.de its-no164
192.168.205.75 its-no165.its.uni-kassel.de its-no165
192.168.205.76 its-no166.its.uni-kassel.de its-no166
192.168.205.77 its-no167.its.uni-kassel.de its-no167
192.168.205.78 its-no168.its.uni-kassel.de its-no168
192.168.205.79 its-no169.its.uni-kassel.de its-no169
192.168.205.80 its-no170.its.uni-kassel.de its-no170
192.168.205.81 its-no171.its.uni-kassel.de its-no171
192.168.205.82 its-no172.its.uni-kassel.de its-no172
192.168.205.83 its-no173.its.uni-kassel.de its-no173
192.168.205.84 its-no174.its.uni-kassel.de its-no174
192.168.205.85 its-no175.its.uni-kassel.de its-no175
192.168.205.86 its-no176.its.uni-kassel.de its-no176
192.168.205.87 its-no177.its.uni-kassel.de its-no177
192.168.205.88 its-no178.its.uni-kassel.de its-no178
192.168.205.89 its-no179.its.uni-kassel.de its-no179
192.168.205.90 its-no180.its.uni-kassel.de its-no180
192.168.205.91 its-no181.its.uni-kassel.de its-no181
192.168.205.92 its-no182.its.uni-kassel.de its-no182
192.168.205.93 its-no183.its.uni-kassel.de its-no183
192.168.205.94 its-no184.its.uni-kassel.de its-no184
192.168.205.95 its-no185.its.uni-kassel.de its-no185
192.168.205.96 its-no186.its.uni-kassel.de its-no186
192.168.205.97 its-no187.its.uni-kassel.de its-no187
192.168.205.98 its-no188.its.uni-kassel.de its-no188
192.168.205.99 its-no189.its.uni-kassel.de its-no189
192.168.205.100 its-no190.its.uni-kassel.de its-no190
192.168.205.101 its-no191.its.uni-kassel.de its-no191
192.168.205.102 its-no192.its.uni-kassel.de its-no192
192.168.205.103 its-no193.its.uni-kassel.de its-no193
192.168.205.104 its-no194.its.uni-kassel.de its-no194
192.168.205.105 its-no195.its.uni-kassel.de its-no195
192.168.205.106 its-no196.its.uni-kassel.de its-no196
192.168.205.107 its-no197.its.uni-kassel.de its-no197
192.168.205.108 its-no198.its.uni-kassel.de its-no198
192.168.205.109 its-no199.its.uni-kassel.de its-no199
192.168.205.110 its-no200.its.uni-kassel.de its-no200
192.168.205.111 its-no201.its.uni-kassel.de its-no201
192.168.205.112 its-no202.its.uni-kassel.de its-no202
192.168.205.113 its-no203.its.uni-kassel.de its-no203
192.168.205.114 its-no204.its.uni-kassel.de its-no204
192.168.205.115 its-no205.its.uni-kassel.de its-no205
192.168.205.116 its-no206.its.uni-kassel.de its-no206
192.168.205.117 its-no207.its.uni-kassel.de its-no207
192.168.205.118 its-no208.its.uni-kassel.de its-no208
192.168.205.119 its-no209.its.uni-kassel.de its-no209
192.168.205.120 its-no210.its.uni-kassel.de its-no210
192.168.205.121 its-no211.its.uni-kassel.de its-no211
192.168.205.122 its-no212.its.uni-kassel.de its-no212
192.168.205.123 its-no213.its.uni-kassel.de its-no213
192.168.205.124 its-no214.its.uni-kassel.de its-no214
192.168.205.125 its-no215.its.uni-kassel.de its-no215
192.168.205.126 its-no216.its.uni-kassel.de its-no216
192.168.205.127 its-no217.its.uni-kassel.de its-no217
192.168.205.128 its-no218.its.uni-kassel.de its-no218
192.168.205.129 its-no219.its.uni-kassel.de its-no219
192.168.205.130 its-no220.its.uni-kassel.de its-no220
192.168.205.131 its-no221.its.uni-kassel.de its-no221
192.168.205.132 its-no222.its.uni-kassel.de its-no222
192.168.205.133 its-no223.its.uni-kassel.de its-no223
192.168.205.134 its-no224.its.uni-kassel.de its-no224
192.168.205.135 its-no225.its.uni-kassel.de its-no225
192.168.205.136 its-no226.its.uni-kassel.de its-no226
192.168.205.137 its-no227.its.uni-kassel.de its-no227
192.168.205.138 its-no228.its.uni-kassel.de its-no228
192.168.205.139 its-no229.its.uni-kassel.de its-no229
192.168.205.140 its-no230.its.uni-kassel.de its-no230
192.168.205.141 its-no231.its.uni-kassel.de its-no231
192.168.205.142 its-no232.its.uni-kassel.de its-no232
192.168.205.143 its-no233.its.uni-kassel.de its-no233
192.168.205.144 its-no234.its.uni-kassel.de its-no234
192.168.205.145 its-no235.its.uni-kassel.de its-no235
192.168.205.146 its-no236.its.uni-kassel.de its-no236
192.168.205.147 its-no237.its.uni-kassel.de its-no237
192.168.205.148 its-no238.its.uni-kassel.de its-no238
192.168.205.149 its-no239.its.uni-kassel.de its-no239
192.168.205.150 its-no240.its.uni-kassel.de its-no240
192.168.205.151 its-no241.its.uni-kassel.de its-no241
192.168.205.152 its-no242.its.uni-kassel.de its-no242
192.168.205.153 its-no243.its.uni-kassel.de its-no243
192.168.205.154 its-no244.its.uni-kassel.de its-no244
192.168.205.155 its-no245.its.uni-kassel.de its-no245
192.168.205.156 its-no246.its.uni-kassel.de its-no246
192.168.205.157 its-no247.its.uni-kassel.de its-no247
192.168.205.158 its-no248.its.uni-kassel.de its-no248
192.168.205.159 its-no249.its.uni-kassel.de its-no249
192.168.205.160 its-no250.its.uni-kassel.de its-no250
192.168.205.161 its-no251.its.uni-kassel.de its-no251
192.168.205.162 its-no252.its.uni-kassel.de its-no252
192.168.205.163 its-no253.its.uni-kassel.de its-no253
192.168.205.164 its-no254.its.uni-kassel.de its-no254
192.168.205.165 its-no255.its.uni-kassel.de its-no255
192.168.205.166 its-no256.its.uni-kassel.de its-no256
192.168.205.167 its-no257.its.uni-kassel.de its-no257
192.168.205.168 its-no258.its.uni-kassel.de its-no258
192.168.205.169 its-no259.its.uni-kassel.de its-no259
192.168.205.170 its-no260.its.uni-kassel.de its-no260
192.168.205.171 its-no261.its.uni-kassel.de its-no261
192.168.205.172 its-no262.its.uni-kassel.de its-no262
192.168.205.173 its-no263.its.uni-kassel.de its-no263
192.168.205.174 its-no264.its.uni-kassel.de its-no264
192.168.205.175 its-no265.its.uni-kassel.de its-no265
192.168.205.176 its-no266.its.uni-kassel.de its-no266
192.168.205.177 its-no267.its.uni-kassel.de its-no267
192.168.205.178 its-no268.its.uni-kassel.de its-no268
192.168.205.179 its-no269.its.uni-kassel.de its-no269
192.168.205.180 its-no270.its.uni-kassel.de its-no270
192.168.205.181 its-no271.its.uni-kassel.de its-no271
192.168.205.182 its-no272.its.uni-kassel.de its-no272
192.168.205.183 its-no273.its.uni-kassel.de its-no273
192.168.205.184 its-no274.its.uni-kassel.de its-no274
192.168.205.185 its-no275.its.uni-kassel.de its-no275
192.168.205.186 its-no276.its.uni-kassel.de its-no276
192.168.205.187 its-no277.its.uni-kassel.de its-no277
192.168.205.188 its-no278.its.uni-kassel.de its-no278
192.168.205.189 its-no279.its.uni-kassel.de its-no279
192.168.205.190 its-no280.its.uni-kassel.de its-no280
192.168.205.191 its-no281.its.uni-kassel.de its-no281
192.168.205.192 its-no282.its.uni-kassel.de its-no282
192.168.205.193 its-no283.its.uni-kassel.de its-no283
192.168.205.194 its-no284.its.uni-kassel.de its-no284
192.168.205.195 its-no285.its.uni-kassel.de its-no285
192.168.205.196 its-no286.its.uni-kassel.de its-no286
192.168.205.197 its-no287.its.uni-kassel.de its-no287
192.168.205.198 its-no288.its.uni-kassel.de its-no288
192.168.205.199 its-no289.its.uni-kassel.de its-no289
192.168.205.200 its-no290.its.uni-kassel.de its-no290
192.168.205.201 its-no291.its.uni-kassel.de its-no291
192.168.205.202 its-no292.its.uni-kassel.de its-no292
192.168.205.203 its-no293.its.uni-kassel.de its-no293
192.168.205.204 its-no294.its.uni-kassel.de its-no294
192.168.205.205 its-no295.its.uni-kassel.de its-no295
192.168.205.206 its-no296.its.uni-kassel.de its-no296
192.168.205.207 its-no297.its.uni-kassel.de its-no297
192.168.205.208 its-no298.its.uni-kassel.de its-no298
192.168.205.209 its-no299.its.uni-kassel.de its-no299
192.168.168.30 its-ib1.its.uni-kassel.de its-ib1
192.168.168.170 its-ib10.its.uni-kassel.de its-ib10
192.168.168.171 its-ib11.its.uni-kassel.de its-ib11
192.168.168.172 its-ib12.its.uni-kassel.de its-ib12
192.168.168.173 its-ib13.its.uni-kassel.de its-ib13
192.168.168.174 its-ib14.its.uni-kassel.de its-ib14
192.168.168.175 its-ib15.its.uni-kassel.de its-ib15
192.168.168.176 its-ib16.its.uni-kassel.de its-ib16
192.168.168.177 its-ib17.its.uni-kassel.de its-ib17
192.168.168.178 its-ib18.its.uni-kassel.de its-ib18
192.168.168.179 its-ib19.its.uni-kassel.de its-ib19
192.168.169.10 its-ib100.its.uni-kassel.de its-ib100
192.168.169.11 its-ib101.its.uni-kassel.de its-ib101
192.168.169.12 its-ib102.its.uni-kassel.de its-ib102
192.168.169.13 its-ib103.its.uni-kassel.de its-ib103
192.168.169.14 its-ib104.its.uni-kassel.de its-ib104
192.168.169.15 its-ib105.its.uni-kassel.de its-ib105
192.168.169.16 its-ib106.its.uni-kassel.de its-ib106
192.168.169.17 its-ib107.its.uni-kassel.de its-ib107
192.168.169.18 its-ib108.its.uni-kassel.de its-ib108
192.168.169.19 its-ib109.its.uni-kassel.de its-ib109
192.168.169.20 its-ib110.its.uni-kassel.de its-ib110
192.168.169.21 its-ib111.its.uni-kassel.de its-ib111
192.168.169.22 its-ib112.its.uni-kassel.de its-ib112
192.168.169.23 its-ib113.its.uni-kassel.de its-ib113
192.168.169.24 its-ib114.its.uni-kassel.de its-ib114
192.168.169.25 its-ib115.its.uni-kassel.de its-ib115
192.168.169.26 its-ib116.its.uni-kassel.de its-ib116
192.168.169.27 its-ib117.its.uni-kassel.de its-ib117
192.168.169.28 its-ib118.its.uni-kassel.de its-ib118
192.168.169.29 its-ib119.its.uni-kassel.de its-ib119
192.168.169.30 its-ib120.its.uni-kassel.de its-ib120
192.168.169.31 its-ib121.its.uni-kassel.de its-ib121
192.168.169.32 its-ib122.its.uni-kassel.de its-ib122
192.168.169.33 its-ib123.its.uni-kassel.de its-ib123
192.168.169.34 its-ib124.its.uni-kassel.de its-ib124
192.168.169.35 its-ib125.its.uni-kassel.de its-ib125
192.168.169.36 its-ib126.its.uni-kassel.de its-ib126
192.168.169.37 its-ib127.its.uni-kassel.de its-ib127
192.168.169.38 its-ib128.its.uni-kassel.de its-ib128
192.168.169.39 its-ib129.its.uni-kassel.de its-ib129
192.168.169.40 its-ib130.its.uni-kassel.de its-ib130
192.168.169.41 its-ib131.its.uni-kassel.de its-ib131
192.168.169.42 its-ib132.its.uni-kassel.de its-ib132
192.168.169.43 its-ib133.its.uni-kassel.de its-ib133
192.168.169.44 its-ib134.its.uni-kassel.de its-ib134
192.168.169.45 its-ib135.its.uni-kassel.de its-ib135
192.168.169.46 its-ib136.its.uni-kassel.de its-ib136
192.168.169.47 its-ib137.its.uni-kassel.de its-ib137
192.168.169.48 its-ib138.its.uni-kassel.de its-ib138
192.168.169.49 its-ib139.its.uni-kassel.de its-ib139
192.168.169.50 its-ib140.its.uni-kassel.de its-ib140
192.168.169.51 its-ib141.its.uni-kassel.de its-ib141
192.168.169.52 its-ib142.its.uni-kassel.de its-ib142
192.168.169.53 its-ib143.its.uni-kassel.de its-ib143
192.168.169.54 its-ib144.its.uni-kassel.de its-ib144
192.168.169.55 its-ib145.its.uni-kassel.de its-ib145
192.168.169.56 its-ib146.its.uni-kassel.de its-ib146
192.168.169.57 its-ib147.its.uni-kassel.de its-ib147
192.168.169.58 its-ib148.its.uni-kassel.de its-ib148
192.168.169.59 its-ib149.its.uni-kassel.de its-ib149
192.168.169.60 its-ib150.its.uni-kassel.de its-ib150
192.168.169.61 its-ib151.its.uni-kassel.de its-ib151
192.168.169.62 its-ib152.its.uni-kassel.de its-ib152
192.168.169.63 its-ib153.its.uni-kassel.de its-ib153
192.168.169.64 its-ib154.its.uni-kassel.de its-ib154
192.168.169.65 its-ib155.its.uni-kassel.de its-ib155
192.168.169.66 its-ib156.its.uni-kassel.de its-ib156
192.168.169.67 its-ib157.its.uni-kassel.de its-ib157
192.168.169.68 its-ib158.its.uni-kassel.de its-ib158
192.168.169.69 its-ib159.its.uni-kassel.de its-ib159
192.168.169.70 its-ib160.its.uni-kassel.de its-ib160
192.168.169.71 its-ib161.its.uni-kassel.de its-ib161
192.168.169.72 its-ib162.its.uni-kassel.de its-ib162
192.168.169.73 its-ib163.its.uni-kassel.de its-ib163
192.168.169.74 its-ib164.its.uni-kassel.de its-ib164
192.168.169.75 its-ib165.its.uni-kassel.de its-ib165
192.168.169.76 its-ib166.its.uni-kassel.de its-ib166
192.168.169.77 its-ib167.its.uni-kassel.de its-ib167
192.168.169.78 its-ib168.its.uni-kassel.de its-ib168
192.168.169.79 its-ib169.its.uni-kassel.de its-ib169
192.168.169.80 its-ib170.its.uni-kassel.de its-ib170
192.168.169.81 its-ib171.its.uni-kassel.de its-ib171
192.168.169.82 its-ib172.its.uni-kassel.de its-ib172
192.168.169.83 its-ib173.its.uni-kassel.de its-ib173
192.168.169.84 its-ib174.its.uni-kassel.de its-ib174
192.168.169.85 its-ib175.its.uni-kassel.de its-ib175
192.168.169.86 its-ib176.its.uni-kassel.de its-ib176
192.168.169.87 its-ib177.its.uni-kassel.de its-ib177
192.168.169.88 its-ib178.its.uni-kassel.de its-ib178
192.168.169.89 its-ib179.its.uni-kassel.de its-ib179
192.168.169.90 its-ib180.its.uni-kassel.de its-ib180
192.168.169.91 its-ib181.its.uni-kassel.de its-ib181
192.168.169.92 its-ib182.its.uni-kassel.de its-ib182
192.168.169.93 its-ib183.its.uni-kassel.de its-ib183
192.168.169.94 its-ib184.its.uni-kassel.de its-ib184
192.168.169.95 its-ib185.its.uni-kassel.de its-ib185
192.168.169.96 its-ib186.its.uni-kassel.de its-ib186
192.168.169.97 its-ib187.its.uni-kassel.de its-ib187
192.168.169.98 its-ib188.its.uni-kassel.de its-ib188
192.168.169.99 its-ib189.its.uni-kassel.de its-ib189
192.168.169.100 its-ib190.its.uni-kassel.de its-ib190
192.168.169.101 its-ib191.its.uni-kassel.de its-ib191
192.168.169.102 its-ib192.its.uni-kassel.de its-ib192
192.168.169.103 its-ib193.its.uni-kassel.de its-ib193
192.168.169.104 its-ib194.its.uni-kassel.de its-ib194
192.168.169.105 its-ib195.its.uni-kassel.de its-ib195
192.168.169.106 its-ib196.its.uni-kassel.de its-ib196
192.168.169.107 its-ib197.its.uni-kassel.de its-ib197
192.168.169.108 its-ib198.its.uni-kassel.de its-ib198
192.168.169.109 its-ib199.its.uni-kassel.de its-ib199
192.168.169.110 its-ib200.its.uni-kassel.de its-ib200
192.168.169.111 its-ib201.its.uni-kassel.de its-ib201
192.168.169.112 its-ib202.its.uni-kassel.de its-ib202
192.168.169.113 its-ib203.its.uni-kassel.de its-ib203
192.168.169.114 its-ib204.its.uni-kassel.de its-ib204
192.168.169.115 its-ib205.its.uni-kassel.de its-ib205
192.168.169.116 its-ib206.its.uni-kassel.de its-ib206
192.168.169.117 its-ib207.its.uni-kassel.de its-ib207
192.168.169.118 its-ib208.its.uni-kassel.de its-ib208
192.168.169.119 its-ib209.its.uni-kassel.de its-ib209
192.168.169.120 its-ib210.its.uni-kassel.de its-ib210
192.168.169.121 its-ib211.its.uni-kassel.de its-ib211
192.168.169.122 its-ib212.its.uni-kassel.de its-ib212
192.168.169.123 its-ib213.its.uni-kassel.de its-ib213
192.168.169.124 its-ib214.its.uni-kassel.de its-ib214
192.168.169.125 its-ib215.its.uni-kassel.de its-ib215
192.168.169.126 its-ib216.its.uni-kassel.de its-ib216
192.168.169.127 its-ib217.its.uni-kassel.de its-ib217
192.168.169.128 its-ib218.its.uni-kassel.de its-ib218
192.168.169.129 its-ib219.its.uni-kassel.de its-ib219
192.168.169.130 its-ib220.its.uni-kassel.de its-ib220
192.168.169.131 its-ib221.its.uni-kassel.de its-ib221
192.168.169.132 its-ib222.its.uni-kassel.de its-ib222
192.168.169.133 its-ib223.its.uni-kassel.de its-ib223
192.168.169.134 its-ib224.its.uni-kassel.de its-ib224
192.168.169.135 its-ib225.its.uni-kassel.de its-ib225
192.168.169.136 its-ib226.its.uni-kassel.de its-ib226
192.168.169.137 its-ib227.its.uni-kassel.de its-ib227
192.168.169.138 its-ib228.its.uni-kassel.de its-ib228
192.168.169.139 its-ib229.its.uni-kassel.de its-ib229
192.168.169.140 its-ib230.its.uni-kassel.de its-ib230
192.168.169.141 its-ib231.its.uni-kassel.de its-ib231
192.168.169.142 its-ib232.its.uni-kassel.de its-ib232
192.168.169.143 its-ib233.its.uni-kassel.de its-ib233
192.168.169.144 its-ib234.its.uni-kassel.de its-ib234
192.168.169.145 its-ib235.its.uni-kassel.de its-ib235
192.168.169.146 its-ib236.its.uni-kassel.de its-ib236
192.168.169.147 its-ib237.its.uni-kassel.de its-ib237
192.168.169.148 its-ib238.its.uni-kassel.de its-ib238
192.168.169.149 its-ib239.its.uni-kassel.de its-ib239
192.168.169.150 its-ib240.its.uni-kassel.de its-ib240
192.168.169.151 its-ib241.its.uni-kassel.de its-ib241
192.168.169.152 its-ib242.its.uni-kassel.de its-ib242
192.168.169.153 its-ib243.its.uni-kassel.de its-ib243
192.168.169.154 its-ib244.its.uni-kassel.de its-ib244
192.168.169.155 its-ib245.its.uni-kassel.de its-ib245
192.168.169.156 its-ib246.its.uni-kassel.de its-ib246
192.168.169.157 its-ib247.its.uni-kassel.de its-ib247
192.168.169.158 its-ib248.its.uni-kassel.de its-ib248
192.168.169.159 its-ib249.its.uni-kassel.de its-ib249
192.168.169.160 its-ib250.its.uni-kassel.de its-ib250
192.168.169.161 its-ib251.its.uni-kassel.de its-ib251
192.168.169.162 its-ib252.its.uni-kassel.de its-ib252
192.168.169.163 its-ib253.its.uni-kassel.de its-ib253
192.168.169.164 its-ib254.its.uni-kassel.de its-ib254
192.168.169.165 its-ib255.its.uni-kassel.de its-ib255
192.168.169.166 its-ib256.its.uni-kassel.de its-ib256
192.168.169.167 its-ib257.its.uni-kassel.de its-ib257
192.168.169.168 its-ib258.its.uni-kassel.de its-ib258
192.168.169.169 its-ib259.its.uni-kassel.de its-ib259
192.168.169.170 its-ib260.its.uni-kassel.de its-ib260
192.168.169.171 its-ib261.its.uni-kassel.de its-ib261
192.168.169.172 its-ib262.its.uni-kassel.de its-ib262
192.168.169.173 its-ib263.its.uni-kassel.de its-ib263
192.168.169.174 its-ib264.its.uni-kassel.de its-ib264
192.168.169.175 its-ib265.its.uni-kassel.de its-ib265
192.168.169.176 its-ib266.its.uni-kassel.de its-ib266
192.168.169.177 its-ib267.its.uni-kassel.de its-ib267
192.168.169.178 its-ib268.its.uni-kassel.de its-ib268
192.168.169.179 its-ib269.its.uni-kassel.de its-ib269
192.168.169.180 its-ib270.its.uni-kassel.de its-ib270
192.168.169.181 its-ib271.its.uni-kassel.de its-ib271
192.168.169.182 its-ib272.its.uni-kassel.de its-ib272
192.168.169.183 its-ib273.its.uni-kassel.de its-ib273
192.168.169.184 its-ib274.its.uni-kassel.de its-ib274
192.168.169.185 its-ib275.its.uni-kassel.de its-ib275
192.168.169.186 its-ib276.its.uni-kassel.de its-ib276
192.168.169.187 its-ib277.its.uni-kassel.de its-ib277
192.168.169.188 its-ib278.its.uni-kassel.de its-ib278
192.168.169.189 its-ib279.its.uni-kassel.de its-ib279
192.168.169.190 its-ib280.its.uni-kassel.de its-ib280
192.168.169.191 its-ib281.its.uni-kassel.de its-ib281
192.168.169.192 its-ib282.its.uni-kassel.de its-ib282
192.168.169.193 its-ib283.its.uni-kassel.de its-ib283
192.168.169.194 its-ib284.its.uni-kassel.de its-ib284
192.168.169.195 its-ib285.its.uni-kassel.de its-ib285
192.168.169.196 its-ib286.its.uni-kassel.de its-ib286
192.168.169.197 its-ib287.its.uni-kassel.de its-ib287
192.168.169.198 its-ib288.its.uni-kassel.de its-ib288
192.168.169.199 its-ib289.its.uni-kassel.de its-ib289
192.168.169.200 its-ib290.its.uni-kassel.de its-ib290
192.168.169.201 its-ib291.its.uni-kassel.de its-ib291
192.168.169.202 its-ib292.its.uni-kassel.de its-ib292
192.168.169.203 its-ib293.its.uni-kassel.de its-ib293
192.168.169.204 its-ib294.its.uni-kassel.de its-ib294
192.168.169.205 its-ib295.its.uni-kassel.de its-ib295
192.168.169.206 its-ib296.its.uni-kassel.de its-ib296
192.168.169.207 its-ib297.its.uni-kassel.de its-ib297
192.168.169.208 its-ib298.its.uni-kassel.de its-ib298
192.168.169.209 its-ib299.its.uni-kassel.de its-ib299
141.51.205.210 its-cs300.its.uni-kassel.de its-cs300
141.51.205.211 its-cs301.its.uni-kassel.de its-cs301
141.51.205.212 its-cs302.its.uni-kassel.de its-cs302
141.51.205.213 its-cs303.its.uni-kassel.de its-cs303
141.51.205.214 its-cs304.its.uni-kassel.de its-cs304
141.51.205.215 its-cs305.its.uni-kassel.de its-cs305
141.51.205.216 its-cs306.its.uni-kassel.de its-cs306
141.51.205.217 its-cs307.its.uni-kassel.de its-cs307
141.51.205.218 its-cs308.its.uni-kassel.de its-cs308
141.51.205.219 its-cs309.its.uni-kassel.de its-cs309
141.51.205.220 its-cs310.its.uni-kassel.de its-cs310
141.51.205.221 its-cs311.its.uni-kassel.de its-cs311
141.51.205.222 its-cs312.its.uni-kassel.de its-cs312
141.51.205.223 its-cs313.its.uni-kassel.de its-cs313
141.51.205.224 its-cs314.its.uni-kassel.de its-cs314
141.51.205.225 its-cs315.its.uni-kassel.de its-cs315
141.51.205.226 its-cs316.its.uni-kassel.de its-cs316
141.51.205.227 its-cs317.its.uni-kassel.de its-cs317
141.51.205.228 its-cs318.its.uni-kassel.de its-cs318
141.51.205.229 its-cs319.its.uni-kassel.de its-cs319
141.51.205.230 its-cs320.its.uni-kassel.de its-cs320
141.51.205.231 its-cs321.its.uni-kassel.de its-cs321
141.51.205.232 its-cs322.its.uni-kassel.de its-cs322
141.51.205.233 its-cs323.its.uni-kassel.de its-cs323
141.51.205.234 its-cs324.its.uni-kassel.de its-cs324
141.51.205.235 its-cs325.its.uni-kassel.de its-cs325
141.51.205.236 its-cs326.its.uni-kassel.de its-cs326
141.51.205.237 its-cs327.its.uni-kassel.de its-cs327
141.51.205.238 its-cs328.its.uni-kassel.de its-cs328
141.51.205.239 its-cs329.its.uni-kassel.de its-cs329
141.51.205.240 its-cs330.its.uni-kassel.de its-cs330
141.51.205.241 its-cs331.its.uni-kassel.de its-cs331
141.51.205.242 its-cs332.its.uni-kassel.de its-cs332
141.51.205.243 its-cs333.its.uni-kassel.de its-cs333
141.51.205.244 its-cs334.its.uni-kassel.de its-cs334
141.51.205.245 its-cs335.its.uni-kassel.de its-cs335
141.51.205.246 its-cs336.its.uni-kassel.de its-cs336
141.51.205.247 its-cs337.its.uni-kassel.de its-cs337
141.51.205.248 its-cs338.its.uni-kassel.de its-cs338
141.51.205.249 its-cs339.its.uni-kassel.de its-cs339
141.51.205.250 its-cs340.its.uni-kassel.de its-cs340
141.51.205.251 its-cs341.its.uni-kassel.de its-cs341
141.51.205.252 its-cs342.its.uni-kassel.de its-cs342
141.51.205.253 its-cs343.its.uni-kassel.de its-cs343
141.51.205.254 its-cs344.its.uni-kassel.de its-cs344
192.168.205.210 its-no300.its.uni-kassel.de its-no300
192.168.205.211 its-no301.its.uni-kassel.de its-no301
192.168.205.212 its-no302.its.uni-kassel.de its-no302
192.168.205.213 its-no303.its.uni-kassel.de its-no303
192.168.205.214 its-no304.its.uni-kassel.de its-no304
192.168.205.215 its-no305.its.uni-kassel.de its-no305
192.168.205.216 its-no306.its.uni-kassel.de its-no306
192.168.205.217 its-no307.its.uni-kassel.de its-no307
192.168.205.218 its-no308.its.uni-kassel.de its-no308
192.168.205.219 its-no309.its.uni-kassel.de its-no309
192.168.205.220 its-no310.its.uni-kassel.de its-no310
192.168.205.221 its-no311.its.uni-kassel.de its-no311
192.168.205.222 its-no312.its.uni-kassel.de its-no312
192.168.205.223 its-no313.its.uni-kassel.de its-no313
192.168.205.224 its-no314.its.uni-kassel.de its-no314
192.168.205.225 its-no315.its.uni-kassel.de its-no315
192.168.205.226 its-no316.its.uni-kassel.de its-no316
192.168.205.227 its-no317.its.uni-kassel.de its-no317
192.168.205.228 its-no318.its.uni-kassel.de its-no318
192.168.205.229 its-no319.its.uni-kassel.de its-no319
192.168.205.230 its-no320.its.uni-kassel.de its-no320
192.168.205.231 its-no321.its.uni-kassel.de its-no321
192.168.205.232 its-no322.its.uni-kassel.de its-no322
192.168.205.233 its-no323.its.uni-kassel.de its-no323
192.168.205.234 its-no324.its.uni-kassel.de its-no324
192.168.205.235 its-no325.its.uni-kassel.de its-no325
192.168.205.236 its-no326.its.uni-kassel.de its-no326
192.168.205.237 its-no327.its.uni-kassel.de its-no327
192.168.205.238 its-no328.its.uni-kassel.de its-no328
192.168.205.239 its-no329.its.uni-kassel.de its-no329
192.168.205.240 its-no330.its.uni-kassel.de its-no330
192.168.205.241 its-no331.its.uni-kassel.de its-no331
192.168.205.242 its-no332.its.uni-kassel.de its-no332
192.168.205.243 its-no333.its.uni-kassel.de its-no333
192.168.205.244 its-no334.its.uni-kassel.de its-no334
192.168.205.245 its-no335.its.uni-kassel.de its-no335
192.168.205.246 its-no336.its.uni-kassel.de its-no336
192.168.205.247 its-no337.its.uni-kassel.de its-no337
192.168.205.248 its-no338.its.uni-kassel.de its-no338
192.168.205.249 its-no339.its.uni-kassel.de its-no339
192.168.205.250 its-no340.its.uni-kassel.de its-no340
192.168.205.251 its-no341.its.uni-kassel.de its-no341
192.168.205.252 its-no342.its.uni-kassel.de its-no342
192.168.205.253 its-no343.its.uni-kassel.de its-no343
192.168.205.254 its-no344.its.uni-kassel.de its-no344
192.168.169.210 its-ib300.its.uni-kassel.de its-ib300
192.168.169.211 its-ib301.its.uni-kassel.de its-ib301
192.168.169.212 its-ib302.its.uni-kassel.de its-ib302
192.168.169.213 its-ib303.its.uni-kassel.de its-ib303
192.168.169.214 its-ib304.its.uni-kassel.de its-ib304
192.168.169.215 its-ib305.its.uni-kassel.de its-ib305
192.168.169.216 its-ib306.its.uni-kassel.de its-ib306
192.168.169.217 its-ib307.its.uni-kassel.de its-ib307
192.168.169.218 its-ib308.its.uni-kassel.de its-ib308
192.168.169.219 its-ib309.its.uni-kassel.de its-ib309
192.168.169.220 its-ib310.its.uni-kassel.de its-ib310
192.168.169.221 its-ib311.its.uni-kassel.de its-ib311
192.168.169.222 its-ib312.its.uni-kassel.de its-ib312
192.168.169.223 its-ib313.its.uni-kassel.de its-ib313
192.168.169.224 its-ib314.its.uni-kassel.de its-ib314
192.168.169.225 its-ib315.its.uni-kassel.de its-ib315
192.168.169.226 its-ib316.its.uni-kassel.de its-ib316
192.168.169.227 its-ib317.its.uni-kassel.de its-ib317
192.168.169.228 its-ib318.its.uni-kassel.de its-ib318
192.168.169.229 its-ib319.its.uni-kassel.de its-ib319
192.168.169.230 its-ib320.its.uni-kassel.de its-ib320
192.168.169.231 its-ib321.its.uni-kassel.de its-ib321
192.168.169.232 its-ib322.its.uni-kassel.de its-ib322
192.168.169.233 its-ib323.its.uni-kassel.de its-ib323
192.168.169.234 its-ib324.its.uni-kassel.de its-ib324
192.168.169.235 its-ib325.its.uni-kassel.de its-ib325
192.168.169.236 its-ib326.its.uni-kassel.de its-ib326
192.168.169.237 its-ib327.its.uni-kassel.de its-ib327
192.168.169.238 its-ib328.its.uni-kassel.de its-ib328
192.168.169.239 its-ib329.its.uni-kassel.de its-ib329
192.168.169.240 its-ib330.its.uni-kassel.de its-ib330
192.168.169.241 its-ib331.its.uni-kassel.de its-ib331
192.168.169.242 its-ib332.its.uni-kassel.de its-ib332
192.168.169.243 its-ib333.its.uni-kassel.de its-ib333
192.168.169.244 its-ib334.its.uni-kassel.de its-ib334
192.168.169.245 its-ib335.its.uni-kassel.de its-ib335
192.168.169.246 its-ib336.its.uni-kassel.de its-ib336
192.168.169.247 its-ib337.its.uni-kassel.de its-ib337
192.168.169.248 its-ib338.its.uni-kassel.de its-ib338
192.168.169.249 its-ib339.its.uni-kassel.de its-ib339
192.168.169.250 its-ib340.its.uni-kassel.de its-ib340
192.168.169.251 its-ib341.its.uni-kassel.de its-ib341
192.168.169.252 its-ib342.its.uni-kassel.de its-ib342
192.168.169.253 its-ib343.its.uni-kassel.de its-ib343
192.168.169.254 its-ib344.its.uni-kassel.de its-ib344


> Hello there,
>
>      Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>     Hi,
>
>     i am currently trying to run my hadoop program on a cluster. Sadly
>     though my datanodes and tasktrackers seem to have difficulties
>     with their communication as their logs say:
>     * Some datanodes and tasktrackers seem to have portproblems of
>     some kind as it can be seen in the logs below. I wondered if this
>     might be due to reasons correllated with the localhost entry in
>     /etc/hosts as you can read in alot of posts with similar errors,
>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>     ping localhost... the technician of the cluster said he'd be
>     looking for the mechanics resolving localhost)
>     * The other nodes can not speak with the namenode and jobtracker
>     (its-cs131). Although it is absolutely not clear, why this is
>     happening: the "dfs -put" i do directly before the job is running
>     fine, which seems to imply that communication between those
>     servers is working flawlessly.
>
>     Is there any reason why this might happen?
>
>
>     Regards,
>     Elmar
>
>     LOGS BELOW:
>
>     \____Datanodes
>
>     After successfully putting the data to hdfs (at this point i
>     thought namenode and datanodes have to communicate), i get the
>     following errors when starting the job:
>
>     There are 2 kinds of logs i found: the first one is big (about
>     12MB) and looks like this:
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 1 time(s).
>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 2 time(s).
>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 3 time(s).
>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 4 time(s).
>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 5 time(s).
>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 6 time(s).
>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 7 time(s).
>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 8 time(s).
>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 9 time(s).
>     2012-08-13 08:23:36,335 WARN
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at $Proxy5.sendHeartbeat(Unknown Source)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>         at java.lang.Thread.run(Thread.java:619)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 5 more
>
>     ... (this continues til the end of the log)
>
>     The second is short kind:
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:19,038 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting DataNode
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:19,203 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:19,216 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:19,217 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:19,218 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>     metrics system started
>     2012-08-13 00:59:19,306 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:19,346 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>     2012-08-13 00:59:21,787 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>     FSDatasetStatusMBean
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     Shutting down all async disk service threads...
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     All async disk service threads have been shut down.
>     2012-08-13 00:59:21,898 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.BindException: Problem binding to /0.0.0.0:50010
>     <http://0.0.0.0:50010> : Address already in use
>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>     Caused by: java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>         ... 7 more
>
>     2012-08-13 00:59:21,899 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down DataNode at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>
>
>
>
>     \_____TastTracker
>     With TaskTrackers it is the same: there are 2 kinds.
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>     Resending 'status' to 'its-cs131' with reponseId '879
>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 1 time(s).
>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 2 time(s).
>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 3 time(s).
>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 4 time(s).
>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 5 time(s).
>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 6 time(s).
>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 7 time(s).
>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 8 time(s).
>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 9 time(s).
>     2012-08-13 02:10:04,651 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>         at
>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>         at
>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 6 more
>
>
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>     STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting TaskTracker
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:24,569 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:24,626 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>     metrics system started
>     2012-08-13 00:59:24,950 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>     org.mortbay.log.Slf4jLog
>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>     Added global filtersafety
>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>     2012-08-13 00:59:25,232 INFO
>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>     2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tasktracker with owner as bmacek
>     2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker:
>     Good mapred local directories are:
>     /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>     2012-08-13 00:59:25,244 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:25,255 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source jvm registered.
>     2012-08-13 00:59:25,256 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source TaskTrackerMetrics registered.
>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>     Starting SocketReader
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcDetailedActivityForPort54850 registered.
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcActivityForPort54850 registered.
>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>     Server Responder: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server listener on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 0 on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 1 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker up at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850>
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 3 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 2 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tracker
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting thread: Map-events fetcher for all reduce tasks on
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>     setsid exited with exit code 0
>     2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>     Using ResourceCalculatorPlugin :
>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>     2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager
>     is disabled.
>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>     IndexCache created with max memory = 10485760
>     2012-08-13 00:59:38,158 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ShuffleServerMetrics registered.
>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>     Port returned by webServer.getConnectors()[0].getLocalPort()
>     before open() is -1. Opening the listener on 50060
>     2012-08-13 00:59:38,161 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>     because java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at
>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>         at
>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>
>     2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>     SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down TaskTracker at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 

A more relevant question is if he's running a firewall on each of these machines? 

A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 

One other side note... are these machines multi-homed?

On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello there,
> 
>      Could you please share your /etc/hosts file, if you don't mind.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
> Hi,
> 
> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
> 
> Is there any reason why this might happen?
> 
> 
> Regards,
> Elmar
> 
> LOGS BELOW:
> 
> \____Datanodes
> 
> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
> 
> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 5 more
> 
> ... (this continues til the end of the log)
> 
> The second is short kind:
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>     ... 7 more
> 
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 
> 
> 
> 
> 
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 6 more
> 
> 
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
> 
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 

A more relevant question is if he's running a firewall on each of these machines? 

A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 

One other side note... are these machines multi-homed?

On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello there,
> 
>      Could you please share your /etc/hosts file, if you don't mind.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
> Hi,
> 
> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
> 
> Is there any reason why this might happen?
> 
> 
> Regards,
> Elmar
> 
> LOGS BELOW:
> 
> \____Datanodes
> 
> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
> 
> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 5 more
> 
> ... (this continues til the end of the log)
> 
> The second is short kind:
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>     ... 7 more
> 
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 
> 
> 
> 
> 
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 6 more
> 
> 
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
> 
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 


Re: DataNode and Tasttracker communication

Posted by Michael Segel <mi...@hotmail.com>.
If the nodes can communicate and distribute data, then the odds are that the issue isn't going to be in his /etc/hosts. 

A more relevant question is if he's running a firewall on each of these machines? 

A simple test... ssh to one node, ping other nodes and the control nodes at random to see if they can see one another. Then check to see if there is a firewall running which would limit the types of traffic between nodes. 

One other side note... are these machines multi-homed?

On Aug 13, 2012, at 7:51 AM, Mohammad Tariq <do...@gmail.com> wrote:

> Hello there,
> 
>      Could you please share your /etc/hosts file, if you don't mind.
> 
> Regards,
>     Mohammad Tariq
> 
> 
> 
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek <ma...@cs.uni-kassel.de> wrote:
> Hi,
> 
> i am currently trying to run my hadoop program on a cluster. Sadly though my datanodes and tasktrackers seem to have difficulties with their communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind as it can be seen in the logs below. I wondered if this might be due to reasons correllated with the localhost entry in /etc/hosts as you can read in alot of posts with similar errors, but i checked the file neither localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping localhost... the technician of the cluster said he'd be looking for the mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker (its-cs131). Although it is absolutely not clear, why this is happening: the "dfs -put" i do directly before the job is running fine, which seems to imply that communication between those servers is working flawlessly.
> 
> Is there any reason why this might happen?
> 
> 
> Regards,
> Elmar
> 
> LOGS BELOW:
> 
> \____Datanodes
> 
> After successfully putting the data to hdfs (at this point i thought namenode and datanodes have to communicate), i get the following errors when starting the job:
> 
> There are 2 kinds of logs i found: the first one is big (about 12MB) and looks like this:
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 5 more
> 
> ... (this continues til the end of the log)
> 
> The second is short kind:
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/work/bmacek/hadoop/hdfs/slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException: Problem binding to /0.0.0.0:50010 : Address already in use
>     at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>     ... 7 more
> 
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 
> 
> 
> 
> 
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ############################### LOG TYPE 1 ############################################################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.ConnectException: Call to its-cs131/141.51.205.41:35555 failed on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
>     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>     at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>     ... 6 more
> 
> 
> ########################### LOG TYPE 2 ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort54850 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>     at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>     at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
> 
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/141.51.205.43
> ************************************************************/
> 


Re: DataNode and Tasttracker communication

Posted by Björn-Elmar Macek <em...@cs.uni-kassel.de>.
Sure i can, but it is long as it is a cluster:


141.51.12.86  hrz-cs400.hrz.uni-kassel.de hrz-cs400

141.51.204.11 hrz-cs401.hrz.uni-kassel.de hrz-cs401
141.51.204.12 hrz-cs402.hrz.uni-kassel.de hrz-cs402
141.51.204.13 hrz-cs403.hrz.uni-kassel.de hrz-cs403
141.51.204.14 hrz-cs404.hrz.uni-kassel.de hrz-cs404
141.51.204.15 hrz-cs405.hrz.uni-kassel.de hrz-cs405
141.51.204.16 hrz-cs406.hrz.uni-kassel.de hrz-cs406
141.51.204.17 hrz-cs407.hrz.uni-kassel.de hrz-cs407
141.51.204.18 hrz-cs408.hrz.uni-kassel.de hrz-cs408
141.51.204.19 hrz-cs409.hrz.uni-kassel.de hrz-cs409
141.51.204.20 hrz-cs410.hrz.uni-kassel.de hrz-cs410
141.51.204.21 hrz-cs411.hrz.uni-kassel.de hrz-cs411
141.51.204.22 hrz-cs412.hrz.uni-kassel.de hrz-cs412
141.51.204.23 hrz-cs413.hrz.uni-kassel.de hrz-cs413
141.51.204.24 hrz-cs414.hrz.uni-kassel.de hrz-cs414
141.51.204.25 hrz-cs415.hrz.uni-kassel.de hrz-cs415
141.51.204.26 hrz-cs416.hrz.uni-kassel.de hrz-cs416
141.51.204.27 hrz-cs417.hrz.uni-kassel.de hrz-cs417
141.51.204.28 hrz-cs418.hrz.uni-kassel.de hrz-cs418
141.51.204.29 hrz-cs419.hrz.uni-kassel.de hrz-cs419
141.51.204.31 hrz-cs421.hrz.uni-kassel.de hrz-cs421
141.51.204.32 hrz-cs422.hrz.uni-kassel.de hrz-cs422
141.51.204.33 hrz-cs423.hrz.uni-kassel.de hrz-cs423
141.51.204.34 hrz-cs424.hrz.uni-kassel.de hrz-cs424
141.51.204.35 hrz-cs425.hrz.uni-kassel.de hrz-cs425
141.51.204.36 hrz-cs426.hrz.uni-kassel.de hrz-cs426
141.51.204.37 hrz-cs427.hrz.uni-kassel.de hrz-cs427
141.51.204.38 hrz-cs428.hrz.uni-kassel.de hrz-cs428
141.51.204.39 hrz-cs429.hrz.uni-kassel.de hrz-cs429
141.51.204.40 hrz-cs430.hrz.uni-kassel.de hrz-cs430
141.51.204.47 hrz-cs437.hrz.uni-kassel.de hrz-cs437
141.51.204.48 hrz-cs438.hrz.uni-kassel.de hrz-cs438
141.51.204.49 hrz-cs439.hrz.uni-kassel.de hrz-cs439
141.51.204.50 hrz-cs440.hrz.uni-kassel.de hrz-cs440
141.51.204.51 hrz-cs441.hrz.uni-kassel.de hrz-cs441
141.51.204.54 hrz-cs444.hrz.uni-kassel.de hrz-cs444
141.51.204.65 hrz-cs455.hrz.uni-kassel.de hrz-cs455
141.51.204.66 hrz-cs456.hrz.uni-kassel.de hrz-cs456
141.51.204.69 hrz-cs459.hrz.uni-kassel.de hrz-cs459
141.51.204.70 hrz-cs460.hrz.uni-kassel.de hrz-cs460
141.51.204.71 hrz-cs461.hrz.uni-kassel.de hrz-cs461
141.51.204.72 hrz-cs462.hrz.uni-kassel.de hrz-cs462
141.51.204.73 hrz-cs463.hrz.uni-kassel.de hrz-cs463
141.51.204.74 hrz-cs464.hrz.uni-kassel.de hrz-cs464
141.51.204.75 hrz-cs465.hrz.uni-kassel.de hrz-cs465
141.51.204.76 hrz-cs466.hrz.uni-kassel.de hrz-cs466
141.51.204.77 hrz-cs467.hrz.uni-kassel.de hrz-cs467
141.51.204.78 hrz-cs468.hrz.uni-kassel.de hrz-cs468
141.51.204.79 hrz-cs469.hrz.uni-kassel.de hrz-cs469
141.51.204.80 hrz-cs470.hrz.uni-kassel.de hrz-cs470
141.51.204.81 hrz-cs471.hrz.uni-kassel.de hrz-cs471
141.51.204.82 hrz-cs472.hrz.uni-kassel.de hrz-cs472
141.51.204.83 hrz-cs473.hrz.uni-kassel.de hrz-cs473
141.51.204.84 hrz-cs474.hrz.uni-kassel.de hrz-cs474
141.51.204.85 hrz-cs475.hrz.uni-kassel.de hrz-cs475
141.51.204.86 hrz-cs476.hrz.uni-kassel.de hrz-cs476
141.51.204.87 hrz-cs477.hrz.uni-kassel.de hrz-cs477
141.51.204.88 hrz-cs478.hrz.uni-kassel.de hrz-cs478
141.51.204.89 hrz-cs479.hrz.uni-kassel.de hrz-cs479
141.51.204.90 hrz-cs480.hrz.uni-kassel.de hrz-cs480
141.51.204.91 hrz-cs481.hrz.uni-kassel.de hrz-cs481
141.51.204.92 hrz-cs482.hrz.uni-kassel.de hrz-cs482
141.51.204.93 hrz-cs483.hrz.uni-kassel.de hrz-cs483
141.51.204.94 hrz-cs484.hrz.uni-kassel.de hrz-cs484
141.51.204.95 hrz-cs485.hrz.uni-kassel.de hrz-cs485
141.51.204.96 hrz-cs486.hrz.uni-kassel.de hrz-cs486
141.51.204.97 hrz-cs487.hrz.uni-kassel.de hrz-cs487
141.51.204.98 hrz-cs488.hrz.uni-kassel.de hrz-cs488
141.51.204.99 hrz-cs489.hrz.uni-kassel.de hrz-cs489
141.51.204.100 hrz-cs490.hrz.uni-kassel.de hrz-cs490
141.51.204.101 hrz-cs491.hrz.uni-kassel.de hrz-cs491
141.51.204.102 hrz-cs492.hrz.uni-kassel.de hrz-cs492
141.51.204.103 hrz-cs493.hrz.uni-kassel.de hrz-cs493
141.51.204.104 hrz-cs494.hrz.uni-kassel.de hrz-cs494
141.51.204.105 hrz-cs495.hrz.uni-kassel.de hrz-cs495
141.51.204.106 hrz-cs496.hrz.uni-kassel.de hrz-cs496
141.51.204.107 hrz-cs497.hrz.uni-kassel.de hrz-cs497
141.51.204.108 hrz-cs498.hrz.uni-kassel.de hrz-cs498
141.51.204.109 hrz-cs499.hrz.uni-kassel.de hrz-cs499
141.51.204.110 hrz-cs500.hrz.uni-kassel.de hrz-cs500
141.51.204.111 hrz-cs501.hrz.uni-kassel.de hrz-cs501
141.51.204.112 hrz-cs502.hrz.uni-kassel.de hrz-cs502
141.51.204.113 hrz-cs503.hrz.uni-kassel.de hrz-cs503
141.51.204.114 hrz-cs504.hrz.uni-kassel.de hrz-cs504
141.51.204.115 hrz-cs505.hrz.uni-kassel.de hrz-cs505
141.51.204.116 hrz-cs506.hrz.uni-kassel.de hrz-cs506
141.51.204.117 hrz-cs507.hrz.uni-kassel.de hrz-cs507
141.51.204.118 hrz-cs508.hrz.uni-kassel.de hrz-cs508
141.51.204.119 hrz-cs509.hrz.uni-kassel.de hrz-cs509
141.51.204.120 hrz-cs510.hrz.uni-kassel.de hrz-cs510
141.51.204.121 hrz-cs511.hrz.uni-kassel.de hrz-cs511
141.51.204.122 hrz-cs512.hrz.uni-kassel.de hrz-cs512
141.51.204.123 hrz-cs513.hrz.uni-kassel.de hrz-cs513
141.51.204.124 hrz-cs514.hrz.uni-kassel.de hrz-cs514
141.51.204.125 hrz-cs515.hrz.uni-kassel.de hrz-cs515
141.51.204.126 hrz-cs516.hrz.uni-kassel.de hrz-cs516
141.51.204.127 hrz-cs517.hrz.uni-kassel.de hrz-cs517
141.51.204.128 hrz-cs518.hrz.uni-kassel.de hrz-cs518
141.51.204.129 hrz-cs519.hrz.uni-kassel.de hrz-cs519
141.51.204.130 hrz-cs520.hrz.uni-kassel.de hrz-cs520
141.51.204.131 hrz-cs521.hrz.uni-kassel.de hrz-cs521
141.51.204.132 hrz-cs522.hrz.uni-kassel.de hrz-cs522
141.51.204.133 hrz-cs523.hrz.uni-kassel.de hrz-cs523
141.51.204.134 hrz-cs524.hrz.uni-kassel.de hrz-cs524
141.51.204.135 hrz-cs525.hrz.uni-kassel.de hrz-cs525
141.51.204.136 hrz-cs526.hrz.uni-kassel.de hrz-cs526
141.51.204.137 hrz-cs527.hrz.uni-kassel.de hrz-cs527
141.51.204.138 hrz-cs528.hrz.uni-kassel.de hrz-cs528
141.51.204.139 hrz-cs529.hrz.uni-kassel.de hrz-cs529
141.51.204.140 hrz-cs530.hrz.uni-kassel.de hrz-cs530
141.51.204.141 hrz-cs531.hrz.uni-kassel.de hrz-cs531
141.51.204.142 hrz-cs532.hrz.uni-kassel.de hrz-cs532
141.51.204.143 hrz-cs533.hrz.uni-kassel.de hrz-cs533
141.51.204.144 hrz-cs534.hrz.uni-kassel.de hrz-cs534
141.51.204.145 hrz-cs535.hrz.uni-kassel.de hrz-cs535
141.51.204.146 hrz-cs536.hrz.uni-kassel.de hrz-cs536
141.51.204.147 hrz-cs537.hrz.uni-kassel.de hrz-cs537
141.51.204.148 hrz-cs538.hrz.uni-kassel.de hrz-cs538
141.51.204.149 hrz-cs539.hrz.uni-kassel.de hrz-cs539
141.51.204.150 hrz-cs540.hrz.uni-kassel.de hrz-cs540
141.51.204.151 hrz-cs541.hrz.uni-kassel.de hrz-cs541
141.51.204.152 hrz-cs542.hrz.uni-kassel.de hrz-cs542
141.51.204.153 hrz-cs543.hrz.uni-kassel.de hrz-cs543
141.51.204.154 hrz-cs544.hrz.uni-kassel.de hrz-cs544
141.51.204.155 hrz-cs545.hrz.uni-kassel.de hrz-cs545
141.51.204.156 hrz-cs546.hrz.uni-kassel.de hrz-cs546
141.51.204.157 hrz-cs547.hrz.uni-kassel.de hrz-cs547
141.51.204.158 hrz-cs548.hrz.uni-kassel.de hrz-cs548
141.51.204.159 hrz-cs549.hrz.uni-kassel.de hrz-cs549
141.51.204.160 hrz-cs550.hrz.uni-kassel.de hrz-cs550
141.51.204.161 hrz-cs551.hrz.uni-kassel.de hrz-cs551
141.51.204.162 hrz-cs552.hrz.uni-kassel.de hrz-cs552
141.51.204.163 hrz-cs553.hrz.uni-kassel.de hrz-cs553
141.51.204.164 hrz-cs554.hrz.uni-kassel.de hrz-cs554
141.51.204.165 hrz-cs555.hrz.uni-kassel.de hrz-cs555
141.51.204.166 hrz-cs556.hrz.uni-kassel.de hrz-cs556
141.51.204.167 hrz-cs557.hrz.uni-kassel.de hrz-cs557
141.51.204.168 hrz-cs558.hrz.uni-kassel.de hrz-cs558
141.51.204.169 hrz-cs559.hrz.uni-kassel.de hrz-cs559
141.51.204.215 hrz-cs560.hrz.uni-kassel.de hrz-cs560
141.51.204.216 hrz-cs561.hrz.uni-kassel.de hrz-cs561
141.51.204.217 hrz-cs562.hrz.uni-kassel.de hrz-cs562
141.51.204.218 hrz-cs563.hrz.uni-kassel.de hrz-cs563
141.51.204.219 hrz-cs564.hrz.uni-kassel.de hrz-cs564
141.51.204.220 hrz-cs565.hrz.uni-kassel.de hrz-cs565
141.51.204.221 hrz-cs566.hrz.uni-kassel.de hrz-cs566
141.51.204.222 hrz-cs567.hrz.uni-kassel.de hrz-cs567
141.51.204.223 hrz-cs568.hrz.uni-kassel.de hrz-cs568
141.51.204.224 hrz-cs569.hrz.uni-kassel.de hrz-cs569
141.51.204.225 hrz-cs570.hrz.uni-kassel.de hrz-cs570
141.51.204.226 hrz-cs571.hrz.uni-kassel.de hrz-cs571
141.51.204.227 hrz-cs572.hrz.uni-kassel.de hrz-cs572
141.51.204.228 hrz-cs573.hrz.uni-kassel.de hrz-cs573
141.51.204.229 hrz-cs574.hrz.uni-kassel.de hrz-cs574
141.51.204.230 hrz-cs575.hrz.uni-kassel.de hrz-cs575
141.51.204.231 hrz-cs576.hrz.uni-kassel.de hrz-cs576
141.51.204.232 hrz-cs577.hrz.uni-kassel.de hrz-cs577
141.51.204.233 hrz-cs578.hrz.uni-kassel.de hrz-cs578
141.51.204.234 hrz-cs579.hrz.uni-kassel.de hrz-cs579
141.51.204.235 hrz-cs580.hrz.uni-kassel.de hrz-cs580
141.51.204.236 hrz-cs581.hrz.uni-kassel.de hrz-cs581
141.51.204.237 hrz-cs582.hrz.uni-kassel.de hrz-cs582
141.51.204.238 hrz-cs583.hrz.uni-kassel.de hrz-cs583
141.51.204.239 hrz-cs584.hrz.uni-kassel.de hrz-cs584
141.51.204.240 hrz-cs585.hrz.uni-kassel.de hrz-cs585
141.51.204.241 hrz-cs586.hrz.uni-kassel.de hrz-cs586
141.51.204.242 hrz-cs587.hrz.uni-kassel.de hrz-cs587
141.51.204.243 hrz-cs588.hrz.uni-kassel.de hrz-cs588
141.51.204.244 hrz-cs589.hrz.uni-kassel.de hrz-cs589
141.51.204.245 hrz-cs590.hrz.uni-kassel.de hrz-cs590
141.51.204.246 hrz-cs591.hrz.uni-kassel.de hrz-cs591
141.51.204.247 hrz-cs592.hrz.uni-kassel.de hrz-cs592
141.51.204.248 hrz-cs593.hrz.uni-kassel.de hrz-cs593
141.51.204.249 hrz-cs594.hrz.uni-kassel.de hrz-cs594
141.51.204.250 hrz-cs595.hrz.uni-kassel.de hrz-cs595
141.51.204.251 hrz-cs596.hrz.uni-kassel.de hrz-cs596
141.51.204.252 hrz-cs597.hrz.uni-kassel.de hrz-cs597
141.51.204.253 hrz-cs598.hrz.uni-kassel.de hrz-cs598
141.51.204.254 hrz-cs599.hrz.uni-kassel.de hrz-cs599


192.168.204.11 hrz-no401.hrz.uni-kassel.de hrz-no401
192.168.204.12 hrz-no402.hrz.uni-kassel.de hrz-no402
192.168.204.13 hrz-no403.hrz.uni-kassel.de hrz-no403
192.168.204.14 hrz-no404.hrz.uni-kassel.de hrz-no404
192.168.204.15 hrz-no405.hrz.uni-kassel.de hrz-no405
192.168.204.16 hrz-no406.hrz.uni-kassel.de hrz-no406
192.168.204.17 hrz-no407.hrz.uni-kassel.de hrz-no407
192.168.204.18 hrz-no408.hrz.uni-kassel.de hrz-no408
192.168.204.19 hrz-no409.hrz.uni-kassel.de hrz-no409
192.168.204.20 hrz-no410.hrz.uni-kassel.de hrz-no410
192.168.204.21 hrz-no411.hrz.uni-kassel.de hrz-no411
192.168.204.22 hrz-no412.hrz.uni-kassel.de hrz-no412
192.168.204.23 hrz-no413.hrz.uni-kassel.de hrz-no413
192.168.204.24 hrz-no414.hrz.uni-kassel.de hrz-no414
192.168.204.25 hrz-no415.hrz.uni-kassel.de hrz-no415
192.168.204.26 hrz-no416.hrz.uni-kassel.de hrz-no416
192.168.204.27 hrz-no417.hrz.uni-kassel.de hrz-no417
192.168.204.28 hrz-no418.hrz.uni-kassel.de hrz-no418
192.168.204.29 hrz-no419.hrz.uni-kassel.de hrz-no419
192.168.204.31 hrz-no421.hrz.uni-kassel.de hrz-no421
192.168.204.32 hrz-no422.hrz.uni-kassel.de hrz-no422
192.168.204.33 hrz-no423.hrz.uni-kassel.de hrz-no423
192.168.204.34 hrz-no424.hrz.uni-kassel.de hrz-no424
192.168.204.35 hrz-no425.hrz.uni-kassel.de hrz-no425
192.168.204.36 hrz-no426.hrz.uni-kassel.de hrz-no426
192.168.204.37 hrz-no427.hrz.uni-kassel.de hrz-no427
192.168.204.38 hrz-no428.hrz.uni-kassel.de hrz-no428
192.168.204.39 hrz-no429.hrz.uni-kassel.de hrz-no429
192.168.204.40 hrz-no430.hrz.uni-kassel.de hrz-no430
192.168.204.47 hrz-no437.hrz.uni-kassel.de hrz-no437
192.168.204.48 hrz-no438.hrz.uni-kassel.de hrz-no438
192.168.204.49 hrz-no439.hrz.uni-kassel.de hrz-no439
192.168.204.50 hrz-no440.hrz.uni-kassel.de hrz-no440
192.168.204.51 hrz-no441.hrz.uni-kassel.de hrz-no441
192.168.204.54 hrz-no444.hrz.uni-kassel.de hrz-no444
192.168.204.65 hrz-no455.hrz.uni-kassel.de hrz-no455
192.168.204.66 hrz-no456.hrz.uni-kassel.de hrz-no456
192.168.204.69 hrz-no459.hrz.uni-kassel.de hrz-no459
192.168.204.70 hrz-no460.hrz.uni-kassel.de hrz-no460
192.168.204.71 hrz-no461.hrz.uni-kassel.de hrz-no461
192.168.204.72 hrz-no462.hrz.uni-kassel.de hrz-no462
192.168.204.73 hrz-no463.hrz.uni-kassel.de hrz-no463
192.168.204.74 hrz-no464.hrz.uni-kassel.de hrz-no464
192.168.204.75 hrz-no465.hrz.uni-kassel.de hrz-no465
192.168.204.76 hrz-no466.hrz.uni-kassel.de hrz-no466
192.168.204.77 hrz-no467.hrz.uni-kassel.de hrz-no467
192.168.204.78 hrz-no468.hrz.uni-kassel.de hrz-no468
192.168.204.79 hrz-no469.hrz.uni-kassel.de hrz-no469
192.168.204.80 hrz-no470.hrz.uni-kassel.de hrz-no470
192.168.204.81 hrz-no471.hrz.uni-kassel.de hrz-no471
192.168.204.82 hrz-no472.hrz.uni-kassel.de hrz-no472
192.168.204.83 hrz-no473.hrz.uni-kassel.de hrz-no473
192.168.204.84 hrz-no474.hrz.uni-kassel.de hrz-no474
192.168.204.85 hrz-no475.hrz.uni-kassel.de hrz-no475
192.168.204.86 hrz-no476.hrz.uni-kassel.de hrz-no476
192.168.204.87 hrz-no477.hrz.uni-kassel.de hrz-no477
192.168.204.88 hrz-no478.hrz.uni-kassel.de hrz-no478
192.168.204.89 hrz-no479.hrz.uni-kassel.de hrz-no479
192.168.204.90 hrz-no480.hrz.uni-kassel.de hrz-no480
192.168.204.91 hrz-no481.hrz.uni-kassel.de hrz-no481
192.168.204.92 hrz-no482.hrz.uni-kassel.de hrz-no482
192.168.204.93 hrz-no483.hrz.uni-kassel.de hrz-no483
192.168.204.94 hrz-no484.hrz.uni-kassel.de hrz-no484
192.168.204.95 hrz-no485.hrz.uni-kassel.de hrz-no485
192.168.204.96 hrz-no486.hrz.uni-kassel.de hrz-no486
192.168.204.97 hrz-no487.hrz.uni-kassel.de hrz-no487
192.168.204.98 hrz-no488.hrz.uni-kassel.de hrz-no488
192.168.204.99 hrz-no489.hrz.uni-kassel.de hrz-no489
192.168.204.100 hrz-no490.hrz.uni-kassel.de hrz-no490
192.168.204.101 hrz-no491.hrz.uni-kassel.de hrz-no491
192.168.204.102 hrz-no492.hrz.uni-kassel.de hrz-no492
192.168.204.103 hrz-no493.hrz.uni-kassel.de hrz-no493
192.168.204.104 hrz-no494.hrz.uni-kassel.de hrz-no494
192.168.204.105 hrz-no495.hrz.uni-kassel.de hrz-no495
192.168.204.106 hrz-no496.hrz.uni-kassel.de hrz-no496
192.168.204.107 hrz-no497.hrz.uni-kassel.de hrz-no497
192.168.204.108 hrz-no498.hrz.uni-kassel.de hrz-no498
192.168.204.109 hrz-no499.hrz.uni-kassel.de hrz-no499
192.168.204.110 hrz-no500.hrz.uni-kassel.de hrz-no500
192.168.204.111 hrz-no501.hrz.uni-kassel.de hrz-no501
192.168.204.112 hrz-no502.hrz.uni-kassel.de hrz-no502
192.168.204.113 hrz-no503.hrz.uni-kassel.de hrz-no503
192.168.204.114 hrz-no504.hrz.uni-kassel.de hrz-no504
192.168.204.115 hrz-no505.hrz.uni-kassel.de hrz-no505
192.168.204.116 hrz-no506.hrz.uni-kassel.de hrz-no506
192.168.204.117 hrz-no507.hrz.uni-kassel.de hrz-no507
192.168.204.118 hrz-no508.hrz.uni-kassel.de hrz-no508
192.168.204.119 hrz-no509.hrz.uni-kassel.de hrz-no509
192.168.204.120 hrz-no510.hrz.uni-kassel.de hrz-no510
192.168.204.121 hrz-no511.hrz.uni-kassel.de hrz-no511
192.168.204.122 hrz-no512.hrz.uni-kassel.de hrz-no512
192.168.204.123 hrz-no513.hrz.uni-kassel.de hrz-no513
192.168.204.124 hrz-no514.hrz.uni-kassel.de hrz-no514
192.168.204.125 hrz-no515.hrz.uni-kassel.de hrz-no515
192.168.204.126 hrz-no516.hrz.uni-kassel.de hrz-no516
192.168.204.127 hrz-no517.hrz.uni-kassel.de hrz-no517
192.168.204.128 hrz-no518.hrz.uni-kassel.de hrz-no518
192.168.204.129 hrz-no519.hrz.uni-kassel.de hrz-no519
192.168.204.130 hrz-no520.hrz.uni-kassel.de hrz-no520
192.168.204.131 hrz-no521.hrz.uni-kassel.de hrz-no521
192.168.204.132 hrz-no522.hrz.uni-kassel.de hrz-no522
192.168.204.133 hrz-no523.hrz.uni-kassel.de hrz-no523
192.168.204.134 hrz-no524.hrz.uni-kassel.de hrz-no524
192.168.204.135 hrz-no525.hrz.uni-kassel.de hrz-no525
192.168.204.136 hrz-no526.hrz.uni-kassel.de hrz-no526
192.168.204.137 hrz-no527.hrz.uni-kassel.de hrz-no527
192.168.204.138 hrz-no528.hrz.uni-kassel.de hrz-no528
192.168.204.139 hrz-no529.hrz.uni-kassel.de hrz-no529
192.168.204.140 hrz-no530.hrz.uni-kassel.de hrz-no530
192.168.204.141 hrz-no531.hrz.uni-kassel.de hrz-no531
192.168.204.142 hrz-no532.hrz.uni-kassel.de hrz-no532
192.168.204.143 hrz-no533.hrz.uni-kassel.de hrz-no533
192.168.204.144 hrz-no534.hrz.uni-kassel.de hrz-no534
192.168.204.145 hrz-no535.hrz.uni-kassel.de hrz-no535
192.168.204.146 hrz-no536.hrz.uni-kassel.de hrz-no536
192.168.204.147 hrz-no537.hrz.uni-kassel.de hrz-no537
192.168.204.148 hrz-no538.hrz.uni-kassel.de hrz-no538
192.168.204.149 hrz-no539.hrz.uni-kassel.de hrz-no539
192.168.204.150 hrz-no540.hrz.uni-kassel.de hrz-no540
192.168.204.151 hrz-no541.hrz.uni-kassel.de hrz-no541
192.168.204.152 hrz-no542.hrz.uni-kassel.de hrz-no542
192.168.204.153 hrz-no543.hrz.uni-kassel.de hrz-no543
192.168.204.154 hrz-no544.hrz.uni-kassel.de hrz-no544
192.168.204.155 hrz-no545.hrz.uni-kassel.de hrz-no545
192.168.204.156 hrz-no546.hrz.uni-kassel.de hrz-no546
192.168.204.157 hrz-no547.hrz.uni-kassel.de hrz-no547
192.168.204.158 hrz-no548.hrz.uni-kassel.de hrz-no548
192.168.204.159 hrz-no549.hrz.uni-kassel.de hrz-no549
192.168.204.160 hrz-no550.hrz.uni-kassel.de hrz-no550
192.168.204.161 hrz-no551.hrz.uni-kassel.de hrz-no551
192.168.204.162 hrz-no552.hrz.uni-kassel.de hrz-no552
192.168.204.163 hrz-no553.hrz.uni-kassel.de hrz-no553
192.168.204.164 hrz-no554.hrz.uni-kassel.de hrz-no554
192.168.204.165 hrz-no555.hrz.uni-kassel.de hrz-no555
192.168.204.166 hrz-no556.hrz.uni-kassel.de hrz-no556
192.168.204.167 hrz-no557.hrz.uni-kassel.de hrz-no557
192.168.204.168 hrz-no558.hrz.uni-kassel.de hrz-no558
192.168.204.169 hrz-no559.hrz.uni-kassel.de hrz-no559
192.168.204.215 hrz-no560.hrz.uni-kassel.de hrz-no560
192.168.204.216 hrz-no561.hrz.uni-kassel.de hrz-no561
192.168.204.217 hrz-no562.hrz.uni-kassel.de hrz-no562
192.168.204.218 hrz-no563.hrz.uni-kassel.de hrz-no563
192.168.204.219 hrz-no564.hrz.uni-kassel.de hrz-no564
192.168.204.220 hrz-no565.hrz.uni-kassel.de hrz-no565
192.168.204.221 hrz-no566.hrz.uni-kassel.de hrz-no566
192.168.204.222 hrz-no567.hrz.uni-kassel.de hrz-no567
192.168.204.223 hrz-no568.hrz.uni-kassel.de hrz-no568
192.168.204.224 hrz-no569.hrz.uni-kassel.de hrz-no569
192.168.204.225 hrz-no570.hrz.uni-kassel.de hrz-no570
192.168.204.226 hrz-no571.hrz.uni-kassel.de hrz-no571
192.168.204.227 hrz-no572.hrz.uni-kassel.de hrz-no572
192.168.204.228 hrz-no573.hrz.uni-kassel.de hrz-no573
192.168.204.229 hrz-no574.hrz.uni-kassel.de hrz-no574
192.168.204.230 hrz-no575.hrz.uni-kassel.de hrz-no575
192.168.204.231 hrz-no576.hrz.uni-kassel.de hrz-no576
192.168.204.232 hrz-no577.hrz.uni-kassel.de hrz-no577
192.168.204.233 hrz-no578.hrz.uni-kassel.de hrz-no578
192.168.204.234 hrz-no579.hrz.uni-kassel.de hrz-no579
192.168.204.235 hrz-no580.hrz.uni-kassel.de hrz-no580
192.168.204.236 hrz-no581.hrz.uni-kassel.de hrz-no581
192.168.204.237 hrz-no582.hrz.uni-kassel.de hrz-no582
192.168.204.238 hrz-no583.hrz.uni-kassel.de hrz-no583
192.168.204.239 hrz-no584.hrz.uni-kassel.de hrz-no584
192.168.204.240 hrz-no585.hrz.uni-kassel.de hrz-no585
192.168.204.241 hrz-no586.hrz.uni-kassel.de hrz-no586
192.168.204.242 hrz-no587.hrz.uni-kassel.de hrz-no587
192.168.204.243 hrz-no588.hrz.uni-kassel.de hrz-no588
192.168.204.244 hrz-no589.hrz.uni-kassel.de hrz-no589
192.168.204.245 hrz-no590.hrz.uni-kassel.de hrz-no590
192.168.204.246 hrz-no591.hrz.uni-kassel.de hrz-no591
192.168.204.247 hrz-no592.hrz.uni-kassel.de hrz-no592
192.168.204.248 hrz-no593.hrz.uni-kassel.de hrz-no593
192.168.204.249 hrz-no594.hrz.uni-kassel.de hrz-no594
192.168.204.250 hrz-no595.hrz.uni-kassel.de hrz-no595
192.168.204.251 hrz-no596.hrz.uni-kassel.de hrz-no596
192.168.204.252 hrz-no597.hrz.uni-kassel.de hrz-no597
192.168.204.253 hrz-no598.hrz.uni-kassel.de hrz-no598
192.168.204.254 hrz-no599.hrz.uni-kassel.de hrz-no599

141.51.204.190    hrz-gc100 hrz-gc100.hrz.uni-kassel.de
141.51.204.191    hrz-gc101.hrz.uni-kassel.de hrz-gc101
141.51.204.192    hrz-gc102.hrz.uni-kassel.de hrz-gc102
141.51.204.193    hrz-gc103.hrz.uni-kassel.de hrz-gc103
141.51.204.194    hrz-gc104.hrz.uni-kassel.de hrz-gc104
141.51.204.195    hrz-gc105.hrz.uni-kassel.de hrz-gc105
141.51.204.196    hrz-gc106.hrz.uni-kassel.de hrz-gc106
141.51.204.197    hrz-gc107.hrz.uni-kassel.de hrz-gc107
141.51.204.198    hrz-gc108.hrz.uni-kassel.de hrz-gc108
141.51.204.199    hrz-gc109.hrz.uni-kassel.de hrz-gc109
141.51.204.200    hrz-gc110.hrz.uni-kassel.de hrz-gc110
141.51.204.201    hrz-gc111.hrz.uni-kassel.de hrz-gc111
141.51.204.202    hrz-gc112.hrz.uni-kassel.de hrz-gc112
141.51.204.203    hrz-gc113.hrz.uni-kassel.de hrz-gc113
141.51.204.204    hrz-gc114.hrz.uni-kassel.de hrz-gc114
141.51.204.205    hrz-gc115.hrz.uni-kassel.de hrz-gc115
141.51.204.206    hrz-gc116.hrz.uni-kassel.de hrz-gc116
141.51.204.207    hrz-gc117.hrz.uni-kassel.de hrz-gc117
141.51.204.208    hrz-gc118.hrz.uni-kassel.de hrz-gc118
141.51.204.209    hrz-gc119.hrz.uni-kassel.de hrz-gc119
141.51.204.210    hrz-gc120.hrz.uni-kassel.de hrz-gc120

# Cluster neu
141.51.204.30 its-cs1.its.uni-kassel.de its-cs1
141.51.204.170 its-cs10.its.uni-kassel.de its-cs10
141.51.204.171 its-cs11.its.uni-kassel.de its-cs11
141.51.204.172 its-cs12.its.uni-kassel.de its-cs12
141.51.204.173 its-cs13.its.uni-kassel.de its-cs13
141.51.204.174 its-cs14.its.uni-kassel.de its-cs14
141.51.204.175 its-cs15.its.uni-kassel.de its-cs15
141.51.204.176 its-cs16.its.uni-kassel.de its-cs16
141.51.204.177 its-cs17.its.uni-kassel.de its-cs17
141.51.204.178 its-cs18.its.uni-kassel.de its-cs18
141.51.204.179 its-cs19.its.uni-kassel.de its-cs19
141.51.205.10 its-cs100.its.uni-kassel.de its-cs100
141.51.205.11 its-cs101.its.uni-kassel.de its-cs101
141.51.205.12 its-cs102.its.uni-kassel.de its-cs102
141.51.205.13 its-cs103.its.uni-kassel.de its-cs103
141.51.205.14 its-cs104.its.uni-kassel.de its-cs104
141.51.205.15 its-cs105.its.uni-kassel.de its-cs105
141.51.205.16 its-cs106.its.uni-kassel.de its-cs106
141.51.205.17 its-cs107.its.uni-kassel.de its-cs107
141.51.205.18 its-cs108.its.uni-kassel.de its-cs108
141.51.205.19 its-cs109.its.uni-kassel.de its-cs109
141.51.205.20 its-cs110.its.uni-kassel.de its-cs110
141.51.205.21 its-cs111.its.uni-kassel.de its-cs111
141.51.205.22 its-cs112.its.uni-kassel.de its-cs112
141.51.205.23 its-cs113.its.uni-kassel.de its-cs113
141.51.205.24 its-cs114.its.uni-kassel.de its-cs114
141.51.205.25 its-cs115.its.uni-kassel.de its-cs115
141.51.205.26 its-cs116.its.uni-kassel.de its-cs116
141.51.205.27 its-cs117.its.uni-kassel.de its-cs117
141.51.205.28 its-cs118.its.uni-kassel.de its-cs118
141.51.205.29 its-cs119.its.uni-kassel.de its-cs119
141.51.205.30 its-cs120.its.uni-kassel.de its-cs120
141.51.205.31 its-cs121.its.uni-kassel.de its-cs121
141.51.205.32 its-cs122.its.uni-kassel.de its-cs122
141.51.205.33 its-cs123.its.uni-kassel.de its-cs123
141.51.205.34 its-cs124.its.uni-kassel.de its-cs124
141.51.205.35 its-cs125.its.uni-kassel.de its-cs125
141.51.205.36 its-cs126.its.uni-kassel.de its-cs126
141.51.205.37 its-cs127.its.uni-kassel.de its-cs127
141.51.205.38 its-cs128.its.uni-kassel.de its-cs128
141.51.205.39 its-cs129.its.uni-kassel.de its-cs129
141.51.205.40 its-cs130.its.uni-kassel.de its-cs130
141.51.205.41 its-cs131.its.uni-kassel.de its-cs131
141.51.205.42 its-cs132.its.uni-kassel.de its-cs132
141.51.205.43 its-cs133.its.uni-kassel.de its-cs133
141.51.205.44 its-cs134.its.uni-kassel.de its-cs134
141.51.205.45 its-cs135.its.uni-kassel.de its-cs135
141.51.205.46 its-cs136.its.uni-kassel.de its-cs136
141.51.205.47 its-cs137.its.uni-kassel.de its-cs137
141.51.205.48 its-cs138.its.uni-kassel.de its-cs138
141.51.205.49 its-cs139.its.uni-kassel.de its-cs139
141.51.205.50 its-cs140.its.uni-kassel.de its-cs140
141.51.205.51 its-cs141.its.uni-kassel.de its-cs141
141.51.205.52 its-cs142.its.uni-kassel.de its-cs142
141.51.205.53 its-cs143.its.uni-kassel.de its-cs143
141.51.205.54 its-cs144.its.uni-kassel.de its-cs144
141.51.205.55 its-cs145.its.uni-kassel.de its-cs145
141.51.205.56 its-cs146.its.uni-kassel.de its-cs146
141.51.205.57 its-cs147.its.uni-kassel.de its-cs147
141.51.205.58 its-cs148.its.uni-kassel.de its-cs148
141.51.205.59 its-cs149.its.uni-kassel.de its-cs149
141.51.205.60 its-cs150.its.uni-kassel.de its-cs150
141.51.205.61 its-cs151.its.uni-kassel.de its-cs151
141.51.205.62 its-cs152.its.uni-kassel.de its-cs152
141.51.205.63 its-cs153.its.uni-kassel.de its-cs153
141.51.205.64 its-cs154.its.uni-kassel.de its-cs154
141.51.205.65 its-cs155.its.uni-kassel.de its-cs155
141.51.205.66 its-cs156.its.uni-kassel.de its-cs156
141.51.205.67 its-cs157.its.uni-kassel.de its-cs157
141.51.205.68 its-cs158.its.uni-kassel.de its-cs158
141.51.205.69 its-cs159.its.uni-kassel.de its-cs159
141.51.205.70 its-cs160.its.uni-kassel.de its-cs160
141.51.205.71 its-cs161.its.uni-kassel.de its-cs161
141.51.205.72 its-cs162.its.uni-kassel.de its-cs162
141.51.205.73 its-cs163.its.uni-kassel.de its-cs163
141.51.205.74 its-cs164.its.uni-kassel.de its-cs164
141.51.205.75 its-cs165.its.uni-kassel.de its-cs165
141.51.205.76 its-cs166.its.uni-kassel.de its-cs166
141.51.205.77 its-cs167.its.uni-kassel.de its-cs167
141.51.205.78 its-cs168.its.uni-kassel.de its-cs168
141.51.205.79 its-cs169.its.uni-kassel.de its-cs169
141.51.205.80 its-cs170.its.uni-kassel.de its-cs170
141.51.205.81 its-cs171.its.uni-kassel.de its-cs171
141.51.205.82 its-cs172.its.uni-kassel.de its-cs172
141.51.205.83 its-cs173.its.uni-kassel.de its-cs173
141.51.205.84 its-cs174.its.uni-kassel.de its-cs174
141.51.205.85 its-cs175.its.uni-kassel.de its-cs175
141.51.205.86 its-cs176.its.uni-kassel.de its-cs176
141.51.205.87 its-cs177.its.uni-kassel.de its-cs177
141.51.205.88 its-cs178.its.uni-kassel.de its-cs178
141.51.205.89 its-cs179.its.uni-kassel.de its-cs179
141.51.205.90 its-cs180.its.uni-kassel.de its-cs180
141.51.205.91 its-cs181.its.uni-kassel.de its-cs181
141.51.205.92 its-cs182.its.uni-kassel.de its-cs182
141.51.205.93 its-cs183.its.uni-kassel.de its-cs183
141.51.205.94 its-cs184.its.uni-kassel.de its-cs184
141.51.205.95 its-cs185.its.uni-kassel.de its-cs185
141.51.205.96 its-cs186.its.uni-kassel.de its-cs186
141.51.205.97 its-cs187.its.uni-kassel.de its-cs187
141.51.205.98 its-cs188.its.uni-kassel.de its-cs188
141.51.205.99 its-cs189.its.uni-kassel.de its-cs189
141.51.205.100 its-cs190.its.uni-kassel.de its-cs190
141.51.205.101 its-cs191.its.uni-kassel.de its-cs191
141.51.205.102 its-cs192.its.uni-kassel.de its-cs192
141.51.205.103 its-cs193.its.uni-kassel.de its-cs193
141.51.205.104 its-cs194.its.uni-kassel.de its-cs194
141.51.205.105 its-cs195.its.uni-kassel.de its-cs195
141.51.205.106 its-cs196.its.uni-kassel.de its-cs196
141.51.205.107 its-cs197.its.uni-kassel.de its-cs197
141.51.205.108 its-cs198.its.uni-kassel.de its-cs198
141.51.205.109 its-cs199.its.uni-kassel.de its-cs199
141.51.205.110 its-cs200.its.uni-kassel.de its-cs200
141.51.205.111 its-cs201.its.uni-kassel.de its-cs201
141.51.205.112 its-cs202.its.uni-kassel.de its-cs202
141.51.205.113 its-cs203.its.uni-kassel.de its-cs203
141.51.205.114 its-cs204.its.uni-kassel.de its-cs204
141.51.205.115 its-cs205.its.uni-kassel.de its-cs205
141.51.205.116 its-cs206.its.uni-kassel.de its-cs206
141.51.205.117 its-cs207.its.uni-kassel.de its-cs207
141.51.205.118 its-cs208.its.uni-kassel.de its-cs208
141.51.205.119 its-cs209.its.uni-kassel.de its-cs209
141.51.205.120 its-cs210.its.uni-kassel.de its-cs210
141.51.205.121 its-cs211.its.uni-kassel.de its-cs211
141.51.205.122 its-cs212.its.uni-kassel.de its-cs212
141.51.205.123 its-cs213.its.uni-kassel.de its-cs213
141.51.205.124 its-cs214.its.uni-kassel.de its-cs214
141.51.205.125 its-cs215.its.uni-kassel.de its-cs215
141.51.205.126 its-cs216.its.uni-kassel.de its-cs216
141.51.205.127 its-cs217.its.uni-kassel.de its-cs217
141.51.205.128 its-cs218.its.uni-kassel.de its-cs218
141.51.205.129 its-cs219.its.uni-kassel.de its-cs219
141.51.205.130 its-cs220.its.uni-kassel.de its-cs220
141.51.205.131 its-cs221.its.uni-kassel.de its-cs221
141.51.205.132 its-cs222.its.uni-kassel.de its-cs222
141.51.205.133 its-cs223.its.uni-kassel.de its-cs223
141.51.205.134 its-cs224.its.uni-kassel.de its-cs224
141.51.205.135 its-cs225.its.uni-kassel.de its-cs225
141.51.205.136 its-cs226.its.uni-kassel.de its-cs226
141.51.205.137 its-cs227.its.uni-kassel.de its-cs227
141.51.205.138 its-cs228.its.uni-kassel.de its-cs228
141.51.205.139 its-cs229.its.uni-kassel.de its-cs229
141.51.205.140 its-cs230.its.uni-kassel.de its-cs230
141.51.205.141 its-cs231.its.uni-kassel.de its-cs231
141.51.205.142 its-cs232.its.uni-kassel.de its-cs232
141.51.205.143 its-cs233.its.uni-kassel.de its-cs233
141.51.205.144 its-cs234.its.uni-kassel.de its-cs234
141.51.205.145 its-cs235.its.uni-kassel.de its-cs235
141.51.205.146 its-cs236.its.uni-kassel.de its-cs236
141.51.205.147 its-cs237.its.uni-kassel.de its-cs237
141.51.205.148 its-cs238.its.uni-kassel.de its-cs238
141.51.205.149 its-cs239.its.uni-kassel.de its-cs239
141.51.205.150 its-cs240.its.uni-kassel.de its-cs240
141.51.205.151 its-cs241.its.uni-kassel.de its-cs241
141.51.205.152 its-cs242.its.uni-kassel.de its-cs242
141.51.205.153 its-cs243.its.uni-kassel.de its-cs243
141.51.205.154 its-cs244.its.uni-kassel.de its-cs244
141.51.205.155 its-cs245.its.uni-kassel.de its-cs245
141.51.205.156 its-cs246.its.uni-kassel.de its-cs246
141.51.205.157 its-cs247.its.uni-kassel.de its-cs247
141.51.205.158 its-cs248.its.uni-kassel.de its-cs248
141.51.205.159 its-cs249.its.uni-kassel.de its-cs249
141.51.205.160 its-cs250.its.uni-kassel.de its-cs250
141.51.205.161 its-cs251.its.uni-kassel.de its-cs251
141.51.205.162 its-cs252.its.uni-kassel.de its-cs252
141.51.205.163 its-cs253.its.uni-kassel.de its-cs253
141.51.205.164 its-cs254.its.uni-kassel.de its-cs254
141.51.205.165 its-cs255.its.uni-kassel.de its-cs255
141.51.205.166 its-cs256.its.uni-kassel.de its-cs256
141.51.205.167 its-cs257.its.uni-kassel.de its-cs257
141.51.205.168 its-cs258.its.uni-kassel.de its-cs258
141.51.205.169 its-cs259.its.uni-kassel.de its-cs259
141.51.205.170 its-cs260.its.uni-kassel.de its-cs260
141.51.205.171 its-cs261.its.uni-kassel.de its-cs261
141.51.205.172 its-cs262.its.uni-kassel.de its-cs262
141.51.205.173 its-cs263.its.uni-kassel.de its-cs263
141.51.205.174 its-cs264.its.uni-kassel.de its-cs264
141.51.205.175 its-cs265.its.uni-kassel.de its-cs265
141.51.205.176 its-cs266.its.uni-kassel.de its-cs266
141.51.205.177 its-cs267.its.uni-kassel.de its-cs267
141.51.205.178 its-cs268.its.uni-kassel.de its-cs268
141.51.205.179 its-cs269.its.uni-kassel.de its-cs269
141.51.205.180 its-cs270.its.uni-kassel.de its-cs270
141.51.205.181 its-cs271.its.uni-kassel.de its-cs271
141.51.205.182 its-cs272.its.uni-kassel.de its-cs272
141.51.205.183 its-cs273.its.uni-kassel.de its-cs273
141.51.205.184 its-cs274.its.uni-kassel.de its-cs274
141.51.205.185 its-cs275.its.uni-kassel.de its-cs275
141.51.205.186 its-cs276.its.uni-kassel.de its-cs276
141.51.205.187 its-cs277.its.uni-kassel.de its-cs277
141.51.205.188 its-cs278.its.uni-kassel.de its-cs278
141.51.205.189 its-cs279.its.uni-kassel.de its-cs279
141.51.205.190 its-cs280.its.uni-kassel.de its-cs280
141.51.205.191 its-cs281.its.uni-kassel.de its-cs281
141.51.205.192 its-cs282.its.uni-kassel.de its-cs282
141.51.205.193 its-cs283.its.uni-kassel.de its-cs283
141.51.205.194 its-cs284.its.uni-kassel.de its-cs284
141.51.205.195 its-cs285.its.uni-kassel.de its-cs285
141.51.205.196 its-cs286.its.uni-kassel.de its-cs286
141.51.205.197 its-cs287.its.uni-kassel.de its-cs287
141.51.205.198 its-cs288.its.uni-kassel.de its-cs288
141.51.205.199 its-cs289.its.uni-kassel.de its-cs289
141.51.205.200 its-cs290.its.uni-kassel.de its-cs290
141.51.205.201 its-cs291.its.uni-kassel.de its-cs291
141.51.205.202 its-cs292.its.uni-kassel.de its-cs292
141.51.205.203 its-cs293.its.uni-kassel.de its-cs293
141.51.205.204 its-cs294.its.uni-kassel.de its-cs294
141.51.205.205 its-cs295.its.uni-kassel.de its-cs295
141.51.205.206 its-cs296.its.uni-kassel.de its-cs296
141.51.205.207 its-cs297.its.uni-kassel.de its-cs297
141.51.205.208 its-cs298.its.uni-kassel.de its-cs298
141.51.205.209 its-cs299.its.uni-kassel.de its-cs299
192.168.204.30 its-no1.its.uni-kassel.de its-no1
192.168.204.170 its-no10.its.uni-kassel.de its-no10
192.168.204.171 its-no11.its.uni-kassel.de its-no11
192.168.204.172 its-no12.its.uni-kassel.de its-no12
192.168.204.173 its-no13.its.uni-kassel.de its-no13
192.168.204.174 its-no14.its.uni-kassel.de its-no14
192.168.204.175 its-no15.its.uni-kassel.de its-no15
192.168.204.176 its-no16.its.uni-kassel.de its-no16
192.168.204.177 its-no17.its.uni-kassel.de its-no17
192.168.204.178 its-no18.its.uni-kassel.de its-no18
192.168.204.179 its-no19.its.uni-kassel.de its-no19
192.168.205.10 its-no100.its.uni-kassel.de its-no100
192.168.205.11 its-no101.its.uni-kassel.de its-no101
192.168.205.12 its-no102.its.uni-kassel.de its-no102
192.168.205.13 its-no103.its.uni-kassel.de its-no103
192.168.205.14 its-no104.its.uni-kassel.de its-no104
192.168.205.15 its-no105.its.uni-kassel.de its-no105
192.168.205.16 its-no106.its.uni-kassel.de its-no106
192.168.205.17 its-no107.its.uni-kassel.de its-no107
192.168.205.18 its-no108.its.uni-kassel.de its-no108
192.168.205.19 its-no109.its.uni-kassel.de its-no109
192.168.205.20 its-no110.its.uni-kassel.de its-no110
192.168.205.21 its-no111.its.uni-kassel.de its-no111
192.168.205.22 its-no112.its.uni-kassel.de its-no112
192.168.205.23 its-no113.its.uni-kassel.de its-no113
192.168.205.24 its-no114.its.uni-kassel.de its-no114
192.168.205.25 its-no115.its.uni-kassel.de its-no115
192.168.205.26 its-no116.its.uni-kassel.de its-no116
192.168.205.27 its-no117.its.uni-kassel.de its-no117
192.168.205.28 its-no118.its.uni-kassel.de its-no118
192.168.205.29 its-no119.its.uni-kassel.de its-no119
192.168.205.30 its-no120.its.uni-kassel.de its-no120
192.168.205.31 its-no121.its.uni-kassel.de its-no121
192.168.205.32 its-no122.its.uni-kassel.de its-no122
192.168.205.33 its-no123.its.uni-kassel.de its-no123
192.168.205.34 its-no124.its.uni-kassel.de its-no124
192.168.205.35 its-no125.its.uni-kassel.de its-no125
192.168.205.36 its-no126.its.uni-kassel.de its-no126
192.168.205.37 its-no127.its.uni-kassel.de its-no127
192.168.205.38 its-no128.its.uni-kassel.de its-no128
192.168.205.39 its-no129.its.uni-kassel.de its-no129
192.168.205.40 its-no130.its.uni-kassel.de its-no130
192.168.205.41 its-no131.its.uni-kassel.de its-no131
192.168.205.42 its-no132.its.uni-kassel.de its-no132
192.168.205.43 its-no133.its.uni-kassel.de its-no133
192.168.205.44 its-no134.its.uni-kassel.de its-no134
192.168.205.45 its-no135.its.uni-kassel.de its-no135
192.168.205.46 its-no136.its.uni-kassel.de its-no136
192.168.205.47 its-no137.its.uni-kassel.de its-no137
192.168.205.48 its-no138.its.uni-kassel.de its-no138
192.168.205.49 its-no139.its.uni-kassel.de its-no139
192.168.205.50 its-no140.its.uni-kassel.de its-no140
192.168.205.51 its-no141.its.uni-kassel.de its-no141
192.168.205.52 its-no142.its.uni-kassel.de its-no142
192.168.205.53 its-no143.its.uni-kassel.de its-no143
192.168.205.54 its-no144.its.uni-kassel.de its-no144
192.168.205.55 its-no145.its.uni-kassel.de its-no145
192.168.205.56 its-no146.its.uni-kassel.de its-no146
192.168.205.57 its-no147.its.uni-kassel.de its-no147
192.168.205.58 its-no148.its.uni-kassel.de its-no148
192.168.205.59 its-no149.its.uni-kassel.de its-no149
192.168.205.60 its-no150.its.uni-kassel.de its-no150
192.168.205.61 its-no151.its.uni-kassel.de its-no151
192.168.205.62 its-no152.its.uni-kassel.de its-no152
192.168.205.63 its-no153.its.uni-kassel.de its-no153
192.168.205.64 its-no154.its.uni-kassel.de its-no154
192.168.205.65 its-no155.its.uni-kassel.de its-no155
192.168.205.66 its-no156.its.uni-kassel.de its-no156
192.168.205.67 its-no157.its.uni-kassel.de its-no157
192.168.205.68 its-no158.its.uni-kassel.de its-no158
192.168.205.69 its-no159.its.uni-kassel.de its-no159
192.168.205.70 its-no160.its.uni-kassel.de its-no160
192.168.205.71 its-no161.its.uni-kassel.de its-no161
192.168.205.72 its-no162.its.uni-kassel.de its-no162
192.168.205.73 its-no163.its.uni-kassel.de its-no163
192.168.205.74 its-no164.its.uni-kassel.de its-no164
192.168.205.75 its-no165.its.uni-kassel.de its-no165
192.168.205.76 its-no166.its.uni-kassel.de its-no166
192.168.205.77 its-no167.its.uni-kassel.de its-no167
192.168.205.78 its-no168.its.uni-kassel.de its-no168
192.168.205.79 its-no169.its.uni-kassel.de its-no169
192.168.205.80 its-no170.its.uni-kassel.de its-no170
192.168.205.81 its-no171.its.uni-kassel.de its-no171
192.168.205.82 its-no172.its.uni-kassel.de its-no172
192.168.205.83 its-no173.its.uni-kassel.de its-no173
192.168.205.84 its-no174.its.uni-kassel.de its-no174
192.168.205.85 its-no175.its.uni-kassel.de its-no175
192.168.205.86 its-no176.its.uni-kassel.de its-no176
192.168.205.87 its-no177.its.uni-kassel.de its-no177
192.168.205.88 its-no178.its.uni-kassel.de its-no178
192.168.205.89 its-no179.its.uni-kassel.de its-no179
192.168.205.90 its-no180.its.uni-kassel.de its-no180
192.168.205.91 its-no181.its.uni-kassel.de its-no181
192.168.205.92 its-no182.its.uni-kassel.de its-no182
192.168.205.93 its-no183.its.uni-kassel.de its-no183
192.168.205.94 its-no184.its.uni-kassel.de its-no184
192.168.205.95 its-no185.its.uni-kassel.de its-no185
192.168.205.96 its-no186.its.uni-kassel.de its-no186
192.168.205.97 its-no187.its.uni-kassel.de its-no187
192.168.205.98 its-no188.its.uni-kassel.de its-no188
192.168.205.99 its-no189.its.uni-kassel.de its-no189
192.168.205.100 its-no190.its.uni-kassel.de its-no190
192.168.205.101 its-no191.its.uni-kassel.de its-no191
192.168.205.102 its-no192.its.uni-kassel.de its-no192
192.168.205.103 its-no193.its.uni-kassel.de its-no193
192.168.205.104 its-no194.its.uni-kassel.de its-no194
192.168.205.105 its-no195.its.uni-kassel.de its-no195
192.168.205.106 its-no196.its.uni-kassel.de its-no196
192.168.205.107 its-no197.its.uni-kassel.de its-no197
192.168.205.108 its-no198.its.uni-kassel.de its-no198
192.168.205.109 its-no199.its.uni-kassel.de its-no199
192.168.205.110 its-no200.its.uni-kassel.de its-no200
192.168.205.111 its-no201.its.uni-kassel.de its-no201
192.168.205.112 its-no202.its.uni-kassel.de its-no202
192.168.205.113 its-no203.its.uni-kassel.de its-no203
192.168.205.114 its-no204.its.uni-kassel.de its-no204
192.168.205.115 its-no205.its.uni-kassel.de its-no205
192.168.205.116 its-no206.its.uni-kassel.de its-no206
192.168.205.117 its-no207.its.uni-kassel.de its-no207
192.168.205.118 its-no208.its.uni-kassel.de its-no208
192.168.205.119 its-no209.its.uni-kassel.de its-no209
192.168.205.120 its-no210.its.uni-kassel.de its-no210
192.168.205.121 its-no211.its.uni-kassel.de its-no211
192.168.205.122 its-no212.its.uni-kassel.de its-no212
192.168.205.123 its-no213.its.uni-kassel.de its-no213
192.168.205.124 its-no214.its.uni-kassel.de its-no214
192.168.205.125 its-no215.its.uni-kassel.de its-no215
192.168.205.126 its-no216.its.uni-kassel.de its-no216
192.168.205.127 its-no217.its.uni-kassel.de its-no217
192.168.205.128 its-no218.its.uni-kassel.de its-no218
192.168.205.129 its-no219.its.uni-kassel.de its-no219
192.168.205.130 its-no220.its.uni-kassel.de its-no220
192.168.205.131 its-no221.its.uni-kassel.de its-no221
192.168.205.132 its-no222.its.uni-kassel.de its-no222
192.168.205.133 its-no223.its.uni-kassel.de its-no223
192.168.205.134 its-no224.its.uni-kassel.de its-no224
192.168.205.135 its-no225.its.uni-kassel.de its-no225
192.168.205.136 its-no226.its.uni-kassel.de its-no226
192.168.205.137 its-no227.its.uni-kassel.de its-no227
192.168.205.138 its-no228.its.uni-kassel.de its-no228
192.168.205.139 its-no229.its.uni-kassel.de its-no229
192.168.205.140 its-no230.its.uni-kassel.de its-no230
192.168.205.141 its-no231.its.uni-kassel.de its-no231
192.168.205.142 its-no232.its.uni-kassel.de its-no232
192.168.205.143 its-no233.its.uni-kassel.de its-no233
192.168.205.144 its-no234.its.uni-kassel.de its-no234
192.168.205.145 its-no235.its.uni-kassel.de its-no235
192.168.205.146 its-no236.its.uni-kassel.de its-no236
192.168.205.147 its-no237.its.uni-kassel.de its-no237
192.168.205.148 its-no238.its.uni-kassel.de its-no238
192.168.205.149 its-no239.its.uni-kassel.de its-no239
192.168.205.150 its-no240.its.uni-kassel.de its-no240
192.168.205.151 its-no241.its.uni-kassel.de its-no241
192.168.205.152 its-no242.its.uni-kassel.de its-no242
192.168.205.153 its-no243.its.uni-kassel.de its-no243
192.168.205.154 its-no244.its.uni-kassel.de its-no244
192.168.205.155 its-no245.its.uni-kassel.de its-no245
192.168.205.156 its-no246.its.uni-kassel.de its-no246
192.168.205.157 its-no247.its.uni-kassel.de its-no247
192.168.205.158 its-no248.its.uni-kassel.de its-no248
192.168.205.159 its-no249.its.uni-kassel.de its-no249
192.168.205.160 its-no250.its.uni-kassel.de its-no250
192.168.205.161 its-no251.its.uni-kassel.de its-no251
192.168.205.162 its-no252.its.uni-kassel.de its-no252
192.168.205.163 its-no253.its.uni-kassel.de its-no253
192.168.205.164 its-no254.its.uni-kassel.de its-no254
192.168.205.165 its-no255.its.uni-kassel.de its-no255
192.168.205.166 its-no256.its.uni-kassel.de its-no256
192.168.205.167 its-no257.its.uni-kassel.de its-no257
192.168.205.168 its-no258.its.uni-kassel.de its-no258
192.168.205.169 its-no259.its.uni-kassel.de its-no259
192.168.205.170 its-no260.its.uni-kassel.de its-no260
192.168.205.171 its-no261.its.uni-kassel.de its-no261
192.168.205.172 its-no262.its.uni-kassel.de its-no262
192.168.205.173 its-no263.its.uni-kassel.de its-no263
192.168.205.174 its-no264.its.uni-kassel.de its-no264
192.168.205.175 its-no265.its.uni-kassel.de its-no265
192.168.205.176 its-no266.its.uni-kassel.de its-no266
192.168.205.177 its-no267.its.uni-kassel.de its-no267
192.168.205.178 its-no268.its.uni-kassel.de its-no268
192.168.205.179 its-no269.its.uni-kassel.de its-no269
192.168.205.180 its-no270.its.uni-kassel.de its-no270
192.168.205.181 its-no271.its.uni-kassel.de its-no271
192.168.205.182 its-no272.its.uni-kassel.de its-no272
192.168.205.183 its-no273.its.uni-kassel.de its-no273
192.168.205.184 its-no274.its.uni-kassel.de its-no274
192.168.205.185 its-no275.its.uni-kassel.de its-no275
192.168.205.186 its-no276.its.uni-kassel.de its-no276
192.168.205.187 its-no277.its.uni-kassel.de its-no277
192.168.205.188 its-no278.its.uni-kassel.de its-no278
192.168.205.189 its-no279.its.uni-kassel.de its-no279
192.168.205.190 its-no280.its.uni-kassel.de its-no280
192.168.205.191 its-no281.its.uni-kassel.de its-no281
192.168.205.192 its-no282.its.uni-kassel.de its-no282
192.168.205.193 its-no283.its.uni-kassel.de its-no283
192.168.205.194 its-no284.its.uni-kassel.de its-no284
192.168.205.195 its-no285.its.uni-kassel.de its-no285
192.168.205.196 its-no286.its.uni-kassel.de its-no286
192.168.205.197 its-no287.its.uni-kassel.de its-no287
192.168.205.198 its-no288.its.uni-kassel.de its-no288
192.168.205.199 its-no289.its.uni-kassel.de its-no289
192.168.205.200 its-no290.its.uni-kassel.de its-no290
192.168.205.201 its-no291.its.uni-kassel.de its-no291
192.168.205.202 its-no292.its.uni-kassel.de its-no292
192.168.205.203 its-no293.its.uni-kassel.de its-no293
192.168.205.204 its-no294.its.uni-kassel.de its-no294
192.168.205.205 its-no295.its.uni-kassel.de its-no295
192.168.205.206 its-no296.its.uni-kassel.de its-no296
192.168.205.207 its-no297.its.uni-kassel.de its-no297
192.168.205.208 its-no298.its.uni-kassel.de its-no298
192.168.205.209 its-no299.its.uni-kassel.de its-no299
192.168.168.30 its-ib1.its.uni-kassel.de its-ib1
192.168.168.170 its-ib10.its.uni-kassel.de its-ib10
192.168.168.171 its-ib11.its.uni-kassel.de its-ib11
192.168.168.172 its-ib12.its.uni-kassel.de its-ib12
192.168.168.173 its-ib13.its.uni-kassel.de its-ib13
192.168.168.174 its-ib14.its.uni-kassel.de its-ib14
192.168.168.175 its-ib15.its.uni-kassel.de its-ib15
192.168.168.176 its-ib16.its.uni-kassel.de its-ib16
192.168.168.177 its-ib17.its.uni-kassel.de its-ib17
192.168.168.178 its-ib18.its.uni-kassel.de its-ib18
192.168.168.179 its-ib19.its.uni-kassel.de its-ib19
192.168.169.10 its-ib100.its.uni-kassel.de its-ib100
192.168.169.11 its-ib101.its.uni-kassel.de its-ib101
192.168.169.12 its-ib102.its.uni-kassel.de its-ib102
192.168.169.13 its-ib103.its.uni-kassel.de its-ib103
192.168.169.14 its-ib104.its.uni-kassel.de its-ib104
192.168.169.15 its-ib105.its.uni-kassel.de its-ib105
192.168.169.16 its-ib106.its.uni-kassel.de its-ib106
192.168.169.17 its-ib107.its.uni-kassel.de its-ib107
192.168.169.18 its-ib108.its.uni-kassel.de its-ib108
192.168.169.19 its-ib109.its.uni-kassel.de its-ib109
192.168.169.20 its-ib110.its.uni-kassel.de its-ib110
192.168.169.21 its-ib111.its.uni-kassel.de its-ib111
192.168.169.22 its-ib112.its.uni-kassel.de its-ib112
192.168.169.23 its-ib113.its.uni-kassel.de its-ib113
192.168.169.24 its-ib114.its.uni-kassel.de its-ib114
192.168.169.25 its-ib115.its.uni-kassel.de its-ib115
192.168.169.26 its-ib116.its.uni-kassel.de its-ib116
192.168.169.27 its-ib117.its.uni-kassel.de its-ib117
192.168.169.28 its-ib118.its.uni-kassel.de its-ib118
192.168.169.29 its-ib119.its.uni-kassel.de its-ib119
192.168.169.30 its-ib120.its.uni-kassel.de its-ib120
192.168.169.31 its-ib121.its.uni-kassel.de its-ib121
192.168.169.32 its-ib122.its.uni-kassel.de its-ib122
192.168.169.33 its-ib123.its.uni-kassel.de its-ib123
192.168.169.34 its-ib124.its.uni-kassel.de its-ib124
192.168.169.35 its-ib125.its.uni-kassel.de its-ib125
192.168.169.36 its-ib126.its.uni-kassel.de its-ib126
192.168.169.37 its-ib127.its.uni-kassel.de its-ib127
192.168.169.38 its-ib128.its.uni-kassel.de its-ib128
192.168.169.39 its-ib129.its.uni-kassel.de its-ib129
192.168.169.40 its-ib130.its.uni-kassel.de its-ib130
192.168.169.41 its-ib131.its.uni-kassel.de its-ib131
192.168.169.42 its-ib132.its.uni-kassel.de its-ib132
192.168.169.43 its-ib133.its.uni-kassel.de its-ib133
192.168.169.44 its-ib134.its.uni-kassel.de its-ib134
192.168.169.45 its-ib135.its.uni-kassel.de its-ib135
192.168.169.46 its-ib136.its.uni-kassel.de its-ib136
192.168.169.47 its-ib137.its.uni-kassel.de its-ib137
192.168.169.48 its-ib138.its.uni-kassel.de its-ib138
192.168.169.49 its-ib139.its.uni-kassel.de its-ib139
192.168.169.50 its-ib140.its.uni-kassel.de its-ib140
192.168.169.51 its-ib141.its.uni-kassel.de its-ib141
192.168.169.52 its-ib142.its.uni-kassel.de its-ib142
192.168.169.53 its-ib143.its.uni-kassel.de its-ib143
192.168.169.54 its-ib144.its.uni-kassel.de its-ib144
192.168.169.55 its-ib145.its.uni-kassel.de its-ib145
192.168.169.56 its-ib146.its.uni-kassel.de its-ib146
192.168.169.57 its-ib147.its.uni-kassel.de its-ib147
192.168.169.58 its-ib148.its.uni-kassel.de its-ib148
192.168.169.59 its-ib149.its.uni-kassel.de its-ib149
192.168.169.60 its-ib150.its.uni-kassel.de its-ib150
192.168.169.61 its-ib151.its.uni-kassel.de its-ib151
192.168.169.62 its-ib152.its.uni-kassel.de its-ib152
192.168.169.63 its-ib153.its.uni-kassel.de its-ib153
192.168.169.64 its-ib154.its.uni-kassel.de its-ib154
192.168.169.65 its-ib155.its.uni-kassel.de its-ib155
192.168.169.66 its-ib156.its.uni-kassel.de its-ib156
192.168.169.67 its-ib157.its.uni-kassel.de its-ib157
192.168.169.68 its-ib158.its.uni-kassel.de its-ib158
192.168.169.69 its-ib159.its.uni-kassel.de its-ib159
192.168.169.70 its-ib160.its.uni-kassel.de its-ib160
192.168.169.71 its-ib161.its.uni-kassel.de its-ib161
192.168.169.72 its-ib162.its.uni-kassel.de its-ib162
192.168.169.73 its-ib163.its.uni-kassel.de its-ib163
192.168.169.74 its-ib164.its.uni-kassel.de its-ib164
192.168.169.75 its-ib165.its.uni-kassel.de its-ib165
192.168.169.76 its-ib166.its.uni-kassel.de its-ib166
192.168.169.77 its-ib167.its.uni-kassel.de its-ib167
192.168.169.78 its-ib168.its.uni-kassel.de its-ib168
192.168.169.79 its-ib169.its.uni-kassel.de its-ib169
192.168.169.80 its-ib170.its.uni-kassel.de its-ib170
192.168.169.81 its-ib171.its.uni-kassel.de its-ib171
192.168.169.82 its-ib172.its.uni-kassel.de its-ib172
192.168.169.83 its-ib173.its.uni-kassel.de its-ib173
192.168.169.84 its-ib174.its.uni-kassel.de its-ib174
192.168.169.85 its-ib175.its.uni-kassel.de its-ib175
192.168.169.86 its-ib176.its.uni-kassel.de its-ib176
192.168.169.87 its-ib177.its.uni-kassel.de its-ib177
192.168.169.88 its-ib178.its.uni-kassel.de its-ib178
192.168.169.89 its-ib179.its.uni-kassel.de its-ib179
192.168.169.90 its-ib180.its.uni-kassel.de its-ib180
192.168.169.91 its-ib181.its.uni-kassel.de its-ib181
192.168.169.92 its-ib182.its.uni-kassel.de its-ib182
192.168.169.93 its-ib183.its.uni-kassel.de its-ib183
192.168.169.94 its-ib184.its.uni-kassel.de its-ib184
192.168.169.95 its-ib185.its.uni-kassel.de its-ib185
192.168.169.96 its-ib186.its.uni-kassel.de its-ib186
192.168.169.97 its-ib187.its.uni-kassel.de its-ib187
192.168.169.98 its-ib188.its.uni-kassel.de its-ib188
192.168.169.99 its-ib189.its.uni-kassel.de its-ib189
192.168.169.100 its-ib190.its.uni-kassel.de its-ib190
192.168.169.101 its-ib191.its.uni-kassel.de its-ib191
192.168.169.102 its-ib192.its.uni-kassel.de its-ib192
192.168.169.103 its-ib193.its.uni-kassel.de its-ib193
192.168.169.104 its-ib194.its.uni-kassel.de its-ib194
192.168.169.105 its-ib195.its.uni-kassel.de its-ib195
192.168.169.106 its-ib196.its.uni-kassel.de its-ib196
192.168.169.107 its-ib197.its.uni-kassel.de its-ib197
192.168.169.108 its-ib198.its.uni-kassel.de its-ib198
192.168.169.109 its-ib199.its.uni-kassel.de its-ib199
192.168.169.110 its-ib200.its.uni-kassel.de its-ib200
192.168.169.111 its-ib201.its.uni-kassel.de its-ib201
192.168.169.112 its-ib202.its.uni-kassel.de its-ib202
192.168.169.113 its-ib203.its.uni-kassel.de its-ib203
192.168.169.114 its-ib204.its.uni-kassel.de its-ib204
192.168.169.115 its-ib205.its.uni-kassel.de its-ib205
192.168.169.116 its-ib206.its.uni-kassel.de its-ib206
192.168.169.117 its-ib207.its.uni-kassel.de its-ib207
192.168.169.118 its-ib208.its.uni-kassel.de its-ib208
192.168.169.119 its-ib209.its.uni-kassel.de its-ib209
192.168.169.120 its-ib210.its.uni-kassel.de its-ib210
192.168.169.121 its-ib211.its.uni-kassel.de its-ib211
192.168.169.122 its-ib212.its.uni-kassel.de its-ib212
192.168.169.123 its-ib213.its.uni-kassel.de its-ib213
192.168.169.124 its-ib214.its.uni-kassel.de its-ib214
192.168.169.125 its-ib215.its.uni-kassel.de its-ib215
192.168.169.126 its-ib216.its.uni-kassel.de its-ib216
192.168.169.127 its-ib217.its.uni-kassel.de its-ib217
192.168.169.128 its-ib218.its.uni-kassel.de its-ib218
192.168.169.129 its-ib219.its.uni-kassel.de its-ib219
192.168.169.130 its-ib220.its.uni-kassel.de its-ib220
192.168.169.131 its-ib221.its.uni-kassel.de its-ib221
192.168.169.132 its-ib222.its.uni-kassel.de its-ib222
192.168.169.133 its-ib223.its.uni-kassel.de its-ib223
192.168.169.134 its-ib224.its.uni-kassel.de its-ib224
192.168.169.135 its-ib225.its.uni-kassel.de its-ib225
192.168.169.136 its-ib226.its.uni-kassel.de its-ib226
192.168.169.137 its-ib227.its.uni-kassel.de its-ib227
192.168.169.138 its-ib228.its.uni-kassel.de its-ib228
192.168.169.139 its-ib229.its.uni-kassel.de its-ib229
192.168.169.140 its-ib230.its.uni-kassel.de its-ib230
192.168.169.141 its-ib231.its.uni-kassel.de its-ib231
192.168.169.142 its-ib232.its.uni-kassel.de its-ib232
192.168.169.143 its-ib233.its.uni-kassel.de its-ib233
192.168.169.144 its-ib234.its.uni-kassel.de its-ib234
192.168.169.145 its-ib235.its.uni-kassel.de its-ib235
192.168.169.146 its-ib236.its.uni-kassel.de its-ib236
192.168.169.147 its-ib237.its.uni-kassel.de its-ib237
192.168.169.148 its-ib238.its.uni-kassel.de its-ib238
192.168.169.149 its-ib239.its.uni-kassel.de its-ib239
192.168.169.150 its-ib240.its.uni-kassel.de its-ib240
192.168.169.151 its-ib241.its.uni-kassel.de its-ib241
192.168.169.152 its-ib242.its.uni-kassel.de its-ib242
192.168.169.153 its-ib243.its.uni-kassel.de its-ib243
192.168.169.154 its-ib244.its.uni-kassel.de its-ib244
192.168.169.155 its-ib245.its.uni-kassel.de its-ib245
192.168.169.156 its-ib246.its.uni-kassel.de its-ib246
192.168.169.157 its-ib247.its.uni-kassel.de its-ib247
192.168.169.158 its-ib248.its.uni-kassel.de its-ib248
192.168.169.159 its-ib249.its.uni-kassel.de its-ib249
192.168.169.160 its-ib250.its.uni-kassel.de its-ib250
192.168.169.161 its-ib251.its.uni-kassel.de its-ib251
192.168.169.162 its-ib252.its.uni-kassel.de its-ib252
192.168.169.163 its-ib253.its.uni-kassel.de its-ib253
192.168.169.164 its-ib254.its.uni-kassel.de its-ib254
192.168.169.165 its-ib255.its.uni-kassel.de its-ib255
192.168.169.166 its-ib256.its.uni-kassel.de its-ib256
192.168.169.167 its-ib257.its.uni-kassel.de its-ib257
192.168.169.168 its-ib258.its.uni-kassel.de its-ib258
192.168.169.169 its-ib259.its.uni-kassel.de its-ib259
192.168.169.170 its-ib260.its.uni-kassel.de its-ib260
192.168.169.171 its-ib261.its.uni-kassel.de its-ib261
192.168.169.172 its-ib262.its.uni-kassel.de its-ib262
192.168.169.173 its-ib263.its.uni-kassel.de its-ib263
192.168.169.174 its-ib264.its.uni-kassel.de its-ib264
192.168.169.175 its-ib265.its.uni-kassel.de its-ib265
192.168.169.176 its-ib266.its.uni-kassel.de its-ib266
192.168.169.177 its-ib267.its.uni-kassel.de its-ib267
192.168.169.178 its-ib268.its.uni-kassel.de its-ib268
192.168.169.179 its-ib269.its.uni-kassel.de its-ib269
192.168.169.180 its-ib270.its.uni-kassel.de its-ib270
192.168.169.181 its-ib271.its.uni-kassel.de its-ib271
192.168.169.182 its-ib272.its.uni-kassel.de its-ib272
192.168.169.183 its-ib273.its.uni-kassel.de its-ib273
192.168.169.184 its-ib274.its.uni-kassel.de its-ib274
192.168.169.185 its-ib275.its.uni-kassel.de its-ib275
192.168.169.186 its-ib276.its.uni-kassel.de its-ib276
192.168.169.187 its-ib277.its.uni-kassel.de its-ib277
192.168.169.188 its-ib278.its.uni-kassel.de its-ib278
192.168.169.189 its-ib279.its.uni-kassel.de its-ib279
192.168.169.190 its-ib280.its.uni-kassel.de its-ib280
192.168.169.191 its-ib281.its.uni-kassel.de its-ib281
192.168.169.192 its-ib282.its.uni-kassel.de its-ib282
192.168.169.193 its-ib283.its.uni-kassel.de its-ib283
192.168.169.194 its-ib284.its.uni-kassel.de its-ib284
192.168.169.195 its-ib285.its.uni-kassel.de its-ib285
192.168.169.196 its-ib286.its.uni-kassel.de its-ib286
192.168.169.197 its-ib287.its.uni-kassel.de its-ib287
192.168.169.198 its-ib288.its.uni-kassel.de its-ib288
192.168.169.199 its-ib289.its.uni-kassel.de its-ib289
192.168.169.200 its-ib290.its.uni-kassel.de its-ib290
192.168.169.201 its-ib291.its.uni-kassel.de its-ib291
192.168.169.202 its-ib292.its.uni-kassel.de its-ib292
192.168.169.203 its-ib293.its.uni-kassel.de its-ib293
192.168.169.204 its-ib294.its.uni-kassel.de its-ib294
192.168.169.205 its-ib295.its.uni-kassel.de its-ib295
192.168.169.206 its-ib296.its.uni-kassel.de its-ib296
192.168.169.207 its-ib297.its.uni-kassel.de its-ib297
192.168.169.208 its-ib298.its.uni-kassel.de its-ib298
192.168.169.209 its-ib299.its.uni-kassel.de its-ib299
141.51.205.210 its-cs300.its.uni-kassel.de its-cs300
141.51.205.211 its-cs301.its.uni-kassel.de its-cs301
141.51.205.212 its-cs302.its.uni-kassel.de its-cs302
141.51.205.213 its-cs303.its.uni-kassel.de its-cs303
141.51.205.214 its-cs304.its.uni-kassel.de its-cs304
141.51.205.215 its-cs305.its.uni-kassel.de its-cs305
141.51.205.216 its-cs306.its.uni-kassel.de its-cs306
141.51.205.217 its-cs307.its.uni-kassel.de its-cs307
141.51.205.218 its-cs308.its.uni-kassel.de its-cs308
141.51.205.219 its-cs309.its.uni-kassel.de its-cs309
141.51.205.220 its-cs310.its.uni-kassel.de its-cs310
141.51.205.221 its-cs311.its.uni-kassel.de its-cs311
141.51.205.222 its-cs312.its.uni-kassel.de its-cs312
141.51.205.223 its-cs313.its.uni-kassel.de its-cs313
141.51.205.224 its-cs314.its.uni-kassel.de its-cs314
141.51.205.225 its-cs315.its.uni-kassel.de its-cs315
141.51.205.226 its-cs316.its.uni-kassel.de its-cs316
141.51.205.227 its-cs317.its.uni-kassel.de its-cs317
141.51.205.228 its-cs318.its.uni-kassel.de its-cs318
141.51.205.229 its-cs319.its.uni-kassel.de its-cs319
141.51.205.230 its-cs320.its.uni-kassel.de its-cs320
141.51.205.231 its-cs321.its.uni-kassel.de its-cs321
141.51.205.232 its-cs322.its.uni-kassel.de its-cs322
141.51.205.233 its-cs323.its.uni-kassel.de its-cs323
141.51.205.234 its-cs324.its.uni-kassel.de its-cs324
141.51.205.235 its-cs325.its.uni-kassel.de its-cs325
141.51.205.236 its-cs326.its.uni-kassel.de its-cs326
141.51.205.237 its-cs327.its.uni-kassel.de its-cs327
141.51.205.238 its-cs328.its.uni-kassel.de its-cs328
141.51.205.239 its-cs329.its.uni-kassel.de its-cs329
141.51.205.240 its-cs330.its.uni-kassel.de its-cs330
141.51.205.241 its-cs331.its.uni-kassel.de its-cs331
141.51.205.242 its-cs332.its.uni-kassel.de its-cs332
141.51.205.243 its-cs333.its.uni-kassel.de its-cs333
141.51.205.244 its-cs334.its.uni-kassel.de its-cs334
141.51.205.245 its-cs335.its.uni-kassel.de its-cs335
141.51.205.246 its-cs336.its.uni-kassel.de its-cs336
141.51.205.247 its-cs337.its.uni-kassel.de its-cs337
141.51.205.248 its-cs338.its.uni-kassel.de its-cs338
141.51.205.249 its-cs339.its.uni-kassel.de its-cs339
141.51.205.250 its-cs340.its.uni-kassel.de its-cs340
141.51.205.251 its-cs341.its.uni-kassel.de its-cs341
141.51.205.252 its-cs342.its.uni-kassel.de its-cs342
141.51.205.253 its-cs343.its.uni-kassel.de its-cs343
141.51.205.254 its-cs344.its.uni-kassel.de its-cs344
192.168.205.210 its-no300.its.uni-kassel.de its-no300
192.168.205.211 its-no301.its.uni-kassel.de its-no301
192.168.205.212 its-no302.its.uni-kassel.de its-no302
192.168.205.213 its-no303.its.uni-kassel.de its-no303
192.168.205.214 its-no304.its.uni-kassel.de its-no304
192.168.205.215 its-no305.its.uni-kassel.de its-no305
192.168.205.216 its-no306.its.uni-kassel.de its-no306
192.168.205.217 its-no307.its.uni-kassel.de its-no307
192.168.205.218 its-no308.its.uni-kassel.de its-no308
192.168.205.219 its-no309.its.uni-kassel.de its-no309
192.168.205.220 its-no310.its.uni-kassel.de its-no310
192.168.205.221 its-no311.its.uni-kassel.de its-no311
192.168.205.222 its-no312.its.uni-kassel.de its-no312
192.168.205.223 its-no313.its.uni-kassel.de its-no313
192.168.205.224 its-no314.its.uni-kassel.de its-no314
192.168.205.225 its-no315.its.uni-kassel.de its-no315
192.168.205.226 its-no316.its.uni-kassel.de its-no316
192.168.205.227 its-no317.its.uni-kassel.de its-no317
192.168.205.228 its-no318.its.uni-kassel.de its-no318
192.168.205.229 its-no319.its.uni-kassel.de its-no319
192.168.205.230 its-no320.its.uni-kassel.de its-no320
192.168.205.231 its-no321.its.uni-kassel.de its-no321
192.168.205.232 its-no322.its.uni-kassel.de its-no322
192.168.205.233 its-no323.its.uni-kassel.de its-no323
192.168.205.234 its-no324.its.uni-kassel.de its-no324
192.168.205.235 its-no325.its.uni-kassel.de its-no325
192.168.205.236 its-no326.its.uni-kassel.de its-no326
192.168.205.237 its-no327.its.uni-kassel.de its-no327
192.168.205.238 its-no328.its.uni-kassel.de its-no328
192.168.205.239 its-no329.its.uni-kassel.de its-no329
192.168.205.240 its-no330.its.uni-kassel.de its-no330
192.168.205.241 its-no331.its.uni-kassel.de its-no331
192.168.205.242 its-no332.its.uni-kassel.de its-no332
192.168.205.243 its-no333.its.uni-kassel.de its-no333
192.168.205.244 its-no334.its.uni-kassel.de its-no334
192.168.205.245 its-no335.its.uni-kassel.de its-no335
192.168.205.246 its-no336.its.uni-kassel.de its-no336
192.168.205.247 its-no337.its.uni-kassel.de its-no337
192.168.205.248 its-no338.its.uni-kassel.de its-no338
192.168.205.249 its-no339.its.uni-kassel.de its-no339
192.168.205.250 its-no340.its.uni-kassel.de its-no340
192.168.205.251 its-no341.its.uni-kassel.de its-no341
192.168.205.252 its-no342.its.uni-kassel.de its-no342
192.168.205.253 its-no343.its.uni-kassel.de its-no343
192.168.205.254 its-no344.its.uni-kassel.de its-no344
192.168.169.210 its-ib300.its.uni-kassel.de its-ib300
192.168.169.211 its-ib301.its.uni-kassel.de its-ib301
192.168.169.212 its-ib302.its.uni-kassel.de its-ib302
192.168.169.213 its-ib303.its.uni-kassel.de its-ib303
192.168.169.214 its-ib304.its.uni-kassel.de its-ib304
192.168.169.215 its-ib305.its.uni-kassel.de its-ib305
192.168.169.216 its-ib306.its.uni-kassel.de its-ib306
192.168.169.217 its-ib307.its.uni-kassel.de its-ib307
192.168.169.218 its-ib308.its.uni-kassel.de its-ib308
192.168.169.219 its-ib309.its.uni-kassel.de its-ib309
192.168.169.220 its-ib310.its.uni-kassel.de its-ib310
192.168.169.221 its-ib311.its.uni-kassel.de its-ib311
192.168.169.222 its-ib312.its.uni-kassel.de its-ib312
192.168.169.223 its-ib313.its.uni-kassel.de its-ib313
192.168.169.224 its-ib314.its.uni-kassel.de its-ib314
192.168.169.225 its-ib315.its.uni-kassel.de its-ib315
192.168.169.226 its-ib316.its.uni-kassel.de its-ib316
192.168.169.227 its-ib317.its.uni-kassel.de its-ib317
192.168.169.228 its-ib318.its.uni-kassel.de its-ib318
192.168.169.229 its-ib319.its.uni-kassel.de its-ib319
192.168.169.230 its-ib320.its.uni-kassel.de its-ib320
192.168.169.231 its-ib321.its.uni-kassel.de its-ib321
192.168.169.232 its-ib322.its.uni-kassel.de its-ib322
192.168.169.233 its-ib323.its.uni-kassel.de its-ib323
192.168.169.234 its-ib324.its.uni-kassel.de its-ib324
192.168.169.235 its-ib325.its.uni-kassel.de its-ib325
192.168.169.236 its-ib326.its.uni-kassel.de its-ib326
192.168.169.237 its-ib327.its.uni-kassel.de its-ib327
192.168.169.238 its-ib328.its.uni-kassel.de its-ib328
192.168.169.239 its-ib329.its.uni-kassel.de its-ib329
192.168.169.240 its-ib330.its.uni-kassel.de its-ib330
192.168.169.241 its-ib331.its.uni-kassel.de its-ib331
192.168.169.242 its-ib332.its.uni-kassel.de its-ib332
192.168.169.243 its-ib333.its.uni-kassel.de its-ib333
192.168.169.244 its-ib334.its.uni-kassel.de its-ib334
192.168.169.245 its-ib335.its.uni-kassel.de its-ib335
192.168.169.246 its-ib336.its.uni-kassel.de its-ib336
192.168.169.247 its-ib337.its.uni-kassel.de its-ib337
192.168.169.248 its-ib338.its.uni-kassel.de its-ib338
192.168.169.249 its-ib339.its.uni-kassel.de its-ib339
192.168.169.250 its-ib340.its.uni-kassel.de its-ib340
192.168.169.251 its-ib341.its.uni-kassel.de its-ib341
192.168.169.252 its-ib342.its.uni-kassel.de its-ib342
192.168.169.253 its-ib343.its.uni-kassel.de its-ib343
192.168.169.254 its-ib344.its.uni-kassel.de its-ib344


> Hello there,
>
>      Could you please share your /etc/hosts file, if you don't mind.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek 
> <macek@cs.uni-kassel.de <ma...@cs.uni-kassel.de>> wrote:
>
>     Hi,
>
>     i am currently trying to run my hadoop program on a cluster. Sadly
>     though my datanodes and tasktrackers seem to have difficulties
>     with their communication as their logs say:
>     * Some datanodes and tasktrackers seem to have portproblems of
>     some kind as it can be seen in the logs below. I wondered if this
>     might be due to reasons correllated with the localhost entry in
>     /etc/hosts as you can read in alot of posts with similar errors,
>     but i checked the file neither localhost nor 127.0.0.1/127.0.1.1
>     <http://127.0.0.1/127.0.1.1> is bound there. (although you can
>     ping localhost... the technician of the cluster said he'd be
>     looking for the mechanics resolving localhost)
>     * The other nodes can not speak with the namenode and jobtracker
>     (its-cs131). Although it is absolutely not clear, why this is
>     happening: the "dfs -put" i do directly before the job is running
>     fine, which seems to imply that communication between those
>     servers is working flawlessly.
>
>     Is there any reason why this might happen?
>
>
>     Regards,
>     Elmar
>
>     LOGS BELOW:
>
>     \____Datanodes
>
>     After successfully putting the data to hdfs (at this point i
>     thought namenode and datanodes have to communicate), i get the
>     following errors when starting the job:
>
>     There are 2 kinds of logs i found: the first one is big (about
>     12MB) and looks like this:
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 1 time(s).
>     2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 2 time(s).
>     2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 3 time(s).
>     2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 4 time(s).
>     2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 5 time(s).
>     2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 6 time(s).
>     2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 7 time(s).
>     2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 8 time(s).
>     2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 9 time(s).
>     2012-08-13 08:23:36,335 WARN
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at $Proxy5.sendHeartbeat(Unknown Source)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:904)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
>         at java.lang.Thread.run(Thread.java:619)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 5 more
>
>     ... (this continues til the end of the log)
>
>     The second is short kind:
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:19,038 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting DataNode
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:19,203 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:19,216 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:19,217 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:19,218 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode
>     metrics system started
>     2012-08-13 00:59:19,306 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:19,346 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35554
>     <http://141.51.205.41:35554>. Already tried 0 time(s).
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Storage directory
>     /home/work/bmacek/hadoop/hdfs/slave is not formatted.
>     2012-08-13 00:59:21,584 INFO
>     org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
>     2012-08-13 00:59:21,787 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: Registered
>     FSDatasetStatusMBean
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     Shutting down all async disk service threads...
>     2012-08-13 00:59:21,897 INFO
>     org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService:
>     All async disk service threads have been shut down.
>     2012-08-13 00:59:21,898 ERROR
>     org.apache.hadoop.hdfs.server.datanode.DataNode:
>     java.net.BindException: Problem binding to /0.0.0.0:50010
>     <http://0.0.0.0:50010> : Address already in use
>         at org.apache.hadoop.ipc.Server.bind(Server.java:227)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:404)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
>         at
>     org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
>     Caused by: java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server.bind(Server.java:225)
>         ... 7 more
>
>     2012-08-13 00:59:21,899 INFO
>     org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down DataNode at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>
>
>
>
>     \_____TastTracker
>     With TaskTrackers it is the same: there are 2 kinds.
>     ############################### LOG TYPE 1
>     ############################################################
>     2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.TaskTracker:
>     Resending 'status' to 'its-cs131' with reponseId '879
>     2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 1 time(s).
>     2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 2 time(s).
>     2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 3 time(s).
>     2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 4 time(s).
>     2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 5 time(s).
>     2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 6 time(s).
>     2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 7 time(s).
>     2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 8 time(s).
>     2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 9 time(s).
>     2012-08-13 02:10:04,651 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Caught exception:
>     java.net.ConnectException: Call to its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555> failed on connection exception:
>     java.net.ConnectException: Connection refused
>         at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1071)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>         at org.apache.hadoop.mapred.$Proxy5.heartbeat(Unknown Source)
>         at
>     org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1857)
>         at
>     org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1653)
>         at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2503)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3744)
>     Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
>     sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net
>     <http://org.apache.hadoop.net>.NetUtils.connect(NetUtils.java:489)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
>         at
>     org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
>         at
>     org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1046)
>         ... 6 more
>
>
>     ########################### LOG TYPE 2
>     ############################################################
>     2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
>     STARTUP_MSG:
>     /************************************************************
>     STARTUP_MSG: Starting TaskTracker
>     STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     STARTUP_MSG:   args = []
>     STARTUP_MSG:   version = 1.0.2
>     STARTUP_MSG:   build =
>     https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2
>     -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
>     ************************************************************/
>     2012-08-13 00:59:24,569 INFO
>     org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties
>     from hadoop-metrics2.properties
>     2012-08-13 00:59:24,626 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source MetricsSystem,sub=Stats registered.
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled
>     snapshot period at 10 second(s).
>     2012-08-13 00:59:24,627 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker
>     metrics system started
>     2012-08-13 00:59:24,950 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ugi registered.
>     2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to
>     org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>     org.mortbay.log.Slf4jLog
>     2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.HttpServer:
>     Added global filtersafety
>     (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>     2012-08-13 00:59:25,232 INFO
>     org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>     truncater with mapRetainSize=-1 and reduceRetainSize=-1
>     2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tasktracker with owner as bmacek
>     2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.TaskTracker:
>     Good mapred local directories are:
>     /home/work/bmacek/hadoop/hdfs/tmp/mapred/local
>     2012-08-13 00:59:25,244 INFO
>     org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop
>     library
>     2012-08-13 00:59:25,255 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source jvm registered.
>     2012-08-13 00:59:25,256 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source TaskTrackerMetrics registered.
>     2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server:
>     Starting SocketReader
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcDetailedActivityForPort54850 registered.
>     2012-08-13 00:59:25,282 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source RpcActivityForPort54850 registered.
>     2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC
>     Server Responder: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server listener on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 0 on 54850: starting
>     2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 1 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker up at: localhost/127.0.0.1:54850 <http://127.0.0.1:54850>
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 3 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC
>     Server handler 2 on 54850: starting
>     2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting tracker
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client:
>     Retrying connect to server: its-cs131/141.51.205.41:35555
>     <http://141.51.205.41:35555>. Already tried 0 time(s).
>     2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.TaskTracker:
>     Starting thread: Map-events fetcher for all reduce tasks on
>     tracker_its-cs133.its.uni-kassel.de:localhost/127.0.0.1:54850
>     <http://127.0.0.1:54850>
>     2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.ProcessTree:
>     setsid exited with exit code 0
>     2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.TaskTracker:
>     Using ResourceCalculatorPlugin :
>     org.apache.hadoop.util.LinuxResourceCalculatorPlugin@445e228
>     2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.TaskTracker:
>     TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager
>     is disabled.
>     2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.IndexCache:
>     IndexCache created with max memory = 10485760
>     2012-08-13 00:59:38,158 INFO
>     org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for
>     source ShuffleServerMetrics registered.
>     2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.HttpServer:
>     Port returned by webServer.getConnectors()[0].getLocalPort()
>     before open() is -1. Opening the listener on 50060
>     2012-08-13 00:59:38,161 ERROR
>     org.apache.hadoop.mapred.TaskTracker: Can not start task tracker
>     because java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch
>     <http://sun.nio.ch>.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at
>     org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
>         at org.apache.hadoop.http.HttpServer.start(HttpServer.java:581)
>         at
>     org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1502)
>         at
>     org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)
>
>     2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.TaskTracker:
>     SHUTDOWN_MSG:
>     /************************************************************
>     SHUTDOWN_MSG: Shutting down TaskTracker at
>     its-cs133.its.uni-kassel.de/141.51.205.43
>     <http://its-cs133.its.uni-kassel.de/141.51.205.43>
>     ************************************************************/
>
>


Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hello there,

     Could you please share your /etc/hosts file, if you don't mind.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
<ma...@cs.uni-kassel.de>wrote:

> Hi,
>
> i am currently trying to run my hadoop program on a cluster. Sadly though
> my datanodes and tasktrackers seem to have difficulties with their
> communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind
> as it can be seen in the logs below. I wondered if this might be due to
> reasons correllated with the localhost entry in /etc/hosts as you can read
> in alot of posts with similar errors, but i checked the file neither
> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
> localhost... the technician of the cluster said he'd be looking for the
> mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker
> (its-cs131). Although it is absolutely not clear, why this is happening:
> the "dfs -put" i do directly before the job is running fine, which seems to
> imply that communication between those servers is working flawlessly.
>
> Is there any reason why this might happen?
>
>
> Regards,
> Elmar
>
> LOGS BELOW:
>
> \____Datanodes
>
> After successfully putting the data to hdfs (at this point i thought
> namenode and datanodes have to communicate), i get the following errors
> when starting the job:
>
> There are 2 kinds of logs i found: the first one is big (about 12MB) and
> looks like this:
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed
> on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> offerService(DataNode.java:**904)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.run(**
> DataNode.java:1458)
>     at java.lang.Thread.run(Thread.**java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 5 more
>
> ... (this continues til the end of the log)
>
> The second is short kind:
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Storage directory /home/work/bmacek/hadoop/hdfs/**slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.BindException: Problem binding to /0.0.0.0:50010 : Address
> already in use
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> startDataNode(DataNode.java:**404)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.<init>(**
> DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> makeInstance(DataNode.java:**1582)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> instantiateDataNode(DataNode.**java:1521)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> createDataNode(DataNode.java:**1539)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.secureMain(**
> DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.main(**
> DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:225)
>     ... 7 more
>
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>
>
>
>
>
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.**TaskTracker:
> Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.**TaskTracker:
> Caught exception: java.net.ConnectException: Call to its-cs131/
> 141.51.205.41:35555 failed on connection exception:
> java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$**Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.**TaskTracker.transmitHeartBeat(**
> TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.**TaskTracker.offerService(**
> TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.**TaskTracker.run(TaskTracker.**java:2503)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 6 more
>
>
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.**TaskTracker:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.**
> Log4jLoggerAdapter(org.**mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.**HttpServer: Added
> global filtersafety (class=org.apache.hadoop.http.**
> HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.**TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.**TaskTracker: Good
> mapred local directories are: /home/work/bmacek/hadoop/hdfs/**
> tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
> SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcDetailedActivityForPort5485**0 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
> Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tracker tracker_its-cs133.its.uni-**kassel.de:localhost/
> 127.0.0.1:**54850 <http://127.0.0.1:54850>
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting thread: Map-events fetcher for all reduce tasks on
> tracker_its-cs133.its.uni-**kassel.de:localhost/127.0.0.1:**54850<http://127.0.0.1:54850>
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.**ProcessTree: setsid
> exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.**TaskTracker:
> Using ResourceCalculatorPlugin : org.apache.hadoop.util.**
> LinuxResourceCalculatorPlugin@**445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
> disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.**IndexCache:
> IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.**HttpServer: Port
> returned by webServer.getConnectors()[0].**getLocalPort() before open()
> is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.**TaskTracker: Can
> not start task tracker because java.net.BindException: Address already in
> use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.**SelectChannelConnector.open(**
> SelectChannelConnector.java:**216)
>     at org.apache.hadoop.http.**HttpServer.start(HttpServer.**java:581)
>     at org.apache.hadoop.mapred.**TaskTracker.<init>(**
> TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3742)
>
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.**TaskTracker:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hello there,

     Could you please share your /etc/hosts file, if you don't mind.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
<ma...@cs.uni-kassel.de>wrote:

> Hi,
>
> i am currently trying to run my hadoop program on a cluster. Sadly though
> my datanodes and tasktrackers seem to have difficulties with their
> communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind
> as it can be seen in the logs below. I wondered if this might be due to
> reasons correllated with the localhost entry in /etc/hosts as you can read
> in alot of posts with similar errors, but i checked the file neither
> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
> localhost... the technician of the cluster said he'd be looking for the
> mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker
> (its-cs131). Although it is absolutely not clear, why this is happening:
> the "dfs -put" i do directly before the job is running fine, which seems to
> imply that communication between those servers is working flawlessly.
>
> Is there any reason why this might happen?
>
>
> Regards,
> Elmar
>
> LOGS BELOW:
>
> \____Datanodes
>
> After successfully putting the data to hdfs (at this point i thought
> namenode and datanodes have to communicate), i get the following errors
> when starting the job:
>
> There are 2 kinds of logs i found: the first one is big (about 12MB) and
> looks like this:
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed
> on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> offerService(DataNode.java:**904)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.run(**
> DataNode.java:1458)
>     at java.lang.Thread.run(Thread.**java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 5 more
>
> ... (this continues til the end of the log)
>
> The second is short kind:
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Storage directory /home/work/bmacek/hadoop/hdfs/**slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.BindException: Problem binding to /0.0.0.0:50010 : Address
> already in use
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> startDataNode(DataNode.java:**404)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.<init>(**
> DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> makeInstance(DataNode.java:**1582)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> instantiateDataNode(DataNode.**java:1521)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> createDataNode(DataNode.java:**1539)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.secureMain(**
> DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.main(**
> DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:225)
>     ... 7 more
>
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>
>
>
>
>
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.**TaskTracker:
> Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.**TaskTracker:
> Caught exception: java.net.ConnectException: Call to its-cs131/
> 141.51.205.41:35555 failed on connection exception:
> java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$**Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.**TaskTracker.transmitHeartBeat(**
> TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.**TaskTracker.offerService(**
> TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.**TaskTracker.run(TaskTracker.**java:2503)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 6 more
>
>
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.**TaskTracker:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.**
> Log4jLoggerAdapter(org.**mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.**HttpServer: Added
> global filtersafety (class=org.apache.hadoop.http.**
> HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.**TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.**TaskTracker: Good
> mapred local directories are: /home/work/bmacek/hadoop/hdfs/**
> tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
> SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcDetailedActivityForPort5485**0 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
> Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tracker tracker_its-cs133.its.uni-**kassel.de:localhost/
> 127.0.0.1:**54850 <http://127.0.0.1:54850>
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting thread: Map-events fetcher for all reduce tasks on
> tracker_its-cs133.its.uni-**kassel.de:localhost/127.0.0.1:**54850<http://127.0.0.1:54850>
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.**ProcessTree: setsid
> exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.**TaskTracker:
> Using ResourceCalculatorPlugin : org.apache.hadoop.util.**
> LinuxResourceCalculatorPlugin@**445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
> disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.**IndexCache:
> IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.**HttpServer: Port
> returned by webServer.getConnectors()[0].**getLocalPort() before open()
> is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.**TaskTracker: Can
> not start task tracker because java.net.BindException: Address already in
> use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.**SelectChannelConnector.open(**
> SelectChannelConnector.java:**216)
>     at org.apache.hadoop.http.**HttpServer.start(HttpServer.**java:581)
>     at org.apache.hadoop.mapred.**TaskTracker.<init>(**
> TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3742)
>
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.**TaskTracker:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hello there,

     Could you please share your /etc/hosts file, if you don't mind.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
<ma...@cs.uni-kassel.de>wrote:

> Hi,
>
> i am currently trying to run my hadoop program on a cluster. Sadly though
> my datanodes and tasktrackers seem to have difficulties with their
> communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind
> as it can be seen in the logs below. I wondered if this might be due to
> reasons correllated with the localhost entry in /etc/hosts as you can read
> in alot of posts with similar errors, but i checked the file neither
> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
> localhost... the technician of the cluster said he'd be looking for the
> mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker
> (its-cs131). Although it is absolutely not clear, why this is happening:
> the "dfs -put" i do directly before the job is running fine, which seems to
> imply that communication between those servers is working flawlessly.
>
> Is there any reason why this might happen?
>
>
> Regards,
> Elmar
>
> LOGS BELOW:
>
> \____Datanodes
>
> After successfully putting the data to hdfs (at this point i thought
> namenode and datanodes have to communicate), i get the following errors
> when starting the job:
>
> There are 2 kinds of logs i found: the first one is big (about 12MB) and
> looks like this:
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed
> on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> offerService(DataNode.java:**904)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.run(**
> DataNode.java:1458)
>     at java.lang.Thread.run(Thread.**java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 5 more
>
> ... (this continues til the end of the log)
>
> The second is short kind:
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Storage directory /home/work/bmacek/hadoop/hdfs/**slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.BindException: Problem binding to /0.0.0.0:50010 : Address
> already in use
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> startDataNode(DataNode.java:**404)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.<init>(**
> DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> makeInstance(DataNode.java:**1582)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> instantiateDataNode(DataNode.**java:1521)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> createDataNode(DataNode.java:**1539)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.secureMain(**
> DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.main(**
> DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:225)
>     ... 7 more
>
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>
>
>
>
>
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.**TaskTracker:
> Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.**TaskTracker:
> Caught exception: java.net.ConnectException: Call to its-cs131/
> 141.51.205.41:35555 failed on connection exception:
> java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$**Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.**TaskTracker.transmitHeartBeat(**
> TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.**TaskTracker.offerService(**
> TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.**TaskTracker.run(TaskTracker.**java:2503)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 6 more
>
>
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.**TaskTracker:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.**
> Log4jLoggerAdapter(org.**mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.**HttpServer: Added
> global filtersafety (class=org.apache.hadoop.http.**
> HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.**TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.**TaskTracker: Good
> mapred local directories are: /home/work/bmacek/hadoop/hdfs/**
> tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
> SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcDetailedActivityForPort5485**0 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
> Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tracker tracker_its-cs133.its.uni-**kassel.de:localhost/
> 127.0.0.1:**54850 <http://127.0.0.1:54850>
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting thread: Map-events fetcher for all reduce tasks on
> tracker_its-cs133.its.uni-**kassel.de:localhost/127.0.0.1:**54850<http://127.0.0.1:54850>
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.**ProcessTree: setsid
> exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.**TaskTracker:
> Using ResourceCalculatorPlugin : org.apache.hadoop.util.**
> LinuxResourceCalculatorPlugin@**445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
> disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.**IndexCache:
> IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.**HttpServer: Port
> returned by webServer.getConnectors()[0].**getLocalPort() before open()
> is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.**TaskTracker: Can
> not start task tracker because java.net.BindException: Address already in
> use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.**SelectChannelConnector.open(**
> SelectChannelConnector.java:**216)
>     at org.apache.hadoop.http.**HttpServer.start(HttpServer.**java:581)
>     at org.apache.hadoop.mapred.**TaskTracker.<init>(**
> TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3742)
>
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.**TaskTracker:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>

Re: DataNode and Tasttracker communication

Posted by Mohammad Tariq <do...@gmail.com>.
Hello there,

     Could you please share your /etc/hosts file, if you don't mind.

Regards,
    Mohammad Tariq



On Mon, Aug 13, 2012 at 6:01 PM, Björn-Elmar Macek
<ma...@cs.uni-kassel.de>wrote:

> Hi,
>
> i am currently trying to run my hadoop program on a cluster. Sadly though
> my datanodes and tasktrackers seem to have difficulties with their
> communication as their logs say:
> * Some datanodes and tasktrackers seem to have portproblems of some kind
> as it can be seen in the logs below. I wondered if this might be due to
> reasons correllated with the localhost entry in /etc/hosts as you can read
> in alot of posts with similar errors, but i checked the file neither
> localhost nor 127.0.0.1/127.0.1.1 is bound there. (although you can ping
> localhost... the technician of the cluster said he'd be looking for the
> mechanics resolving localhost)
> * The other nodes can not speak with the namenode and jobtracker
> (its-cs131). Although it is absolutely not clear, why this is happening:
> the "dfs -put" i do directly before the job is running fine, which seems to
> imply that communication between those servers is working flawlessly.
>
> Is there any reason why this might happen?
>
>
> Regards,
> Elmar
>
> LOGS BELOW:
>
> \____Datanodes
>
> After successfully putting the data to hdfs (at this point i thought
> namenode and datanodes have to communicate), i get the following errors
> when starting the job:
>
> There are 2 kinds of logs i found: the first one is big (about 12MB) and
> looks like this:
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 08:23:27,331 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 08:23:28,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 1 time(s).
> 2012-08-13 08:23:29,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 2 time(s).
> 2012-08-13 08:23:30,332 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 3 time(s).
> 2012-08-13 08:23:31,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 4 time(s).
> 2012-08-13 08:23:32,333 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 5 time(s).
> 2012-08-13 08:23:33,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 6 time(s).
> 2012-08-13 08:23:34,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 7 time(s).
> 2012-08-13 08:23:35,334 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 8 time(s).
> 2012-08-13 08:23:36,335 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 9 time(s).
> 2012-08-13 08:23:36,335 WARN org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.ConnectException: Call to its-cs131/141.51.205.41:35554 failed
> on connection exception: java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at $Proxy5.sendHeartbeat(Unknown Source)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> offerService(DataNode.java:**904)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.run(**
> DataNode.java:1458)
>     at java.lang.Thread.run(Thread.**java:619)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 5 more
>
> ... (this continues til the end of the log)
>
> The second is short kind:
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:19,038 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:19,203 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:19,216 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:19,217 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:19,218 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> DataNode metrics system started
> 2012-08-13 00:59:19,306 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:19,346 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:20,482 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35554. Already tried 0 time(s).
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Storage directory /home/work/bmacek/hadoop/hdfs/**slave is not formatted.
> 2012-08-13 00:59:21,584 INFO org.apache.hadoop.hdfs.server.**common.Storage:
> Formatting ...
> 2012-08-13 00:59:21,787 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> Registered FSDatasetStatusMBean
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> Shutting down all async disk service threads...
> 2012-08-13 00:59:21,897 INFO org.apache.hadoop.hdfs.server.**datanode.**FSDatasetAsyncDiskService:
> All async disk service threads have been shut down.
> 2012-08-13 00:59:21,898 ERROR org.apache.hadoop.hdfs.server.**datanode.DataNode:
> java.net.BindException: Problem binding to /0.0.0.0:50010 : Address
> already in use
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:227)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> startDataNode(DataNode.java:**404)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.<init>(**
> DataNode.java:299)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> makeInstance(DataNode.java:**1582)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> instantiateDataNode(DataNode.**java:1521)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.**
> createDataNode(DataNode.java:**1539)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.secureMain(**
> DataNode.java:1665)
>     at org.apache.hadoop.hdfs.server.**datanode.DataNode.main(**
> DataNode.java:1682)
> Caused by: java.net.BindException: Address already in use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.apache.hadoop.ipc.Server.**bind(Server.java:225)
>     ... 7 more
>
> 2012-08-13 00:59:21,899 INFO org.apache.hadoop.hdfs.server.**datanode.DataNode:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>
>
>
>
>
> \_____TastTracker
> With TaskTrackers it is the same: there are 2 kinds.
> ##############################**# LOG TYPE 1
> ##############################**##############################
> 2012-08-13 02:09:54,645 INFO org.apache.hadoop.mapred.**TaskTracker:
> Resending 'status' to 'its-cs131' with reponseId '879
> 2012-08-13 02:09:55,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 02:09:56,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 1 time(s).
> 2012-08-13 02:09:57,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 2 time(s).
> 2012-08-13 02:09:58,647 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 3 time(s).
> 2012-08-13 02:09:59,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 4 time(s).
> 2012-08-13 02:10:00,648 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 5 time(s).
> 2012-08-13 02:10:01,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 6 time(s).
> 2012-08-13 02:10:02,649 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 7 time(s).
> 2012-08-13 02:10:03,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 8 time(s).
> 2012-08-13 02:10:04,650 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 9 time(s).
> 2012-08-13 02:10:04,651 ERROR org.apache.hadoop.mapred.**TaskTracker:
> Caught exception: java.net.ConnectException: Call to its-cs131/
> 141.51.205.41:35555 failed on connection exception:
> java.net.ConnectException: Connection refused
>     at org.apache.hadoop.ipc.Client.**wrapException(Client.java:**1095)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1071)
>     at org.apache.hadoop.ipc.RPC$**Invoker.invoke(RPC.java:225)
>     at org.apache.hadoop.mapred.$**Proxy5.heartbeat(Unknown Source)
>     at org.apache.hadoop.mapred.**TaskTracker.transmitHeartBeat(**
> TaskTracker.java:1857)
>     at org.apache.hadoop.mapred.**TaskTracker.offerService(**
> TaskTracker.java:1653)
>     at org.apache.hadoop.mapred.**TaskTracker.run(TaskTracker.**java:2503)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3744)
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.**checkConnect(Native Method)
>     at sun.nio.ch.SocketChannelImpl.**finishConnect(**
> SocketChannelImpl.java:574)
>     at org.apache.hadoop.net.**SocketIOWithTimeout.connect(**
> SocketIOWithTimeout.java:206)
>     at org.apache.hadoop.net.**NetUtils.connect(NetUtils.**java:489)
>     at org.apache.hadoop.ipc.Client$**Connection.setupConnection(**
> Client.java:434)
>     at org.apache.hadoop.ipc.Client$**Connection.setupIOstreams(**
> Client.java:560)
>     at org.apache.hadoop.ipc.Client$**Connection.access$2000(Client.**
> java:184)
>     at org.apache.hadoop.ipc.Client.**getConnection(Client.java:**1202)
>     at org.apache.hadoop.ipc.Client.**call(Client.java:1046)
>     ... 6 more
>
>
> ########################### LOG TYPE 2 ##############################**
> ##############################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.**TaskTracker:
> STARTUP_MSG:
> /****************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = its-cs133.its.uni-kassel.de/**141.51.205.43<http://its-cs133.its.uni-kassel.de/141.51.205.43>
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 1.0.2
> STARTUP_MSG:   build = https://svn.apache.org/repos/**
> asf/hadoop/common/branches/**branch-1.0.2<https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2>-r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
> ****************************************************************/
> 2012-08-13 00:59:24,569 INFO org.apache.hadoop.metrics2.**impl.MetricsConfig:
> loaded properties from hadoop-metrics2.properties
> 2012-08-13 00:59:24,626 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source MetricsSystem,sub=Stats registered.
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> Scheduled snapshot period at 10 second(s).
> 2012-08-13 00:59:24,627 INFO org.apache.hadoop.metrics2.**impl.MetricsSystemImpl:
> TaskTracker metrics system started
> 2012-08-13 00:59:24,950 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ugi registered.
> 2012-08-13 00:59:25,146 INFO org.mortbay.log: Logging to org.slf4j.impl.**
> Log4jLoggerAdapter(org.**mortbay.log) via org.mortbay.log.Slf4jLog
> 2012-08-13 00:59:25,206 INFO org.apache.hadoop.http.**HttpServer: Added
> global filtersafety (class=org.apache.hadoop.http.**
> HttpServer$QuotingInputFilter)
> 2012-08-13 00:59:25,232 INFO org.apache.hadoop.mapred.**TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-08-13 00:59:25,237 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tasktracker with owner as bmacek
> 2012-08-13 00:59:25,239 INFO org.apache.hadoop.mapred.**TaskTracker: Good
> mapred local directories are: /home/work/bmacek/hadoop/hdfs/**
> tmp/mapred/local
> 2012-08-13 00:59:25,244 INFO org.apache.hadoop.util.**NativeCodeLoader:
> Loaded the native-hadoop library
> 2012-08-13 00:59:25,255 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source jvm registered.
> 2012-08-13 00:59:25,256 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source TaskTrackerMetrics registered.
> 2012-08-13 00:59:25,279 INFO org.apache.hadoop.ipc.Server: Starting
> SocketReader
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcDetailedActivityForPort5485**0 registered.
> 2012-08-13 00:59:25,282 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source RpcActivityForPort54850 registered.
> 2012-08-13 00:59:25,287 INFO org.apache.hadoop.ipc.Server: IPC Server
> Responder: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 54850: starting
> 2012-08-13 00:59:25,288 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker up at: localhost/127.0.0.1:54850
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54850: starting
> 2012-08-13 00:59:25,289 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting tracker tracker_its-cs133.its.uni-**kassel.de:localhost/
> 127.0.0.1:**54850 <http://127.0.0.1:54850>
> 2012-08-13 00:59:26,321 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: its-cs131/141.51.205.41:35555. Already tried 0 time(s).
> 2012-08-13 00:59:38,104 INFO org.apache.hadoop.mapred.**TaskTracker:
> Starting thread: Map-events fetcher for all reduce tasks on
> tracker_its-cs133.its.uni-**kassel.de:localhost/127.0.0.1:**54850<http://127.0.0.1:54850>
> 2012-08-13 00:59:38,120 INFO org.apache.hadoop.util.**ProcessTree: setsid
> exited with exit code 0
> 2012-08-13 00:59:38,134 INFO org.apache.hadoop.mapred.**TaskTracker:
> Using ResourceCalculatorPlugin : org.apache.hadoop.util.**
> LinuxResourceCalculatorPlugin@**445e228
> 2012-08-13 00:59:38,137 WARN org.apache.hadoop.mapred.**TaskTracker:
> TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is
> disabled.
> 2012-08-13 00:59:38,145 INFO org.apache.hadoop.mapred.**IndexCache:
> IndexCache created with max memory = 10485760
> 2012-08-13 00:59:38,158 INFO org.apache.hadoop.metrics2.**impl.MetricsSourceAdapter:
> MBean for source ShuffleServerMetrics registered.
> 2012-08-13 00:59:38,161 INFO org.apache.hadoop.http.**HttpServer: Port
> returned by webServer.getConnectors()[0].**getLocalPort() before open()
> is -1. Opening the listener on 50060
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.**TaskTracker: Can
> not start task tracker because java.net.BindException: Address already in
> use
>     at sun.nio.ch.Net.bind(Native Method)
>     at sun.nio.ch.**ServerSocketChannelImpl.bind(**
> ServerSocketChannelImpl.java:**119)
>     at sun.nio.ch.**ServerSocketAdaptor.bind(**
> ServerSocketAdaptor.java:59)
>     at org.mortbay.jetty.nio.**SelectChannelConnector.open(**
> SelectChannelConnector.java:**216)
>     at org.apache.hadoop.http.**HttpServer.start(HttpServer.**java:581)
>     at org.apache.hadoop.mapred.**TaskTracker.<init>(**
> TaskTracker.java:1502)
>     at org.apache.hadoop.mapred.**TaskTracker.main(TaskTracker.**
> java:3742)
>
> 2012-08-13 00:59:38,163 INFO org.apache.hadoop.mapred.**TaskTracker:
> SHUTDOWN_MSG:
> /****************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at its-cs133.its.uni-kassel.de/**
> 141.51.205.43 <http://its-cs133.its.uni-kassel.de/141.51.205.43>
> ****************************************************************/
>

Re: DataNode and Tasttracker communication

Posted by James Brown <jb...@syndicate.net>.
Hi Bjorn,

For the two items below, it is possible datanodes and tasktrackers are 
already running.

This command will show processes bound to the datanode port:
netstat -putan | grep 50010

tasktracker port:
netstat -putan | grep 50060

If your netstat command does not support the -p option try lsof.


> \____Datanodes
...
> The second is short kind:
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:19,038 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
...
> 2012-08-13 00:59:21,898 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
> Problem binding to /0.0.0.0:50010 : Address already in use

...

> \_____TastTracker
...
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG:
...
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
> not start task tracker because java.net.BindException: Address already
> in use





Re: DataNode and Tasttracker communication

Posted by James Brown <jb...@syndicate.net>.
Hi Bjorn,

For the two items below, it is possible datanodes and tasktrackers are 
already running.

This command will show processes bound to the datanode port:
netstat -putan | grep 50010

tasktracker port:
netstat -putan | grep 50060

If your netstat command does not support the -p option try lsof.


> \____Datanodes
...
> The second is short kind:
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:19,038 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
...
> 2012-08-13 00:59:21,898 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.BindException:
> Problem binding to /0.0.0.0:50010 : Address already in use

...

> \_____TastTracker
...
> ########################### LOG TYPE 2
> ############################################################
> 2012-08-13 00:59:24,376 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG:
...
> 2012-08-13 00:59:38,161 ERROR org.apache.hadoop.mapred.TaskTracker: Can
> not start task tracker because java.net.BindException: Address already
> in use