You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by shefali pawar <sh...@rediffmail.com> on 2009/02/04 18:34:01 UTC

Regarding "Hadoop multi cluster" set-up

Hi,

I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 master(which should also work as a slave) and 1 slave node. 

But while running bin/start-dfs.sh the datanode is not starting on the slave. I had read the previous mails on the list, but nothing seems to be working in this case. I am getting the following error in the hadoop-root-datanode-slave log file while running the command bin/start-dfs.sh =>

2009-02-03 13:00:27,516 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = slave/172.16.0.32
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.19.0
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
************************************************************/
2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 0 time(s).
2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 1 time(s).
2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 2 time(s).
2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 3 time(s).
2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 4 time(s).
2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 5 time(s).
2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 6 time(s).
2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 7 time(s).
2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 8 time(s).
2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 9 time(s).
2009-02-03 13:00:37,738 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to master/172.16.0.46:54310 failed on local exception: No route to host
	at org.apache.hadoop.ipc.Client.call(Client.java:699)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
	at $Proxy4.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:205)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
Caused by: java.net.NoRouteToHostException: No route to host
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
	at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
	at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
	at org.apache.hadoop.ipc.Client.call(Client.java:685)
	... 12 more

2009-02-03 13:00:37,739 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
************************************************************/


Also, the Pseudo distributed operation is working on both the machines. And i am able to ssh from 'master to master' and 'master to slave' via a password-less ssh login. I do not think there is any problem with the network because cross pinging is working fine.

I am working on Linux (Fedora 8)

The following is the configuration which i am using

On master and slave, <HADOOP_INSTALL>/conf/masters looks like this: 

 master

On master and slave, <HADOOP_INSTALL>/conf/slaves looks like this: 

 master
 slave

On both the machines conf/hadoop-site.xml looks like this

 <property>
   <name>fs.default.name</name>
   <value>hdfs://master:54310</value>
   <description>The name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem.</description>
 </property>
 <property>
   <name>mapred.job.tracker</name>
   <value>master:54311</value>
   <description>The host and port that the MapReduce job tracker runs
   at.  If "local", then jobs are run in-process as a single map
   and reduce task.
   </description>
 </property>
 <property>
   <name>dfs.replication</name>
   <value>2</value>
   <description>Default block replication.
   The actual number of replications can be specified when the file is created.
   The default is used if replication is not specified in create time.
   </description>
 </property>

namenode is formatted succesfully by running

"bin/hadoop namenode -format"

on the master node.

I am new to Hadoop and I do not know what is going wrong.

Any help will be appreciated.

Thanking you in advance

Shefali Pawar
Pune, India

Re: Regarding "Hadoop multi cluster" set-up

Posted by S D <sd...@gmail.com>.
Shefali,

Is your firewall blocking port 54310 on the master?

John

On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar <sh...@rediffmail.com>wrote:

> Hi,
>
> I am trying to set-up a two node cluster using Hadoop0.19.0, with 1
> master(which should also work as a slave) and 1 slave node.
>
> But while running bin/start-dfs.sh the datanode is not starting on the
> slave. I had read the previous mails on the list, but nothing seems to be
> working in this case. I am getting the following error in the
> hadoop-root-datanode-slave log file while running the command
> bin/start-dfs.sh =>
>
> 2009-02-03 13:00:27,516 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = slave/172.16.0.32
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.19.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
> 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> ************************************************************/
> 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 0 time(s).
> 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 1 time(s).
> 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 2 time(s).
> 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 3 time(s).
> 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 4 time(s).
> 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 5 time(s).
> 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 6 time(s).
> 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 7 time(s).
> 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 8 time(s).
> 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 9 time(s).
> 2009-02-03 13:00:37,738 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call
> to master/172.16.0.46:54310 failed on local exception: No route to host
>        at org.apache.hadoop.ipc.Client.call(Client.java:699)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>        at $Proxy4.getProtocolVersion(Unknown Source)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
>        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
>        at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:205)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
>        at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
> Caused by: java.net.NoRouteToHostException: No route to host
>        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>        at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>        at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
>        at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
>        at
> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
>        at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
>        at org.apache.hadoop.ipc.Client.call(Client.java:685)
>        ... 12 more
>
> 2009-02-03 13:00:37,739 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
> ************************************************************/
>
>
> Also, the Pseudo distributed operation is working on both the machines. And
> i am able to ssh from 'master to master' and 'master to slave' via a
> password-less ssh login. I do not think there is any problem with the
> network because cross pinging is working fine.
>
> I am working on Linux (Fedora 8)
>
> The following is the configuration which i am using
>
> On master and slave, <HADOOP_INSTALL>/conf/masters looks like this:
>
>  master
>
> On master and slave, <HADOOP_INSTALL>/conf/slaves looks like this:
>
>  master
>  slave
>
> On both the machines conf/hadoop-site.xml looks like this
>
>  <property>
>   <name>fs.default.name</name>
>   <value>hdfs://master:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
>  </property>
>  <property>
>   <name>mapred.job.tracker</name>
>   <value>master:54311</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
>  </property>
>  <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
>  </property>
>
> namenode is formatted succesfully by running
>
> "bin/hadoop namenode -format"
>
> on the master node.
>
> I am new to Hadoop and I do not know what is going wrong.
>
> Any help will be appreciated.
>
> Thanking you in advance
>
> Shefali Pawar
> Pune, India
>