You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by A Geek <dw...@live.com> on 2012/12/02 16:55:09 UTC

Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]


Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc:                  1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes. 
I did: netstat -puntl on master machine and got this: 
hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program nametcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               tcp6       0      0 :::50020                :::*                    LISTEN      6224/java       tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java       tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN      6723/java       tcp6       0      0 :::57065                :::*                    LISTEN      6040/java       tcp6       0      0 :::50090                :::*                    LISTEN      6401/java       tcp6       0      0 :::50060                :::*                    LISTEN      6723/java       tcp6       0      0 :::50030                :::*                    LISTEN      6540/java       tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java       tcp6       0      0 :::45747                :::*                    LISTEN      6401/java       tcp6       0      0 :::33174                :::*                    LISTEN      6540/java       tcp6       0      0 :::50070                :::*                    LISTEN      6040/java       tcp6       0      0 :::22                   :::*                    LISTEN      -               tcp6       0      0 :::54424                :::*                    LISTEN      6224/java       tcp6       0      0 :::50010                :::*                    LISTEN      6224/java       tcp6       0      0 :::50075                :::*                    LISTEN      6224/java       udp        0      0 0.0.0.0:68              0.0.0.0:*                           -               hadoop@nutchcluster1:~/hadoop$ 

As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java 

To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$ 
But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$ 


The configurations are as below:
--------------core-site.xml content is as below<property>  <name>hadoop.tmp.dir</name>  <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>  <description>A base for other temporary directories</description></property>
<property>  <name>fs.default.name</name>  <value>hdfs://nutchcluster1:54310</value>  <description>The name of the default file system.  A URI whose  scheme and authority determine the FileSystem implementation.  The  uri's scheme determines the config property (fs.SCHEME.impl) naming  the FileSystem implementation class.  The uri's authority is used to  determine the host, port, etc. for a filesystem.</description></property>

-----------------hdfs-site.xml content is as below:<configuration><property>  <name>dfs.replication</name>  <value>2</value>  <description>Default block replication.  The actual number of replications can be specified when the file is created.  The default is used if replication is not specified in create time.  </description></property></configuration>


------------------mapred-site.xml content is as: <configuration><property>  <name>mapred.job.tracker</name>  <value>nutchcluster1:54320</value>  <description>The host and port that the MapReduce job tracker runs  at.  If "local", then jobs are run in-process as a single map  and reduce task.  </description></property><property>  <name>mapred.map.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers).  </description></property>
<property>  <name>mapred.reduce.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers).  </description></property></configuration>

I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
# The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
-----------------/etc/hosts content ends here
File content for masters is :nutchcluster1
and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$ 

and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$ 

This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance. 

Regards, DW

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.


Thanks Harsh. As per your comments, I removed the loopback address for the hostname and added the LAN IP, and I copied the same content on all the 3 slave machines and everything started working. 
Thanks Nitin for pointing me to Whirr. I'd a quick look earlier at Whirr, but though it might be complex to setup the things and did everything manually. But now, it looks like Whirr is quite a useful tool. I'll take a look. 
Thanks to the Hadoop community, my cluster is now up and running. 

Regards, DW
Date: Mon, 3 Dec 2012 00:18:48 +0530
Subject: Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
From: nitinpawar432@gmail.com
To: user@hadoop.apache.org

also,
if you want to setup a hadoop cluster on aws, just try using whirr. Basically it does everything for you

On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

Your problem is that your /etc/hosts file has the line:



127.0.0.1 nutchcluster1



Just delete that line, restart your services. You intend your hostname

"nutchcluster1" to be externally accessible, so aliasing it to the

loopback address (127.0.0.1) is not right.



On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:

> Hi,

> Just to add the version details: I'm running Apache Hadoop release 1.0.4

> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk

> space and has 1.7GB RAM and is a single core machine.

>

> Regards,

> DW

>

> ________________________________

> From: dw.90@live.com

> To: user@hadoop.apache.org

> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based

> machines]

> Date: Sun, 2 Dec 2012 15:55:09 +0000

>

>

> Hi All,

> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].

> Using the following doc:

>                   1.

> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial

>

> I'm able to setup hadoop clusters with required configurations. I can see

> that all the required services on master and on slaves nodes are running as

> required[please see below JPS command output ]. The problem, I'm facing is

> that, the HDFS and Mapreduce daemons running on Master and can be accessed

> from Master only, and not from the slave machines. Note that, I've added

> these ports in the EC2 security group to open them. And I can browse the

> master machines UI from web browser, using: http://<machine

> ip>:50070/dfshealth.jsp

>

>

> Now, the problem which I'm facing is , the HDFS as well the jobtracker both

> are accessible from the master machine[I'm using master as both Namenode and

> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for

> these two are not accessible from other slave nodes.

>

> I did: netstat -puntl on master machine and got this:

>

> hadoop@nutchcluster1:~/hadoop$ netstat -puntl

> (Not all processes could be identified, non-owned process info

>  will not be shown, you would have to be root to see it all.)

> Active Internet connections (only servers)

> Proto Recv-Q Send-Q Local Address           Foreign Address         State

> PID/Program name

> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN

> -

> tcp6       0      0 :::50020                :::*                    LISTEN

> 6224/java

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::57065                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::50090                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::50060                :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::50030                :::*                    LISTEN

> 6540/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::45747                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::33174                :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::50070                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::22                   :::*                    LISTEN

> -

> tcp6       0      0 :::54424                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50010                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50075                :::*                    LISTEN

> 6224/java

> udp        0      0 0.0.0.0:68              0.0.0.0:*

> -

> hadoop@nutchcluster1:~/hadoop$

>

>

> As can be seen in the output, both the HDFS daemon and mapreduce daemons are

> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave

> machines]

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

>

>

> To confirm, the same Idid this on master:

> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> Found 1 items

> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home

> hadoop@nutchcluster1:~/hadoop$

>

> But, when I ran the same command on slaves, I get this:

> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).

> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).

> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).

> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).

> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).

> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).

> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).

> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).

> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).

> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).

> Bad connection to FS. command aborted. exception: Call to

> nutchcluster1/10.4.39.23:54310 failed on connection exception:

> java.net.ConnectException: Connection refused

> hadoop@nutchcluster2:~/hadoop$

>

>

>

> The configurations are as below:

>

> --------------core-site.xml content is as below

> <property>

>   <name>hadoop.tmp.dir</name>

>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>

>   <description>A base for other temporary directories</description>

> </property>

>

> <property>

>   <name>fs.default.name</name>

>   <value>hdfs://nutchcluster1:54310</value>

>   <description>The name of the default file system.  A URI whose

>   scheme and authority determine the FileSystem implementation.  The

>   uri's scheme determines the config property (fs.SCHEME.impl) naming

>   the FileSystem implementation class.  The uri's authority is used to

>   determine the host, port, etc. for a filesystem.</description>

> </property>

>

>

> -----------------hdfs-site.xml content is as below:

> <configuration>

> <property>

>   <name>dfs.replication</name>

>   <value>2</value>

>   <description>Default block replication.

>   The actual number of replications can be specified when the file is

> created.

>   The default is used if replication is not specified in create time.

>   </description>

> </property>

> </configuration>

>

>

>

> ------------------mapred-site.xml content is as:

> <configuration>

> <property>

>   <name>mapred.job.tracker</name>

>   <value>nutchcluster1:54320</value>

>   <description>The host and port that the MapReduce job tracker runs

>   at.  If "local", then jobs are run in-process as a single map

>   and reduce task.

>   </description>

> </property>

> <property>

>   <name>mapred.map.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,

> number of tasktrackers).

>   </description>

> </property>

>

> <property>

>   <name>mapred.reduce.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 2x the number of slave processors

> (i.e., number of tasktrackers).

>   </description>

> </property>

> </configuration>

>

>

> I replicated all the above on all the other 3 slave machines[1 master + 3

> slaves]. My /etc/hosts content is as below on the master node. Note that,

> I've thee same content on slaves as well, the only difference is, its own IP

> is set to 127.0.0.1 and for others the exact IP is set:

>

> ------------------------------/etc/hosts conent:

> 127.0.0.1 localhost

> 127.0.0.1 nutchcluster1

> 10.111.59.96 nutchcluster2

> 10.201.223.79 nutchcluster3

> 10.190.117.68 nutchcluster4

>

> # The following lines are desirable for IPv6 capable hosts

> ::1 ip6-localhost ip6-loopback

> fe00::0 ip6-localnet

> ff00::0 ip6-mcastprefix

> ff02::1 ip6-allnodes

> ff02::2 ip6-allrouters

> ff02::3 ip6-allhosts

>

> -----------------/etc/hosts content ends here

>

> File content for masters is :

> nutchcluster1

>

> and file content for slaves is:

> nutchcluster1

> nutchcluster2

> nutchcluster3

> nutchcluster4

>

> Then, I copied all the contents relevant to config[*-site.xml, *.env files]

> folder on all the slaves.

>

> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then

> I'm starting the mapreduce as : bin/start-mapred.sh

> After running the above two on my master machine[nuthccluster1] I can see

> the following jps output:

> adoop@nutchcluster1:~/hadoop$ jps

> 6401 SecondaryNameNode

> 6723 TaskTracker

> 6224 DataNode

> 6540 JobTracker

> 7354 Jps

> 6040 NameNode

> hadoop@nutchcluster1:~/hadoop$

>

>

> and on the slaves, the jps output is:

> hadoop@nutchcluster2:~/hadoop$ jps

> 8952 DataNode

> 9104 TaskTracker

> 9388 Jps

> hadoop@nutchcluster2:~/hadoop$

>

>

> This clearly indicates that the port 54310 is accessible from the master

> only and not from the slaves. This is the point I'm stuck at and would

> appreciate if someone could point me what config is missing or what is

> wrong. Any comment/feedback, in this regard would be highly appreciated.

> Thanks in advance.

>

>

> Regards,

> DW







--

Harsh J



-- 
Nitin Pawar

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.


Thanks Harsh. As per your comments, I removed the loopback address for the hostname and added the LAN IP, and I copied the same content on all the 3 slave machines and everything started working. 
Thanks Nitin for pointing me to Whirr. I'd a quick look earlier at Whirr, but though it might be complex to setup the things and did everything manually. But now, it looks like Whirr is quite a useful tool. I'll take a look. 
Thanks to the Hadoop community, my cluster is now up and running. 

Regards, DW
Date: Mon, 3 Dec 2012 00:18:48 +0530
Subject: Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
From: nitinpawar432@gmail.com
To: user@hadoop.apache.org

also,
if you want to setup a hadoop cluster on aws, just try using whirr. Basically it does everything for you

On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

Your problem is that your /etc/hosts file has the line:



127.0.0.1 nutchcluster1



Just delete that line, restart your services. You intend your hostname

"nutchcluster1" to be externally accessible, so aliasing it to the

loopback address (127.0.0.1) is not right.



On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:

> Hi,

> Just to add the version details: I'm running Apache Hadoop release 1.0.4

> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk

> space and has 1.7GB RAM and is a single core machine.

>

> Regards,

> DW

>

> ________________________________

> From: dw.90@live.com

> To: user@hadoop.apache.org

> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based

> machines]

> Date: Sun, 2 Dec 2012 15:55:09 +0000

>

>

> Hi All,

> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].

> Using the following doc:

>                   1.

> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial

>

> I'm able to setup hadoop clusters with required configurations. I can see

> that all the required services on master and on slaves nodes are running as

> required[please see below JPS command output ]. The problem, I'm facing is

> that, the HDFS and Mapreduce daemons running on Master and can be accessed

> from Master only, and not from the slave machines. Note that, I've added

> these ports in the EC2 security group to open them. And I can browse the

> master machines UI from web browser, using: http://<machine

> ip>:50070/dfshealth.jsp

>

>

> Now, the problem which I'm facing is , the HDFS as well the jobtracker both

> are accessible from the master machine[I'm using master as both Namenode and

> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for

> these two are not accessible from other slave nodes.

>

> I did: netstat -puntl on master machine and got this:

>

> hadoop@nutchcluster1:~/hadoop$ netstat -puntl

> (Not all processes could be identified, non-owned process info

>  will not be shown, you would have to be root to see it all.)

> Active Internet connections (only servers)

> Proto Recv-Q Send-Q Local Address           Foreign Address         State

> PID/Program name

> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN

> -

> tcp6       0      0 :::50020                :::*                    LISTEN

> 6224/java

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::57065                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::50090                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::50060                :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::50030                :::*                    LISTEN

> 6540/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::45747                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::33174                :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::50070                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::22                   :::*                    LISTEN

> -

> tcp6       0      0 :::54424                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50010                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50075                :::*                    LISTEN

> 6224/java

> udp        0      0 0.0.0.0:68              0.0.0.0:*

> -

> hadoop@nutchcluster1:~/hadoop$

>

>

> As can be seen in the output, both the HDFS daemon and mapreduce daemons are

> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave

> machines]

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

>

>

> To confirm, the same Idid this on master:

> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> Found 1 items

> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home

> hadoop@nutchcluster1:~/hadoop$

>

> But, when I ran the same command on slaves, I get this:

> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).

> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).

> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).

> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).

> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).

> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).

> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).

> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).

> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).

> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).

> Bad connection to FS. command aborted. exception: Call to

> nutchcluster1/10.4.39.23:54310 failed on connection exception:

> java.net.ConnectException: Connection refused

> hadoop@nutchcluster2:~/hadoop$

>

>

>

> The configurations are as below:

>

> --------------core-site.xml content is as below

> <property>

>   <name>hadoop.tmp.dir</name>

>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>

>   <description>A base for other temporary directories</description>

> </property>

>

> <property>

>   <name>fs.default.name</name>

>   <value>hdfs://nutchcluster1:54310</value>

>   <description>The name of the default file system.  A URI whose

>   scheme and authority determine the FileSystem implementation.  The

>   uri's scheme determines the config property (fs.SCHEME.impl) naming

>   the FileSystem implementation class.  The uri's authority is used to

>   determine the host, port, etc. for a filesystem.</description>

> </property>

>

>

> -----------------hdfs-site.xml content is as below:

> <configuration>

> <property>

>   <name>dfs.replication</name>

>   <value>2</value>

>   <description>Default block replication.

>   The actual number of replications can be specified when the file is

> created.

>   The default is used if replication is not specified in create time.

>   </description>

> </property>

> </configuration>

>

>

>

> ------------------mapred-site.xml content is as:

> <configuration>

> <property>

>   <name>mapred.job.tracker</name>

>   <value>nutchcluster1:54320</value>

>   <description>The host and port that the MapReduce job tracker runs

>   at.  If "local", then jobs are run in-process as a single map

>   and reduce task.

>   </description>

> </property>

> <property>

>   <name>mapred.map.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,

> number of tasktrackers).

>   </description>

> </property>

>

> <property>

>   <name>mapred.reduce.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 2x the number of slave processors

> (i.e., number of tasktrackers).

>   </description>

> </property>

> </configuration>

>

>

> I replicated all the above on all the other 3 slave machines[1 master + 3

> slaves]. My /etc/hosts content is as below on the master node. Note that,

> I've thee same content on slaves as well, the only difference is, its own IP

> is set to 127.0.0.1 and for others the exact IP is set:

>

> ------------------------------/etc/hosts conent:

> 127.0.0.1 localhost

> 127.0.0.1 nutchcluster1

> 10.111.59.96 nutchcluster2

> 10.201.223.79 nutchcluster3

> 10.190.117.68 nutchcluster4

>

> # The following lines are desirable for IPv6 capable hosts

> ::1 ip6-localhost ip6-loopback

> fe00::0 ip6-localnet

> ff00::0 ip6-mcastprefix

> ff02::1 ip6-allnodes

> ff02::2 ip6-allrouters

> ff02::3 ip6-allhosts

>

> -----------------/etc/hosts content ends here

>

> File content for masters is :

> nutchcluster1

>

> and file content for slaves is:

> nutchcluster1

> nutchcluster2

> nutchcluster3

> nutchcluster4

>

> Then, I copied all the contents relevant to config[*-site.xml, *.env files]

> folder on all the slaves.

>

> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then

> I'm starting the mapreduce as : bin/start-mapred.sh

> After running the above two on my master machine[nuthccluster1] I can see

> the following jps output:

> adoop@nutchcluster1:~/hadoop$ jps

> 6401 SecondaryNameNode

> 6723 TaskTracker

> 6224 DataNode

> 6540 JobTracker

> 7354 Jps

> 6040 NameNode

> hadoop@nutchcluster1:~/hadoop$

>

>

> and on the slaves, the jps output is:

> hadoop@nutchcluster2:~/hadoop$ jps

> 8952 DataNode

> 9104 TaskTracker

> 9388 Jps

> hadoop@nutchcluster2:~/hadoop$

>

>

> This clearly indicates that the port 54310 is accessible from the master

> only and not from the slaves. This is the point I'm stuck at and would

> appreciate if someone could point me what config is missing or what is

> wrong. Any comment/feedback, in this regard would be highly appreciated.

> Thanks in advance.

>

>

> Regards,

> DW







--

Harsh J



-- 
Nitin Pawar

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.


Thanks Harsh. As per your comments, I removed the loopback address for the hostname and added the LAN IP, and I copied the same content on all the 3 slave machines and everything started working. 
Thanks Nitin for pointing me to Whirr. I'd a quick look earlier at Whirr, but though it might be complex to setup the things and did everything manually. But now, it looks like Whirr is quite a useful tool. I'll take a look. 
Thanks to the Hadoop community, my cluster is now up and running. 

Regards, DW
Date: Mon, 3 Dec 2012 00:18:48 +0530
Subject: Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
From: nitinpawar432@gmail.com
To: user@hadoop.apache.org

also,
if you want to setup a hadoop cluster on aws, just try using whirr. Basically it does everything for you

On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

Your problem is that your /etc/hosts file has the line:



127.0.0.1 nutchcluster1



Just delete that line, restart your services. You intend your hostname

"nutchcluster1" to be externally accessible, so aliasing it to the

loopback address (127.0.0.1) is not right.



On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:

> Hi,

> Just to add the version details: I'm running Apache Hadoop release 1.0.4

> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk

> space and has 1.7GB RAM and is a single core machine.

>

> Regards,

> DW

>

> ________________________________

> From: dw.90@live.com

> To: user@hadoop.apache.org

> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based

> machines]

> Date: Sun, 2 Dec 2012 15:55:09 +0000

>

>

> Hi All,

> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].

> Using the following doc:

>                   1.

> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial

>

> I'm able to setup hadoop clusters with required configurations. I can see

> that all the required services on master and on slaves nodes are running as

> required[please see below JPS command output ]. The problem, I'm facing is

> that, the HDFS and Mapreduce daemons running on Master and can be accessed

> from Master only, and not from the slave machines. Note that, I've added

> these ports in the EC2 security group to open them. And I can browse the

> master machines UI from web browser, using: http://<machine

> ip>:50070/dfshealth.jsp

>

>

> Now, the problem which I'm facing is , the HDFS as well the jobtracker both

> are accessible from the master machine[I'm using master as both Namenode and

> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for

> these two are not accessible from other slave nodes.

>

> I did: netstat -puntl on master machine and got this:

>

> hadoop@nutchcluster1:~/hadoop$ netstat -puntl

> (Not all processes could be identified, non-owned process info

>  will not be shown, you would have to be root to see it all.)

> Active Internet connections (only servers)

> Proto Recv-Q Send-Q Local Address           Foreign Address         State

> PID/Program name

> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN

> -

> tcp6       0      0 :::50020                :::*                    LISTEN

> 6224/java

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::57065                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::50090                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::50060                :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::50030                :::*                    LISTEN

> 6540/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::45747                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::33174                :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::50070                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::22                   :::*                    LISTEN

> -

> tcp6       0      0 :::54424                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50010                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50075                :::*                    LISTEN

> 6224/java

> udp        0      0 0.0.0.0:68              0.0.0.0:*

> -

> hadoop@nutchcluster1:~/hadoop$

>

>

> As can be seen in the output, both the HDFS daemon and mapreduce daemons are

> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave

> machines]

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

>

>

> To confirm, the same Idid this on master:

> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> Found 1 items

> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home

> hadoop@nutchcluster1:~/hadoop$

>

> But, when I ran the same command on slaves, I get this:

> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).

> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).

> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).

> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).

> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).

> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).

> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).

> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).

> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).

> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).

> Bad connection to FS. command aborted. exception: Call to

> nutchcluster1/10.4.39.23:54310 failed on connection exception:

> java.net.ConnectException: Connection refused

> hadoop@nutchcluster2:~/hadoop$

>

>

>

> The configurations are as below:

>

> --------------core-site.xml content is as below

> <property>

>   <name>hadoop.tmp.dir</name>

>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>

>   <description>A base for other temporary directories</description>

> </property>

>

> <property>

>   <name>fs.default.name</name>

>   <value>hdfs://nutchcluster1:54310</value>

>   <description>The name of the default file system.  A URI whose

>   scheme and authority determine the FileSystem implementation.  The

>   uri's scheme determines the config property (fs.SCHEME.impl) naming

>   the FileSystem implementation class.  The uri's authority is used to

>   determine the host, port, etc. for a filesystem.</description>

> </property>

>

>

> -----------------hdfs-site.xml content is as below:

> <configuration>

> <property>

>   <name>dfs.replication</name>

>   <value>2</value>

>   <description>Default block replication.

>   The actual number of replications can be specified when the file is

> created.

>   The default is used if replication is not specified in create time.

>   </description>

> </property>

> </configuration>

>

>

>

> ------------------mapred-site.xml content is as:

> <configuration>

> <property>

>   <name>mapred.job.tracker</name>

>   <value>nutchcluster1:54320</value>

>   <description>The host and port that the MapReduce job tracker runs

>   at.  If "local", then jobs are run in-process as a single map

>   and reduce task.

>   </description>

> </property>

> <property>

>   <name>mapred.map.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,

> number of tasktrackers).

>   </description>

> </property>

>

> <property>

>   <name>mapred.reduce.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 2x the number of slave processors

> (i.e., number of tasktrackers).

>   </description>

> </property>

> </configuration>

>

>

> I replicated all the above on all the other 3 slave machines[1 master + 3

> slaves]. My /etc/hosts content is as below on the master node. Note that,

> I've thee same content on slaves as well, the only difference is, its own IP

> is set to 127.0.0.1 and for others the exact IP is set:

>

> ------------------------------/etc/hosts conent:

> 127.0.0.1 localhost

> 127.0.0.1 nutchcluster1

> 10.111.59.96 nutchcluster2

> 10.201.223.79 nutchcluster3

> 10.190.117.68 nutchcluster4

>

> # The following lines are desirable for IPv6 capable hosts

> ::1 ip6-localhost ip6-loopback

> fe00::0 ip6-localnet

> ff00::0 ip6-mcastprefix

> ff02::1 ip6-allnodes

> ff02::2 ip6-allrouters

> ff02::3 ip6-allhosts

>

> -----------------/etc/hosts content ends here

>

> File content for masters is :

> nutchcluster1

>

> and file content for slaves is:

> nutchcluster1

> nutchcluster2

> nutchcluster3

> nutchcluster4

>

> Then, I copied all the contents relevant to config[*-site.xml, *.env files]

> folder on all the slaves.

>

> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then

> I'm starting the mapreduce as : bin/start-mapred.sh

> After running the above two on my master machine[nuthccluster1] I can see

> the following jps output:

> adoop@nutchcluster1:~/hadoop$ jps

> 6401 SecondaryNameNode

> 6723 TaskTracker

> 6224 DataNode

> 6540 JobTracker

> 7354 Jps

> 6040 NameNode

> hadoop@nutchcluster1:~/hadoop$

>

>

> and on the slaves, the jps output is:

> hadoop@nutchcluster2:~/hadoop$ jps

> 8952 DataNode

> 9104 TaskTracker

> 9388 Jps

> hadoop@nutchcluster2:~/hadoop$

>

>

> This clearly indicates that the port 54310 is accessible from the master

> only and not from the slaves. This is the point I'm stuck at and would

> appreciate if someone could point me what config is missing or what is

> wrong. Any comment/feedback, in this regard would be highly appreciated.

> Thanks in advance.

>

>

> Regards,

> DW







--

Harsh J



-- 
Nitin Pawar

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.


Thanks Harsh. As per your comments, I removed the loopback address for the hostname and added the LAN IP, and I copied the same content on all the 3 slave machines and everything started working. 
Thanks Nitin for pointing me to Whirr. I'd a quick look earlier at Whirr, but though it might be complex to setup the things and did everything manually. But now, it looks like Whirr is quite a useful tool. I'll take a look. 
Thanks to the Hadoop community, my cluster is now up and running. 

Regards, DW
Date: Mon, 3 Dec 2012 00:18:48 +0530
Subject: Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
From: nitinpawar432@gmail.com
To: user@hadoop.apache.org

also,
if you want to setup a hadoop cluster on aws, just try using whirr. Basically it does everything for you

On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

Your problem is that your /etc/hosts file has the line:



127.0.0.1 nutchcluster1



Just delete that line, restart your services. You intend your hostname

"nutchcluster1" to be externally accessible, so aliasing it to the

loopback address (127.0.0.1) is not right.



On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:

> Hi,

> Just to add the version details: I'm running Apache Hadoop release 1.0.4

> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk

> space and has 1.7GB RAM and is a single core machine.

>

> Regards,

> DW

>

> ________________________________

> From: dw.90@live.com

> To: user@hadoop.apache.org

> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based

> machines]

> Date: Sun, 2 Dec 2012 15:55:09 +0000

>

>

> Hi All,

> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].

> Using the following doc:

>                   1.

> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial

>

> I'm able to setup hadoop clusters with required configurations. I can see

> that all the required services on master and on slaves nodes are running as

> required[please see below JPS command output ]. The problem, I'm facing is

> that, the HDFS and Mapreduce daemons running on Master and can be accessed

> from Master only, and not from the slave machines. Note that, I've added

> these ports in the EC2 security group to open them. And I can browse the

> master machines UI from web browser, using: http://<machine

> ip>:50070/dfshealth.jsp

>

>

> Now, the problem which I'm facing is , the HDFS as well the jobtracker both

> are accessible from the master machine[I'm using master as both Namenode and

> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for

> these two are not accessible from other slave nodes.

>

> I did: netstat -puntl on master machine and got this:

>

> hadoop@nutchcluster1:~/hadoop$ netstat -puntl

> (Not all processes could be identified, non-owned process info

>  will not be shown, you would have to be root to see it all.)

> Active Internet connections (only servers)

> Proto Recv-Q Send-Q Local Address           Foreign Address         State

> PID/Program name

> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN

> -

> tcp6       0      0 :::50020                :::*                    LISTEN

> 6224/java

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::57065                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::50090                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::50060                :::*                    LISTEN

> 6723/java

> tcp6       0      0 :::50030                :::*                    LISTEN

> 6540/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::45747                :::*                    LISTEN

> 6401/java

> tcp6       0      0 :::33174                :::*                    LISTEN

> 6540/java

> tcp6       0      0 :::50070                :::*                    LISTEN

> 6040/java

> tcp6       0      0 :::22                   :::*                    LISTEN

> -

> tcp6       0      0 :::54424                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50010                :::*                    LISTEN

> 6224/java

> tcp6       0      0 :::50075                :::*                    LISTEN

> 6224/java

> udp        0      0 0.0.0.0:68              0.0.0.0:*

> -

> hadoop@nutchcluster1:~/hadoop$

>

>

> As can be seen in the output, both the HDFS daemon and mapreduce daemons are

> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave

> machines]

> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN

> 6040/java

> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN

> 6540/java

>

>

> To confirm, the same Idid this on master:

> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> Found 1 items

> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home

> hadoop@nutchcluster1:~/hadoop$

>

> But, when I ran the same command on slaves, I get this:

> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/

> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).

> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).

> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).

> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).

> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).

> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).

> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).

> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).

> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).

> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:

> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).

> Bad connection to FS. command aborted. exception: Call to

> nutchcluster1/10.4.39.23:54310 failed on connection exception:

> java.net.ConnectException: Connection refused

> hadoop@nutchcluster2:~/hadoop$

>

>

>

> The configurations are as below:

>

> --------------core-site.xml content is as below

> <property>

>   <name>hadoop.tmp.dir</name>

>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>

>   <description>A base for other temporary directories</description>

> </property>

>

> <property>

>   <name>fs.default.name</name>

>   <value>hdfs://nutchcluster1:54310</value>

>   <description>The name of the default file system.  A URI whose

>   scheme and authority determine the FileSystem implementation.  The

>   uri's scheme determines the config property (fs.SCHEME.impl) naming

>   the FileSystem implementation class.  The uri's authority is used to

>   determine the host, port, etc. for a filesystem.</description>

> </property>

>

>

> -----------------hdfs-site.xml content is as below:

> <configuration>

> <property>

>   <name>dfs.replication</name>

>   <value>2</value>

>   <description>Default block replication.

>   The actual number of replications can be specified when the file is

> created.

>   The default is used if replication is not specified in create time.

>   </description>

> </property>

> </configuration>

>

>

>

> ------------------mapred-site.xml content is as:

> <configuration>

> <property>

>   <name>mapred.job.tracker</name>

>   <value>nutchcluster1:54320</value>

>   <description>The host and port that the MapReduce job tracker runs

>   at.  If "local", then jobs are run in-process as a single map

>   and reduce task.

>   </description>

> </property>

> <property>

>   <name>mapred.map.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,

> number of tasktrackers).

>   </description>

> </property>

>

> <property>

>   <name>mapred.reduce.tasks</name>

>   <value>40</value>

>   <description>As a rule of thumb, use 2x the number of slave processors

> (i.e., number of tasktrackers).

>   </description>

> </property>

> </configuration>

>

>

> I replicated all the above on all the other 3 slave machines[1 master + 3

> slaves]. My /etc/hosts content is as below on the master node. Note that,

> I've thee same content on slaves as well, the only difference is, its own IP

> is set to 127.0.0.1 and for others the exact IP is set:

>

> ------------------------------/etc/hosts conent:

> 127.0.0.1 localhost

> 127.0.0.1 nutchcluster1

> 10.111.59.96 nutchcluster2

> 10.201.223.79 nutchcluster3

> 10.190.117.68 nutchcluster4

>

> # The following lines are desirable for IPv6 capable hosts

> ::1 ip6-localhost ip6-loopback

> fe00::0 ip6-localnet

> ff00::0 ip6-mcastprefix

> ff02::1 ip6-allnodes

> ff02::2 ip6-allrouters

> ff02::3 ip6-allhosts

>

> -----------------/etc/hosts content ends here

>

> File content for masters is :

> nutchcluster1

>

> and file content for slaves is:

> nutchcluster1

> nutchcluster2

> nutchcluster3

> nutchcluster4

>

> Then, I copied all the contents relevant to config[*-site.xml, *.env files]

> folder on all the slaves.

>

> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then

> I'm starting the mapreduce as : bin/start-mapred.sh

> After running the above two on my master machine[nuthccluster1] I can see

> the following jps output:

> adoop@nutchcluster1:~/hadoop$ jps

> 6401 SecondaryNameNode

> 6723 TaskTracker

> 6224 DataNode

> 6540 JobTracker

> 7354 Jps

> 6040 NameNode

> hadoop@nutchcluster1:~/hadoop$

>

>

> and on the slaves, the jps output is:

> hadoop@nutchcluster2:~/hadoop$ jps

> 8952 DataNode

> 9104 TaskTracker

> 9388 Jps

> hadoop@nutchcluster2:~/hadoop$

>

>

> This clearly indicates that the port 54310 is accessible from the master

> only and not from the slaves. This is the point I'm stuck at and would

> appreciate if someone could point me what config is missing or what is

> wrong. Any comment/feedback, in this regard would be highly appreciated.

> Thanks in advance.

>

>

> Regards,

> DW







--

Harsh J



-- 
Nitin Pawar

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Nitin Pawar <ni...@gmail.com>.

also,

if you want to setup a hadoop cluster on aws, just try using whirr.
Basically it does everything for you


On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

> Your problem is that your /etc/hosts file has the line:
>
> 127.0.0.1 nutchcluster1
>
> Just delete that line, restart your services. You intend your hostname
> "nutchcluster1" to be externally accessible, so aliasing it to the
> loopback address (127.0.0.1) is not right.
>
> On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> > Hi,
> > Just to add the version details: I'm running Apache Hadoop release 1.0.4
> > with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> > space and has 1.7GB RAM and is a single core machine.
> >
> > Regards,
> > DW
> >
> > ________________________________
> > From: dw.90@live.com
> > To: user@hadoop.apache.org
> > Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> > machines]
> > Date: Sun, 2 Dec 2012 15:55:09 +0000
> >
> >
> > Hi All,
> > I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04
> x_64].
> > Using the following doc:
> >                   1.
> >
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
> >
> > I'm able to setup hadoop clusters with required configurations. I can see
> > that all the required services on master and on slaves nodes are running
> as
> > required[please see below JPS command output ]. The problem, I'm facing
> is
> > that, the HDFS and Mapreduce daemons running on Master and can be
> accessed
> > from Master only, and not from the slave machines. Note that, I've added
> > these ports in the EC2 security group to open them. And I can browse the
> > master machines UI from web browser, using: http://<machine
> > ip>:50070/dfshealth.jsp
> >
> >
> > Now, the problem which I'm facing is , the HDFS as well the jobtracker
> both
> > are accessible from the master machine[I'm using master as both Namenode
> and
> > Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> > these two are not accessible from other slave nodes.
> >
> > I did: netstat -puntl on master machine and got this:
> >
> > hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> > (Not all processes could be identified, non-owned process info
> >  will not be shown, you would have to be root to see it all.)
> > Active Internet connections (only servers)
> > Proto Recv-Q Send-Q Local Address           Foreign Address         State
> > PID/Program name
> > tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN
> > -
> > tcp6       0      0 :::50020                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:32776         :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::57065                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::50090                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::50060                :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::50030                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::45747                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::33174                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::50070                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::22                   :::*
>  LISTEN
> > -
> > tcp6       0      0 :::54424                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50010                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50075                :::*
>  LISTEN
> > 6224/java
> > udp        0      0 0.0.0.0:68              0.0.0.0:*
> > -
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > As can be seen in the output, both the HDFS daemon and mapreduce daemons
> are
> > accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any
> machine/slave
> > machines]
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> >
> >
> > To confirm, the same Idid this on master:
> > adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > Found 1 items
> > drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> > hadoop@nutchcluster1:~/hadoop$
> >
> > But, when I ran the same command on slaves, I get this:
> > hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> > 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> > 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> > 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> > 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> > 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> > 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> > 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> > 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> > 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> > Bad connection to FS. command aborted. exception: Call to
> > nutchcluster1/10.4.39.23:54310 failed on connection exception:
> > java.net.ConnectException: Connection refused
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> >
> > The configurations are as below:
> >
> > --------------core-site.xml content is as below
> > <property>
> >   <name>hadoop.tmp.dir</name>
> >   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
> >   <description>A base for other temporary directories</description>
> > </property>
> >
> > <property>
> >   <name>fs.default.name</name>
> >   <value>hdfs://nutchcluster1:54310</value>
> >   <description>The name of the default file system.  A URI whose
> >   scheme and authority determine the FileSystem implementation.  The
> >   uri's scheme determines the config property (fs.SCHEME.impl) naming
> >   the FileSystem implementation class.  The uri's authority is used to
> >   determine the host, port, etc. for a filesystem.</description>
> > </property>
> >
> >
> > -----------------hdfs-site.xml content is as below:
> > <configuration>
> > <property>
> >   <name>dfs.replication</name>
> >   <value>2</value>
> >   <description>Default block replication.
> >   The actual number of replications can be specified when the file is
> > created.
> >   The default is used if replication is not specified in create time.
> >   </description>
> > </property>
> > </configuration>
> >
> >
> >
> > ------------------mapred-site.xml content is as:
> > <configuration>
> > <property>
> >   <name>mapred.job.tracker</name>
> >   <value>nutchcluster1:54320</value>
> >   <description>The host and port that the MapReduce job tracker runs
> >   at.  If "local", then jobs are run in-process as a single map
> >   and reduce task.
> >   </description>
> > </property>
> > <property>
> >   <name>mapred.map.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> > number of tasktrackers).
> >   </description>
> > </property>
> >
> > <property>
> >   <name>mapred.reduce.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 2x the number of slave processors
> > (i.e., number of tasktrackers).
> >   </description>
> > </property>
> > </configuration>
> >
> >
> > I replicated all the above on all the other 3 slave machines[1 master + 3
> > slaves]. My /etc/hosts content is as below on the master node. Note that,
> > I've thee same content on slaves as well, the only difference is, its
> own IP
> > is set to 127.0.0.1 and for others the exact IP is set:
> >
> > ------------------------------/etc/hosts conent:
> > 127.0.0.1 localhost
> > 127.0.0.1 nutchcluster1
> > 10.111.59.96 nutchcluster2
> > 10.201.223.79 nutchcluster3
> > 10.190.117.68 nutchcluster4
> >
> > # The following lines are desirable for IPv6 capable hosts
> > ::1 ip6-localhost ip6-loopback
> > fe00::0 ip6-localnet
> > ff00::0 ip6-mcastprefix
> > ff02::1 ip6-allnodes
> > ff02::2 ip6-allrouters
> > ff02::3 ip6-allhosts
> >
> > -----------------/etc/hosts content ends here
> >
> > File content for masters is :
> > nutchcluster1
> >
> > and file content for slaves is:
> > nutchcluster1
> > nutchcluster2
> > nutchcluster3
> > nutchcluster4
> >
> > Then, I copied all the contents relevant to config[*-site.xml, *.env
> files]
> > folder on all the slaves.
> >
> > As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> > I'm starting the mapreduce as : bin/start-mapred.sh
> > After running the above two on my master machine[nuthccluster1] I can see
> > the following jps output:
> > adoop@nutchcluster1:~/hadoop$ jps
> > 6401 SecondaryNameNode
> > 6723 TaskTracker
> > 6224 DataNode
> > 6540 JobTracker
> > 7354 Jps
> > 6040 NameNode
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > and on the slaves, the jps output is:
> > hadoop@nutchcluster2:~/hadoop$ jps
> > 8952 DataNode
> > 9104 TaskTracker
> > 9388 Jps
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> > This clearly indicates that the port 54310 is accessible from the master
> > only and not from the slaves. This is the point I'm stuck at and would
> > appreciate if someone could point me what config is missing or what is
> > wrong. Any comment/feedback, in this regard would be highly appreciated.
> > Thanks in advance.
> >
> >
> > Regards,
> > DW
>
>
>
> --
> Harsh J
>



-- 
Nitin Pawar

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Nitin Pawar <ni...@gmail.com>.

also,

if you want to setup a hadoop cluster on aws, just try using whirr.
Basically it does everything for you


On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

> Your problem is that your /etc/hosts file has the line:
>
> 127.0.0.1 nutchcluster1
>
> Just delete that line, restart your services. You intend your hostname
> "nutchcluster1" to be externally accessible, so aliasing it to the
> loopback address (127.0.0.1) is not right.
>
> On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> > Hi,
> > Just to add the version details: I'm running Apache Hadoop release 1.0.4
> > with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> > space and has 1.7GB RAM and is a single core machine.
> >
> > Regards,
> > DW
> >
> > ________________________________
> > From: dw.90@live.com
> > To: user@hadoop.apache.org
> > Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> > machines]
> > Date: Sun, 2 Dec 2012 15:55:09 +0000
> >
> >
> > Hi All,
> > I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04
> x_64].
> > Using the following doc:
> >                   1.
> >
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
> >
> > I'm able to setup hadoop clusters with required configurations. I can see
> > that all the required services on master and on slaves nodes are running
> as
> > required[please see below JPS command output ]. The problem, I'm facing
> is
> > that, the HDFS and Mapreduce daemons running on Master and can be
> accessed
> > from Master only, and not from the slave machines. Note that, I've added
> > these ports in the EC2 security group to open them. And I can browse the
> > master machines UI from web browser, using: http://<machine
> > ip>:50070/dfshealth.jsp
> >
> >
> > Now, the problem which I'm facing is , the HDFS as well the jobtracker
> both
> > are accessible from the master machine[I'm using master as both Namenode
> and
> > Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> > these two are not accessible from other slave nodes.
> >
> > I did: netstat -puntl on master machine and got this:
> >
> > hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> > (Not all processes could be identified, non-owned process info
> >  will not be shown, you would have to be root to see it all.)
> > Active Internet connections (only servers)
> > Proto Recv-Q Send-Q Local Address           Foreign Address         State
> > PID/Program name
> > tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN
> > -
> > tcp6       0      0 :::50020                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:32776         :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::57065                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::50090                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::50060                :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::50030                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::45747                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::33174                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::50070                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::22                   :::*
>  LISTEN
> > -
> > tcp6       0      0 :::54424                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50010                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50075                :::*
>  LISTEN
> > 6224/java
> > udp        0      0 0.0.0.0:68              0.0.0.0:*
> > -
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > As can be seen in the output, both the HDFS daemon and mapreduce daemons
> are
> > accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any
> machine/slave
> > machines]
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> >
> >
> > To confirm, the same Idid this on master:
> > adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > Found 1 items
> > drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> > hadoop@nutchcluster1:~/hadoop$
> >
> > But, when I ran the same command on slaves, I get this:
> > hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> > 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> > 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> > 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> > 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> > 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> > 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> > 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> > 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> > 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> > Bad connection to FS. command aborted. exception: Call to
> > nutchcluster1/10.4.39.23:54310 failed on connection exception:
> > java.net.ConnectException: Connection refused
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> >
> > The configurations are as below:
> >
> > --------------core-site.xml content is as below
> > <property>
> >   <name>hadoop.tmp.dir</name>
> >   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
> >   <description>A base for other temporary directories</description>
> > </property>
> >
> > <property>
> >   <name>fs.default.name</name>
> >   <value>hdfs://nutchcluster1:54310</value>
> >   <description>The name of the default file system.  A URI whose
> >   scheme and authority determine the FileSystem implementation.  The
> >   uri's scheme determines the config property (fs.SCHEME.impl) naming
> >   the FileSystem implementation class.  The uri's authority is used to
> >   determine the host, port, etc. for a filesystem.</description>
> > </property>
> >
> >
> > -----------------hdfs-site.xml content is as below:
> > <configuration>
> > <property>
> >   <name>dfs.replication</name>
> >   <value>2</value>
> >   <description>Default block replication.
> >   The actual number of replications can be specified when the file is
> > created.
> >   The default is used if replication is not specified in create time.
> >   </description>
> > </property>
> > </configuration>
> >
> >
> >
> > ------------------mapred-site.xml content is as:
> > <configuration>
> > <property>
> >   <name>mapred.job.tracker</name>
> >   <value>nutchcluster1:54320</value>
> >   <description>The host and port that the MapReduce job tracker runs
> >   at.  If "local", then jobs are run in-process as a single map
> >   and reduce task.
> >   </description>
> > </property>
> > <property>
> >   <name>mapred.map.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> > number of tasktrackers).
> >   </description>
> > </property>
> >
> > <property>
> >   <name>mapred.reduce.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 2x the number of slave processors
> > (i.e., number of tasktrackers).
> >   </description>
> > </property>
> > </configuration>
> >
> >
> > I replicated all the above on all the other 3 slave machines[1 master + 3
> > slaves]. My /etc/hosts content is as below on the master node. Note that,
> > I've thee same content on slaves as well, the only difference is, its
> own IP
> > is set to 127.0.0.1 and for others the exact IP is set:
> >
> > ------------------------------/etc/hosts conent:
> > 127.0.0.1 localhost
> > 127.0.0.1 nutchcluster1
> > 10.111.59.96 nutchcluster2
> > 10.201.223.79 nutchcluster3
> > 10.190.117.68 nutchcluster4
> >
> > # The following lines are desirable for IPv6 capable hosts
> > ::1 ip6-localhost ip6-loopback
> > fe00::0 ip6-localnet
> > ff00::0 ip6-mcastprefix
> > ff02::1 ip6-allnodes
> > ff02::2 ip6-allrouters
> > ff02::3 ip6-allhosts
> >
> > -----------------/etc/hosts content ends here
> >
> > File content for masters is :
> > nutchcluster1
> >
> > and file content for slaves is:
> > nutchcluster1
> > nutchcluster2
> > nutchcluster3
> > nutchcluster4
> >
> > Then, I copied all the contents relevant to config[*-site.xml, *.env
> files]
> > folder on all the slaves.
> >
> > As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> > I'm starting the mapreduce as : bin/start-mapred.sh
> > After running the above two on my master machine[nuthccluster1] I can see
> > the following jps output:
> > adoop@nutchcluster1:~/hadoop$ jps
> > 6401 SecondaryNameNode
> > 6723 TaskTracker
> > 6224 DataNode
> > 6540 JobTracker
> > 7354 Jps
> > 6040 NameNode
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > and on the slaves, the jps output is:
> > hadoop@nutchcluster2:~/hadoop$ jps
> > 8952 DataNode
> > 9104 TaskTracker
> > 9388 Jps
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> > This clearly indicates that the port 54310 is accessible from the master
> > only and not from the slaves. This is the point I'm stuck at and would
> > appreciate if someone could point me what config is missing or what is
> > wrong. Any comment/feedback, in this regard would be highly appreciated.
> > Thanks in advance.
> >
> >
> > Regards,
> > DW
>
>
>
> --
> Harsh J
>



-- 
Nitin Pawar

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Nitin Pawar <ni...@gmail.com>.

also,

if you want to setup a hadoop cluster on aws, just try using whirr.
Basically it does everything for you


On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

> Your problem is that your /etc/hosts file has the line:
>
> 127.0.0.1 nutchcluster1
>
> Just delete that line, restart your services. You intend your hostname
> "nutchcluster1" to be externally accessible, so aliasing it to the
> loopback address (127.0.0.1) is not right.
>
> On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> > Hi,
> > Just to add the version details: I'm running Apache Hadoop release 1.0.4
> > with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> > space and has 1.7GB RAM and is a single core machine.
> >
> > Regards,
> > DW
> >
> > ________________________________
> > From: dw.90@live.com
> > To: user@hadoop.apache.org
> > Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> > machines]
> > Date: Sun, 2 Dec 2012 15:55:09 +0000
> >
> >
> > Hi All,
> > I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04
> x_64].
> > Using the following doc:
> >                   1.
> >
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
> >
> > I'm able to setup hadoop clusters with required configurations. I can see
> > that all the required services on master and on slaves nodes are running
> as
> > required[please see below JPS command output ]. The problem, I'm facing
> is
> > that, the HDFS and Mapreduce daemons running on Master and can be
> accessed
> > from Master only, and not from the slave machines. Note that, I've added
> > these ports in the EC2 security group to open them. And I can browse the
> > master machines UI from web browser, using: http://<machine
> > ip>:50070/dfshealth.jsp
> >
> >
> > Now, the problem which I'm facing is , the HDFS as well the jobtracker
> both
> > are accessible from the master machine[I'm using master as both Namenode
> and
> > Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> > these two are not accessible from other slave nodes.
> >
> > I did: netstat -puntl on master machine and got this:
> >
> > hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> > (Not all processes could be identified, non-owned process info
> >  will not be shown, you would have to be root to see it all.)
> > Active Internet connections (only servers)
> > Proto Recv-Q Send-Q Local Address           Foreign Address         State
> > PID/Program name
> > tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN
> > -
> > tcp6       0      0 :::50020                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:32776         :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::57065                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::50090                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::50060                :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::50030                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::45747                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::33174                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::50070                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::22                   :::*
>  LISTEN
> > -
> > tcp6       0      0 :::54424                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50010                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50075                :::*
>  LISTEN
> > 6224/java
> > udp        0      0 0.0.0.0:68              0.0.0.0:*
> > -
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > As can be seen in the output, both the HDFS daemon and mapreduce daemons
> are
> > accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any
> machine/slave
> > machines]
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> >
> >
> > To confirm, the same Idid this on master:
> > adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > Found 1 items
> > drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> > hadoop@nutchcluster1:~/hadoop$
> >
> > But, when I ran the same command on slaves, I get this:
> > hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> > 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> > 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> > 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> > 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> > 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> > 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> > 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> > 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> > 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> > Bad connection to FS. command aborted. exception: Call to
> > nutchcluster1/10.4.39.23:54310 failed on connection exception:
> > java.net.ConnectException: Connection refused
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> >
> > The configurations are as below:
> >
> > --------------core-site.xml content is as below
> > <property>
> >   <name>hadoop.tmp.dir</name>
> >   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
> >   <description>A base for other temporary directories</description>
> > </property>
> >
> > <property>
> >   <name>fs.default.name</name>
> >   <value>hdfs://nutchcluster1:54310</value>
> >   <description>The name of the default file system.  A URI whose
> >   scheme and authority determine the FileSystem implementation.  The
> >   uri's scheme determines the config property (fs.SCHEME.impl) naming
> >   the FileSystem implementation class.  The uri's authority is used to
> >   determine the host, port, etc. for a filesystem.</description>
> > </property>
> >
> >
> > -----------------hdfs-site.xml content is as below:
> > <configuration>
> > <property>
> >   <name>dfs.replication</name>
> >   <value>2</value>
> >   <description>Default block replication.
> >   The actual number of replications can be specified when the file is
> > created.
> >   The default is used if replication is not specified in create time.
> >   </description>
> > </property>
> > </configuration>
> >
> >
> >
> > ------------------mapred-site.xml content is as:
> > <configuration>
> > <property>
> >   <name>mapred.job.tracker</name>
> >   <value>nutchcluster1:54320</value>
> >   <description>The host and port that the MapReduce job tracker runs
> >   at.  If "local", then jobs are run in-process as a single map
> >   and reduce task.
> >   </description>
> > </property>
> > <property>
> >   <name>mapred.map.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> > number of tasktrackers).
> >   </description>
> > </property>
> >
> > <property>
> >   <name>mapred.reduce.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 2x the number of slave processors
> > (i.e., number of tasktrackers).
> >   </description>
> > </property>
> > </configuration>
> >
> >
> > I replicated all the above on all the other 3 slave machines[1 master + 3
> > slaves]. My /etc/hosts content is as below on the master node. Note that,
> > I've thee same content on slaves as well, the only difference is, its
> own IP
> > is set to 127.0.0.1 and for others the exact IP is set:
> >
> > ------------------------------/etc/hosts conent:
> > 127.0.0.1 localhost
> > 127.0.0.1 nutchcluster1
> > 10.111.59.96 nutchcluster2
> > 10.201.223.79 nutchcluster3
> > 10.190.117.68 nutchcluster4
> >
> > # The following lines are desirable for IPv6 capable hosts
> > ::1 ip6-localhost ip6-loopback
> > fe00::0 ip6-localnet
> > ff00::0 ip6-mcastprefix
> > ff02::1 ip6-allnodes
> > ff02::2 ip6-allrouters
> > ff02::3 ip6-allhosts
> >
> > -----------------/etc/hosts content ends here
> >
> > File content for masters is :
> > nutchcluster1
> >
> > and file content for slaves is:
> > nutchcluster1
> > nutchcluster2
> > nutchcluster3
> > nutchcluster4
> >
> > Then, I copied all the contents relevant to config[*-site.xml, *.env
> files]
> > folder on all the slaves.
> >
> > As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> > I'm starting the mapreduce as : bin/start-mapred.sh
> > After running the above two on my master machine[nuthccluster1] I can see
> > the following jps output:
> > adoop@nutchcluster1:~/hadoop$ jps
> > 6401 SecondaryNameNode
> > 6723 TaskTracker
> > 6224 DataNode
> > 6540 JobTracker
> > 7354 Jps
> > 6040 NameNode
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > and on the slaves, the jps output is:
> > hadoop@nutchcluster2:~/hadoop$ jps
> > 8952 DataNode
> > 9104 TaskTracker
> > 9388 Jps
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> > This clearly indicates that the port 54310 is accessible from the master
> > only and not from the slaves. This is the point I'm stuck at and would
> > appreciate if someone could point me what config is missing or what is
> > wrong. Any comment/feedback, in this regard would be highly appreciated.
> > Thanks in advance.
> >
> >
> > Regards,
> > DW
>
>
>
> --
> Harsh J
>



-- 
Nitin Pawar

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Nitin Pawar <ni...@gmail.com>.

also,

if you want to setup a hadoop cluster on aws, just try using whirr.
Basically it does everything for you


On Sun, Dec 2, 2012 at 10:12 PM, Harsh J <ha...@cloudera.com> wrote:

> Your problem is that your /etc/hosts file has the line:
>
> 127.0.0.1 nutchcluster1
>
> Just delete that line, restart your services. You intend your hostname
> "nutchcluster1" to be externally accessible, so aliasing it to the
> loopback address (127.0.0.1) is not right.
>
> On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> > Hi,
> > Just to add the version details: I'm running Apache Hadoop release 1.0.4
> > with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> > space and has 1.7GB RAM and is a single core machine.
> >
> > Regards,
> > DW
> >
> > ________________________________
> > From: dw.90@live.com
> > To: user@hadoop.apache.org
> > Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> > machines]
> > Date: Sun, 2 Dec 2012 15:55:09 +0000
> >
> >
> > Hi All,
> > I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04
> x_64].
> > Using the following doc:
> >                   1.
> >
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
> >
> > I'm able to setup hadoop clusters with required configurations. I can see
> > that all the required services on master and on slaves nodes are running
> as
> > required[please see below JPS command output ]. The problem, I'm facing
> is
> > that, the HDFS and Mapreduce daemons running on Master and can be
> accessed
> > from Master only, and not from the slave machines. Note that, I've added
> > these ports in the EC2 security group to open them. And I can browse the
> > master machines UI from web browser, using: http://<machine
> > ip>:50070/dfshealth.jsp
> >
> >
> > Now, the problem which I'm facing is , the HDFS as well the jobtracker
> both
> > are accessible from the master machine[I'm using master as both Namenode
> and
> > Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> > these two are not accessible from other slave nodes.
> >
> > I did: netstat -puntl on master machine and got this:
> >
> > hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> > (Not all processes could be identified, non-owned process info
> >  will not be shown, you would have to be root to see it all.)
> > Active Internet connections (only servers)
> > Proto Recv-Q Send-Q Local Address           Foreign Address         State
> > PID/Program name
> > tcp        0      0 0.0.0.0:22              0.0.0.0:*
> LISTEN
> > -
> > tcp6       0      0 :::50020                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:32776         :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::57065                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::50090                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::50060                :::*
>  LISTEN
> > 6723/java
> > tcp6       0      0 :::50030                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::45747                :::*
>  LISTEN
> > 6401/java
> > tcp6       0      0 :::33174                :::*
>  LISTEN
> > 6540/java
> > tcp6       0      0 :::50070                :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 :::22                   :::*
>  LISTEN
> > -
> > tcp6       0      0 :::54424                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50010                :::*
>  LISTEN
> > 6224/java
> > tcp6       0      0 :::50075                :::*
>  LISTEN
> > 6224/java
> > udp        0      0 0.0.0.0:68              0.0.0.0:*
> > -
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > As can be seen in the output, both the HDFS daemon and mapreduce daemons
> are
> > accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any
> machine/slave
> > machines]
> > tcp6       0      0 127.0.0.1:54310         :::*
>  LISTEN
> > 6040/java
> > tcp6       0      0 127.0.0.1:54320         :::*
>  LISTEN
> > 6540/java
> >
> >
> > To confirm, the same Idid this on master:
> > adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > Found 1 items
> > drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> > hadoop@nutchcluster1:~/hadoop$
> >
> > But, when I ran the same command on slaves, I get this:
> > hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls
> hdfs://nutchcluster1:54310/
> > 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> > 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> > 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> > 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> > 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> > 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> > 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> > 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> > 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> > 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> > nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> > Bad connection to FS. command aborted. exception: Call to
> > nutchcluster1/10.4.39.23:54310 failed on connection exception:
> > java.net.ConnectException: Connection refused
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> >
> > The configurations are as below:
> >
> > --------------core-site.xml content is as below
> > <property>
> >   <name>hadoop.tmp.dir</name>
> >   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
> >   <description>A base for other temporary directories</description>
> > </property>
> >
> > <property>
> >   <name>fs.default.name</name>
> >   <value>hdfs://nutchcluster1:54310</value>
> >   <description>The name of the default file system.  A URI whose
> >   scheme and authority determine the FileSystem implementation.  The
> >   uri's scheme determines the config property (fs.SCHEME.impl) naming
> >   the FileSystem implementation class.  The uri's authority is used to
> >   determine the host, port, etc. for a filesystem.</description>
> > </property>
> >
> >
> > -----------------hdfs-site.xml content is as below:
> > <configuration>
> > <property>
> >   <name>dfs.replication</name>
> >   <value>2</value>
> >   <description>Default block replication.
> >   The actual number of replications can be specified when the file is
> > created.
> >   The default is used if replication is not specified in create time.
> >   </description>
> > </property>
> > </configuration>
> >
> >
> >
> > ------------------mapred-site.xml content is as:
> > <configuration>
> > <property>
> >   <name>mapred.job.tracker</name>
> >   <value>nutchcluster1:54320</value>
> >   <description>The host and port that the MapReduce job tracker runs
> >   at.  If "local", then jobs are run in-process as a single map
> >   and reduce task.
> >   </description>
> > </property>
> > <property>
> >   <name>mapred.map.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> > number of tasktrackers).
> >   </description>
> > </property>
> >
> > <property>
> >   <name>mapred.reduce.tasks</name>
> >   <value>40</value>
> >   <description>As a rule of thumb, use 2x the number of slave processors
> > (i.e., number of tasktrackers).
> >   </description>
> > </property>
> > </configuration>
> >
> >
> > I replicated all the above on all the other 3 slave machines[1 master + 3
> > slaves]. My /etc/hosts content is as below on the master node. Note that,
> > I've thee same content on slaves as well, the only difference is, its
> own IP
> > is set to 127.0.0.1 and for others the exact IP is set:
> >
> > ------------------------------/etc/hosts conent:
> > 127.0.0.1 localhost
> > 127.0.0.1 nutchcluster1
> > 10.111.59.96 nutchcluster2
> > 10.201.223.79 nutchcluster3
> > 10.190.117.68 nutchcluster4
> >
> > # The following lines are desirable for IPv6 capable hosts
> > ::1 ip6-localhost ip6-loopback
> > fe00::0 ip6-localnet
> > ff00::0 ip6-mcastprefix
> > ff02::1 ip6-allnodes
> > ff02::2 ip6-allrouters
> > ff02::3 ip6-allhosts
> >
> > -----------------/etc/hosts content ends here
> >
> > File content for masters is :
> > nutchcluster1
> >
> > and file content for slaves is:
> > nutchcluster1
> > nutchcluster2
> > nutchcluster3
> > nutchcluster4
> >
> > Then, I copied all the contents relevant to config[*-site.xml, *.env
> files]
> > folder on all the slaves.
> >
> > As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> > I'm starting the mapreduce as : bin/start-mapred.sh
> > After running the above two on my master machine[nuthccluster1] I can see
> > the following jps output:
> > adoop@nutchcluster1:~/hadoop$ jps
> > 6401 SecondaryNameNode
> > 6723 TaskTracker
> > 6224 DataNode
> > 6540 JobTracker
> > 7354 Jps
> > 6040 NameNode
> > hadoop@nutchcluster1:~/hadoop$
> >
> >
> > and on the slaves, the jps output is:
> > hadoop@nutchcluster2:~/hadoop$ jps
> > 8952 DataNode
> > 9104 TaskTracker
> > 9388 Jps
> > hadoop@nutchcluster2:~/hadoop$
> >
> >
> > This clearly indicates that the port 54310 is accessible from the master
> > only and not from the slaves. This is the point I'm stuck at and would
> > appreciate if someone could point me what config is missing or what is
> > wrong. Any comment/feedback, in this regard would be highly appreciated.
> > Thanks in advance.
> >
> >
> > Regards,
> > DW
>
>
>
> --
> Harsh J
>



-- 
Nitin Pawar

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Harsh J <ha...@cloudera.com>.

Your problem is that your /etc/hosts file has the line:

127.0.0.1 nutchcluster1

Just delete that line, restart your services. You intend your hostname
"nutchcluster1" to be externally accessible, so aliasing it to the
loopback address (127.0.0.1) is not right.

On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> Hi,
> Just to add the version details: I'm running Apache Hadoop release 1.0.4
> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> space and has 1.7GB RAM and is a single core machine.
>
> Regards,
> DW
>
> ________________________________
> From: dw.90@live.com
> To: user@hadoop.apache.org
> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> machines]
> Date: Sun, 2 Dec 2012 15:55:09 +0000
>
>
> Hi All,
> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
> Using the following doc:
>                   1.
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
>
> I'm able to setup hadoop clusters with required configurations. I can see
> that all the required services on master and on slaves nodes are running as
> required[please see below JPS command output ]. The problem, I'm facing is
> that, the HDFS and Mapreduce daemons running on Master and can be accessed
> from Master only, and not from the slave machines. Note that, I've added
> these ports in the EC2 security group to open them. And I can browse the
> master machines UI from web browser, using: http://<machine
> ip>:50070/dfshealth.jsp
>
>
> Now, the problem which I'm facing is , the HDFS as well the jobtracker both
> are accessible from the master machine[I'm using master as both Namenode and
> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> these two are not accessible from other slave nodes.
>
> I did: netstat -puntl on master machine and got this:
>
> hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> (Not all processes could be identified, non-owned process info
>  will not be shown, you would have to be root to see it all.)
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
> PID/Program name
> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
> -
> tcp6       0      0 :::50020                :::*                    LISTEN
> 6224/java
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::57065                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::50090                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::50060                :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::50030                :::*                    LISTEN
> 6540/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::45747                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::33174                :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::50070                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::22                   :::*                    LISTEN
> -
> tcp6       0      0 :::54424                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50010                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50075                :::*                    LISTEN
> 6224/java
> udp        0      0 0.0.0.0:68              0.0.0.0:*
> -
> hadoop@nutchcluster1:~/hadoop$
>
>
> As can be seen in the output, both the HDFS daemon and mapreduce daemons are
> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave
> machines]
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
>
>
> To confirm, the same Idid this on master:
> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> Found 1 items
> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> hadoop@nutchcluster1:~/hadoop$
>
> But, when I ran the same command on slaves, I get this:
> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> Bad connection to FS. command aborted. exception: Call to
> nutchcluster1/10.4.39.23:54310 failed on connection exception:
> java.net.ConnectException: Connection refused
> hadoop@nutchcluster2:~/hadoop$
>
>
>
> The configurations are as below:
>
> --------------core-site.xml content is as below
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
>   <description>A base for other temporary directories</description>
> </property>
>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://nutchcluster1:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>
>
> -----------------hdfs-site.xml content is as below:
> <configuration>
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
> </configuration>
>
>
>
> ------------------mapred-site.xml content is as:
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>nutchcluster1:54320</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
>
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
>
>
> I replicated all the above on all the other 3 slave machines[1 master + 3
> slaves]. My /etc/hosts content is as below on the master node. Note that,
> I've thee same content on slaves as well, the only difference is, its own IP
> is set to 127.0.0.1 and for others the exact IP is set:
>
> ------------------------------/etc/hosts conent:
> 127.0.0.1 localhost
> 127.0.0.1 nutchcluster1
> 10.111.59.96 nutchcluster2
> 10.201.223.79 nutchcluster3
> 10.190.117.68 nutchcluster4
>
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
>
> -----------------/etc/hosts content ends here
>
> File content for masters is :
> nutchcluster1
>
> and file content for slaves is:
> nutchcluster1
> nutchcluster2
> nutchcluster3
> nutchcluster4
>
> Then, I copied all the contents relevant to config[*-site.xml, *.env files]
> folder on all the slaves.
>
> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> I'm starting the mapreduce as : bin/start-mapred.sh
> After running the above two on my master machine[nuthccluster1] I can see
> the following jps output:
> adoop@nutchcluster1:~/hadoop$ jps
> 6401 SecondaryNameNode
> 6723 TaskTracker
> 6224 DataNode
> 6540 JobTracker
> 7354 Jps
> 6040 NameNode
> hadoop@nutchcluster1:~/hadoop$
>
>
> and on the slaves, the jps output is:
> hadoop@nutchcluster2:~/hadoop$ jps
> 8952 DataNode
> 9104 TaskTracker
> 9388 Jps
> hadoop@nutchcluster2:~/hadoop$
>
>
> This clearly indicates that the port 54310 is accessible from the master
> only and not from the slaves. This is the point I'm stuck at and would
> appreciate if someone could point me what config is missing or what is
> wrong. Any comment/feedback, in this regard would be highly appreciated.
> Thanks in advance.
>
>
> Regards,
> DW



-- 
Harsh J

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Harsh J <ha...@cloudera.com>.

Your problem is that your /etc/hosts file has the line:

127.0.0.1 nutchcluster1

Just delete that line, restart your services. You intend your hostname
"nutchcluster1" to be externally accessible, so aliasing it to the
loopback address (127.0.0.1) is not right.

On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> Hi,
> Just to add the version details: I'm running Apache Hadoop release 1.0.4
> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> space and has 1.7GB RAM and is a single core machine.
>
> Regards,
> DW
>
> ________________________________
> From: dw.90@live.com
> To: user@hadoop.apache.org
> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> machines]
> Date: Sun, 2 Dec 2012 15:55:09 +0000
>
>
> Hi All,
> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
> Using the following doc:
>                   1.
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
>
> I'm able to setup hadoop clusters with required configurations. I can see
> that all the required services on master and on slaves nodes are running as
> required[please see below JPS command output ]. The problem, I'm facing is
> that, the HDFS and Mapreduce daemons running on Master and can be accessed
> from Master only, and not from the slave machines. Note that, I've added
> these ports in the EC2 security group to open them. And I can browse the
> master machines UI from web browser, using: http://<machine
> ip>:50070/dfshealth.jsp
>
>
> Now, the problem which I'm facing is , the HDFS as well the jobtracker both
> are accessible from the master machine[I'm using master as both Namenode and
> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> these two are not accessible from other slave nodes.
>
> I did: netstat -puntl on master machine and got this:
>
> hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> (Not all processes could be identified, non-owned process info
>  will not be shown, you would have to be root to see it all.)
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
> PID/Program name
> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
> -
> tcp6       0      0 :::50020                :::*                    LISTEN
> 6224/java
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::57065                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::50090                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::50060                :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::50030                :::*                    LISTEN
> 6540/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::45747                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::33174                :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::50070                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::22                   :::*                    LISTEN
> -
> tcp6       0      0 :::54424                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50010                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50075                :::*                    LISTEN
> 6224/java
> udp        0      0 0.0.0.0:68              0.0.0.0:*
> -
> hadoop@nutchcluster1:~/hadoop$
>
>
> As can be seen in the output, both the HDFS daemon and mapreduce daemons are
> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave
> machines]
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
>
>
> To confirm, the same Idid this on master:
> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> Found 1 items
> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> hadoop@nutchcluster1:~/hadoop$
>
> But, when I ran the same command on slaves, I get this:
> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> Bad connection to FS. command aborted. exception: Call to
> nutchcluster1/10.4.39.23:54310 failed on connection exception:
> java.net.ConnectException: Connection refused
> hadoop@nutchcluster2:~/hadoop$
>
>
>
> The configurations are as below:
>
> --------------core-site.xml content is as below
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
>   <description>A base for other temporary directories</description>
> </property>
>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://nutchcluster1:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>
>
> -----------------hdfs-site.xml content is as below:
> <configuration>
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
> </configuration>
>
>
>
> ------------------mapred-site.xml content is as:
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>nutchcluster1:54320</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
>
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
>
>
> I replicated all the above on all the other 3 slave machines[1 master + 3
> slaves]. My /etc/hosts content is as below on the master node. Note that,
> I've thee same content on slaves as well, the only difference is, its own IP
> is set to 127.0.0.1 and for others the exact IP is set:
>
> ------------------------------/etc/hosts conent:
> 127.0.0.1 localhost
> 127.0.0.1 nutchcluster1
> 10.111.59.96 nutchcluster2
> 10.201.223.79 nutchcluster3
> 10.190.117.68 nutchcluster4
>
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
>
> -----------------/etc/hosts content ends here
>
> File content for masters is :
> nutchcluster1
>
> and file content for slaves is:
> nutchcluster1
> nutchcluster2
> nutchcluster3
> nutchcluster4
>
> Then, I copied all the contents relevant to config[*-site.xml, *.env files]
> folder on all the slaves.
>
> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> I'm starting the mapreduce as : bin/start-mapred.sh
> After running the above two on my master machine[nuthccluster1] I can see
> the following jps output:
> adoop@nutchcluster1:~/hadoop$ jps
> 6401 SecondaryNameNode
> 6723 TaskTracker
> 6224 DataNode
> 6540 JobTracker
> 7354 Jps
> 6040 NameNode
> hadoop@nutchcluster1:~/hadoop$
>
>
> and on the slaves, the jps output is:
> hadoop@nutchcluster2:~/hadoop$ jps
> 8952 DataNode
> 9104 TaskTracker
> 9388 Jps
> hadoop@nutchcluster2:~/hadoop$
>
>
> This clearly indicates that the port 54310 is accessible from the master
> only and not from the slaves. This is the point I'm stuck at and would
> appreciate if someone could point me what config is missing or what is
> wrong. Any comment/feedback, in this regard would be highly appreciated.
> Thanks in advance.
>
>
> Regards,
> DW



-- 
Harsh J

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Harsh J <ha...@cloudera.com>.

Your problem is that your /etc/hosts file has the line:

127.0.0.1 nutchcluster1

Just delete that line, restart your services. You intend your hostname
"nutchcluster1" to be externally accessible, so aliasing it to the
loopback address (127.0.0.1) is not right.

On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> Hi,
> Just to add the version details: I'm running Apache Hadoop release 1.0.4
> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> space and has 1.7GB RAM and is a single core machine.
>
> Regards,
> DW
>
> ________________________________
> From: dw.90@live.com
> To: user@hadoop.apache.org
> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> machines]
> Date: Sun, 2 Dec 2012 15:55:09 +0000
>
>
> Hi All,
> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
> Using the following doc:
>                   1.
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
>
> I'm able to setup hadoop clusters with required configurations. I can see
> that all the required services on master and on slaves nodes are running as
> required[please see below JPS command output ]. The problem, I'm facing is
> that, the HDFS and Mapreduce daemons running on Master and can be accessed
> from Master only, and not from the slave machines. Note that, I've added
> these ports in the EC2 security group to open them. And I can browse the
> master machines UI from web browser, using: http://<machine
> ip>:50070/dfshealth.jsp
>
>
> Now, the problem which I'm facing is , the HDFS as well the jobtracker both
> are accessible from the master machine[I'm using master as both Namenode and
> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> these two are not accessible from other slave nodes.
>
> I did: netstat -puntl on master machine and got this:
>
> hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> (Not all processes could be identified, non-owned process info
>  will not be shown, you would have to be root to see it all.)
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
> PID/Program name
> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
> -
> tcp6       0      0 :::50020                :::*                    LISTEN
> 6224/java
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::57065                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::50090                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::50060                :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::50030                :::*                    LISTEN
> 6540/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::45747                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::33174                :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::50070                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::22                   :::*                    LISTEN
> -
> tcp6       0      0 :::54424                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50010                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50075                :::*                    LISTEN
> 6224/java
> udp        0      0 0.0.0.0:68              0.0.0.0:*
> -
> hadoop@nutchcluster1:~/hadoop$
>
>
> As can be seen in the output, both the HDFS daemon and mapreduce daemons are
> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave
> machines]
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
>
>
> To confirm, the same Idid this on master:
> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> Found 1 items
> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> hadoop@nutchcluster1:~/hadoop$
>
> But, when I ran the same command on slaves, I get this:
> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> Bad connection to FS. command aborted. exception: Call to
> nutchcluster1/10.4.39.23:54310 failed on connection exception:
> java.net.ConnectException: Connection refused
> hadoop@nutchcluster2:~/hadoop$
>
>
>
> The configurations are as below:
>
> --------------core-site.xml content is as below
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
>   <description>A base for other temporary directories</description>
> </property>
>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://nutchcluster1:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>
>
> -----------------hdfs-site.xml content is as below:
> <configuration>
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
> </configuration>
>
>
>
> ------------------mapred-site.xml content is as:
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>nutchcluster1:54320</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
>
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
>
>
> I replicated all the above on all the other 3 slave machines[1 master + 3
> slaves]. My /etc/hosts content is as below on the master node. Note that,
> I've thee same content on slaves as well, the only difference is, its own IP
> is set to 127.0.0.1 and for others the exact IP is set:
>
> ------------------------------/etc/hosts conent:
> 127.0.0.1 localhost
> 127.0.0.1 nutchcluster1
> 10.111.59.96 nutchcluster2
> 10.201.223.79 nutchcluster3
> 10.190.117.68 nutchcluster4
>
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
>
> -----------------/etc/hosts content ends here
>
> File content for masters is :
> nutchcluster1
>
> and file content for slaves is:
> nutchcluster1
> nutchcluster2
> nutchcluster3
> nutchcluster4
>
> Then, I copied all the contents relevant to config[*-site.xml, *.env files]
> folder on all the slaves.
>
> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> I'm starting the mapreduce as : bin/start-mapred.sh
> After running the above two on my master machine[nuthccluster1] I can see
> the following jps output:
> adoop@nutchcluster1:~/hadoop$ jps
> 6401 SecondaryNameNode
> 6723 TaskTracker
> 6224 DataNode
> 6540 JobTracker
> 7354 Jps
> 6040 NameNode
> hadoop@nutchcluster1:~/hadoop$
>
>
> and on the slaves, the jps output is:
> hadoop@nutchcluster2:~/hadoop$ jps
> 8952 DataNode
> 9104 TaskTracker
> 9388 Jps
> hadoop@nutchcluster2:~/hadoop$
>
>
> This clearly indicates that the port 54310 is accessible from the master
> only and not from the slaves. This is the point I'm stuck at and would
> appreciate if someone could point me what config is missing or what is
> wrong. Any comment/feedback, in this regard would be highly appreciated.
> Thanks in advance.
>
>
> Regards,
> DW



-- 
Harsh J

Re: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by Harsh J <ha...@cloudera.com>.

Your problem is that your /etc/hosts file has the line:

127.0.0.1 nutchcluster1

Just delete that line, restart your services. You intend your hostname
"nutchcluster1" to be externally accessible, so aliasing it to the
loopback address (127.0.0.1) is not right.

On Sun, Dec 2, 2012 at 10:08 PM, A Geek <dw...@live.com> wrote:
> Hi,
> Just to add the version details: I'm running Apache Hadoop release 1.0.4
> with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk
> space and has 1.7GB RAM and is a single core machine.
>
> Regards,
> DW
>
> ________________________________
> From: dw.90@live.com
> To: user@hadoop.apache.org
> Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based
> machines]
> Date: Sun, 2 Dec 2012 15:55:09 +0000
>
>
> Hi All,
> I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64].
> Using the following doc:
>                   1.
> http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
>
> I'm able to setup hadoop clusters with required configurations. I can see
> that all the required services on master and on slaves nodes are running as
> required[please see below JPS command output ]. The problem, I'm facing is
> that, the HDFS and Mapreduce daemons running on Master and can be accessed
> from Master only, and not from the slave machines. Note that, I've added
> these ports in the EC2 security group to open them. And I can browse the
> master machines UI from web browser, using: http://<machine
> ip>:50070/dfshealth.jsp
>
>
> Now, the problem which I'm facing is , the HDFS as well the jobtracker both
> are accessible from the master machine[I'm using master as both Namenode and
> Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for
> these two are not accessible from other slave nodes.
>
> I did: netstat -puntl on master machine and got this:
>
> hadoop@nutchcluster1:~/hadoop$ netstat -puntl
> (Not all processes could be identified, non-owned process info
>  will not be shown, you would have to be root to see it all.)
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
> PID/Program name
> tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
> -
> tcp6       0      0 :::50020                :::*                    LISTEN
> 6224/java
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::57065                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::50090                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::50060                :::*                    LISTEN
> 6723/java
> tcp6       0      0 :::50030                :::*                    LISTEN
> 6540/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::45747                :::*                    LISTEN
> 6401/java
> tcp6       0      0 :::33174                :::*                    LISTEN
> 6540/java
> tcp6       0      0 :::50070                :::*                    LISTEN
> 6040/java
> tcp6       0      0 :::22                   :::*                    LISTEN
> -
> tcp6       0      0 :::54424                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50010                :::*                    LISTEN
> 6224/java
> tcp6       0      0 :::50075                :::*                    LISTEN
> 6224/java
> udp        0      0 0.0.0.0:68              0.0.0.0:*
> -
> hadoop@nutchcluster1:~/hadoop$
>
>
> As can be seen in the output, both the HDFS daemon and mapreduce daemons are
> accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave
> machines]
> tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN
> 6040/java
> tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN
> 6540/java
>
>
> To confirm, the same Idid this on master:
> adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> Found 1 items
> drwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /home
> hadoop@nutchcluster1:~/hadoop$
>
> But, when I ran the same command on slaves, I get this:
> hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/
> 12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).
> 12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).
> 12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).
> 12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).
> 12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).
> 12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).
> 12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).
> 12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).
> 12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).
> 12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server:
> nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).
> Bad connection to FS. command aborted. exception: Call to
> nutchcluster1/10.4.39.23:54310 failed on connection exception:
> java.net.ConnectException: Connection refused
> hadoop@nutchcluster2:~/hadoop$
>
>
>
> The configurations are as below:
>
> --------------core-site.xml content is as below
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>
>   <description>A base for other temporary directories</description>
> </property>
>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://nutchcluster1:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>
>
> -----------------hdfs-site.xml content is as below:
> <configuration>
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
> </configuration>
>
>
>
> ------------------mapred-site.xml content is as:
> <configuration>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>nutchcluster1:54320</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
>
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>40</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
>
>
> I replicated all the above on all the other 3 slave machines[1 master + 3
> slaves]. My /etc/hosts content is as below on the master node. Note that,
> I've thee same content on slaves as well, the only difference is, its own IP
> is set to 127.0.0.1 and for others the exact IP is set:
>
> ------------------------------/etc/hosts conent:
> 127.0.0.1 localhost
> 127.0.0.1 nutchcluster1
> 10.111.59.96 nutchcluster2
> 10.201.223.79 nutchcluster3
> 10.190.117.68 nutchcluster4
>
> # The following lines are desirable for IPv6 capable hosts
> ::1 ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
>
> -----------------/etc/hosts content ends here
>
> File content for masters is :
> nutchcluster1
>
> and file content for slaves is:
> nutchcluster1
> nutchcluster2
> nutchcluster3
> nutchcluster4
>
> Then, I copied all the contents relevant to config[*-site.xml, *.env files]
> folder on all the slaves.
>
> As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then
> I'm starting the mapreduce as : bin/start-mapred.sh
> After running the above two on my master machine[nuthccluster1] I can see
> the following jps output:
> adoop@nutchcluster1:~/hadoop$ jps
> 6401 SecondaryNameNode
> 6723 TaskTracker
> 6224 DataNode
> 6540 JobTracker
> 7354 Jps
> 6040 NameNode
> hadoop@nutchcluster1:~/hadoop$
>
>
> and on the slaves, the jps output is:
> hadoop@nutchcluster2:~/hadoop$ jps
> 8952 DataNode
> 9104 TaskTracker
> 9388 Jps
> hadoop@nutchcluster2:~/hadoop$
>
>
> This clearly indicates that the port 54310 is accessible from the master
> only and not from the slaves. This is the point I'm stuck at and would
> appreciate if someone could point me what config is missing or what is
> wrong. Any comment/feedback, in this regard would be highly appreciated.
> Thanks in advance.
>
>
> Regards,
> DW



-- 
Harsh J

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.

Hi,Just to add the version details: I'm running Apache Hadoop release 1.0.4 with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk space and has 1.7GB RAM and is a single core machine.
Regards,DW
From: dw.90@live.com
To: user@hadoop.apache.org
Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
Date: Sun, 2 Dec 2012 15:55:09 +0000

Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc:                  1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes. 
I did: netstat -puntl on master machine and got this: 
hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program nametcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               tcp6       0      0 :::50020                :::*                    LISTEN      6224/java       tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java       tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN      6723/java       tcp6       0      0 :::57065                :::*                    LISTEN      6040/java       tcp6       0      0 :::50090                :::*                    LISTEN      6401/java       tcp6       0      0 :::50060                :::*                    LISTEN      6723/java       tcp6       0      0 :::50030                :::*                    LISTEN      6540/java       tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java       tcp6       0      0 :::45747                :::*                    LISTEN      6401/java       tcp6       0      0 :::33174                :::*                    LISTEN      6540/java       tcp6       0      0 :::50070                :::*                    LISTEN      6040/java       tcp6       0      0 :::22                   :::*                    LISTEN      -               tcp6       0      0 :::54424                :::*                    LISTEN      6224/java       tcp6       0      0 :::50010                :::*                    LISTEN      6224/java       tcp6       0      0 :::50075                :::*                    LISTEN      6224/java       udp        0      0 0.0.0.0:68              0.0.0.0:*                           -               hadoop@nutchcluster1:~/hadoop$ 

As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java 

To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$ 
But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$ 

The configurations are as below:
--------------core-site.xml content is as below<property>  <name>hadoop.tmp.dir</name>  <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>  <description>A base for other temporary directories</description></property>
<property>  <name>fs.default.name</name>  <value>hdfs://nutchcluster1:54310</value>  <description>The name of the default file system.  A URI whose  scheme and authority determine the FileSystem implementation.  The  uri's scheme determines the config property (fs.SCHEME.impl) naming  the FileSystem implementation class.  The uri's authority is used to  determine the host, port, etc. for a filesystem.</description></property>

-----------------hdfs-site.xml content is as below:<configuration><property>  <name>dfs.replication</name>  <value>2</value>  <description>Default block replication.  The actual number of replications can be specified when the file is created.  The default is used if replication is not specified in create time.  </description></property></configuration>

------------------mapred-site.xml content is as: <configuration><property>  <name>mapred.job.tracker</name>  <value>nutchcluster1:54320</value>  <description>The host and port that the MapReduce job tracker runs  at.  If "local", then jobs are run in-process as a single map  and reduce task.  </description></property><property>  <name>mapred.map.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers).  </description></property>
<property>  <name>mapred.reduce.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers).  </description></property></configuration>

I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
# The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
-----------------/etc/hosts content ends here
File content for masters is :nutchcluster1
and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$ 

and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$ 

This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance. 

Regards, DW

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.

Hi,Just to add the version details: I'm running Apache Hadoop release 1.0.4 with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk space and has 1.7GB RAM and is a single core machine.
Regards,DW
From: dw.90@live.com
To: user@hadoop.apache.org
Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
Date: Sun, 2 Dec 2012 15:55:09 +0000

Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc:                  1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes. 
I did: netstat -puntl on master machine and got this: 
hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program nametcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               tcp6       0      0 :::50020                :::*                    LISTEN      6224/java       tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java       tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN      6723/java       tcp6       0      0 :::57065                :::*                    LISTEN      6040/java       tcp6       0      0 :::50090                :::*                    LISTEN      6401/java       tcp6       0      0 :::50060                :::*                    LISTEN      6723/java       tcp6       0      0 :::50030                :::*                    LISTEN      6540/java       tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java       tcp6       0      0 :::45747                :::*                    LISTEN      6401/java       tcp6       0      0 :::33174                :::*                    LISTEN      6540/java       tcp6       0      0 :::50070                :::*                    LISTEN      6040/java       tcp6       0      0 :::22                   :::*                    LISTEN      -               tcp6       0      0 :::54424                :::*                    LISTEN      6224/java       tcp6       0      0 :::50010                :::*                    LISTEN      6224/java       tcp6       0      0 :::50075                :::*                    LISTEN      6224/java       udp        0      0 0.0.0.0:68              0.0.0.0:*                           -               hadoop@nutchcluster1:~/hadoop$ 

As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java 

To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$ 
But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$ 

The configurations are as below:
--------------core-site.xml content is as below<property>  <name>hadoop.tmp.dir</name>  <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>  <description>A base for other temporary directories</description></property>
<property>  <name>fs.default.name</name>  <value>hdfs://nutchcluster1:54310</value>  <description>The name of the default file system.  A URI whose  scheme and authority determine the FileSystem implementation.  The  uri's scheme determines the config property (fs.SCHEME.impl) naming  the FileSystem implementation class.  The uri's authority is used to  determine the host, port, etc. for a filesystem.</description></property>

-----------------hdfs-site.xml content is as below:<configuration><property>  <name>dfs.replication</name>  <value>2</value>  <description>Default block replication.  The actual number of replications can be specified when the file is created.  The default is used if replication is not specified in create time.  </description></property></configuration>

------------------mapred-site.xml content is as: <configuration><property>  <name>mapred.job.tracker</name>  <value>nutchcluster1:54320</value>  <description>The host and port that the MapReduce job tracker runs  at.  If "local", then jobs are run in-process as a single map  and reduce task.  </description></property><property>  <name>mapred.map.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers).  </description></property>
<property>  <name>mapred.reduce.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers).  </description></property></configuration>

I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
# The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
-----------------/etc/hosts content ends here
File content for masters is :nutchcluster1
and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$ 

and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$ 

This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance. 

Regards, DW

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.

Hi,Just to add the version details: I'm running Apache Hadoop release 1.0.4 with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk space and has 1.7GB RAM and is a single core machine.
Regards,DW
From: dw.90@live.com
To: user@hadoop.apache.org
Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
Date: Sun, 2 Dec 2012 15:55:09 +0000

Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc:                  1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes. 
I did: netstat -puntl on master machine and got this: 
hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program nametcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               tcp6       0      0 :::50020                :::*                    LISTEN      6224/java       tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java       tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN      6723/java       tcp6       0      0 :::57065                :::*                    LISTEN      6040/java       tcp6       0      0 :::50090                :::*                    LISTEN      6401/java       tcp6       0      0 :::50060                :::*                    LISTEN      6723/java       tcp6       0      0 :::50030                :::*                    LISTEN      6540/java       tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java       tcp6       0      0 :::45747                :::*                    LISTEN      6401/java       tcp6       0      0 :::33174                :::*                    LISTEN      6540/java       tcp6       0      0 :::50070                :::*                    LISTEN      6040/java       tcp6       0      0 :::22                   :::*                    LISTEN      -               tcp6       0      0 :::54424                :::*                    LISTEN      6224/java       tcp6       0      0 :::50010                :::*                    LISTEN      6224/java       tcp6       0      0 :::50075                :::*                    LISTEN      6224/java       udp        0      0 0.0.0.0:68              0.0.0.0:*                           -               hadoop@nutchcluster1:~/hadoop$ 

As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java 

To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$ 
But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$ 

The configurations are as below:
--------------core-site.xml content is as below<property>  <name>hadoop.tmp.dir</name>  <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>  <description>A base for other temporary directories</description></property>
<property>  <name>fs.default.name</name>  <value>hdfs://nutchcluster1:54310</value>  <description>The name of the default file system.  A URI whose  scheme and authority determine the FileSystem implementation.  The  uri's scheme determines the config property (fs.SCHEME.impl) naming  the FileSystem implementation class.  The uri's authority is used to  determine the host, port, etc. for a filesystem.</description></property>

-----------------hdfs-site.xml content is as below:<configuration><property>  <name>dfs.replication</name>  <value>2</value>  <description>Default block replication.  The actual number of replications can be specified when the file is created.  The default is used if replication is not specified in create time.  </description></property></configuration>

------------------mapred-site.xml content is as: <configuration><property>  <name>mapred.job.tracker</name>  <value>nutchcluster1:54320</value>  <description>The host and port that the MapReduce job tracker runs  at.  If "local", then jobs are run in-process as a single map  and reduce task.  </description></property><property>  <name>mapred.map.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers).  </description></property>
<property>  <name>mapred.reduce.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers).  </description></property></configuration>

I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
# The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
-----------------/etc/hosts content ends here
File content for masters is :nutchcluster1
and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$ 

and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$ 

This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance. 

Regards, DW

RE: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]

Posted by A Geek <dw...@live.com>.

Hi,Just to add the version details: I'm running Apache Hadoop release 1.0.4 with jdk1.6.0_37 . The underlying Ubuntu 12.04 machine has got 300GB disk space and has 1.7GB RAM and is a single core machine.
Regards,DW
From: dw.90@live.com
To: user@hadoop.apache.org
Subject: Hadoop Cluster setup on EC2 instances [Ubuntu 12.04 x64 based machines]
Date: Sun, 2 Dec 2012 15:55:09 +0000

Hi All, I'm trying to setup Hadoop Cluster using 4 machines[4 x Ubuntu 12.04 x_64]. Using the following doc:                  1. http://titan.softnet.tuc.gr:8082/User:xenia/Page_Title/Hadoop_Cluster_Setup_Tutorial
I'm able to setup hadoop clusters with required configurations. I can see that all the required services on master and on slaves nodes are running as required[please see below JPS command output ]. The problem, I'm facing is that, the HDFS and Mapreduce daemons running on Master and can be accessed from Master only, and not from the slave machines. Note that, I've added these ports in the EC2 security group to open them. And I can browse the master machines UI from web browser, using: http://<machine ip>:50070/dfshealth.jsp

Now, the problem which I'm facing is , the HDFS as well the jobtracker both are accessible from the master machine[I'm using master as both Namenode and Datanode] but both the ports[hdfs: 54310 and mapreduce: 54320] used for these two are not accessible from other slave nodes. 
I did: netstat -puntl on master machine and got this: 
hadoop@nutchcluster1:~/hadoop$ netstat -puntl(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program nametcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -               tcp6       0      0 :::50020                :::*                    LISTEN      6224/java       tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java       tcp6       0      0 127.0.0.1:32776         :::*                    LISTEN      6723/java       tcp6       0      0 :::57065                :::*                    LISTEN      6040/java       tcp6       0      0 :::50090                :::*                    LISTEN      6401/java       tcp6       0      0 :::50060                :::*                    LISTEN      6723/java       tcp6       0      0 :::50030                :::*                    LISTEN      6540/java       tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java       tcp6       0      0 :::45747                :::*                    LISTEN      6401/java       tcp6       0      0 :::33174                :::*                    LISTEN      6540/java       tcp6       0      0 :::50070                :::*                    LISTEN      6040/java       tcp6       0      0 :::22                   :::*                    LISTEN      -               tcp6       0      0 :::54424                :::*                    LISTEN      6224/java       tcp6       0      0 :::50010                :::*                    LISTEN      6224/java       tcp6       0      0 :::50075                :::*                    LISTEN      6224/java       udp        0      0 0.0.0.0:68              0.0.0.0:*                           -               hadoop@nutchcluster1:~/hadoop$ 

As can be seen in the output, both the HDFS daemon and mapreduce daemons are accessible, but only from 127.0.0.1 and not from 0.0.0.0 [any machine/slave machines]tcp6       0      0 127.0.0.1:54310         :::*                    LISTEN      6040/java tcp6       0      0 127.0.0.1:54320         :::*                    LISTEN      6540/java 

To confirm, the same Idid this on master: adoop@nutchcluster1:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/Found 1 itemsdrwxr-xr-x   - hadoop supergroup          0 2012-12-02 12:53 /homehadoop@nutchcluster1:~/hadoop$ 
But, when I ran the same command on slaves, I get this: hadoop@nutchcluster2:~/hadoop$ bin/hadoop fs -ls hdfs://nutchcluster1:54310/12/12/02 15:42:16 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 0 time(s).12/12/02 15:42:17 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 1 time(s).12/12/02 15:42:18 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 2 time(s).12/12/02 15:42:19 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 3 time(s).12/12/02 15:42:20 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 4 time(s).12/12/02 15:42:21 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 5 time(s).12/12/02 15:42:22 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 6 time(s).12/12/02 15:42:23 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 7 time(s).12/12/02 15:42:24 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 8 time(s).12/12/02 15:42:25 INFO ipc.Client: Retrying connect to server: nutchcluster1/10.4.39.23:54310. Already tried 9 time(s).Bad connection to FS. command aborted. exception: Call to nutchcluster1/10.4.39.23:54310 failed on connection exception: java.net.ConnectException: Connection refusedhadoop@nutchcluster2:~/hadoop$ 

The configurations are as below:
--------------core-site.xml content is as below<property>  <name>hadoop.tmp.dir</name>  <value>/home/hadoop/hadoop/datastore/hadoop-${user.name}</value>  <description>A base for other temporary directories</description></property>
<property>  <name>fs.default.name</name>  <value>hdfs://nutchcluster1:54310</value>  <description>The name of the default file system.  A URI whose  scheme and authority determine the FileSystem implementation.  The  uri's scheme determines the config property (fs.SCHEME.impl) naming  the FileSystem implementation class.  The uri's authority is used to  determine the host, port, etc. for a filesystem.</description></property>

-----------------hdfs-site.xml content is as below:<configuration><property>  <name>dfs.replication</name>  <value>2</value>  <description>Default block replication.  The actual number of replications can be specified when the file is created.  The default is used if replication is not specified in create time.  </description></property></configuration>

------------------mapred-site.xml content is as: <configuration><property>  <name>mapred.job.tracker</name>  <value>nutchcluster1:54320</value>  <description>The host and port that the MapReduce job tracker runs  at.  If "local", then jobs are run in-process as a single map  and reduce task.  </description></property><property>  <name>mapred.map.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 10x the number of slaves (i.e., number of tasktrackers).  </description></property>
<property>  <name>mapred.reduce.tasks</name>  <value>40</value>  <description>As a rule of thumb, use 2x the number of slave processors (i.e., number of tasktrackers).  </description></property></configuration>

I replicated all the above on all the other 3 slave machines[1 master + 3 slaves]. My /etc/hosts content is as below on the master node. Note that, I've thee same content on slaves as well, the only difference is, its own IP is set to 127.0.0.1 and for others the exact IP is set:
------------------------------/etc/hosts conent:127.0.0.1 localhost127.0.0.1 nutchcluster110.111.59.96 nutchcluster210.201.223.79 nutchcluster310.190.117.68 nutchcluster4
# The following lines are desirable for IPv6 capable hosts::1 ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
-----------------/etc/hosts content ends here
File content for masters is :nutchcluster1
and file content for slaves is: nutchcluster1nutchcluster2nutchcluster3nutchcluster4
Then, I copied all the contents relevant to config[*-site.xml, *.env files] folder on all the slaves.
As per the steps, I'm starting the HDFS using: bin/start-dfs.sh  and then I'm starting the mapreduce as : bin/start-mapred.sh After running the above two on my master machine[nuthccluster1] I can see the following jps output: adoop@nutchcluster1:~/hadoop$ jps6401 SecondaryNameNode6723 TaskTracker6224 DataNode6540 JobTracker7354 Jps6040 NameNodehadoop@nutchcluster1:~/hadoop$ 

and on the slaves, the jps output is: hadoop@nutchcluster2:~/hadoop$ jps8952 DataNode9104 TaskTracker9388 Jpshadoop@nutchcluster2:~/hadoop$ 

This clearly indicates that the port 54310 is accessible from the master only and not from the slaves. This is the point I'm stuck at and would appreciate if someone could point me what config is missing or what is wrong. Any comment/feedback, in this regard would be highly appreciated. Thanks in advance. 

Regards, DW