You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sridhar Raman <sr...@gmail.com> on 2008/05/09 09:27:23 UTC

Problem in 2-node cluster

This is the setup I have: 2 machines - master, slave.
master is both a namenode and a datanode.
slave is just a datanode.

In the master machine, I have configured the conf/hadoop-site.xml files in
both machines so that fs.default.name = hdfs://master:54310, and
mapred.job.tracker = master:54311.
Also, the conf/masters has only master as the entry.  And conf/slaves has
master and slave as entries.
(NOTE: I have these exact same config in the slave machine as well)

I start off with bin/hadoop namenode -format.  This creates a folder
hadoop-user in machine.
Then I do bin/start-dfs.sh.  According to this page, the output I should get
should indicate the following: (EXPECTED OUTPUT)

> starting namenode ...
> slave: starting datanode ...
> master: starting datanode ...
> master: starting secondarynamenode
>

But what I actually get is the following: (ACTUAL OUTPUT)

> starting namenode ...
> slave: starting datanode ...
> *: no address associated with name master*
> master: starting secondarynamenode
>

I think the critical line is the "no address associated ..." one.  Because
of this, I end up with just 1 datanode.

When I try http://localhost:50030/, the Cluster Summary shows just 1 node.

Any idea why I might be getting this?

I checked all the possible reasons.
* Checked whether the slave had master in etc/hosts.  Yes, it does.
* Checked whether the master had master in etc/hosts.  Yes, it does.
* Checked whether I am able to ssh to master from slave, and vice versa.
Yes, I can.

So, I am stumped as of now.  Any ideas?

Thanks,
Sridhar

Re: Problem in 2-node cluster

Posted by Sridhar Raman <sr...@gmail.com>.
Woah!  I managed to fix it, but god!  It was a bug that was so ... I don't
know ... hard to name it.

The first thing I noticed was that if my conf/slaves file had,

> master
> slave
>

I would get "* no address associated with name master*", and if it was,

> slave
> master
>

I would get "* no address associated with name slave*".

Then we realised that it was probably a mistake due to some incompatible
text formats.  I was editing my files in Textpad.  But since probably Cygwin
was being used, the formats needed to be different.  So I saved the
conf/slaves file in Unix mode (available in Textpad) ... and voila! it
worked like a dream!  Whew!

On Fri, May 9, 2008 at 12:57 PM, Sridhar Raman <sr...@gmail.com>
wrote:

> This is the setup I have: 2 machines - master, slave.
> master is both a namenode and a datanode.
> slave is just a datanode.
>
> In the master machine, I have configured the conf/hadoop-site.xml files in
> both machines so that fs.default.name = hdfs://master:54310, and
> mapred.job.tracker = master:54311.
> Also, the conf/masters has only master as the entry.  And conf/slaves has
> master and slave as entries.
> (NOTE: I have these exact same config in the slave machine as well)
>
> I start off with bin/hadoop namenode -format.  This creates a folder
> hadoop-user in machine.
> Then I do bin/start-dfs.sh.  According to this page, the output I should
> get should indicate the following: (EXPECTED OUTPUT)
>
>> starting namenode ...
>> slave: starting datanode ...
>> master: starting datanode ...
>> master: starting secondarynamenode
>>
>
> But what I actually get is the following: (ACTUAL OUTPUT)
>
>> starting namenode ...
>> slave: starting datanode ...
>> *: no address associated with name master*
>> master: starting secondarynamenode
>>
>
> I think the critical line is the "no address associated ..." one.  Because
> of this, I end up with just 1 datanode.
>
> When I try http://localhost:50030/, the Cluster Summary shows just 1 node.
>
> Any idea why I might be getting this?
>
> I checked all the possible reasons.
> * Checked whether the slave had master in etc/hosts.  Yes, it does.
> * Checked whether the master had master in etc/hosts.  Yes, it does.
> * Checked whether I am able to ssh to master from slave, and vice versa.
> Yes, I can.
>
> So, I am stumped as of now.  Any ideas?
>
> Thanks,
> Sridhar
>