You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Tiger Uppercut <ge...@gmail.com> on 2007/03/26 09:27:55 UTC

setting up hadoop on a single node, vanilla arguments

Hi,

[I tried googling and searching the mailing list for a similar
problem, but I coudn't find one this basic :]

I just tried to install hadoop on a single node, a 64-bit box running Ubuntu.

[tiger]$ uname -a
Linux  2.6.16-gentoo-r9  x86_64 Intel(R) Xeon(TM) CPU 3.60GHz
GenuineIntel GNU/Linux

I started start-all.sh, which appeared to work well:

[tiger]$ $HADOOP_INSTALL/bin/start-all.sh

However, when I tried to run the wordcount example
(http://wiki.apache.org/lucene-hadoop/WordCount), hadoop couldn't
connect to my localhost:

[tiger]$ bin/hadoop jar hadoop-0.12.2-examples.jar wordcount input_dir
output_dir
07/03/26 00:22:37 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 1 time(s).
07/03/26 00:22:37 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 2 time(s).
...
java.lang.RuntimeException: java.net.ConnectException: Connection refused

Please let me know if you have any thoughts!

Thanks,

-Tiger


See below for my hadoop-site.xml settings, derived from
(http://wiki.apache.org/nutch/NutchHadoopTutorial)

<configuration>

<property>
  <name>fs.default.name</name>
  <value>localhost:9000</value>
</property>

<!-- map/reduce properties -->

<property>
  <name>mapred.job.tracker</name>
  <value>localhost:9001</value>
</property>

<property>
  <name>dfs.name.dir</name>
  <value>/some_dir/hadoop/hadoop_data</value>
</property>

</configuration>

Re: setting up hadoop on a single node, vanilla arguments

Posted by Arun C Murthy <ar...@yahoo-inc.com>.

Hi,

Tiger Uppercut wrote:
> Hi,
> 
> 
> I just tried to install hadoop on a single node, a 64-bit box running 
> Ubuntu.
> 

Could you ensure passphraseless-ssh works on your localhost as detailed in:
http://lucene.apache.org/hadoop/api/index.html -> 'Pseudo-distributed 
operation'

Some more info here:
http://wiki.apache.org/lucene-hadoop/GettingStartedWithHadoop

hth,
Arun

> [tiger]$ uname -a
> Linux  2.6.16-gentoo-r9  x86_64 Intel(R) Xeon(TM) CPU 3.60GHz
> GenuineIntel GNU/Linux
> 
> I started start-all.sh, which appeared to work well:
> 
> [tiger]$ $HADOOP_INSTALL/bin/start-all.sh
> 
> However, when I tried to run the wordcount example
> (http://wiki.apache.org/lucene-hadoop/WordCount), hadoop couldn't
> connect to my localhost:
> 
> [tiger]$ bin/hadoop jar hadoop-0.12.2-examples.jar wordcount input_dir
> output_dir
> 07/03/26 00:22:37 INFO ipc.Client: Retrying connect to server:
> localhost/127.0.0.1:9000. Already tried 1 time(s).
> 07/03/26 00:22:37 INFO ipc.Client: Retrying connect to server:
> localhost/127.0.0.1:9000. Already tried 2 time(s).
> ...
> java.lang.RuntimeException: java.net.ConnectException: Connection refused
> 
> Please let me know if you have any thoughts!
> 
> Thanks,
> 
> -Tiger
> 
> 
> See below for my hadoop-site.xml settings, derived from
> (http://wiki.apache.org/nutch/NutchHadoopTutorial)
> 
> <configuration>
> 
> <property>
>  <name>fs.default.name</name>
>  <value>localhost:9000</value>
> </property>
> 
> <!-- map/reduce properties -->
> 
> <property>
>  <name>mapred.job.tracker</name>
>  <value>localhost:9001</value>
> </property>
> 
> <property>
>  <name>dfs.name.dir</name>
>  <value>/some_dir/hadoop/hadoop_data</value>
> </property>
> 
> </configuration>

Re: setting up hadoop on a single node, vanilla arguments

Posted by Tiger Uppercut <ge...@gmail.com>.

Resending...I think my message got bounced earlier:

On 3/26/07, Tiger Uppercut <ge...@gmail.com> wrote:
> Thanks Philippe.
>
> Yeah, sorry, I should I have mentioned that I tried using the hostname
> of my machine first, so I had the following hadoop-site.xml settings.
>
> <property>
>   <name>fs.default.name</name>
>   <value>tiger.stanford.edu:9000</value>
> </property>
>
> <!-- map/reduce properties -->
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>tiger.stanford.edu:9001</value>
> </property>
>
> But that still didn't work:
>
> tiger$ bin/hadoop jar hadoop-0.12.2-examples.jar wordcount input_dir output_dir
>
> 07/03/26 01:57:25 INFO ipc.Client: Retrying connect to server:
> tiger.stanford.edu/
> xx.yy.zz.aa:9000. Already tried 1 time(s).
> ...
> xx.yy.zz.aa:9000. Already tried 10 time(s).
> java.lang.RuntimeException: java.net.ConnectException: Connection refused
>
> Separately Arun - I did have passphrase-less ssh enabled on this machine.
>
> i.e., I executed:
>
> ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
> cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
>
> On 3/26/07, Philippe Gassmann <ph...@anyware-tech.com> wrote:
> > Hi,
> >
> > Tiger Uppercut a écrit :
> > > <snip/>
> > >
> > > <property>
> > >  <name>fs.default.name</name>
> > >  <value>localhost:9000</value>
> > > </property>
> > >
> > > <!-- map/reduce properties -->
> > >
> > > <property>
> > >  <name>mapred.job.tracker</name>
> > >  <value>localhost:9001</value>
> > > </property>
> > >
> > For the fs.default.name and the mapred.job.tracker try to use the
> > hostname of your machine instead of localhost. When using
> > localhost:XXXX, hadoop servers are listen to the loopback interface. But
> > mapreduce jobs (I do not know exactly where) are seeing that the
> > connections to tasktrackers are issued using the 127.0.0.1 and are
> > trying to reverse dns the adress. Your system will not return localhost
> > but the real name of your machine. In most linux system, that name is
> > binded to an ethernet interface so jobs will try to connect to that
> > interface instead of the loopback one.
> >
> >
> >
> > > <property>
> > >  <name>dfs.name.dir</name>
> > >  <value>/some_dir/hadoop/hadoop_data</value>
> > > </property>
> > >
> > > </configuration>
> >
> >
>

Re: setting up hadoop on a single node, vanilla arguments

Posted by Tiger Uppercut <ge...@gmail.com>.

Thanks Philippe.

Yeah, sorry, I should I have mentioned that I tried using the hostname
of my machine first, so I had the following hadoop-site.xml settings.

<property>
  <name>fs.default.name</name>
  <value>tiger.stanford.edu:9000</value>
</property>

<!-- map/reduce properties -->

<property>
  <name>mapred.job.tracker</name>
  <value>tiger.stanford.edu:9001</value>
</property>

But that still didn't work:

tiger$ bin/hadoop jar hadoop-0.12.2-examples.jar wordcount input_dir output_dir

07/03/26 01:57:25 INFO ipc.Client: Retrying connect to server:
tiger.stanford.edu/
xx.yy.zz.aa:9000. Already tried 1 time(s).
...
xx.yy.zz.aa:9000. Already tried 10 time(s).
java.lang.RuntimeException: java.net.ConnectException: Connection refused

Separately Arun - I did have passphrase-less ssh enabled on this machine.

i.e., I executed:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

On 3/26/07, Philippe Gassmann <ph...@anyware-tech.com> wrote:
> Hi,
>
> Tiger Uppercut a écrit :
> > <snip/>
> >
> > <property>
> >  <name>fs.default.name</name>
> >  <value>localhost:9000</value>
> > </property>
> >
> > <!-- map/reduce properties -->
> >
> > <property>
> >  <name>mapred.job.tracker</name>
> >  <value>localhost:9001</value>
> > </property>
> >
> For the fs.default.name and the mapred.job.tracker try to use the
> hostname of your machine instead of localhost. When using
> localhost:XXXX, hadoop servers are listen to the loopback interface. But
> mapreduce jobs (I do not know exactly where) are seeing that the
> connections to tasktrackers are issued using the 127.0.0.1 and are
> trying to reverse dns the adress. Your system will not return localhost
> but the real name of your machine. In most linux system, that name is
> binded to an ethernet interface so jobs will try to connect to that
> interface instead of the loopback one.
>
>
>
> > <property>
> >  <name>dfs.name.dir</name>
> >  <value>/some_dir/hadoop/hadoop_data</value>
> > </property>
> >
> > </configuration>
>
>

Re: setting up hadoop on a single node, vanilla arguments

Posted by Philippe Gassmann <ph...@anyware-tech.com>.

Hi,

Tiger Uppercut a écrit :
> <snip/>
>
> <property>
>  <name>fs.default.name</name>
>  <value>localhost:9000</value>
> </property>
>
> <!-- map/reduce properties -->
>
> <property>
>  <name>mapred.job.tracker</name>
>  <value>localhost:9001</value>
> </property>
>
For the fs.default.name and the mapred.job.tracker try to use the 
hostname of your machine instead of localhost. When using 
localhost:XXXX, hadoop servers are listen to the loopback interface. But 
mapreduce jobs (I do not know exactly where) are seeing that the 
connections to tasktrackers are issued using the 127.0.0.1 and are 
trying to reverse dns the adress. Your system will not return localhost 
but the real name of your machine. In most linux system, that name is 
binded to an ethernet interface so jobs will try to connect to that 
interface instead of the loopback one.



> <property>
>  <name>dfs.name.dir</name>
>  <value>/some_dir/hadoop/hadoop_data</value>
> </property>
>
> </configuration>