You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Chris Mattmann <ch...@jpl.nasa.gov> on 2006/04/10 06:21:25 UTC

Problem starting hadoop

Hi Folks,

  I'm new to this list (but a familiar face on the Nutch one :-) ). I had a
newbie question. It seems that when I went to start Hadoop DFS using the
start-all.sh script in the bin directory, I note that for DFS it starts a
single namenode using hadoop-daemon.sh, then uses slaves.sh to start all the
slave datanodes. One thing I noticed is that on my linux cluster, the
additional options "-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" are not
available on my system. I tried typing "man ssh_config" and the closest
thing I see to ConnectTimeout is:


     ConnectionAttempts
             Specifies the number of tries (one per second) to make before
             exiting.  The argument must be an integer.  This may be useful
in scripts if the connection sometimes fails.  The default is 1.


Is this normal? Typing uname -a on my machine results in:

Linux <XXX> 2.4.21-37.XXX.ELsmp #1 SMP Tue Oct 18 11:43:19 PDT 2005 x86_64
x86_64 x86_64 GNU/Linux

When I remove those options from the slaves.sh script, the starting of
hadoop DFS succeeds (I commented out the part in start-all.sh that starts
MapReduce stuff, because I'm only trying to use DFS stuff).

Here is the output of ssh -V on my system:

[mattmann@XXX ~/hadoop]$ ssh -V
OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f

Any ideas? Additionally, would it make sense to put in options within the
startup script to only start DFS related daemons and slaves, and the same
goes for only starting MapRed daemons and slaves? If so, I can create a JIRA
issue about this.

Thanks!

Cheers,
  Chris

Re: Problem starting hadoop

Posted by Michael Stack <st...@archive.org>.

Chris Mattmann wrote:
> Hi Folks,
>
>   I'm new to this list (but a familiar face on the Nutch one :-) ). I had a
> newbie question. It seems that when I went to start Hadoop DFS using the
> start-all.sh script in the bin directory, I note that for DFS it starts a
> single namenode using hadoop-daemon.sh, then uses slaves.sh to start all the
> slave datanodes. One thing I noticed is that on my linux cluster, the
> additional options "-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" are not
> available on my system. I tried typing "man ssh_config" and the closest
> thing I see to ConnectTimeout is:
>
>
>      ConnectionAttempts
>              Specifies the number of tries (one per second) to make before
>              exiting.  The argument must be an integer.  This may be useful
> in scripts if the connection sometimes fails.  The default is 1.
>
>
> Is this normal? Typing uname -a on my machine results in:
>
> Linux <XXX> 2.4.21-37.XXX.ELsmp #1 SMP Tue Oct 18 11:43:19 PDT 2005 x86_64
> x86_64 x86_64 GNU/Linux
>
> When I remove those options from the slaves.sh script, the starting of
> hadoop DFS succeeds (I commented out the part in start-all.sh that starts
> MapReduce stuff, because I'm only trying to use DFS stuff).
>
> Here is the output of ssh -V on my system:
>
> [mattmann@XXX ~/hadoop]$ ssh -V
> OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f
>   

Yeah. On older ssh, neither option is available. "OpenSSH_3.5p1n, SSH 
protocols 1.5/2.0, OpenSSL 0x0090609f" doesn't have either option whereas
OpenSSH_4.2p1 Debian-6, OpenSSL 0.9.8a 11 Oct 2005" does. Looking over 
release notes, I note that SendEnv shows up in release 3.9. Doesn't seem 
to be a note on when ConnectTimeout was added (I didn't spend a long 
time looking).

Here's definitions from ssh_config man page:

ConnectTimeout
Specifies the timeout (in seconds) used when connecting to the
ssh server, instead of using the default system TCP timeout.
This value is used only when the target is down or really
unreachable, not when it refuses the connection.

...

SendEnv
Specifies what variables from the local environ(7) should be sent
to the server. Note that environment passing is only supported
for protocol 2, the server must also support it, and the server
must be configured to accept these environment variables. Refer
to AcceptEnv in sshd_config(5) for how to configure the server.
Variables are specified by name, which may contain the wildcard
characters ‘*’ and ‘?’. Multiple environment variables may be
separated by whitespace or spread across multiple SendEnv direc‐
tives. The default is not to send any environment variables.

If you've not noticed already, you can blank out ssh options in 
hadoop-env.sh by setting HADOOP_SSH_OPTS to the empty-string. All should 
work. You'll just not have a timeout on your ssh attempts and you won't 
be able to forward head node HADOOP_* environment variables out to slaves.

St.Ack

RE: Problem starting hadoop

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.

Thanks Doug! :-)

Cheers,
  Chris

______________________________________________
Chris A. Mattmann
Chris.Mattmann@jpl.nasa.gov 
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.

> -----Original Message-----
> From: Doug Cutting [mailto:cutting@apache.org]
> Sent: Tuesday, April 11, 2006 10:02 AM
> To: hadoop-dev@lucene.apache.org
> Subject: Re: Problem starting hadoop
> 
> Chris Mattmann wrote:
> >>Is adding the following line to your conf/hadoop-env.sh not good enough?
> >>
> >>export HADOOP_SSH_OPTS=""
> >>
> >>Or are you arguing that we should make that the default?  I'd be okay
> >>with that.
> >
> > Yup, that's what I'm arguing. +1
> 
> Done.
> 
> Doug

Re: Problem starting hadoop

Posted by Doug Cutting <cu...@apache.org>.

Chris Mattmann wrote:
>>Is adding the following line to your conf/hadoop-env.sh not good enough?
>>
>>export HADOOP_SSH_OPTS=""
>>
>>Or are you arguing that we should make that the default?  I'd be okay
>>with that.
> 
> Yup, that's what I'm arguing. +1

Done.

Doug

RE: Problem starting hadoop

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.

Hey Doug,
> 
> Is adding the following line to your conf/hadoop-env.sh not good enough?
> 
> export HADOOP_SSH_OPTS=""
> 
> Or are you arguing that we should make that the default?  I'd be okay
> with that.

Yup, that's what I'm arguing. +1

> 
> >>Instead of 'bin/start-all.sh' that's just 'bin/hadoop-daemon.sh start
> >>namenode; bin/hadoop-daemons.sh start datanode'.  Is that what you're
> >>after?  I guess we could add a bin/start-dfs.sh command that does this.
> >
> > Yeah, that's exactly what I'm after. [ ... ] Anyways, getting back to
> the
> > point, yeah, it would be great to have something to just start and stop
> DFS,
> > and start and stop MapReduce for that matter, even if it just amounts to
> the
> > simple command you mentioned above. I've attached a quick patch for
> Hadoop
> > that implements it.
> 
> Can you please attach this to a bug so that we don't lose track of it?
> 
> Also, we should probably change the -all scripts to be based on the -dfs
> and -mapred scripts, no?

Yup, agreed. I'll attach the patch in five minutes to JIRA. Thanks for the
quick turnaround on this.

Take care,
  Chris

> 
> Doug

Re: Problem starting hadoop

Posted by Doug Cutting <cu...@apache.org>.

Chris Mattmann wrote:
>>You can override this by editing conf/hadoop-env.sh.  Both are optional,
>>but convenient.  Perhaps we should avoid using them until they're in
>>wider distribution.
> 
> Yeah, I think it would be nice to find a workaround to explicitly using
> these options.

Is adding the following line to your conf/hadoop-env.sh not good enough?

export HADOOP_SSH_OPTS=""

Or are you arguing that we should make that the default?  I'd be okay 
with that.

>>Instead of 'bin/start-all.sh' that's just 'bin/hadoop-daemon.sh start
>>namenode; bin/hadoop-daemons.sh start datanode'.  Is that what you're
>>after?  I guess we could add a bin/start-dfs.sh command that does this.
> 
> Yeah, that's exactly what I'm after. [ ... ] Anyways, getting back to the
> point, yeah, it would be great to have something to just start and stop DFS,
> and start and stop MapReduce for that matter, even if it just amounts to the
> simple command you mentioned above. I've attached a quick patch for Hadoop
> that implements it.

Can you please attach this to a bug so that we don't lose track of it?

Also, we should probably change the -all scripts to be based on the -dfs 
and -mapred scripts, no?

Doug

RE: Problem starting hadoop

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.

Hi Doug,

> Chris Mattmann wrote:
> > One thing I noticed is that on my linux cluster, the
> > additional options "-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" are
> not
> > available on my system.
> 
> You can override this by editing conf/hadoop-env.sh.  Both are optional,
> but convenient.  Perhaps we should avoid using them until they're in
> wider distribution.

Yeah, I think it would be nice to find a workaround to explicitly using
these options. I'm running a 16 node/32 processor Linux cluster with 64-bit,
processors, with CentOS Linux installed, along with the ROCKS toolkit. I
think it's a pretty standard cluster distribution and it doesn't come with
an SSH that uses those options. I don't want my environment to dictate what
Hadoop should do for everyone else, but I think it's something to definitely
consider...

> > Any ideas? Additionally, would it make sense to put in options within
> the
> > startup script to only start DFS related daemons and slaves, and the
> same
> > goes for only starting MapRed daemons and slaves? If so, I can create a
> JIRA
> > issue about this.
> 
> Instead of 'bin/start-all.sh' that's just 'bin/hadoop-daemon.sh start
> namenode; bin/hadoop-daemons.sh start datanode'.  Is that what you're
> after?  I guess we could add a bin/start-dfs.sh command that does this.

Yeah, that's exactly what I'm after. Some, non-technical, joe-user way of
just starting up DFS and stopping DFS, not having anything to do with
MapReduce. I'm interested in Hadoop just for its DFS capabilities for now.
We're looking into HDFS within my group at JPL as a solution for our data
movement issue that will arise on our cluster when we begin to stage large
(gigabyte) sized files between an NFS mounted RAID disk and cluster nodes
that have jobs executing on them that need non-NFS access to the files
(having thousands of jobs reading from a single NFS mounted RAID can become
a bottleneck). So right now I'm benchmarking HDFS along with PVFS (parallel
virtual file system) as potential solutions to the data movement issue
between the cluster nodes and our NFS RAIDs. Anyways, getting back to the
point, yeah, it would be great to have something to just start and stop DFS,
and start and stop MapReduce for that matter, even if it just amounts to the
simple command you mentioned above. I've attached a quick patch for Hadoop
that implements it.

Thanks, Doug!

Cheers,
  Chris

> 
> Doug

Re: Problem starting hadoop

Posted by Doug Cutting <cu...@apache.org>.

Chris Mattmann wrote:
> One thing I noticed is that on my linux cluster, the
> additional options "-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" are not
> available on my system.

You can override this by editing conf/hadoop-env.sh.  Both are optional, 
but convenient.  Perhaps we should avoid using them until they're in 
wider distribution.

> Here is the output of ssh -V on my system:
> 
> [mattmann@XXX ~/hadoop]$ ssh -V
> OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f

I run Ubuntu Breezy and have OpenSSH 4.1.  I don't know what version of 
OpenSSH these features were added in.

> Any ideas? Additionally, would it make sense to put in options within the
> startup script to only start DFS related daemons and slaves, and the same
> goes for only starting MapRed daemons and slaves? If so, I can create a JIRA
> issue about this.

Instead of 'bin/start-all.sh' that's just 'bin/hadoop-daemon.sh start 
namenode; bin/hadoop-daemons.sh start datanode'.  Is that what you're 
after?  I guess we could add a bin/start-dfs.sh command that does this.

Doug