You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Pankil Doshi <fo...@gmail.com> on 2009/05/21 01:07:54 UTC

Username in Hadoop cluster

Hello everyone,

Till now I was using same username on all my hadoop cluster machines.

But now I am building my new cluster and face a situation in which I have
different usernames for different machines. So what changes will have to
make in configuring hadoop. using same username ssh was easy. now will it
face problem as now I have different username?

Regards
Pankil

RE: Username in Hadoop cluster

Posted by Vishal Ghawate <vi...@persistent.co.in>.
hi,
you can try for fair scheduler
With Regards,
VIshal S. Ghawate | Intern | Persistent Systems

vishal_ghawate@persistent.co.in  | Cell: +91 9970231302| Tel: +91 (20) 3023 4224

Persistent Systems - Innovation in software product design, development and delivery -  www.persistentsys.com




________________________________________
From: Steve Loughran [stevel@apache.org]
Sent: Thursday, May 21, 2009 3:19 PM
To: core-user@hadoop.apache.org
Subject: Re: Username in Hadoop cluster

Pankil Doshi wrote:
> Hello everyone,
>
> Till now I was using same username on all my hadoop cluster machines.
>
> But now I am building my new cluster and face a situation in which I have
> different usernames for different machines. So what changes will have to
> make in configuring hadoop. using same username ssh was easy. now will it
> face problem as now I have different username?

Are you building these machines up by hand? How many? Why the different
usernames?

Can't you just create a new user and group "hadoop" on all the boxes?

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: Username in Hadoop cluster

Posted by Steve Loughran <st...@apache.org>.
Pankil Doshi wrote:
> Hello everyone,
> 
> Till now I was using same username on all my hadoop cluster machines.
> 
> But now I am building my new cluster and face a situation in which I have
> different usernames for different machines. So what changes will have to
> make in configuring hadoop. using same username ssh was easy. now will it
> face problem as now I have different username?

Are you building these machines up by hand? How many? Why the different 
usernames?

Can't you just create a new user and group "hadoop" on all the boxes?

Re: Username in Hadoop cluster

Posted by Alex Loddengaard <al...@cloudera.com>.
Ah ha!  Good point, Todd.  Pankil, with Todd's suggestion, you can ignore
the first option I proposed.

Thanks,

Alex

On Wed, May 20, 2009 at 4:30 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Wed, May 20, 2009 at 4:14 PM, Alex Loddengaard <al...@cloudera.com>
> wrote:
>
> > First of all, if you can get all machines to have the same user, that
> would
> > greatly simplify things.
> >
> > If, for whatever reason, you absolutely can't get the same user on all
> > machines, then you could do either of the following:
> >
> > 1) Change the *-all.sh scripts to read from a slaves file that has two
> > fields: a host and a user
>
>
> To add to what Alex said, you should actually already be able to do this
> with the existing scripts by simply using the format "username@hostname"
> for
> each entry in the slaves file.
>
> -Todd
>

Re: Username in Hadoop cluster

Posted by Aaron Kimball <aa...@cloudera.com>.
A slightly longer answer:

If you're starting daemons with bin/start-dfs.sh or start-all.sh, you'll
notice that these defer to hadoop-daemons.sh to do the heavy lifting. This
evaluates the string: cd "$HADOOP_HOME" \; "$bin/hadoop-daemon.sh" --config
$HADOOP_CONF_DIR "$@" and passes it to an underlying loop to execute on all
the slaves via ssh.

$bin and $HADOOP_HOME are thus macro-replaced on the server-side. The more
problematic one here is the $bin one, which resolves to the absolute path of
the cwd on the server that is starting Hadoop.

You've got three basic options:
1) Install Hadoop in the exact same path on all nodes
2) Modify bin/hadoop-daemons.sh to do something more clever on your system
by deferring evaluation of HADOOP_HOME and the bin directory (probably
really hairy; you might have to escape the variable names more than once
since there's another script named slaves.sh that this goes through)
3) Start the slaves "manually" on each node by logging in yourself, and
doing a "cd $HADOOP_HOME && bin/hadoop-daemon.sh datanode start"

As a shameless plug, Cloudera's distribution for Hadoop (
www.cloudera.com/hadoop) will also provide init.d scripts so that you can
start Hadoop daemons via the 'service' command. By default, the RPM
installation will also standardize on the "hadoop" username. But you can't
install this without being root.

- Aaron

On Tue, May 26, 2009 at 12:30 PM, Alex Loddengaard <al...@cloudera.com>wrote:

> It looks to me like you didn't install Hadoop consistently across all
> nodes.
>
> xxx.xx.xx.251: bash:
> > /home/utdhadoop1/Hadoop/
>
> hadoop-0.18.3/bin/hadoop-daemon.sh: No such file or
> directory
>
> The above makes me suspect that xxx.xx.xx.251 has Hadoop installed
> differently.  Can you try and locate hadoop-daemon.sh on xxx.xx.xx.251 and
> adjust its location properly?
>
> Alex
>
> On Mon, May 25, 2009 at 10:25 PM, Pankil Doshi <fo...@gmail.com>
> wrote:
>
> > Hello,
> >
> > I tried adding "username@hostname" for eachentry in slaves file.
> >
> > My slave file have 2 data nodes.it looks like below
> >
> > localhost
> > utdhadoop1@xxx.xx.xx.229
> > utdhadoop@xxx.xx.xx.251
> >
> >
> > error what I get when i start dfs is as below:
> >
> > starting namenode, logging to
> >
> >
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-namenode-opencirrus-992.hpl.hp.com.out
> > xxx.xx.xx.229: starting datanode, logging to
> >
> >
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-datanode-opencirrus-992.hpl.hp.com.out
> > *xxx.xx.xx.251: bash: line 0: cd:
> > /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/..: No such file or directory
> > xxx.xx.xx.251: bash:
> > /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/hadoop-daemon.sh: No such file
> or
> > directory
> > *localhost: datanode running as process 25814. Stop it first.
> > xxx.xx.xx.229: starting secondarynamenode, logging to
> >
> >
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-secondarynamenode-opencirrus-992.hpl.hp.com.out
> > localhost: secondarynamenode running as process 25959. Stop it first.
> >
> >
> >
> > Basically it looks for "* /home/utdhadoop1/Hadoop/**
> > hadoop-0.18.3/bin/hadoop-**daemon.sh"
> > but instead it should look for "/home/utdhadoop/Hadoop/...." as
> > xxx.xx.xx.251 has username as utdhadoop* .
> >
> > Any inputs??
> >
> > Thanks
> > Pankil
> >
> > On Wed, May 20, 2009 at 6:30 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> > > On Wed, May 20, 2009 at 4:14 PM, Alex Loddengaard <al...@cloudera.com>
> > > wrote:
> > >
> > > > First of all, if you can get all machines to have the same user, that
> > > would
> > > > greatly simplify things.
> > > >
> > > > If, for whatever reason, you absolutely can't get the same user on
> all
> > > > machines, then you could do either of the following:
> > > >
> > > > 1) Change the *-all.sh scripts to read from a slaves file that has
> two
> > > > fields: a host and a user
> > >
> > >
> > > To add to what Alex said, you should actually already be able to do
> this
> > > with the existing scripts by simply using the format "username@hostname
> "
> > > for
> > > each entry in the slaves file.
> > >
> > > -Todd
> > >
> >
>

Re: Username in Hadoop cluster

Posted by Alex Loddengaard <al...@cloudera.com>.
It looks to me like you didn't install Hadoop consistently across all nodes.

xxx.xx.xx.251: bash:
> /home/utdhadoop1/Hadoop/

hadoop-0.18.3/bin/hadoop-daemon.sh: No such file or
directory

The above makes me suspect that xxx.xx.xx.251 has Hadoop installed
differently.  Can you try and locate hadoop-daemon.sh on xxx.xx.xx.251 and
adjust its location properly?

Alex

On Mon, May 25, 2009 at 10:25 PM, Pankil Doshi <fo...@gmail.com> wrote:

> Hello,
>
> I tried adding "username@hostname" for eachentry in slaves file.
>
> My slave file have 2 data nodes.it looks like below
>
> localhost
> utdhadoop1@xxx.xx.xx.229
> utdhadoop@xxx.xx.xx.251
>
>
> error what I get when i start dfs is as below:
>
> starting namenode, logging to
>
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-namenode-opencirrus-992.hpl.hp.com.out
> xxx.xx.xx.229: starting datanode, logging to
>
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-datanode-opencirrus-992.hpl.hp.com.out
> *xxx.xx.xx.251: bash: line 0: cd:
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/..: No such file or directory
> xxx.xx.xx.251: bash:
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/hadoop-daemon.sh: No such file or
> directory
> *localhost: datanode running as process 25814. Stop it first.
> xxx.xx.xx.229: starting secondarynamenode, logging to
>
> /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-secondarynamenode-opencirrus-992.hpl.hp.com.out
> localhost: secondarynamenode running as process 25959. Stop it first.
>
>
>
> Basically it looks for "* /home/utdhadoop1/Hadoop/**
> hadoop-0.18.3/bin/hadoop-**daemon.sh"
> but instead it should look for "/home/utdhadoop/Hadoop/...." as
> xxx.xx.xx.251 has username as utdhadoop* .
>
> Any inputs??
>
> Thanks
> Pankil
>
> On Wed, May 20, 2009 at 6:30 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > On Wed, May 20, 2009 at 4:14 PM, Alex Loddengaard <al...@cloudera.com>
> > wrote:
> >
> > > First of all, if you can get all machines to have the same user, that
> > would
> > > greatly simplify things.
> > >
> > > If, for whatever reason, you absolutely can't get the same user on all
> > > machines, then you could do either of the following:
> > >
> > > 1) Change the *-all.sh scripts to read from a slaves file that has two
> > > fields: a host and a user
> >
> >
> > To add to what Alex said, you should actually already be able to do this
> > with the existing scripts by simply using the format "username@hostname"
> > for
> > each entry in the slaves file.
> >
> > -Todd
> >
>

Re: Username in Hadoop cluster

Posted by Pankil Doshi <fo...@gmail.com>.
Hello,

I tried adding "username@hostname" for eachentry in slaves file.

My slave file have 2 data nodes.it looks like below

localhost
utdhadoop1@xxx.xx.xx.229
utdhadoop@xxx.xx.xx.251


error what I get when i start dfs is as below:

starting namenode, logging to
/home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-namenode-opencirrus-992.hpl.hp.com.out
xxx.xx.xx.229: starting datanode, logging to
/home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-datanode-opencirrus-992.hpl.hp.com.out
*xxx.xx.xx.251: bash: line 0: cd:
/home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/..: No such file or directory
xxx.xx.xx.251: bash:
/home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/hadoop-daemon.sh: No such file or
directory
*localhost: datanode running as process 25814. Stop it first.
xxx.xx.xx.229: starting secondarynamenode, logging to
/home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-secondarynamenode-opencirrus-992.hpl.hp.com.out
localhost: secondarynamenode running as process 25959. Stop it first.



Basically it looks for "* /home/utdhadoop1/Hadoop/**
hadoop-0.18.3/bin/hadoop-**daemon.sh"
but instead it should look for "/home/utdhadoop/Hadoop/...." as
xxx.xx.xx.251 has username as utdhadoop* .

Any inputs??

Thanks
Pankil

On Wed, May 20, 2009 at 6:30 PM, Todd Lipcon <to...@cloudera.com> wrote:

> On Wed, May 20, 2009 at 4:14 PM, Alex Loddengaard <al...@cloudera.com>
> wrote:
>
> > First of all, if you can get all machines to have the same user, that
> would
> > greatly simplify things.
> >
> > If, for whatever reason, you absolutely can't get the same user on all
> > machines, then you could do either of the following:
> >
> > 1) Change the *-all.sh scripts to read from a slaves file that has two
> > fields: a host and a user
>
>
> To add to what Alex said, you should actually already be able to do this
> with the existing scripts by simply using the format "username@hostname"
> for
> each entry in the slaves file.
>
> -Todd
>

Re: Username in Hadoop cluster

Posted by Todd Lipcon <to...@cloudera.com>.
On Wed, May 20, 2009 at 4:14 PM, Alex Loddengaard <al...@cloudera.com> wrote:

> First of all, if you can get all machines to have the same user, that would
> greatly simplify things.
>
> If, for whatever reason, you absolutely can't get the same user on all
> machines, then you could do either of the following:
>
> 1) Change the *-all.sh scripts to read from a slaves file that has two
> fields: a host and a user


To add to what Alex said, you should actually already be able to do this
with the existing scripts by simply using the format "username@hostname" for
each entry in the slaves file.

-Todd

Re: Username in Hadoop cluster

Posted by Alex Loddengaard <al...@cloudera.com>.
First of all, if you can get all machines to have the same user, that would
greatly simplify things.

If, for whatever reason, you absolutely can't get the same user on all
machines, then you could do either of the following:

1) Change the *-all.sh scripts to read from a slaves file that has two
fields: a host and a user
2) Move away from the *-all.sh scripts and do your own SSHing.

The *-all.sh scripts just SSH to other machines and run hadoop-daemon.sh,
which actually does the starting and stopping.  You could write your own
little script (bash, Python, whatever) that read from your own slaves files
and did the SSHing and hadoop-daemon.sh calling.

Again, ideally you'll have one user.

Alex

On Wed, May 20, 2009 at 4:07 PM, Pankil Doshi <fo...@gmail.com> wrote:

> Hello everyone,
>
> Till now I was using same username on all my hadoop cluster machines.
>
> But now I am building my new cluster and face a situation in which I have
> different usernames for different machines. So what changes will have to
> make in configuring hadoop. using same username ssh was easy. now will it
> face problem as now I have different username?
>
> Regards
> Pankil
>