You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jeff Zhang <zj...@gmail.com> on 2009/11/20 08:55:14 UTC

Re: to get hadoop working around with multiple users on the same instance

In the client machine, configure the core-site.xml, hdfs.xml and mapred.xml
as you do in the hadoop cluster.

then you can run dfs shell command in the client's machine


Jeff Zhang



On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com> wrote:

> Hello all,
>
> I am not sure if the question is framed right !
>
> Lets say user1 launches an instance of hadoop on *single node* , and hence
> he has permission to create,delete files on hdfs or launch M/R jobs .
>
> now what should i do if user2 wants to use the same instance of hadoop
> which
> is launched by user1  and needs permission to create delete files on hdfs
> or
> launch M/R jobs
>
> i am using 0.20 version ..... of hadoop and Ubuntu as the host machine
>
> Thanks for any inputs
>
>
> --
> Regards,
> ~Sid~
> I have never met a man so ignorant that i couldn't learn something from him
>

absolute path or relative path

Posted by Gang Luo <lg...@yahoo.com.cn>.
HI all,
when I run mapreduce program, I find a tricky thing about the input and output path. The input path I give is absolute path (e.g. /user/user1/input/*, assume I am user1) and it works well. The output path I give is also absolute path (e.g. /user/user1/output/). But the final result was output to the path /user/user1/user/user1/output/part-00000. Does that mean hadoop regards the input path as an absolute path while regards the output path as a relative path?

Thanks!

 
Gang Luo
---------
Department of Computer Science
Duke University
(919)316-0993
gang.luo@duke.edu


      ___________________________________________________________ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/

Re: to get hadoop working around with multiple users on the same instance

Posted by Edward Capriolo <ed...@gmail.com>.
On Tue, Nov 24, 2009 at 10:06 AM, Jason Venner <ja...@gmail.com> wrote:
> In the case you describe it depends on where the underlying host file system
> files, supporting the cluster were created and host operating system
> ownership.
>
> Unless you arrange for the files and paths to be writable by all users, or
> by a group shared by your users, you are unlikely to be successful.
>
> Basically you will need the HADOOP_LOG_DIR writable by both users, as well
> as the hadoop.tmp.dir, and the hadoop.tmp.dir not variable based on the
> user.
> HADOOP_LOG_DIR is an environment variable and hadoop.tmp.dir is a
> configuration parameter in the ...-default.xml file, which you would
> override in the ...-site.xml file. The name in ... varies with your hadoop
> version.
>
> On Tue, Nov 24, 2009 at 5:44 AM, Siddu <si...@gmail.com> wrote:
>
>> On Sun, Nov 22, 2009 at 8:19 AM, Jason Venner <jason.hadoop@gmail.com
>> >wrote:
>>
>> > disable hdfs permission checking
>> >
>> > <property>
>> >  <name>dfs.permissions</name>
>> >  <value>true</value>
>> >  <description>
>> >    If "true", enable permission checking in HDFS.
>> >    If "false", permission checking is turned off,
>> >    but all other behavior is unchanged.
>> >    Switching from one parameter value to the other does not change the
>> > mode,
>> >    owner or group of files or directories.
>> >  </description>
>> > </property>
>> >
>> > On Fri, Nov 20, 2009 at 1:46 AM, Jeff Zhang <zj...@gmail.com> wrote:
>> >
>> > > On Fri, Nov 20, 2009 at 4:39 PM, Siddu <si...@gmail.com> wrote:
>> > >
>> > > > On Fri, Nov 20, 2009 at 1:25 PM, Jeff Zhang <zj...@gmail.com>
>> wrote:
>> > > >
>> > > > > In the client machine, configure the core-site.xml, hdfs.xml and
>> > > > mapred.xml
>> > > > > as you do in the hadoop cluster.
>> > > > >
>> > > >
>> > > > if you are referring to client machine as a place from where you
>> launch
>> > > M/R
>> > > > jobs
>> > > >
>> > > >  There is no client machine as such . It just one machine am
>> > > experimenting
>> > > > on (pseudo distributed file system)
>> > > >
>> > >
>> > > *    You can still connect the machine if it run  pseudo distributed
>> file
>> > > system. But in the configuration files, use the real ip refer the
>> machine
>> > > rather than using localhost.*
>> > >
>> > >
>> > >
>> > > > then you can run dfs shell command in the client's machine
>> > > > >
>> > > > >
>> > > > > Jeff Zhang
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com>
>> wrote:
>> > > > >
>> > > > > > Hello all,
>> > > > > >
>> > > > > > I am not sure if the question is framed right !
>> > > > > >
>> > > > > > Lets say user1 launches an instance of hadoop on *single node* ,
>> > and
>> > > > > hence
>> > > > > > he has permission to create,delete files on hdfs or launch M/R
>> jobs
>> > .
>> > > > > >
>> > > > > > now what should i do if user2 wants to use the same instance of
>> > > hadoop
>> > > > > > which
>> > > > > > is launched by user1  and needs permission to create delete files
>> > on
>> > > > hdfs
>> > > > > > or
>> > > > > > launch M/R jobs
>> > > > > >
>> > > > > > i am using 0.20 version ..... of hadoop and Ubuntu as the host
>> > > machine
>> > > > > >
>> > > > > > Thanks for any inputs
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Regards,
>> > > > > > ~Sid~
>> > > > > > I have never met a man so ignorant that i couldn't learn
>> something
>> > > from
>> > > > > him
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Regards,
>> > > > ~Sid~
>> > > > I have never met a man so ignorant that i couldn't learn something
>> from
>> > > him
>> > > >
>> > >
>> >
>> >
>> > Hi all,
>>
>> my question is something like this
>>
>> lets say there are two users (john & jack ) on single host
>>
>> and say john starts the hadoop instance  using the command
>>
>> john@desktop:$bin/start-all.sh
>>
>> and uploads some of the files using
>>
>> john@desktop:$bin/hadoop dfs -copyFromLocal <local file> <dst-path>
>>
>> later i bring down the instance
>>
>> by bin/stop-all.sh
>>
>> . Now if user jack starts the hadoop instance on the same machine using
>>
>> jack@desktop:$bin/start-all.sh
>>
>> and submits any mapreduce job on the files owned by john
>>
>> he shouldnt be able access it isnt it ?
>>
>> There is no client machine ...... its all on one computer
>>
>> Thanks for inputs
>>
>>
>>
>> > --
>> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
>> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
>> > www.prohadoopbook.com a community for Hadoop Professionals
>> >
>>
>>
>>
>> --
>> Regards,
>> ~Sid~
>> I have never met a man so ignorant that i couldn't learn something from him
>>
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>


>>john@desktop:$bin/start-all.sh
>>jack@desktop:$bin/start-all.sh

This is a very odd use case. You (probably) should be starting the
cluster as the same user every time. Otherwise the hadoop superuser is
changing with each startup.

If you really need to support this type of usage you should look at
HOD (Hadoop On Demand), but unless you think you need that make a
dedicated 'hadoop' posix user for starting the hadoop services.

If you need other users to be able to start and stop hadoop use a
mechanism like sudo.

Re: to get hadoop working around with multiple users on the same instance

Posted by Jason Venner <ja...@gmail.com>.
In the case you describe it depends on where the underlying host file system
files, supporting the cluster were created and host operating system
ownership.

Unless you arrange for the files and paths to be writable by all users, or
by a group shared by your users, you are unlikely to be successful.

Basically you will need the HADOOP_LOG_DIR writable by both users, as well
as the hadoop.tmp.dir, and the hadoop.tmp.dir not variable based on the
user.
HADOOP_LOG_DIR is an environment variable and hadoop.tmp.dir is a
configuration parameter in the ...-default.xml file, which you would
override in the ...-site.xml file. The name in ... varies with your hadoop
version.

On Tue, Nov 24, 2009 at 5:44 AM, Siddu <si...@gmail.com> wrote:

> On Sun, Nov 22, 2009 at 8:19 AM, Jason Venner <jason.hadoop@gmail.com
> >wrote:
>
> > disable hdfs permission checking
> >
> > <property>
> >  <name>dfs.permissions</name>
> >  <value>true</value>
> >  <description>
> >    If "true", enable permission checking in HDFS.
> >    If "false", permission checking is turned off,
> >    but all other behavior is unchanged.
> >    Switching from one parameter value to the other does not change the
> > mode,
> >    owner or group of files or directories.
> >  </description>
> > </property>
> >
> > On Fri, Nov 20, 2009 at 1:46 AM, Jeff Zhang <zj...@gmail.com> wrote:
> >
> > > On Fri, Nov 20, 2009 at 4:39 PM, Siddu <si...@gmail.com> wrote:
> > >
> > > > On Fri, Nov 20, 2009 at 1:25 PM, Jeff Zhang <zj...@gmail.com>
> wrote:
> > > >
> > > > > In the client machine, configure the core-site.xml, hdfs.xml and
> > > > mapred.xml
> > > > > as you do in the hadoop cluster.
> > > > >
> > > >
> > > > if you are referring to client machine as a place from where you
> launch
> > > M/R
> > > > jobs
> > > >
> > > >  There is no client machine as such . It just one machine am
> > > experimenting
> > > > on (pseudo distributed file system)
> > > >
> > >
> > > *    You can still connect the machine if it run  pseudo distributed
> file
> > > system. But in the configuration files, use the real ip refer the
> machine
> > > rather than using localhost.*
> > >
> > >
> > >
> > > > then you can run dfs shell command in the client's machine
> > > > >
> > > > >
> > > > > Jeff Zhang
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com>
> wrote:
> > > > >
> > > > > > Hello all,
> > > > > >
> > > > > > I am not sure if the question is framed right !
> > > > > >
> > > > > > Lets say user1 launches an instance of hadoop on *single node* ,
> > and
> > > > > hence
> > > > > > he has permission to create,delete files on hdfs or launch M/R
> jobs
> > .
> > > > > >
> > > > > > now what should i do if user2 wants to use the same instance of
> > > hadoop
> > > > > > which
> > > > > > is launched by user1  and needs permission to create delete files
> > on
> > > > hdfs
> > > > > > or
> > > > > > launch M/R jobs
> > > > > >
> > > > > > i am using 0.20 version ..... of hadoop and Ubuntu as the host
> > > machine
> > > > > >
> > > > > > Thanks for any inputs
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > ~Sid~
> > > > > > I have never met a man so ignorant that i couldn't learn
> something
> > > from
> > > > > him
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > ~Sid~
> > > > I have never met a man so ignorant that i couldn't learn something
> from
> > > him
> > > >
> > >
> >
> >
> > Hi all,
>
> my question is something like this
>
> lets say there are two users (john & jack ) on single host
>
> and say john starts the hadoop instance  using the command
>
> john@desktop:$bin/start-all.sh
>
> and uploads some of the files using
>
> john@desktop:$bin/hadoop dfs -copyFromLocal <local file> <dst-path>
>
> later i bring down the instance
>
> by bin/stop-all.sh
>
> . Now if user jack starts the hadoop instance on the same machine using
>
> jack@desktop:$bin/start-all.sh
>
> and submits any mapreduce job on the files owned by john
>
> he shouldnt be able access it isnt it ?
>
> There is no client machine ...... its all on one computer
>
> Thanks for inputs
>
>
>
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>
>
>
> --
> Regards,
> ~Sid~
> I have never met a man so ignorant that i couldn't learn something from him
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: to get hadoop working around with multiple users on the same instance

Posted by Siddu <si...@gmail.com>.
On Sun, Nov 22, 2009 at 8:19 AM, Jason Venner <ja...@gmail.com>wrote:

> disable hdfs permission checking
>
> <property>
>  <name>dfs.permissions</name>
>  <value>true</value>
>  <description>
>    If "true", enable permission checking in HDFS.
>    If "false", permission checking is turned off,
>    but all other behavior is unchanged.
>    Switching from one parameter value to the other does not change the
> mode,
>    owner or group of files or directories.
>  </description>
> </property>
>
> On Fri, Nov 20, 2009 at 1:46 AM, Jeff Zhang <zj...@gmail.com> wrote:
>
> > On Fri, Nov 20, 2009 at 4:39 PM, Siddu <si...@gmail.com> wrote:
> >
> > > On Fri, Nov 20, 2009 at 1:25 PM, Jeff Zhang <zj...@gmail.com> wrote:
> > >
> > > > In the client machine, configure the core-site.xml, hdfs.xml and
> > > mapred.xml
> > > > as you do in the hadoop cluster.
> > > >
> > >
> > > if you are referring to client machine as a place from where you launch
> > M/R
> > > jobs
> > >
> > >  There is no client machine as such . It just one machine am
> > experimenting
> > > on (pseudo distributed file system)
> > >
> >
> > *    You can still connect the machine if it run  pseudo distributed file
> > system. But in the configuration files, use the real ip refer the machine
> > rather than using localhost.*
> >
> >
> >
> > > then you can run dfs shell command in the client's machine
> > > >
> > > >
> > > > Jeff Zhang
> > > >
> > > >
> > > >
> > > > On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com> wrote:
> > > >
> > > > > Hello all,
> > > > >
> > > > > I am not sure if the question is framed right !
> > > > >
> > > > > Lets say user1 launches an instance of hadoop on *single node* ,
> and
> > > > hence
> > > > > he has permission to create,delete files on hdfs or launch M/R jobs
> .
> > > > >
> > > > > now what should i do if user2 wants to use the same instance of
> > hadoop
> > > > > which
> > > > > is launched by user1  and needs permission to create delete files
> on
> > > hdfs
> > > > > or
> > > > > launch M/R jobs
> > > > >
> > > > > i am using 0.20 version ..... of hadoop and Ubuntu as the host
> > machine
> > > > >
> > > > > Thanks for any inputs
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > ~Sid~
> > > > > I have never met a man so ignorant that i couldn't learn something
> > from
> > > > him
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > ~Sid~
> > > I have never met a man so ignorant that i couldn't learn something from
> > him
> > >
> >
>
>
> Hi all,

my question is something like this

lets say there are two users (john & jack ) on single host

and say john starts the hadoop instance  using the command

john@desktop:$bin/start-all.sh

and uploads some of the files using

john@desktop:$bin/hadoop dfs -copyFromLocal <local file> <dst-path>

later i bring down the instance

by bin/stop-all.sh

. Now if user jack starts the hadoop instance on the same machine using

jack@desktop:$bin/start-all.sh

and submits any mapreduce job on the files owned by john

he shouldnt be able access it isnt it ?

There is no client machine ...... its all on one computer

Thanks for inputs



> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>



-- 
Regards,
~Sid~
I have never met a man so ignorant that i couldn't learn something from him

Re: to get hadoop working around with multiple users on the same instance

Posted by Jason Venner <ja...@gmail.com>.
disable hdfs permission checking

<property>
  <name>dfs.permissions</name>
  <value>true</value>
  <description>
    If "true", enable permission checking in HDFS.
    If "false", permission checking is turned off,
    but all other behavior is unchanged.
    Switching from one parameter value to the other does not change the
mode,
    owner or group of files or directories.
  </description>
</property>

On Fri, Nov 20, 2009 at 1:46 AM, Jeff Zhang <zj...@gmail.com> wrote:

> On Fri, Nov 20, 2009 at 4:39 PM, Siddu <si...@gmail.com> wrote:
>
> > On Fri, Nov 20, 2009 at 1:25 PM, Jeff Zhang <zj...@gmail.com> wrote:
> >
> > > In the client machine, configure the core-site.xml, hdfs.xml and
> > mapred.xml
> > > as you do in the hadoop cluster.
> > >
> >
> > if you are referring to client machine as a place from where you launch
> M/R
> > jobs
> >
> >  There is no client machine as such . It just one machine am
> experimenting
> > on (pseudo distributed file system)
> >
>
> *    You can still connect the machine if it run  pseudo distributed file
> system. But in the configuration files, use the real ip refer the machine
> rather than using localhost.*
>
>
>
> > then you can run dfs shell command in the client's machine
> > >
> > >
> > > Jeff Zhang
> > >
> > >
> > >
> > > On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com> wrote:
> > >
> > > > Hello all,
> > > >
> > > > I am not sure if the question is framed right !
> > > >
> > > > Lets say user1 launches an instance of hadoop on *single node* , and
> > > hence
> > > > he has permission to create,delete files on hdfs or launch M/R jobs .
> > > >
> > > > now what should i do if user2 wants to use the same instance of
> hadoop
> > > > which
> > > > is launched by user1  and needs permission to create delete files on
> > hdfs
> > > > or
> > > > launch M/R jobs
> > > >
> > > > i am using 0.20 version ..... of hadoop and Ubuntu as the host
> machine
> > > >
> > > > Thanks for any inputs
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > ~Sid~
> > > > I have never met a man so ignorant that i couldn't learn something
> from
> > > him
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> > ~Sid~
> > I have never met a man so ignorant that i couldn't learn something from
> him
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: to get hadoop working around with multiple users on the same instance

Posted by Jeff Zhang <zj...@gmail.com>.
On Fri, Nov 20, 2009 at 4:39 PM, Siddu <si...@gmail.com> wrote:

> On Fri, Nov 20, 2009 at 1:25 PM, Jeff Zhang <zj...@gmail.com> wrote:
>
> > In the client machine, configure the core-site.xml, hdfs.xml and
> mapred.xml
> > as you do in the hadoop cluster.
> >
>
> if you are referring to client machine as a place from where you launch M/R
> jobs
>
>  There is no client machine as such . It just one machine am experimenting
> on (pseudo distributed file system)
>

*    You can still connect the machine if it run  pseudo distributed file
system. But in the configuration files, use the real ip refer the machine
rather than using localhost.*



> then you can run dfs shell command in the client's machine
> >
> >
> > Jeff Zhang
> >
> >
> >
> > On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com> wrote:
> >
> > > Hello all,
> > >
> > > I am not sure if the question is framed right !
> > >
> > > Lets say user1 launches an instance of hadoop on *single node* , and
> > hence
> > > he has permission to create,delete files on hdfs or launch M/R jobs .
> > >
> > > now what should i do if user2 wants to use the same instance of hadoop
> > > which
> > > is launched by user1  and needs permission to create delete files on
> hdfs
> > > or
> > > launch M/R jobs
> > >
> > > i am using 0.20 version ..... of hadoop and Ubuntu as the host machine
> > >
> > > Thanks for any inputs
> > >
> > >
> > > --
> > > Regards,
> > > ~Sid~
> > > I have never met a man so ignorant that i couldn't learn something from
> > him
> > >
> >
>
>
>
> --
> Regards,
> ~Sid~
> I have never met a man so ignorant that i couldn't learn something from him
>

Re: to get hadoop working around with multiple users on the same instance

Posted by Siddu <si...@gmail.com>.
On Fri, Nov 20, 2009 at 1:25 PM, Jeff Zhang <zj...@gmail.com> wrote:

> In the client machine, configure the core-site.xml, hdfs.xml and mapred.xml
> as you do in the hadoop cluster.
>

if you are referring to client machine as a place from where you launch M/R
jobs

 There is no client machine as such . It just one machine am experimenting
on (pseudo distributed file system)

then you can run dfs shell command in the client's machine
>
>
> Jeff Zhang
>
>
>
> On Fri, Nov 20, 2009 at 3:38 PM, Siddu <si...@gmail.com> wrote:
>
> > Hello all,
> >
> > I am not sure if the question is framed right !
> >
> > Lets say user1 launches an instance of hadoop on *single node* , and
> hence
> > he has permission to create,delete files on hdfs or launch M/R jobs .
> >
> > now what should i do if user2 wants to use the same instance of hadoop
> > which
> > is launched by user1  and needs permission to create delete files on hdfs
> > or
> > launch M/R jobs
> >
> > i am using 0.20 version ..... of hadoop and Ubuntu as the host machine
> >
> > Thanks for any inputs
> >
> >
> > --
> > Regards,
> > ~Sid~
> > I have never met a man so ignorant that i couldn't learn something from
> him
> >
>



-- 
Regards,
~Sid~
I have never met a man so ignorant that i couldn't learn something from him