You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mark <st...@gmail.com> on 2010/08/30 01:39:09 UTC

Job in 0.21

  How should I be creating a new Job instance in 0.21. It looks like 
Job(Configuration conf, String jobName) has been deprecated. It looks 
like Job(Cluster cluster) is the new way but I'm unsure of how to get a 
handle to the current cluster. Can someone advise. Thanks!

Re: accounts permission on hadoop

Posted by Gang Luo <lg...@yahoo.com.cn>.

It is clear now. Thanks Mike.

-Gang



----- 原始邮件 ----
发件人： Michael Thomas <th...@hep.caltech.edu>
收件人： common-user@hadoop.apache.org
发送日期： 2010/8/31 (周二) 8:04:07 下午
主   题： Re: accounts permission on hadoop

On 08/31/2010 05:01 PM, Michael Thomas wrote:
> On 08/31/2010 03:12 PM, Gang Luo wrote:
>> Thanks Edward.
>>
>> I was thinking is the home directory on HDFS also comply with the linux 
>system?
>> Since I create '/usr/research/home/smith' on HDFS for user 'smith', which is
>> also the home directory for smith in linux. When he input ' 'bin/hadoop fs 
-ls
>> ~/' he will get this directory, instead of '/user/smith'.
>>
>> -Gang
> 
> The tilde character '~' is a shell metacharacter.  It is converted to
> value of $HOME before the shell invokes the bin/hadoop command.  For
> example:
> 
> 1) User types "bin/hadoop fs -ls ~/"
> 2) Your shell changes this command to "bin/hadoop fs -ls /user/smith/"
> (or wherever the user's local linux home directory is located).
> 3) Your shell invokes the command and hadoop returns the contents of
> '/usr/research/home/smith' from hdfs (which probably doesn't exist), not
> from the local filesystem.

Of course, I meant to say "...returns the contents of '/user/smith' from
hdfs..."

--Mike

> Hadoop doesn't do anything special with the ~ character.
> 
> --Mike
>

Re: accounts permission on hadoop

Posted by Michael Thomas <th...@hep.caltech.edu>.

On 08/31/2010 05:01 PM, Michael Thomas wrote:
> On 08/31/2010 03:12 PM, Gang Luo wrote:
>> Thanks Edward.
>>
>> I was thinking is the home directory on HDFS also comply with the linux system?
>> Since I create '/usr/research/home/smith' on HDFS for user 'smith', which is
>> also the home directory for smith in linux. When he input ' 'bin/hadoop fs -ls
>> ~/' he will get this directory, instead of '/user/smith'.
>>
>> -Gang
> 
> The tilde character '~' is a shell metacharacter.  It is converted to
> value of $HOME before the shell invokes the bin/hadoop command.  For
> example:
> 
> 1) User types "bin/hadoop fs -ls ~/"
> 2) Your shell changes this command to "bin/hadoop fs -ls /user/smith/"
> (or wherever the user's local linux home directory is located).
> 3) Your shell invokes the command and hadoop returns the contents of
> '/usr/research/home/smith' from hdfs (which probably doesn't exist), not
> from the local filesystem.

Of course, I meant to say "...returns the contents of '/user/smith' from
hdfs..."

--Mike

> Hadoop doesn't do anything special with the ~ character.
> 
> --Mike
>

Re: accounts permission on hadoop

Posted by Michael Thomas <th...@hep.caltech.edu>.

On 08/31/2010 03:12 PM, Gang Luo wrote:
> Thanks Edward.
> 
> I was thinking is the home directory on HDFS also comply with the linux system?
> Since I create '/usr/research/home/smith' on HDFS for user 'smith', which is
> also the home directory for smith in linux. When he input ' 'bin/hadoop fs -ls
> ~/' he will get this directory, instead of '/user/smith'.
> 
> -Gang

The tilde character '~' is a shell metacharacter.  It is converted to
value of $HOME before the shell invokes the bin/hadoop command.  For
example:

1) User types "bin/hadoop fs -ls ~/"
2) Your shell changes this command to "bin/hadoop fs -ls /user/smith/"
(or wherever the user's local linux home directory is located).
3) Your shell invokes the command and hadoop returns the contents of
'/usr/research/home/smith' from hdfs (which probably doesn't exist), not
from the local filesystem.

Hadoop doesn't do anything special with the ~ character.

--Mike

Re: accounts permission on hadoop

Posted by Gang Luo <lg...@yahoo.com.cn>.

Thanks Edward.

I was thinking is the home directory on HDFS also comply with the linux system? 
Since I create '/usr/research/home/smith' on HDFS for user 'smith', which is 
also the home directory for smith in linux. When he input ' 'bin/hadoop fs -ls 
~/' he will get this directory, instead of '/user/smith'.

-Gang

----- 原始邮件 ----
发件人： Edward Capriolo <ed...@gmail.com>
收件人： common-user@hadoop.apache.org
发送日期： 2010/8/31 (周二) 5:43:22 下午
主   题： Re: accounts permission on hadoop

On Tue, Aug 31, 2010 at 5:07 PM, Gang Luo <lg...@yahoo.com.cn> wrote:
> Hi all,
> I am the administrator of a hadoop cluster. I want to know how to specify a
> group a user belong to. Or hadoop just use the group/user information from the
> linux system it runs on? For example, if a user 'smith' belongs to a group
> 'research' in the linux system. what is his account and group on HDFS?
>
>
> Also, I want to know how to specify the prefix of the home directory. If there
> are two directories on HDFS, '/usr/smith' and 'user/smit'. When smith input
> 'bin/hadoop fs -ls ~/', which directory will he see?
>
> Thanks,
> -Gang
>
>
>
>
>

Currently hadoop gets its user groups from the posix user/groups.

Your default home directory would be /user/smith. Directories like
/usr/smith or /home/smith usually appear by accidental keystrokes,
copy operations, etc.

Regards,
Edward

Re: accounts permission on hadoop

Posted by Allen Wittenauer <aw...@linkedin.com>.

On Sep 1, 2010, at 9:08 AM, Todd Lipcon wrote:
>>> 
>>> Currently hadoop gets its user groups from the posix user/groups.
>> 
>> ... based upon what the client sends, not what the server knows.
> 
> Not anymore in trunk or the security branch - now it's mapped on the
> server side with a configurable resolver class.


Yes, but only like 3 people use that stuff presently.

Trunk=unicorns and ponies.

Re: accounts permission on hadoop

Posted by Todd Lipcon <to...@cloudera.com>.

On Tue, Aug 31, 2010 at 5:28 PM, Allen Wittenauer
<aw...@linkedin.com> wrote:
>
> On Aug 31, 2010, at 2:43 PM, Edward Capriolo wrote:
>
>> On Tue, Aug 31, 2010 at 5:07 PM, Gang Luo <lg...@yahoo.com.cn> wrote:
>>> Hi all,
>>> I am the administrator of a hadoop cluster. I want to know how to specify a
>>> group a user belong to. Or hadoop just use the group/user information from the
>>> linux system it runs on? For example, if a user 'smith' belongs to a group
>>> 'research' in the linux system. what is his account and group on HDFS?
>>>
>>>
>>
>> Currently hadoop gets its user groups from the posix user/groups.
>
> ... based upon what the client sends, not what the server knows.

Not anymore in trunk or the security branch - now it's mapped on the
server side with a configurable resolver class.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: accounts permission on hadoop

Posted by Allen Wittenauer <aw...@linkedin.com>.

On Aug 31, 2010, at 2:43 PM, Edward Capriolo wrote:

> On Tue, Aug 31, 2010 at 5:07 PM, Gang Luo <lg...@yahoo.com.cn> wrote:
>> Hi all,
>> I am the administrator of a hadoop cluster. I want to know how to specify a
>> group a user belong to. Or hadoop just use the group/user information from the
>> linux system it runs on? For example, if a user 'smith' belongs to a group
>> 'research' in the linux system. what is his account and group on HDFS?
>> 
>> 
> 
> Currently hadoop gets its user groups from the posix user/groups.

... based upon what the client sends, not what the server knows.

Re: accounts permission on hadoop

Posted by Edward Capriolo <ed...@gmail.com>.

On Tue, Aug 31, 2010 at 5:07 PM, Gang Luo <lg...@yahoo.com.cn> wrote:
> Hi all,
> I am the administrator of a hadoop cluster. I want to know how to specify a
> group a user belong to. Or hadoop just use the group/user information from the
> linux system it runs on? For example, if a user 'smith' belongs to a group
> 'research' in the linux system. what is his account and group on HDFS?
>
>
> Also, I want to know how to specify the prefix of the home directory. If there
> are two directories on HDFS, '/usr/smith' and 'user/smit'. When smith input
> 'bin/hadoop fs -ls ~/', which directory will he see?
>
> Thanks,
> -Gang
>
>
>
>
>

Currently hadoop gets its user groups from the posix user/groups.

Your default home directory would be /user/smith. Directories like
/usr/smith or /home/smith usually appear by accidental keystrokes,
copy operations, etc.

Regards,
Edward

accounts permission on hadoop

Posted by Gang Luo <lg...@yahoo.com.cn>.

Hi all,
I am the administrator of a hadoop cluster. I want to know how to specify a 
group a user belong to. Or hadoop just use the group/user information from the 
linux system it runs on? For example, if a user 'smith' belongs to a group 
'research' in the linux system. what is his account and group on HDFS? 


Also, I want to know how to specify the prefix of the home directory. If there 
are two directories on HDFS, '/usr/smith' and 'user/smit'. When smith input 
'bin/hadoop fs -ls ~/', which directory will he see?

Thanks,
-Gang

Re: cluster startup problem

Posted by Greg Roelofs <ro...@yahoo-inc.com>.

Hemanth Yamijala wrote:

> On Mon, Aug 30, 2010 at 8:19 AM, Gang Luo <lg...@yahoo.com.cn> wrote:
> >
> > 1. Can I share hadoop code and its configuration across nodes? Say I have a
> > distributed file system running in the cluster and all the nodes could see the
> > hadoop code and conf there. So all the nodes will use the same copy of code and
> > conf to run. Is it possible?

> If they are on the same path, technically it should be possible.
> However, I am not sure it is advisable at all. We've tried to do
> something like this using NFS and it fails in ways that make debugging
> extremely hard.

Read-only NFS?  I recently looked into an NFS-related unit-test bug (MR-2041),
but those failures were due to directory creation and/or permissions-setting
(i.e., writing), apparently timing-related.

Greg

Re: cluster startup problem

Posted by Hemanth Yamijala <yh...@gmail.com>.

Hi,

On Mon, Aug 30, 2010 at 8:19 AM, Gang Luo <lg...@yahoo.com.cn> wrote:
> Hi all,
> I am trying to configure and start a hadoop cluster on EC2. I got some problems
> here.
>
>
> 1. Can I share hadoop code and its configuration across nodes? Say I have a
> distributed file system running in the cluster and all the nodes could see the
> hadoop code and conf there. So all the nodes will use the same copy of code and
> conf to run. Is it possible?
>

If they are on the same path, technically it should be possible.
However, I am not sure it is advisable at all. We've tried to do
something like this using NFS and it fails in ways that make debugging
extremely hard. In short, have local copies on all nodes pointing to
the same path is the recommended option.

> 2. if all the nodes could share hadoop and conf, does it mean I can launch
> hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)?
>
> 3. I think I specify and master and slave correctly. When I launch hadoop from
> master node, no tasktracker or datanode was launched on slave nodes. The log on
> slave nodes says:
>
> ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
> Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID =
> 1048149291; datanode namespaceID = 313740560
>
> what is the problem?
>
> Thanks,
> -Gang
>
>
>
>
>

Re: Does fair scheduler in Hadoop 0.20.2 support preemption or not?

Posted by Matei Zaharia <ma...@eecs.berkeley.edu>.

 The one in 0.20.2 doesn't support it. However, the Cloudera
Distribution of Hadoop has backported preemption (and the other fair
scheduler features in 0.21), so you could try that if you want
preemption on a 0.20 cluster.

Matei

On 8/29/2010 10:37 PM, xiujin yang wrote:
> Hadoop Version: 0.20.2
> Scheduler: Fair scheduler
>
> Now I use fair scheduler to arrange job, but I found the scheduler didn't support preemption? 
>
> Does 0.20.2 support preemption?
>
> I know 0.21.0 will support, 
> https://issues.apache.org/jira/browse/MAPREDUCE-551
>
>
> Thank you in advance. 
>
>
> Best,
>
> Xiujin Yang
>
>
>

Does fair scheduler in Hadoop 0.20.2 support preemption or not?

Posted by xiujin yang <xi...@hotmail.com>.

Hadoop Version: 0.20.2
Scheduler: Fair scheduler

Now I use fair scheduler to arrange job, but I found the scheduler didn't support preemption? 

Does 0.20.2 support preemption?

I know 0.21.0 will support, 
https://issues.apache.org/jira/browse/MAPREDUCE-551


Thank you in advance. 


Best,

Xiujin Yang

RE: cluster startup problem

Posted by xiujin yang <xi...@hotmail.com>.

> Date: Mon, 30 Aug 2010 10:49:50 +0800
> From: lgpublic@yahoo.com.cn
> Subject: cluster startup problem
> To: common-user@hadoop.apache.org
> 
> Hi all,
> I am trying to configure and start a hadoop cluster on EC2. I got some problems 
> here. 
> 
> 
> 1. Can I share hadoop code and its configuration across nodes? Say I have a 
> distributed file system running in the cluster and all the nodes could see the 
> hadoop code and conf there. So all the nodes will use the same copy of code and 
> conf to run. Is it possible?
Use rsync 
 

> 2. if all the nodes could share hadoop and conf, does it mean I can launch 
> hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)?
>
Just have a try. You will get answer. 

> 3. I think I specify and master and slave correctly. When I launch hadoop from 
> master node, no tasktracker or datanode was launched on slave nodes. The log on 
> slave nodes says: 
> 
> ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: 
> Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID = 
> 1048149291; datanode namespaceID = 313740560
> 
> what is the problem?
If HDFS data are useless, just delete HDFS data from datanodes one by one.


> Thanks,
> -Gang
> 
> 
> 
>

cluster startup problem

Posted by Gang Luo <lg...@yahoo.com.cn>.

Hi all,
I am trying to configure and start a hadoop cluster on EC2. I got some problems 
here. 


1. Can I share hadoop code and its configuration across nodes? Say I have a 
distributed file system running in the cluster and all the nodes could see the 
hadoop code and conf there. So all the nodes will use the same copy of code and 
conf to run. Is it possible?

2. if all the nodes could share hadoop and conf, does it mean I can launch 
hadoop (bin/start-dfs.sh, bin/start-mapred.sh) from any node (even slave node)?

3. I think I specify and master and slave correctly. When I launch hadoop from 
master node, no tasktracker or datanode was launched on slave nodes. The log on 
slave nodes says: 

ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: 
Incompatible namespaceIDs in /mnt/hadoop/dfs/data: namenode namespaceID = 
1048149291; datanode namespaceID = 313740560

what is the problem?

Thanks,
-Gang

Re: Job in 0.21

Posted by Owen O'Malley <om...@apache.org>.

On Sun, Aug 29, 2010 at 4:39 PM, Mark <st...@gmail.com> wrote:
>  How should I be creating a new Job instance in 0.21. It looks like
> Job(Configuration conf, String jobName) has been deprecated.

Go ahead and use that method. I have a jira open to undeprecate it.

-- Owen