You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Colin McCabe <cm...@alumni.cmu.edu> on 2012/06/12 00:57:41 UTC

validating user IDs

Hi all,

I recently pulled the latest source, and ran a full build.  The
command line was this:
mvn compile -Pnative

I was confronted with this:

[INFO] Requested user cmccabe has id 500, which is below the minimum
allowed 1000
[INFO] FAIL: test-container-executor
[INFO] ================================================
[INFO] 1 of 1 test failed
[INFO] Please report to mapreduce-dev@hadoop.apache.org
[INFO] ================================================
[INFO] make[1]: *** [check-TESTS] Error 1
[INFO] make[1]: Leaving directory
`/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/container-executor'

Needless to say, it didn't do much to improve my mood.  I was even
less happy when I discovered that -DskipTests has no effect on native
tests (they always run.)  See HADOOP-8480.

Unfortunately, it seems like this problem is popping up more and more
in our native code.  It first appeared in test-task-controller (see
MAPREDUCE-2376) and then later in test-container-executor
(HADOOP-8499).  The basic problem seems to be the hardcoded assumption
that all user IDs below 1000 are system IDs.

It is true that there are configuration files that can be changed to
alter the minimum user ID, but unfortunately these configuration files
are not used by the unit tests.  So anyone developing on a platform
where the user IDs start at 500 is now a second-class citizen, unable
to run unit tests.  This includes anyone running Red Hat, MacOS,
Fedora, etc.

Personally, I can change my user ID.  It's a time-consuming process,
because I need to re-uid all files, but I can do it.  This luxury may
not be available to everyone, though-- developers who don't have root
on their machines, or are using a pre-assigned user ID to connect to
NFS come to mind.

It's true that we could hack around this with environment variables.
It might even be possible to have Maven set these environment
variables automatically from the current user ID.  However, the larger
question I have here is whether this UID validation scheme even makes
any sense.  I have a user named "nobody" whose user ID is 65534.
Surely I should not be able to run map-reduce jobs as this user?  Yet,
under the current system, I can do exactly that.  The root of the
problem seems to be that there is both a default minimum and a default
maximum for "automatic" user IDs.  This configuration seems to be
stored in /etc/login.defs.

On my system, it has:
SYSTEM_UID_MIN            100
SYSTEM_UID_MAX            499
UID_MIN                  500
UID_MAX                 60000

So that means that anything over 60000 (like nobody) is not considered
a valid user ID for regular users.
We could potentially read this file (at least on Linux) and get more
sensible defaults.

I am also curious if we could simply check whether the user we're
trying to run the job as has a valid login shell.  System users are
almost always set to have a login shell of /bin/false or
/sbin/nologin.

Thoughts?
Colin

Re: validating user IDs

Posted by "Aaron T. Myers" <at...@cloudera.com>.
-hdfs-dev@
+mapreduce-dev@

Moving to the more-relevant mapreduce-dev.

--
Aaron T. Myers
Software Engineer, Cloudera



On Mon, Jun 11, 2012 at 4:12 PM, Alejandro Abdelnur <tu...@cloudera.com>wrote:

> Colin,
>
> Would be possible using some kind of cmake config magic to set a macro to
> the current OS limit? Even if this means detecting the OS version and
> assuming its default limit.
>
> thx
>
> On Mon, Jun 11, 2012 at 3:57 PM, Colin McCabe <cmccabe@alumni.cmu.edu
> >wrote:
>
> > Hi all,
> >
> > I recently pulled the latest source, and ran a full build.  The
> > command line was this:
> > mvn compile -Pnative
> >
> > I was confronted with this:
> >
> > [INFO] Requested user cmccabe has id 500, which is below the minimum
> > allowed 1000
> > [INFO] FAIL: test-container-executor
> > [INFO] ================================================
> > [INFO] 1 of 1 test failed
> > [INFO] Please report to mapreduce-dev@hadoop.apache.org
> > [INFO] ================================================
> > [INFO] make[1]: *** [check-TESTS] Error 1
> > [INFO] make[1]: Leaving directory
> >
> >
> `/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/container-executor'
> >
> > Needless to say, it didn't do much to improve my mood.  I was even
> > less happy when I discovered that -DskipTests has no effect on native
> > tests (they always run.)  See HADOOP-8480.
> >
> > Unfortunately, it seems like this problem is popping up more and more
> > in our native code.  It first appeared in test-task-controller (see
> > MAPREDUCE-2376) and then later in test-container-executor
> > (HADOOP-8499).  The basic problem seems to be the hardcoded assumption
> > that all user IDs below 1000 are system IDs.
> >
> > It is true that there are configuration files that can be changed to
> > alter the minimum user ID, but unfortunately these configuration files
> > are not used by the unit tests.  So anyone developing on a platform
> > where the user IDs start at 500 is now a second-class citizen, unable
> > to run unit tests.  This includes anyone running Red Hat, MacOS,
> > Fedora, etc.
> >
> > Personally, I can change my user ID.  It's a time-consuming process,
> > because I need to re-uid all files, but I can do it.  This luxury may
> > not be available to everyone, though-- developers who don't have root
> > on their machines, or are using a pre-assigned user ID to connect to
> > NFS come to mind.
> >
> > It's true that we could hack around this with environment variables.
> > It might even be possible to have Maven set these environment
> > variables automatically from the current user ID.  However, the larger
> > question I have here is whether this UID validation scheme even makes
> > any sense.  I have a user named "nobody" whose user ID is 65534.
> > Surely I should not be able to run map-reduce jobs as this user?  Yet,
> > under the current system, I can do exactly that.  The root of the
> > problem seems to be that there is both a default minimum and a default
> > maximum for "automatic" user IDs.  This configuration seems to be
> > stored in /etc/login.defs.
> >
> > On my system, it has:
> > SYSTEM_UID_MIN            100
> > SYSTEM_UID_MAX            499
> > UID_MIN                  500
> > UID_MAX                 60000
> >
> > So that means that anything over 60000 (like nobody) is not considered
> > a valid user ID for regular users.
> > We could potentially read this file (at least on Linux) and get more
> > sensible defaults.
> >
> > I am also curious if we could simply check whether the user we're
> > trying to run the job as has a valid login shell.  System users are
> > almost always set to have a login shell of /bin/false or
> > /sbin/nologin.
> >
> > Thoughts?
> > Colin
> >
>
>
>
> --
> Alejandro
>

Re: validating user IDs

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
Sure.  We could also find the current user ID and bake that into the
test as an "acceptable" UID.  If that makes sense.

Colin


On Mon, Jun 11, 2012 at 4:12 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> Colin,
>
> Would be possible using some kind of cmake config magic to set a macro to
> the current OS limit? Even if this means detecting the OS version and
> assuming its default limit.
>
> thx
>
> On Mon, Jun 11, 2012 at 3:57 PM, Colin McCabe <cm...@alumni.cmu.edu>wrote:
>
>> Hi all,
>>
>> I recently pulled the latest source, and ran a full build.  The
>> command line was this:
>> mvn compile -Pnative
>>
>> I was confronted with this:
>>
>> [INFO] Requested user cmccabe has id 500, which is below the minimum
>> allowed 1000
>> [INFO] FAIL: test-container-executor
>> [INFO] ================================================
>> [INFO] 1 of 1 test failed
>> [INFO] Please report to mapreduce-dev@hadoop.apache.org
>> [INFO] ================================================
>> [INFO] make[1]: *** [check-TESTS] Error 1
>> [INFO] make[1]: Leaving directory
>>
>> `/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/container-executor'
>>
>> Needless to say, it didn't do much to improve my mood.  I was even
>> less happy when I discovered that -DskipTests has no effect on native
>> tests (they always run.)  See HADOOP-8480.
>>
>> Unfortunately, it seems like this problem is popping up more and more
>> in our native code.  It first appeared in test-task-controller (see
>> MAPREDUCE-2376) and then later in test-container-executor
>> (HADOOP-8499).  The basic problem seems to be the hardcoded assumption
>> that all user IDs below 1000 are system IDs.
>>
>> It is true that there are configuration files that can be changed to
>> alter the minimum user ID, but unfortunately these configuration files
>> are not used by the unit tests.  So anyone developing on a platform
>> where the user IDs start at 500 is now a second-class citizen, unable
>> to run unit tests.  This includes anyone running Red Hat, MacOS,
>> Fedora, etc.
>>
>> Personally, I can change my user ID.  It's a time-consuming process,
>> because I need to re-uid all files, but I can do it.  This luxury may
>> not be available to everyone, though-- developers who don't have root
>> on their machines, or are using a pre-assigned user ID to connect to
>> NFS come to mind.
>>
>> It's true that we could hack around this with environment variables.
>> It might even be possible to have Maven set these environment
>> variables automatically from the current user ID.  However, the larger
>> question I have here is whether this UID validation scheme even makes
>> any sense.  I have a user named "nobody" whose user ID is 65534.
>> Surely I should not be able to run map-reduce jobs as this user?  Yet,
>> under the current system, I can do exactly that.  The root of the
>> problem seems to be that there is both a default minimum and a default
>> maximum for "automatic" user IDs.  This configuration seems to be
>> stored in /etc/login.defs.
>>
>> On my system, it has:
>> SYSTEM_UID_MIN            100
>> SYSTEM_UID_MAX            499
>> UID_MIN                  500
>> UID_MAX                 60000
>>
>> So that means that anything over 60000 (like nobody) is not considered
>> a valid user ID for regular users.
>> We could potentially read this file (at least on Linux) and get more
>> sensible defaults.
>>
>> I am also curious if we could simply check whether the user we're
>> trying to run the job as has a valid login shell.  System users are
>> almost always set to have a login shell of /bin/false or
>> /sbin/nologin.
>>
>> Thoughts?
>> Colin
>>
>
>
>
> --
> Alejandro

Re: validating user IDs

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Colin,

Would be possible using some kind of cmake config magic to set a macro to
the current OS limit? Even if this means detecting the OS version and
assuming its default limit.

thx

On Mon, Jun 11, 2012 at 3:57 PM, Colin McCabe <cm...@alumni.cmu.edu>wrote:

> Hi all,
>
> I recently pulled the latest source, and ran a full build.  The
> command line was this:
> mvn compile -Pnative
>
> I was confronted with this:
>
> [INFO] Requested user cmccabe has id 500, which is below the minimum
> allowed 1000
> [INFO] FAIL: test-container-executor
> [INFO] ================================================
> [INFO] 1 of 1 test failed
> [INFO] Please report to mapreduce-dev@hadoop.apache.org
> [INFO] ================================================
> [INFO] make[1]: *** [check-TESTS] Error 1
> [INFO] make[1]: Leaving directory
>
> `/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/container-executor'
>
> Needless to say, it didn't do much to improve my mood.  I was even
> less happy when I discovered that -DskipTests has no effect on native
> tests (they always run.)  See HADOOP-8480.
>
> Unfortunately, it seems like this problem is popping up more and more
> in our native code.  It first appeared in test-task-controller (see
> MAPREDUCE-2376) and then later in test-container-executor
> (HADOOP-8499).  The basic problem seems to be the hardcoded assumption
> that all user IDs below 1000 are system IDs.
>
> It is true that there are configuration files that can be changed to
> alter the minimum user ID, but unfortunately these configuration files
> are not used by the unit tests.  So anyone developing on a platform
> where the user IDs start at 500 is now a second-class citizen, unable
> to run unit tests.  This includes anyone running Red Hat, MacOS,
> Fedora, etc.
>
> Personally, I can change my user ID.  It's a time-consuming process,
> because I need to re-uid all files, but I can do it.  This luxury may
> not be available to everyone, though-- developers who don't have root
> on their machines, or are using a pre-assigned user ID to connect to
> NFS come to mind.
>
> It's true that we could hack around this with environment variables.
> It might even be possible to have Maven set these environment
> variables automatically from the current user ID.  However, the larger
> question I have here is whether this UID validation scheme even makes
> any sense.  I have a user named "nobody" whose user ID is 65534.
> Surely I should not be able to run map-reduce jobs as this user?  Yet,
> under the current system, I can do exactly that.  The root of the
> problem seems to be that there is both a default minimum and a default
> maximum for "automatic" user IDs.  This configuration seems to be
> stored in /etc/login.defs.
>
> On my system, it has:
> SYSTEM_UID_MIN            100
> SYSTEM_UID_MAX            499
> UID_MIN                  500
> UID_MAX                 60000
>
> So that means that anything over 60000 (like nobody) is not considered
> a valid user ID for regular users.
> We could potentially read this file (at least on Linux) and get more
> sensible defaults.
>
> I am also curious if we could simply check whether the user we're
> trying to run the job as has a valid login shell.  System users are
> almost always set to have a login shell of /bin/false or
> /sbin/nologin.
>
> Thoughts?
> Colin
>



-- 
Alejandro