You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Todd Lipcon <to...@cloudera.com> on 2010/09/13 19:05:43 UTC

Re: hadoop.job.ugi backwards compatibility

On Mon, Sep 13, 2010 at 9:31 AM, Owen O'Malley <om...@apache.org> wrote:

> Moving the discussion over to the more appropriate mapreduce-dev.
>

This is not MR-specific, since the strangely named hadoop.job.ugi determines
HDFS permissions as well. +CC hdfs-dev... though I actually think this is an
issue that users will have interest in, which is why I posted to general
initially rather than a dev list.


> On Mon, Sep 13, 2010 at 9:08 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > 1) Groups resolution happens on the server side, where it used to happen
> on
> > the client. Thus, all Hadoop users must exist on the NN/JT machines in
> order
> > for group mapping to succeed (or the user must write a custom group
> mapper).
>
> There is a plugin that performs the group lookup. See HADOOP-4656.
> There is no requirement for having the user accounts on the NN/JT
> although that is the easiest approach. It is not recommended that the
> users be allowed to login.
>

"or the user must write a custom group mapper" above refers to this plugin
capability. But I think most users do not want to spend the time to write
(or even setup) such a plugin beyond the default shell-based mapping
service.


> I think it is important that turning security on and off doesn't
> drastically change the semantics or protocols. That will become much
> much harder to support downstream.
>
>
As someone who spends an awful lot of time doing downstream support of lots
of different clusters, I actually disagree. I believe the majority of users
do *not* plan on turning on security, so keeping things simpler for them is
worth a lot. In many of these clusters the users and the ops team and the
developers are all one and the same - it's not the multitenant "internal
service" model that we see at the larger installations like Yahoo or
Facebook.


> > 2) The hadoop.job.ugi parameter is ignored - instead the user has to use
> the
> > new UGI.createRemoteUser("foo").doAs() API, even in simple security.
>
> User code that counts on hadoop.job.ugi working will be horribly
> broken once you turn on security. Turning on and off security should
> not involve testing all of your applications. It is unfortunate that
> we ever used the configuration value as the user, but continuing to
> support it will make our user's code much much more brittle.
>

The assumption above is "once you turn on security" - but many users will
not and probably never will turn on security. Providing a transition plan
for one version is our usual policy here - I agree that long term we would
like to do away with this hack of a configuration parameter. Since it's not
hard to provide a backwards compatibility path with a deprecation warning
for one version, are you against it? Or just saying that on your particular
clusters you will choose not to take advantage of it?

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Re: hadoop.job.ugi backwards compatibility

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Sep 13, 2010, at 10:05 AM, Todd Lipcon wrote:

> On Mon, Sep 13, 2010 at 9:31 AM, Owen O'Malley <om...@apache.org> wrote:
> 
>> Moving the discussion over to the more appropriate mapreduce-dev.
>> 
> 
> This is not MR-specific, since the strangely named hadoop.job.ugi determines
> HDFS permissions as well. +CC hdfs-dev... though I actually think this is an
> issue that users will have interest in, which is why I posted to general
> initially rather than a dev list.


+1


Re: hadoop.job.ugi backwards compatibility

Posted by Allen Wittenauer <aw...@linkedin.com>.
On Sep 13, 2010, at 10:05 AM, Todd Lipcon wrote:

> On Mon, Sep 13, 2010 at 9:31 AM, Owen O'Malley <om...@apache.org> wrote:
> 
>> Moving the discussion over to the more appropriate mapreduce-dev.
>> 
> 
> This is not MR-specific, since the strangely named hadoop.job.ugi determines
> HDFS permissions as well. +CC hdfs-dev... though I actually think this is an
> issue that users will have interest in, which is why I posted to general
> initially rather than a dev list.


+1


Re: hadoop.job.ugi backwards compatibility

Posted by Owen O'Malley <om...@apache.org>.
On Sep 13, 2010, at 4:23 PM, Todd Lipcon wrote:

> I agree that keeping API compatibility for UGI was probably  
> impossible, and
> respect that. But it would certainly be very easy to do a patch like  
> the
> following:
>
> JobClient(Configuration conf) {
>  if (conf.get("hadoop.job.ugi") != null &&
> UserGroupInformation.isSecurityEnabled()) {
>    LOG.warn("Stop being evil. Don't use hadoop.job.ugi! RAAWR");
>    UserGroupInformation.createRemoteUser(...).doAs() { create proxy }
>  } else {
>    create normal RPC proxy;
>  }
> }

My problem is three fold:
   1. It isn't one or two spots. It is a *lot* of spots. Doing it  
inconsistently would be far worse than useless.
   2. Having two different authentication paths dramatically increases  
the chance for bugs.
   3. The previously mentioned badness where the api semantics  
dramatically change with the value of a config variable that isn't  
there to enable backwards compatibility.

Furthermore, the upside is really small consisting of only the users  
that have:
   1. developed internal servers that handle multiple users.
   2. on hadoop 0.20
   3. never plan on turning on security
   4. are interested in moving to 0.21 or 0.22
   5. aren't willing to do the straightforward fixes to their code.

-- Owen

Re: hadoop.job.ugi backwards compatibility

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Sep 13, 2010 at 1:04 PM, Owen O'Malley <om...@apache.org> wrote:

> On Mon, Sep 13, 2010 at 11:10 AM, Todd Lipcon <to...@cloudera.com> wrote:
> > Yep, but there are plenty of 10 node clusters out there that do important
> > work at small startups or single-use-case installations, too. We need to
> > provide scalability and security features that work for the 100+ node
> > clusters but also not leave the beginners in the dust.
>
> 10 node clusters are an important use case, but creating the user
> accounts on those clusters is very easy because of the few users.
> Futhermore, if the accounts aren't there it just means the users have
> no groups. Which for a single use system with security turned off
> isn't the end of the world.
>
> > But I think there are plenty of people out there who have built small
> > webapps, shell scripts, cron jobs, etc that use hadoop.job.ugi on some
> > shared account to impersonate other users.
>
> I'd be surprised. At Yahoo, the primary problem came with people
> screen scraping the jobtracker http. With security turned off that
> isn't an issue. Again, it isn't hard, just the evolving interface of
> UserGroupInformation changed. With security, we tried really hard to
> maintain backwards compatibility and succeeded for the vast (99%+)
> majority of the users.
>
> > Perhaps I am estimating
> > incorrectly - that's why I wanted this discussion on a user-facing list
> > rather than a dev-facing list.
>
> Obviously the pointer is there for them to follow into the rabbit hole
> of the dev lists. *grin*
>
> > Another example use case that I do a lot on non-secure clusters is:
> hadoop
> > fs -Dhadoop.job.ugi=hadoop,hadoop <something I want to do as a
> superuser>.
> > The permissions model we have in 0.20 obviously isn't secure, but it's
> nice
> > to avoid accidental mistakes, and making it easy to "sudo" like that is
> > handy.
>
> It might make sense to add a new switch ( -user ?) to hadoop fs that
> does a doAs before doing the
> shell command. You could even make it fancy and try to be a proxy user
> if security is turned on.
>

Yep, I agree - I think either (ab)using proxyuser functionality or adding
some new "sudoers" like configuration would be very handy and we should do
it.


>
> > Regardless of our particular opinions, isn't our policy that we cannot
> break
> > API compatibility between versions without a one-version deprecation
> period?
>
> There wasn't a way to keep UGI stable. It was a broken design before
> the security work. It is marked evolving so we try to minimize
> breakage, but it isn't prohibited.
>
>
I agree that keeping API compatibility for UGI was probably impossible, and
respect that. But it would certainly be very easy to do a patch like the
following:

JobClient(Configuration conf) {
  if (conf.get("hadoop.job.ugi") != null &&
UserGroupInformation.isSecurityEnabled()) {
    LOG.warn("Stop being evil. Don't use hadoop.job.ugi! RAAWR");
    UserGroupInformation.createRemoteUser(...).doAs() { create proxy }
  } else {
    create normal RPC proxy;
  }
}

... and the same on the HDFS side.

Would you -1 such a compatibility layer?

-Todd



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: hadoop.job.ugi backwards compatibility

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Sep 13, 2010 at 1:04 PM, Owen O'Malley <om...@apache.org> wrote:

> On Mon, Sep 13, 2010 at 11:10 AM, Todd Lipcon <to...@cloudera.com> wrote:
> > Yep, but there are plenty of 10 node clusters out there that do important
> > work at small startups or single-use-case installations, too. We need to
> > provide scalability and security features that work for the 100+ node
> > clusters but also not leave the beginners in the dust.
>
> 10 node clusters are an important use case, but creating the user
> accounts on those clusters is very easy because of the few users.
> Futhermore, if the accounts aren't there it just means the users have
> no groups. Which for a single use system with security turned off
> isn't the end of the world.
>
> > But I think there are plenty of people out there who have built small
> > webapps, shell scripts, cron jobs, etc that use hadoop.job.ugi on some
> > shared account to impersonate other users.
>
> I'd be surprised. At Yahoo, the primary problem came with people
> screen scraping the jobtracker http. With security turned off that
> isn't an issue. Again, it isn't hard, just the evolving interface of
> UserGroupInformation changed. With security, we tried really hard to
> maintain backwards compatibility and succeeded for the vast (99%+)
> majority of the users.
>
> > Perhaps I am estimating
> > incorrectly - that's why I wanted this discussion on a user-facing list
> > rather than a dev-facing list.
>
> Obviously the pointer is there for them to follow into the rabbit hole
> of the dev lists. *grin*
>
> > Another example use case that I do a lot on non-secure clusters is:
> hadoop
> > fs -Dhadoop.job.ugi=hadoop,hadoop <something I want to do as a
> superuser>.
> > The permissions model we have in 0.20 obviously isn't secure, but it's
> nice
> > to avoid accidental mistakes, and making it easy to "sudo" like that is
> > handy.
>
> It might make sense to add a new switch ( -user ?) to hadoop fs that
> does a doAs before doing the
> shell command. You could even make it fancy and try to be a proxy user
> if security is turned on.
>

Yep, I agree - I think either (ab)using proxyuser functionality or adding
some new "sudoers" like configuration would be very handy and we should do
it.


>
> > Regardless of our particular opinions, isn't our policy that we cannot
> break
> > API compatibility between versions without a one-version deprecation
> period?
>
> There wasn't a way to keep UGI stable. It was a broken design before
> the security work. It is marked evolving so we try to minimize
> breakage, but it isn't prohibited.
>
>
I agree that keeping API compatibility for UGI was probably impossible, and
respect that. But it would certainly be very easy to do a patch like the
following:

JobClient(Configuration conf) {
  if (conf.get("hadoop.job.ugi") != null &&
UserGroupInformation.isSecurityEnabled()) {
    LOG.warn("Stop being evil. Don't use hadoop.job.ugi! RAAWR");
    UserGroupInformation.createRemoteUser(...).doAs() { create proxy }
  } else {
    create normal RPC proxy;
  }
}

... and the same on the HDFS side.

Would you -1 such a compatibility layer?

-Todd



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: hadoop.job.ugi backwards compatibility

Posted by Owen O'Malley <om...@apache.org>.
On Mon, Sep 13, 2010 at 11:10 AM, Todd Lipcon <to...@cloudera.com> wrote:
> Yep, but there are plenty of 10 node clusters out there that do important
> work at small startups or single-use-case installations, too. We need to
> provide scalability and security features that work for the 100+ node
> clusters but also not leave the beginners in the dust.

10 node clusters are an important use case, but creating the user
accounts on those clusters is very easy because of the few users.
Futhermore, if the accounts aren't there it just means the users have
no groups. Which for a single use system with security turned off
isn't the end of the world.

> But I think there are plenty of people out there who have built small
> webapps, shell scripts, cron jobs, etc that use hadoop.job.ugi on some
> shared account to impersonate other users.

I'd be surprised. At Yahoo, the primary problem came with people
screen scraping the jobtracker http. With security turned off that
isn't an issue. Again, it isn't hard, just the evolving interface of
UserGroupInformation changed. With security, we tried really hard to
maintain backwards compatibility and succeeded for the vast (99%+)
majority of the users.

> Perhaps I am estimating
> incorrectly - that's why I wanted this discussion on a user-facing list
> rather than a dev-facing list.

Obviously the pointer is there for them to follow into the rabbit hole
of the dev lists. *grin*

> Another example use case that I do a lot on non-secure clusters is: hadoop
> fs -Dhadoop.job.ugi=hadoop,hadoop <something I want to do as a superuser>.
> The permissions model we have in 0.20 obviously isn't secure, but it's nice
> to avoid accidental mistakes, and making it easy to "sudo" like that is
> handy.

It might make sense to add a new switch ( -user ?) to hadoop fs that
does a doAs before doing the
shell command. You could even make it fancy and try to be a proxy user
if security is turned on.

> Regardless of our particular opinions, isn't our policy that we cannot break
> API compatibility between versions without a one-version deprecation period?

There wasn't a way to keep UGI stable. It was a broken design before
the security work. It is marked evolving so we try to minimize
breakage, but it isn't prohibited.

-- Owen

Re: hadoop.job.ugi backwards compatibility

Posted by Owen O'Malley <om...@apache.org>.
On Mon, Sep 13, 2010 at 11:10 AM, Todd Lipcon <to...@cloudera.com> wrote:
> Yep, but there are plenty of 10 node clusters out there that do important
> work at small startups or single-use-case installations, too. We need to
> provide scalability and security features that work for the 100+ node
> clusters but also not leave the beginners in the dust.

10 node clusters are an important use case, but creating the user
accounts on those clusters is very easy because of the few users.
Futhermore, if the accounts aren't there it just means the users have
no groups. Which for a single use system with security turned off
isn't the end of the world.

> But I think there are plenty of people out there who have built small
> webapps, shell scripts, cron jobs, etc that use hadoop.job.ugi on some
> shared account to impersonate other users.

I'd be surprised. At Yahoo, the primary problem came with people
screen scraping the jobtracker http. With security turned off that
isn't an issue. Again, it isn't hard, just the evolving interface of
UserGroupInformation changed. With security, we tried really hard to
maintain backwards compatibility and succeeded for the vast (99%+)
majority of the users.

> Perhaps I am estimating
> incorrectly - that's why I wanted this discussion on a user-facing list
> rather than a dev-facing list.

Obviously the pointer is there for them to follow into the rabbit hole
of the dev lists. *grin*

> Another example use case that I do a lot on non-secure clusters is: hadoop
> fs -Dhadoop.job.ugi=hadoop,hadoop <something I want to do as a superuser>.
> The permissions model we have in 0.20 obviously isn't secure, but it's nice
> to avoid accidental mistakes, and making it easy to "sudo" like that is
> handy.

It might make sense to add a new switch ( -user ?) to hadoop fs that
does a doAs before doing the
shell command. You could even make it fancy and try to be a proxy user
if security is turned on.

> Regardless of our particular opinions, isn't our policy that we cannot break
> API compatibility between versions without a one-version deprecation period?

There wasn't a way to keep UGI stable. It was a broken design before
the security work. It is marked evolving so we try to minimize
breakage, but it isn't prohibited.

-- Owen

Re: hadoop.job.ugi backwards compatibility

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Sep 13, 2010 at 10:59 AM, Owen O'Malley <om...@apache.org> wrote:

> On Mon, Sep 13, 2010 at 10:05 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > This is not MR-specific, since the strangely named hadoop.job.ugi
> determines
> > HDFS permissions as well.
>
> Yeah, after I hit send, I realized that I should have used common-dev.
> This is really a dev issue.
>
> > "or the user must write a custom group mapper" above refers to this
> plugin
> > capability. But I think most users do not want to spend the time to write
> > (or even setup) such a plugin beyond the default shell-based mapping
> > service.
>
> Sure, which is why it is easiest to just have the (hopefully disabled)
> user accounts on the jt/nn. Any installs > 100 nodes should be using
> HADOOP-6864 to avoid the fork in the JT/NN.
>

Yep, but there are plenty of 10 node clusters out there that do important
work at small startups or single-use-case installations, too. We need to
provide scalability and security features that work for the 100+ node
clusters but also not leave the beginners in the dust.


>
> > As someone who spends an awful lot of time doing downstream support of
> lots
> > of different clusters, I actually disagree.
>
> Normal applications never need to do doAs. They run as the default
> user. This only comes up in servers that deal with multiple users. In
> *that* context, it sucks having servers that only work in non-secure
> mode. If some server X only works without security that sucks. Doing
> doAs isn't harder, it is just different. Having two different
> semantics models *will* cause lots of grief.
>

I agree that all real (ie community) projects should support both security
and non-security and shouldn't be using hadoop.job.ugi to impersonate users.
But I think there are plenty of people out there who have built small
webapps, shell scripts, cron jobs, etc that use hadoop.job.ugi on some
shared account to impersonate other users. Perhaps I am estimating
incorrectly - that's why I wanted this discussion on a user-facing list
rather than a dev-facing list.

Another example use case that I do a lot on non-secure clusters is: hadoop
fs -Dhadoop.job.ugi=hadoop,hadoop <something I want to do as a superuser>.
The permissions model we have in 0.20 obviously isn't secure, but it's nice
to avoid accidental mistakes, and making it easy to "sudo" like that is
handy.

Regardless of our particular opinions, isn't our policy that we cannot break
API compatibility between versions without a one-version deprecation period?
I see this as an important API (even if it isn't one we like) and breaking
it without such a transition period is against our own rules. Like you said,
doAs() isn't any harder, but we need to give people a grace period to switch
over, and we probably need to write some command line tools to allow fs
operations as superuser, etc.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: hadoop.job.ugi backwards compatibility

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Sep 13, 2010 at 10:59 AM, Owen O'Malley <om...@apache.org> wrote:

> On Mon, Sep 13, 2010 at 10:05 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > This is not MR-specific, since the strangely named hadoop.job.ugi
> determines
> > HDFS permissions as well.
>
> Yeah, after I hit send, I realized that I should have used common-dev.
> This is really a dev issue.
>
> > "or the user must write a custom group mapper" above refers to this
> plugin
> > capability. But I think most users do not want to spend the time to write
> > (or even setup) such a plugin beyond the default shell-based mapping
> > service.
>
> Sure, which is why it is easiest to just have the (hopefully disabled)
> user accounts on the jt/nn. Any installs > 100 nodes should be using
> HADOOP-6864 to avoid the fork in the JT/NN.
>

Yep, but there are plenty of 10 node clusters out there that do important
work at small startups or single-use-case installations, too. We need to
provide scalability and security features that work for the 100+ node
clusters but also not leave the beginners in the dust.


>
> > As someone who spends an awful lot of time doing downstream support of
> lots
> > of different clusters, I actually disagree.
>
> Normal applications never need to do doAs. They run as the default
> user. This only comes up in servers that deal with multiple users. In
> *that* context, it sucks having servers that only work in non-secure
> mode. If some server X only works without security that sucks. Doing
> doAs isn't harder, it is just different. Having two different
> semantics models *will* cause lots of grief.
>

I agree that all real (ie community) projects should support both security
and non-security and shouldn't be using hadoop.job.ugi to impersonate users.
But I think there are plenty of people out there who have built small
webapps, shell scripts, cron jobs, etc that use hadoop.job.ugi on some
shared account to impersonate other users. Perhaps I am estimating
incorrectly - that's why I wanted this discussion on a user-facing list
rather than a dev-facing list.

Another example use case that I do a lot on non-secure clusters is: hadoop
fs -Dhadoop.job.ugi=hadoop,hadoop <something I want to do as a superuser>.
The permissions model we have in 0.20 obviously isn't secure, but it's nice
to avoid accidental mistakes, and making it easy to "sudo" like that is
handy.

Regardless of our particular opinions, isn't our policy that we cannot break
API compatibility between versions without a one-version deprecation period?
I see this as an important API (even if it isn't one we like) and breaking
it without such a transition period is against our own rules. Like you said,
doAs() isn't any harder, but we need to give people a grace period to switch
over, and we probably need to write some command line tools to allow fs
operations as superuser, etc.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: hadoop.job.ugi backwards compatibility

Posted by Owen O'Malley <om...@apache.org>.
On Mon, Sep 13, 2010 at 10:05 AM, Todd Lipcon <to...@cloudera.com> wrote:

> This is not MR-specific, since the strangely named hadoop.job.ugi determines
> HDFS permissions as well.

Yeah, after I hit send, I realized that I should have used common-dev.
This is really a dev issue.

> "or the user must write a custom group mapper" above refers to this plugin
> capability. But I think most users do not want to spend the time to write
> (or even setup) such a plugin beyond the default shell-based mapping
> service.

Sure, which is why it is easiest to just have the (hopefully disabled)
user accounts on the jt/nn. Any installs > 100 nodes should be using
HADOOP-6864 to avoid the fork in the JT/NN.

> As someone who spends an awful lot of time doing downstream support of lots
> of different clusters, I actually disagree.

Normal applications never need to do doAs. They run as the default
user. This only comes up in servers that deal with multiple users. In
*that* context, it sucks having servers that only work in non-secure
mode. If some server X only works without security that sucks. Doing
doAs isn't harder, it is just different. Having two different
semantics models *will* cause lots of grief.

-- Owen

Re: hadoop.job.ugi backwards compatibility

Posted by Owen O'Malley <om...@apache.org>.
On Mon, Sep 13, 2010 at 10:05 AM, Todd Lipcon <to...@cloudera.com> wrote:

> This is not MR-specific, since the strangely named hadoop.job.ugi determines
> HDFS permissions as well.

Yeah, after I hit send, I realized that I should have used common-dev.
This is really a dev issue.

> "or the user must write a custom group mapper" above refers to this plugin
> capability. But I think most users do not want to spend the time to write
> (or even setup) such a plugin beyond the default shell-based mapping
> service.

Sure, which is why it is easiest to just have the (hopefully disabled)
user accounts on the jt/nn. Any installs > 100 nodes should be using
HADOOP-6864 to avoid the fork in the JT/NN.

> As someone who spends an awful lot of time doing downstream support of lots
> of different clusters, I actually disagree.

Normal applications never need to do doAs. They run as the default
user. This only comes up in servers that deal with multiple users. In
*that* context, it sucks having servers that only work in non-secure
mode. If some server X only works without security that sucks. Doing
doAs isn't harder, it is just different. Having two different
semantics models *will* cause lots of grief.

-- Owen