You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Pramod Immaneni <pr...@datatorrent.com> on 2017/05/18 22:44:15 UTC

impersonation and application path

Apex cli supports impersonation in secure mode. With impersonation, the
user running the cli or the user authenticating with hadoop (henceforth
referred to as login user) can be different from the effective user with
which the actions are performed under hadoop. An example for this is an
application can be launched by user A to run in hadoop as user B. This is
kind of like the sudo functionality in unix. You can find more details
about the functionalilty here https://apex.apache.org/docs/apex/security/ in
the Impersonation section.

What happens today with launching an application with impersonation, using
the above launch example, is that even though the application runs as user
B it still uses user A's hdfs path for the application path. The
application path is where the artifacts necessary to run the application
are stored and where the runtime files like checkpoints are stored. This
means that user B needs to have read and write access to user A's
application path folders.

This may not be allowed in certain environments as it may be a policy
violation for the following reason. Because user A is able to impersonate
as user B to launch the application, A is considered to be a higher
privileged user than B and is given necessary privileges in hadoop to do
so. But after launch B needs to access folders belonging to A which could
constitute a violation as we are not asking for permissions for a lower
privilege user to access resources of a higher privilege user.

I would like to propose adding a configuration setting, which when set will
use the application path in the impersonated user's home directory (user B)
as opposed to impersonating user's home directory (user A). If this setting
is not specified then the behavior can default to what it is today for
backwards compatibility.

Comments, suggestions, concerns?

Thanks

Re: impersonation and application path

Posted by Priyanka Gugale <pr...@apache.org>.
+1 for proposal.

Can we make new behaviour of writing to users own directory as default?
Most probably users will upgrade gateway with apex-core. If not, they
always have option to set the flag and fall back to legacy behaviour.

-Priyanka

On Fri, May 19, 2017 at 7:52 AM, Chinmay Kolhatkar <ch...@datatorrent.com>
wrote:

> +1 for pramod's proposal.
>
> On 19-May-2017 4:51 AM, "Sanjay Pujare" <sa...@datatorrent.com> wrote:
>
> > +1 for Pramod's proposal for impersonation.
> >
> > I have an issue with Sandesh's suggestion about making the new behavior
> as
> > the default (or only) behavior. This will introduce incompatibility with
> > other legacy tools (e.g. Datatorrent's dtGateway) that assume user A's
> HDFS
> > path as the application path. Because the legacy tools will continue to
> > assume the old path (user A's path) they will not work with the Apex core
> > that has this change.
> >
> > The current behavior might also be preferable to certain users or their
> > administrators because of not having to deal with multiple HDFS user
> > directories (for administration, logging, backup etc).
> >
> > On Thu, May 18, 2017 at 4:01 PM, Sandesh Hegde <sa...@datatorrent.com>
> > wrote:
> >
> > > My vote is to make the new proposal as the default behavior. Is there a
> > use
> > > case for the current behavior? If not then no need to add the
> > configuration
> > > setting.
> > >
> > > On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni <
> pramod@datatorrent.com>
> > > wrote:
> > >
> > > > Sorry typo in sentence "as we are not asking for permissions for a
> > lower
> > > > privilege", please read as "as we are now asking for permissions for
> a
> > > > lower privilege".
> > > >
> > > > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <
> > pramod@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > Apex cli supports impersonation in secure mode. With impersonation,
> > the
> > > > > user running the cli or the user authenticating with hadoop
> > (henceforth
> > > > > referred to as login user) can be different from the effective user
> > > with
> > > > > which the actions are performed under hadoop. An example for this
> is
> > an
> > > > > application can be launched by user A to run in hadoop as user B.
> > This
> > > is
> > > > > kind of like the sudo functionality in unix. You can find more
> > details
> > > > > about the functionalilty here
> > > > https://apex.apache.org/docs/apex/security/ in
> > > > > the Impersonation section.
> > > > >
> > > > > What happens today with launching an application with
> impersonation,
> > > > using
> > > > > the above launch example, is that even though the application runs
> as
> > > > user
> > > > > B it still uses user A's hdfs path for the application path. The
> > > > > application path is where the artifacts necessary to run the
> > > application
> > > > > are stored and where the runtime files like checkpoints are stored.
> > > This
> > > > > means that user B needs to have read and write access to user A's
> > > > > application path folders.
> > > > >
> > > > > This may not be allowed in certain environments as it may be a
> policy
> > > > > violation for the following reason. Because user A is able to
> > > impersonate
> > > > > as user B to launch the application, A is considered to be a higher
> > > > > privileged user than B and is given necessary privileges in hadoop
> to
> > > do
> > > > > so. But after launch B needs to access folders belonging to A which
> > > could
> > > > > constitute a violation as we are not asking for permissions for a
> > lower
> > > > > privilege user to access resources of a higher privilege user.
> > > > >
> > > > > I would like to propose adding a configuration setting, which when
> > set
> > > > > will use the application path in the impersonated user's home
> > directory
> > > > > (user B) as opposed to impersonating user's home directory (user
> A).
> > If
> > > > > this setting is not specified then the behavior can default to what
> > it
> > > is
> > > > > today for backwards compatibility.
> > > > >
> > > > > Comments, suggestions, concerns?
> > > > >
> > > > > Thanks
> > > > >
> > > >
> > >
> >
>

Re: impersonation and application path

Posted by Chinmay Kolhatkar <ch...@datatorrent.com>.
+1 for pramod's proposal.

On 19-May-2017 4:51 AM, "Sanjay Pujare" <sa...@datatorrent.com> wrote:

> +1 for Pramod's proposal for impersonation.
>
> I have an issue with Sandesh's suggestion about making the new behavior as
> the default (or only) behavior. This will introduce incompatibility with
> other legacy tools (e.g. Datatorrent's dtGateway) that assume user A's HDFS
> path as the application path. Because the legacy tools will continue to
> assume the old path (user A's path) they will not work with the Apex core
> that has this change.
>
> The current behavior might also be preferable to certain users or their
> administrators because of not having to deal with multiple HDFS user
> directories (for administration, logging, backup etc).
>
> On Thu, May 18, 2017 at 4:01 PM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > My vote is to make the new proposal as the default behavior. Is there a
> use
> > case for the current behavior? If not then no need to add the
> configuration
> > setting.
> >
> > On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni <pr...@datatorrent.com>
> > wrote:
> >
> > > Sorry typo in sentence "as we are not asking for permissions for a
> lower
> > > privilege", please read as "as we are now asking for permissions for a
> > > lower privilege".
> > >
> > > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <
> pramod@datatorrent.com
> > >
> > > wrote:
> > >
> > > > Apex cli supports impersonation in secure mode. With impersonation,
> the
> > > > user running the cli or the user authenticating with hadoop
> (henceforth
> > > > referred to as login user) can be different from the effective user
> > with
> > > > which the actions are performed under hadoop. An example for this is
> an
> > > > application can be launched by user A to run in hadoop as user B.
> This
> > is
> > > > kind of like the sudo functionality in unix. You can find more
> details
> > > > about the functionalilty here
> > > https://apex.apache.org/docs/apex/security/ in
> > > > the Impersonation section.
> > > >
> > > > What happens today with launching an application with impersonation,
> > > using
> > > > the above launch example, is that even though the application runs as
> > > user
> > > > B it still uses user A's hdfs path for the application path. The
> > > > application path is where the artifacts necessary to run the
> > application
> > > > are stored and where the runtime files like checkpoints are stored.
> > This
> > > > means that user B needs to have read and write access to user A's
> > > > application path folders.
> > > >
> > > > This may not be allowed in certain environments as it may be a policy
> > > > violation for the following reason. Because user A is able to
> > impersonate
> > > > as user B to launch the application, A is considered to be a higher
> > > > privileged user than B and is given necessary privileges in hadoop to
> > do
> > > > so. But after launch B needs to access folders belonging to A which
> > could
> > > > constitute a violation as we are not asking for permissions for a
> lower
> > > > privilege user to access resources of a higher privilege user.
> > > >
> > > > I would like to propose adding a configuration setting, which when
> set
> > > > will use the application path in the impersonated user's home
> directory
> > > > (user B) as opposed to impersonating user's home directory (user A).
> If
> > > > this setting is not specified then the behavior can default to what
> it
> > is
> > > > today for backwards compatibility.
> > > >
> > > > Comments, suggestions, concerns?
> > > >
> > > > Thanks
> > > >
> > >
> >
>

Re: impersonation and application path

Posted by Sanjay Pujare <sa...@datatorrent.com>.
+1 for Pramod's proposal for impersonation.

I have an issue with Sandesh's suggestion about making the new behavior as
the default (or only) behavior. This will introduce incompatibility with
other legacy tools (e.g. Datatorrent's dtGateway) that assume user A's HDFS
path as the application path. Because the legacy tools will continue to
assume the old path (user A's path) they will not work with the Apex core
that has this change.

The current behavior might also be preferable to certain users or their
administrators because of not having to deal with multiple HDFS user
directories (for administration, logging, backup etc).

On Thu, May 18, 2017 at 4:01 PM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> My vote is to make the new proposal as the default behavior. Is there a use
> case for the current behavior? If not then no need to add the configuration
> setting.
>
> On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
> > Sorry typo in sentence "as we are not asking for permissions for a lower
> > privilege", please read as "as we are now asking for permissions for a
> > lower privilege".
> >
> > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <pramod@datatorrent.com
> >
> > wrote:
> >
> > > Apex cli supports impersonation in secure mode. With impersonation, the
> > > user running the cli or the user authenticating with hadoop (henceforth
> > > referred to as login user) can be different from the effective user
> with
> > > which the actions are performed under hadoop. An example for this is an
> > > application can be launched by user A to run in hadoop as user B. This
> is
> > > kind of like the sudo functionality in unix. You can find more details
> > > about the functionalilty here
> > https://apex.apache.org/docs/apex/security/ in
> > > the Impersonation section.
> > >
> > > What happens today with launching an application with impersonation,
> > using
> > > the above launch example, is that even though the application runs as
> > user
> > > B it still uses user A's hdfs path for the application path. The
> > > application path is where the artifacts necessary to run the
> application
> > > are stored and where the runtime files like checkpoints are stored.
> This
> > > means that user B needs to have read and write access to user A's
> > > application path folders.
> > >
> > > This may not be allowed in certain environments as it may be a policy
> > > violation for the following reason. Because user A is able to
> impersonate
> > > as user B to launch the application, A is considered to be a higher
> > > privileged user than B and is given necessary privileges in hadoop to
> do
> > > so. But after launch B needs to access folders belonging to A which
> could
> > > constitute a violation as we are not asking for permissions for a lower
> > > privilege user to access resources of a higher privilege user.
> > >
> > > I would like to propose adding a configuration setting, which when set
> > > will use the application path in the impersonated user's home directory
> > > (user B) as opposed to impersonating user's home directory (user A). If
> > > this setting is not specified then the behavior can default to what it
> is
> > > today for backwards compatibility.
> > >
> > > Comments, suggestions, concerns?
> > >
> > > Thanks
> > >
> >
>

Re: impersonation and application path

Posted by Sandesh Hegde <sa...@datatorrent.com>.
My vote is to make the new proposal as the default behavior. Is there a use
case for the current behavior? If not then no need to add the configuration
setting.

On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Sorry typo in sentence "as we are not asking for permissions for a lower
> privilege", please read as "as we are now asking for permissions for a
> lower privilege".
>
> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
> > Apex cli supports impersonation in secure mode. With impersonation, the
> > user running the cli or the user authenticating with hadoop (henceforth
> > referred to as login user) can be different from the effective user with
> > which the actions are performed under hadoop. An example for this is an
> > application can be launched by user A to run in hadoop as user B. This is
> > kind of like the sudo functionality in unix. You can find more details
> > about the functionalilty here
> https://apex.apache.org/docs/apex/security/ in
> > the Impersonation section.
> >
> > What happens today with launching an application with impersonation,
> using
> > the above launch example, is that even though the application runs as
> user
> > B it still uses user A's hdfs path for the application path. The
> > application path is where the artifacts necessary to run the application
> > are stored and where the runtime files like checkpoints are stored. This
> > means that user B needs to have read and write access to user A's
> > application path folders.
> >
> > This may not be allowed in certain environments as it may be a policy
> > violation for the following reason. Because user A is able to impersonate
> > as user B to launch the application, A is considered to be a higher
> > privileged user than B and is given necessary privileges in hadoop to do
> > so. But after launch B needs to access folders belonging to A which could
> > constitute a violation as we are not asking for permissions for a lower
> > privilege user to access resources of a higher privilege user.
> >
> > I would like to propose adding a configuration setting, which when set
> > will use the application path in the impersonated user's home directory
> > (user B) as opposed to impersonating user's home directory (user A). If
> > this setting is not specified then the behavior can default to what it is
> > today for backwards compatibility.
> >
> > Comments, suggestions, concerns?
> >
> > Thanks
> >
>

Re: impersonation and application path

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Voting has completed and JIRA has been created

https://issues.apache.org/jira/browse/APEXCORE-733

Thanks

On Mon, May 22, 2017 at 3:12 PM, Sanjay Pujare <sa...@datatorrent.com>
wrote:

> Sounds good.
>
> Also I would like to work on this so the JIRA can be assigned to me.
>
> On Mon, May 22, 2017 at 3:06 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
> > I see the advantage in having a top level setting that says use
> > "impersonating user" vs "impersonated user" resources. This can
> internally
> > switch the resources, there could be resources other than application
> path
> > in future and they would also be covered. I would still leave the default
> > to be the impersonating user as not all setups fall into this category
> and
> > in many cases, user's directories are not created or managed in hdfs and
> > also it would be backward compatible.
> >
> > I think everyone seems to agree on the functionality but there is a
> > difference of opinion on the implementation. I will call out a vote on
> the
> > different implementation options and we can move forward.
> >
> > Thanks
> >
> >
> > On Fri, May 19, 2017 at 11:58 AM, Sanjay Pujare <sa...@datatorrent.com>
> > wrote:
> >
> > > I agree. But how do we use APPLICATION_PATH for this purpose since we
> > need
> > > a Yes/No flag to specify new vs old behavior?
> > >
> > > So we have to use a new setting for this (something like
> > > USE_RUNTIME_USER_APPLICATION_PATH ?)
> > >
> > > On Fri, May 19, 2017 at 7:57 AM, Pramod Immaneni <
> pramod@datatorrent.com
> > >
> > > wrote:
> > >
> > > > I wouldn't necessarily consider the current behavior a bug and the
> > > default
> > > > is fine the way it is today, especially because the user launching
> the
> > > app
> > > > is not the user. APPLICATION_PATH can be used as the setting.
> > > >
> > > > On Fri, May 19, 2017 at 7:43 AM, Vlad Rozov <v.rozov@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Do I understand correctly that the question is regarding
> > > > > DAGContext.APPLICATION_PATH attribute value in case it is not
> > defined?
> > > In
> > > > > this case, I would treat the current behavior as a bug and +1 the
> > > > proposal
> > > > > to set it to the impersonated user B DFS home directory. As
> > > > > APPLICATION_PATH can be explicitly set I do not see a need to
> provide
> > > > > another settings to preserve the current behavior.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Vlad
> > > > >
> > > > >
> > > > > On 5/18/17 15:46, Pramod Immaneni wrote:
> > > > >
> > > > >> Sorry typo in sentence "as we are not asking for permissions for a
> > > lower
> > > > >> privilege", please read as "as we are now asking for permissions
> > for a
> > > > >> lower privilege".
> > > > >>
> > > > >> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <
> > > > pramod@datatorrent.com>
> > > > >> wrote:
> > > > >>
> > > > >> Apex cli supports impersonation in secure mode. With
> impersonation,
> > > the
> > > > >>> user running the cli or the user authenticating with hadoop
> > > (henceforth
> > > > >>> referred to as login user) can be different from the effective
> user
> > > > with
> > > > >>> which the actions are performed under hadoop. An example for this
> > is
> > > an
> > > > >>> application can be launched by user A to run in hadoop as user B.
> > > This
> > > > is
> > > > >>> kind of like the sudo functionality in unix. You can find more
> > > details
> > > > >>> about the functionalilty here https://apex.apache.org/docs/a
> > > > >>> pex/security/ in
> > > > >>> the Impersonation section.
> > > > >>>
> > > > >>> What happens today with launching an application with
> > impersonation,
> > > > >>> using
> > > > >>> the above launch example, is that even though the application
> runs
> > as
> > > > >>> user
> > > > >>> B it still uses user A's hdfs path for the application path. The
> > > > >>> application path is where the artifacts necessary to run the
> > > > application
> > > > >>> are stored and where the runtime files like checkpoints are
> stored.
> > > > This
> > > > >>> means that user B needs to have read and write access to user A's
> > > > >>> application path folders.
> > > > >>>
> > > > >>> This may not be allowed in certain environments as it may be a
> > policy
> > > > >>> violation for the following reason. Because user A is able to
> > > > impersonate
> > > > >>> as user B to launch the application, A is considered to be a
> higher
> > > > >>> privileged user than B and is given necessary privileges in
> hadoop
> > to
> > > > do
> > > > >>> so. But after launch B needs to access folders belonging to A
> which
> > > > could
> > > > >>> constitute a violation as we are not asking for permissions for a
> > > lower
> > > > >>> privilege user to access resources of a higher privilege user.
> > > > >>>
> > > > >>> I would like to propose adding a configuration setting, which
> when
> > > set
> > > > >>> will use the application path in the impersonated user's home
> > > directory
> > > > >>> (user B) as opposed to impersonating user's home directory (user
> > A).
> > > If
> > > > >>> this setting is not specified then the behavior can default to
> what
> > > it
> > > > is
> > > > >>> today for backwards compatibility.
> > > > >>>
> > > > >>> Comments, suggestions, concerns?
> > > > >>>
> > > > >>> Thanks
> > > > >>>
> > > > >>>
> > > > >
> > > >
> > >
> >
>

Re: impersonation and application path

Posted by Sanjay Pujare <sa...@datatorrent.com>.
Sounds good.

Also I would like to work on this so the JIRA can be assigned to me.

On Mon, May 22, 2017 at 3:06 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> I see the advantage in having a top level setting that says use
> "impersonating user" vs "impersonated user" resources. This can internally
> switch the resources, there could be resources other than application path
> in future and they would also be covered. I would still leave the default
> to be the impersonating user as not all setups fall into this category and
> in many cases, user's directories are not created or managed in hdfs and
> also it would be backward compatible.
>
> I think everyone seems to agree on the functionality but there is a
> difference of opinion on the implementation. I will call out a vote on the
> different implementation options and we can move forward.
>
> Thanks
>
>
> On Fri, May 19, 2017 at 11:58 AM, Sanjay Pujare <sa...@datatorrent.com>
> wrote:
>
> > I agree. But how do we use APPLICATION_PATH for this purpose since we
> need
> > a Yes/No flag to specify new vs old behavior?
> >
> > So we have to use a new setting for this (something like
> > USE_RUNTIME_USER_APPLICATION_PATH ?)
> >
> > On Fri, May 19, 2017 at 7:57 AM, Pramod Immaneni <pramod@datatorrent.com
> >
> > wrote:
> >
> > > I wouldn't necessarily consider the current behavior a bug and the
> > default
> > > is fine the way it is today, especially because the user launching the
> > app
> > > is not the user. APPLICATION_PATH can be used as the setting.
> > >
> > > On Fri, May 19, 2017 at 7:43 AM, Vlad Rozov <v....@datatorrent.com>
> > > wrote:
> > >
> > > > Do I understand correctly that the question is regarding
> > > > DAGContext.APPLICATION_PATH attribute value in case it is not
> defined?
> > In
> > > > this case, I would treat the current behavior as a bug and +1 the
> > > proposal
> > > > to set it to the impersonated user B DFS home directory. As
> > > > APPLICATION_PATH can be explicitly set I do not see a need to provide
> > > > another settings to preserve the current behavior.
> > > >
> > > > Thank you,
> > > >
> > > > Vlad
> > > >
> > > >
> > > > On 5/18/17 15:46, Pramod Immaneni wrote:
> > > >
> > > >> Sorry typo in sentence "as we are not asking for permissions for a
> > lower
> > > >> privilege", please read as "as we are now asking for permissions
> for a
> > > >> lower privilege".
> > > >>
> > > >> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <
> > > pramod@datatorrent.com>
> > > >> wrote:
> > > >>
> > > >> Apex cli supports impersonation in secure mode. With impersonation,
> > the
> > > >>> user running the cli or the user authenticating with hadoop
> > (henceforth
> > > >>> referred to as login user) can be different from the effective user
> > > with
> > > >>> which the actions are performed under hadoop. An example for this
> is
> > an
> > > >>> application can be launched by user A to run in hadoop as user B.
> > This
> > > is
> > > >>> kind of like the sudo functionality in unix. You can find more
> > details
> > > >>> about the functionalilty here https://apex.apache.org/docs/a
> > > >>> pex/security/ in
> > > >>> the Impersonation section.
> > > >>>
> > > >>> What happens today with launching an application with
> impersonation,
> > > >>> using
> > > >>> the above launch example, is that even though the application runs
> as
> > > >>> user
> > > >>> B it still uses user A's hdfs path for the application path. The
> > > >>> application path is where the artifacts necessary to run the
> > > application
> > > >>> are stored and where the runtime files like checkpoints are stored.
> > > This
> > > >>> means that user B needs to have read and write access to user A's
> > > >>> application path folders.
> > > >>>
> > > >>> This may not be allowed in certain environments as it may be a
> policy
> > > >>> violation for the following reason. Because user A is able to
> > > impersonate
> > > >>> as user B to launch the application, A is considered to be a higher
> > > >>> privileged user than B and is given necessary privileges in hadoop
> to
> > > do
> > > >>> so. But after launch B needs to access folders belonging to A which
> > > could
> > > >>> constitute a violation as we are not asking for permissions for a
> > lower
> > > >>> privilege user to access resources of a higher privilege user.
> > > >>>
> > > >>> I would like to propose adding a configuration setting, which when
> > set
> > > >>> will use the application path in the impersonated user's home
> > directory
> > > >>> (user B) as opposed to impersonating user's home directory (user
> A).
> > If
> > > >>> this setting is not specified then the behavior can default to what
> > it
> > > is
> > > >>> today for backwards compatibility.
> > > >>>
> > > >>> Comments, suggestions, concerns?
> > > >>>
> > > >>> Thanks
> > > >>>
> > > >>>
> > > >
> > >
> >
>

Re: impersonation and application path

Posted by Pramod Immaneni <pr...@datatorrent.com>.
I see the advantage in having a top level setting that says use
"impersonating user" vs "impersonated user" resources. This can internally
switch the resources, there could be resources other than application path
in future and they would also be covered. I would still leave the default
to be the impersonating user as not all setups fall into this category and
in many cases, user's directories are not created or managed in hdfs and
also it would be backward compatible.

I think everyone seems to agree on the functionality but there is a
difference of opinion on the implementation. I will call out a vote on the
different implementation options and we can move forward.

Thanks


On Fri, May 19, 2017 at 11:58 AM, Sanjay Pujare <sa...@datatorrent.com>
wrote:

> I agree. But how do we use APPLICATION_PATH for this purpose since we need
> a Yes/No flag to specify new vs old behavior?
>
> So we have to use a new setting for this (something like
> USE_RUNTIME_USER_APPLICATION_PATH ?)
>
> On Fri, May 19, 2017 at 7:57 AM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
> > I wouldn't necessarily consider the current behavior a bug and the
> default
> > is fine the way it is today, especially because the user launching the
> app
> > is not the user. APPLICATION_PATH can be used as the setting.
> >
> > On Fri, May 19, 2017 at 7:43 AM, Vlad Rozov <v....@datatorrent.com>
> > wrote:
> >
> > > Do I understand correctly that the question is regarding
> > > DAGContext.APPLICATION_PATH attribute value in case it is not defined?
> In
> > > this case, I would treat the current behavior as a bug and +1 the
> > proposal
> > > to set it to the impersonated user B DFS home directory. As
> > > APPLICATION_PATH can be explicitly set I do not see a need to provide
> > > another settings to preserve the current behavior.
> > >
> > > Thank you,
> > >
> > > Vlad
> > >
> > >
> > > On 5/18/17 15:46, Pramod Immaneni wrote:
> > >
> > >> Sorry typo in sentence "as we are not asking for permissions for a
> lower
> > >> privilege", please read as "as we are now asking for permissions for a
> > >> lower privilege".
> > >>
> > >> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <
> > pramod@datatorrent.com>
> > >> wrote:
> > >>
> > >> Apex cli supports impersonation in secure mode. With impersonation,
> the
> > >>> user running the cli or the user authenticating with hadoop
> (henceforth
> > >>> referred to as login user) can be different from the effective user
> > with
> > >>> which the actions are performed under hadoop. An example for this is
> an
> > >>> application can be launched by user A to run in hadoop as user B.
> This
> > is
> > >>> kind of like the sudo functionality in unix. You can find more
> details
> > >>> about the functionalilty here https://apex.apache.org/docs/a
> > >>> pex/security/ in
> > >>> the Impersonation section.
> > >>>
> > >>> What happens today with launching an application with impersonation,
> > >>> using
> > >>> the above launch example, is that even though the application runs as
> > >>> user
> > >>> B it still uses user A's hdfs path for the application path. The
> > >>> application path is where the artifacts necessary to run the
> > application
> > >>> are stored and where the runtime files like checkpoints are stored.
> > This
> > >>> means that user B needs to have read and write access to user A's
> > >>> application path folders.
> > >>>
> > >>> This may not be allowed in certain environments as it may be a policy
> > >>> violation for the following reason. Because user A is able to
> > impersonate
> > >>> as user B to launch the application, A is considered to be a higher
> > >>> privileged user than B and is given necessary privileges in hadoop to
> > do
> > >>> so. But after launch B needs to access folders belonging to A which
> > could
> > >>> constitute a violation as we are not asking for permissions for a
> lower
> > >>> privilege user to access resources of a higher privilege user.
> > >>>
> > >>> I would like to propose adding a configuration setting, which when
> set
> > >>> will use the application path in the impersonated user's home
> directory
> > >>> (user B) as opposed to impersonating user's home directory (user A).
> If
> > >>> this setting is not specified then the behavior can default to what
> it
> > is
> > >>> today for backwards compatibility.
> > >>>
> > >>> Comments, suggestions, concerns?
> > >>>
> > >>> Thanks
> > >>>
> > >>>
> > >
> >
>

Re: impersonation and application path

Posted by Sanjay Pujare <sa...@datatorrent.com>.
I agree. But how do we use APPLICATION_PATH for this purpose since we need
a Yes/No flag to specify new vs old behavior?

So we have to use a new setting for this (something like
USE_RUNTIME_USER_APPLICATION_PATH ?)

On Fri, May 19, 2017 at 7:57 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> I wouldn't necessarily consider the current behavior a bug and the default
> is fine the way it is today, especially because the user launching the app
> is not the user. APPLICATION_PATH can be used as the setting.
>
> On Fri, May 19, 2017 at 7:43 AM, Vlad Rozov <v....@datatorrent.com>
> wrote:
>
> > Do I understand correctly that the question is regarding
> > DAGContext.APPLICATION_PATH attribute value in case it is not defined? In
> > this case, I would treat the current behavior as a bug and +1 the
> proposal
> > to set it to the impersonated user B DFS home directory. As
> > APPLICATION_PATH can be explicitly set I do not see a need to provide
> > another settings to preserve the current behavior.
> >
> > Thank you,
> >
> > Vlad
> >
> >
> > On 5/18/17 15:46, Pramod Immaneni wrote:
> >
> >> Sorry typo in sentence "as we are not asking for permissions for a lower
> >> privilege", please read as "as we are now asking for permissions for a
> >> lower privilege".
> >>
> >> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <
> pramod@datatorrent.com>
> >> wrote:
> >>
> >> Apex cli supports impersonation in secure mode. With impersonation, the
> >>> user running the cli or the user authenticating with hadoop (henceforth
> >>> referred to as login user) can be different from the effective user
> with
> >>> which the actions are performed under hadoop. An example for this is an
> >>> application can be launched by user A to run in hadoop as user B. This
> is
> >>> kind of like the sudo functionality in unix. You can find more details
> >>> about the functionalilty here https://apex.apache.org/docs/a
> >>> pex/security/ in
> >>> the Impersonation section.
> >>>
> >>> What happens today with launching an application with impersonation,
> >>> using
> >>> the above launch example, is that even though the application runs as
> >>> user
> >>> B it still uses user A's hdfs path for the application path. The
> >>> application path is where the artifacts necessary to run the
> application
> >>> are stored and where the runtime files like checkpoints are stored.
> This
> >>> means that user B needs to have read and write access to user A's
> >>> application path folders.
> >>>
> >>> This may not be allowed in certain environments as it may be a policy
> >>> violation for the following reason. Because user A is able to
> impersonate
> >>> as user B to launch the application, A is considered to be a higher
> >>> privileged user than B and is given necessary privileges in hadoop to
> do
> >>> so. But after launch B needs to access folders belonging to A which
> could
> >>> constitute a violation as we are not asking for permissions for a lower
> >>> privilege user to access resources of a higher privilege user.
> >>>
> >>> I would like to propose adding a configuration setting, which when set
> >>> will use the application path in the impersonated user's home directory
> >>> (user B) as opposed to impersonating user's home directory (user A). If
> >>> this setting is not specified then the behavior can default to what it
> is
> >>> today for backwards compatibility.
> >>>
> >>> Comments, suggestions, concerns?
> >>>
> >>> Thanks
> >>>
> >>>
> >
>

Re: impersonation and application path

Posted by Pramod Immaneni <pr...@datatorrent.com>.
I wouldn't necessarily consider the current behavior a bug and the default
is fine the way it is today, especially because the user launching the app
is not the user. APPLICATION_PATH can be used as the setting.

On Fri, May 19, 2017 at 7:43 AM, Vlad Rozov <v....@datatorrent.com> wrote:

> Do I understand correctly that the question is regarding
> DAGContext.APPLICATION_PATH attribute value in case it is not defined? In
> this case, I would treat the current behavior as a bug and +1 the proposal
> to set it to the impersonated user B DFS home directory. As
> APPLICATION_PATH can be explicitly set I do not see a need to provide
> another settings to preserve the current behavior.
>
> Thank you,
>
> Vlad
>
>
> On 5/18/17 15:46, Pramod Immaneni wrote:
>
>> Sorry typo in sentence "as we are not asking for permissions for a lower
>> privilege", please read as "as we are now asking for permissions for a
>> lower privilege".
>>
>> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>> Apex cli supports impersonation in secure mode. With impersonation, the
>>> user running the cli or the user authenticating with hadoop (henceforth
>>> referred to as login user) can be different from the effective user with
>>> which the actions are performed under hadoop. An example for this is an
>>> application can be launched by user A to run in hadoop as user B. This is
>>> kind of like the sudo functionality in unix. You can find more details
>>> about the functionalilty here https://apex.apache.org/docs/a
>>> pex/security/ in
>>> the Impersonation section.
>>>
>>> What happens today with launching an application with impersonation,
>>> using
>>> the above launch example, is that even though the application runs as
>>> user
>>> B it still uses user A's hdfs path for the application path. The
>>> application path is where the artifacts necessary to run the application
>>> are stored and where the runtime files like checkpoints are stored. This
>>> means that user B needs to have read and write access to user A's
>>> application path folders.
>>>
>>> This may not be allowed in certain environments as it may be a policy
>>> violation for the following reason. Because user A is able to impersonate
>>> as user B to launch the application, A is considered to be a higher
>>> privileged user than B and is given necessary privileges in hadoop to do
>>> so. But after launch B needs to access folders belonging to A which could
>>> constitute a violation as we are not asking for permissions for a lower
>>> privilege user to access resources of a higher privilege user.
>>>
>>> I would like to propose adding a configuration setting, which when set
>>> will use the application path in the impersonated user's home directory
>>> (user B) as opposed to impersonating user's home directory (user A). If
>>> this setting is not specified then the behavior can default to what it is
>>> today for backwards compatibility.
>>>
>>> Comments, suggestions, concerns?
>>>
>>> Thanks
>>>
>>>
>

Re: impersonation and application path

Posted by Vlad Rozov <v....@datatorrent.com>.
Do I understand correctly that the question is regarding 
DAGContext.APPLICATION_PATH attribute value in case it is not defined? 
In this case, I would treat the current behavior as a bug and +1 the 
proposal to set it to the impersonated user B DFS home directory. As 
APPLICATION_PATH can be explicitly set I do not see a need to provide 
another settings to preserve the current behavior.

Thank you,

Vlad

On 5/18/17 15:46, Pramod Immaneni wrote:
> Sorry typo in sentence "as we are not asking for permissions for a lower
> privilege", please read as "as we are now asking for permissions for a
> lower privilege".
>
> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> Apex cli supports impersonation in secure mode. With impersonation, the
>> user running the cli or the user authenticating with hadoop (henceforth
>> referred to as login user) can be different from the effective user with
>> which the actions are performed under hadoop. An example for this is an
>> application can be launched by user A to run in hadoop as user B. This is
>> kind of like the sudo functionality in unix. You can find more details
>> about the functionalilty here https://apex.apache.org/docs/apex/security/ in
>> the Impersonation section.
>>
>> What happens today with launching an application with impersonation, using
>> the above launch example, is that even though the application runs as user
>> B it still uses user A's hdfs path for the application path. The
>> application path is where the artifacts necessary to run the application
>> are stored and where the runtime files like checkpoints are stored. This
>> means that user B needs to have read and write access to user A's
>> application path folders.
>>
>> This may not be allowed in certain environments as it may be a policy
>> violation for the following reason. Because user A is able to impersonate
>> as user B to launch the application, A is considered to be a higher
>> privileged user than B and is given necessary privileges in hadoop to do
>> so. But after launch B needs to access folders belonging to A which could
>> constitute a violation as we are not asking for permissions for a lower
>> privilege user to access resources of a higher privilege user.
>>
>> I would like to propose adding a configuration setting, which when set
>> will use the application path in the impersonated user's home directory
>> (user B) as opposed to impersonating user's home directory (user A). If
>> this setting is not specified then the behavior can default to what it is
>> today for backwards compatibility.
>>
>> Comments, suggestions, concerns?
>>
>> Thanks
>>


Re: impersonation and application path

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Sorry typo in sentence "as we are not asking for permissions for a lower
privilege", please read as "as we are now asking for permissions for a
lower privilege".

On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Apex cli supports impersonation in secure mode. With impersonation, the
> user running the cli or the user authenticating with hadoop (henceforth
> referred to as login user) can be different from the effective user with
> which the actions are performed under hadoop. An example for this is an
> application can be launched by user A to run in hadoop as user B. This is
> kind of like the sudo functionality in unix. You can find more details
> about the functionalilty here https://apex.apache.org/docs/apex/security/ in
> the Impersonation section.
>
> What happens today with launching an application with impersonation, using
> the above launch example, is that even though the application runs as user
> B it still uses user A's hdfs path for the application path. The
> application path is where the artifacts necessary to run the application
> are stored and where the runtime files like checkpoints are stored. This
> means that user B needs to have read and write access to user A's
> application path folders.
>
> This may not be allowed in certain environments as it may be a policy
> violation for the following reason. Because user A is able to impersonate
> as user B to launch the application, A is considered to be a higher
> privileged user than B and is given necessary privileges in hadoop to do
> so. But after launch B needs to access folders belonging to A which could
> constitute a violation as we are not asking for permissions for a lower
> privilege user to access resources of a higher privilege user.
>
> I would like to propose adding a configuration setting, which when set
> will use the application path in the impersonated user's home directory
> (user B) as opposed to impersonating user's home directory (user A). If
> this setting is not specified then the behavior can default to what it is
> today for backwards compatibility.
>
> Comments, suggestions, concerns?
>
> Thanks
>