You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Chan, Regina" <Re...@gs.com> on 2017/10/23 17:52:05 UTC

Impersonation support in Flink

Hi folks,

Is Flink is able to do impersonation using UserGroupInformation? How do we make all the tasks run with this in a way that we wouldn't have to do it per task?


UserGroupInformation ugi = UserGroupInformation.createProxyUser( proxyUser, UserGroupInformation.getLoginUser());
PrivilegedExceptionAction<Void> iAction = new PrivilegedExceptionAction<Void>()
{
public Void run() throws Exception
{
              action.run();
              return null;
       }
};
ugi.doAs(iAction);



Regina Chan
Goldman Sachs - Enterprise Platforms, Data Architecture
30 Hudson Street, 37th floor | Jersey City, NY 07302 *  (212) 902-5697


RE: Impersonation support in Flink

Posted by "Newport, Billy" <Bi...@gs.com>.
Our scenario is to enable a specific Kerberos to impersonate any Kerberos in a specific group, this is enabled the in hdfs configuration. That Kerberos does not need to be root, just a Kerberos allowed to impersonate that users in that group.

We want the job to access HDFS as the impersonated Kerberos, not the one that launched it. We do this with our MR jobs but simply impersonating in the driver and all the mappers/reduces run correctly and use the impersonate user active when the job was submitted. We expected flink to work similarly and found the issue.

We do this without the keytab for that user, if we had it, we wouldn’t need to impersonate if you see what I mean.

So, what kind of changes would be needed where to implement this function, happy to do the patch to enable this behavior.

Billy


From: Eron Wright [mailto:eronwright@gmail.com]
Sent: Monday, October 23, 2017 4:53 PM
To: Chan, Regina [Tech]
Cc: user@flink.apache.org
Subject: Re: Impersonation support in Flink

Hello,
Flink does initialize the process-wide login user, using the UGI's Kerberos login method.  It doesn't support proxy user at the moment.   Let's dig into the scenario a bit to see how best to support it.

As you know, the proxy user functionality of Hadoop allows a process that has superuser credentials to impersonate a normal user when making remote calls to HDFS and other remote services.    A possible scenario would be, the Flink cluster has a superuser account and accesses HDFS on behalf of someone.   Keep in mind that job code runs with full trust within the JM/TM, and would have access to the superuser keytab.   Does that sound like your scenario?

Proxy user support would not facilitate the scenario of running a user's job code such that the job accesses HDFS as that user.   The only way to support that scenario is by launching the cluster using that user's keytab.

I hope this helps,
Eron

On Mon, Oct 23, 2017 at 10:52 AM, Chan, Regina <Re...@gs.com>> wrote:
Hi folks,

Is Flink is able to do impersonation using UserGroupInformation? How do we make all the tasks run with this in a way that we wouldn’t have to do it per task?


UserGroupInformation ugi = UserGroupInformation.createProxyUser( proxyUser, UserGroupInformation.getLoginUser());
PrivilegedExceptionAction<Void> iAction = new PrivilegedExceptionAction<Void>()
{
public Void run() throws Exception
{
              action.run();
              return null;
       }
};
ugi.doAs(iAction);



Regina Chan
Goldman Sachs – Enterprise Platforms, Data Architecture
30 Hudson Street, 37th floor | Jersey City, NY 07302<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D30-2BHudson-2BStreet-2C-2B37th-2Bfloor-2B-257C-2BJersey-2BCity-2C-2BNY-2B07302-250D-2B-28-25C2-25A0-2B-28212-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=ZeDE52hr-zVl4Qjl1El1KhVbTkJEdJstVisdyaaqbrs&s=rN_ceG5mzqTClLiso3EBYH1DwUi9Sh_EZyszNwdm_Q4&e=> •<https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3D30-2BHudson-2BStreet-2C-2B37th-2Bfloor-2B-257C-2BJersey-2BCity-2C-2BNY-2B07302-250D-2B-28-25C2-25A0-2B-28212-26entry-3Dgmail-26source-3Dg&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=rlkM70D3djmDN7dGPzzbVKG26ShcTFDMKlX5AWucE5Q&m=ZeDE52hr-zVl4Qjl1El1KhVbTkJEdJstVisdyaaqbrs&s=rN_ceG5mzqTClLiso3EBYH1DwUi9Sh_EZyszNwdm_Q4&e=>  (212) 902-5697<tel:(212)%20902-5697>



Re: Impersonation support in Flink

Posted by Eron Wright <er...@gmail.com>.
Hello,
Flink does initialize the process-wide login user, using the UGI's Kerberos
login method.  It doesn't support proxy user at the moment.   Let's dig
into the scenario a bit to see how best to support it.

As you know, the proxy user functionality of Hadoop allows a process that
has superuser credentials to impersonate a normal user when making remote
calls to HDFS and other remote services.    A possible scenario would be,
the Flink cluster has a superuser account and accesses HDFS on behalf of
someone.   Keep in mind that job code runs with full trust within the
JM/TM, and would have access to the superuser keytab.   Does that sound
like your scenario?

Proxy user support would not facilitate the scenario of running a user's
job code such that the job accesses HDFS as that user.   The only way to
support that scenario is by launching the cluster using that user's keytab.

I hope this helps,
Eron

On Mon, Oct 23, 2017 at 10:52 AM, Chan, Regina <Re...@gs.com> wrote:

> Hi folks,
>
>
>
> Is Flink is able to do impersonation using UserGroupInformation? How do we
> make all the tasks run with this in a way that we wouldn’t have to do it
> per task?
>
>
>
>
>
> UserGroupInformation ugi = UserGroupInformation.*createProxyUser*(
> proxyUser, UserGroupInformation.*getLoginUser*());
>
> PrivilegedExceptionAction<Void> iAction = *new* PrivilegedExceptionAction<Void>()
>
>
> {
>
> *public* Void run() *throws* Exception
>
> {
>
>               action.run();
>
>               *return* *null*;
>
>        }
>
> };
>
> ugi.doAs(iAction);
>
>
>
>
>
>
>
> *Regina Chan*
>
> *Goldman Sachs* *–* Enterprise Platforms, Data Architecture
>
> *30 Hudson Street, 37th floor | Jersey City, NY 07302
> <https://maps.google.com/?q=30+Hudson+Street,+37th+floor+%7C+Jersey+City,+NY+07302%0D+(%C2%A0+(212&entry=gmail&source=g>*
> (
> <https://maps.google.com/?q=30+Hudson+Street,+37th+floor+%7C+Jersey+City,+NY+07302%0D+(%C2%A0+(212&entry=gmail&source=g>
> (212) 902-5697
>
>
>