You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Lars Francke <la...@gmail.com> on 2020/02/28 14:20:22 UTC

Keytab, Proxy User & Principal

Hi,

I understand that we forbid specifying "principal" & "proxy user" at the
same time because the current logic would just stage the keytab and the
proxy user could then use that to gain full access circumventing any
security.

But we have a use-case for Livy where a different semantic would be great:
Livy is supposed to submit a job for other users. It does so by specifying
"proxy user" and it relies on the local credential cache (outside of Java)
to contain the proper tickets (it runs kinit in a background thread).

This will only work if Livy runs in an environment where it's the only user
working with that credentials cache. Unfortunately that's not always the
case when multiple services share the same user.

(One thing we'll try is to use the KRB5CCNAME environment variable to point
to a different Credential Cache for Livy but I'm not sure yet if that's
being passed on to the spawned Spark process)

Can we not allow specifying a keytab and principal together with proxy user
but those are only used for the initial login to submit the job and are not
shipped to the cluster? This way jobs wouldn't need to rely on the
operating system.

Maybe I'm missing something as well?

Cheers,
Lars

Re: Keytab, Proxy User & Principal

Posted by Lars Francke <la...@gmail.com>.

I just wanted to bump this to see if anyone has any opinions on this?

On Fri, Feb 28, 2020 at 3:20 PM Lars Francke <la...@gmail.com> wrote:

> Hi,
>
> I understand that we forbid specifying "principal" & "proxy user" at the
> same time because the current logic would just stage the keytab and the
> proxy user could then use that to gain full access circumventing any
> security.
>
> But we have a use-case for Livy where a different semantic would be great:
> Livy is supposed to submit a job for other users. It does so by specifying
> "proxy user" and it relies on the local credential cache (outside of Java)
> to contain the proper tickets (it runs kinit in a background thread).
>
> This will only work if Livy runs in an environment where it's the only
> user working with that credentials cache. Unfortunately that's not always
> the case when multiple services share the same user.
>
> (One thing we'll try is to use the KRB5CCNAME environment variable to
> point to a different Credential Cache for Livy but I'm not sure yet if
> that's being passed on to the spawned Spark process)
>
> Can we not allow specifying a keytab and principal together with proxy
> user but those are only used for the initial login to submit the job and
> are not shipped to the cluster? This way jobs wouldn't need to rely on the
> operating system.
>
> Maybe I'm missing something as well?
>
> Cheers,
> Lars
>

Re: Keytab, Proxy User & Principal

Posted by Marcelo Vanzin <va...@gmail.com>.

On Fri, Feb 28, 2020 at 6:21 AM Lars Francke <la...@gmail.com> wrote:

> Can we not allow specifying a keytab and principal together with proxy
> user but those are only used for the initial login to submit the job and
> are not shipped to the cluster? This way jobs wouldn't need to rely on the
> operating system.
>

I'm not sure I 100% understand your use case (even if multiple services are
using the credential cache, why would that be a problem?), but from Spark's
side, the only issue with this is making it clear to the user when things
are being submitted one way or another.

But frankly this feels more like something better taken care of in Livy
(e.g. by using KRB5CCNAME when running spark-submit).

-- 
Marcelo Vanzin
vanzin@gmail.com
"Life's too short to drink cheap beer"