You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Zsolt Tóth <to...@gmail.com> on 2016/11/03 14:22:53 UTC

Delegation Token renewal in yarn-cluster

Hi,

I ran some tests regarding Spark's Delegation Token renewal mechanism. As I
see, the concept here is simple: if I give my keytab file and client
principal to Spark, it starts a token renewal thread, and renews the
namenode delegation tokens after some time. This works fine.

Then I tried to run a long application (with HDFS operation in the end)
without providing the keytab/principal to Spark, and I expected it to fail
after the token expires. It turned out that this is not the case, the
application finishes successfully without a delegation token renewal by
Spark.

My question is: how is that possible? Shouldn't a saveAsTextfile() fail
after the namenode delegation token expired?

Regards,
Zsolt

Re: Delegation Token renewal in yarn-cluster

Posted by Zsolt Tóth <to...@gmail.com>.
Any ideas about this one? Am I missing something here?

2016-11-03 15:22 GMT+01:00 Zsolt Tóth <to...@gmail.com>:

> Hi,
>
> I ran some tests regarding Spark's Delegation Token renewal mechanism. As
> I see, the concept here is simple: if I give my keytab file and client
> principal to Spark, it starts a token renewal thread, and renews the
> namenode delegation tokens after some time. This works fine.
>
> Then I tried to run a long application (with HDFS operation in the end)
> without providing the keytab/principal to Spark, and I expected it to fail
> after the token expires. It turned out that this is not the case, the
> application finishes successfully without a delegation token renewal by
> Spark.
>
> My question is: how is that possible? Shouldn't a saveAsTextfile() fail
> after the namenode delegation token expired?
>
> Regards,
> Zsolt
>

Re: Delegation Token renewal in yarn-cluster

Posted by Steve Loughran <st...@hortonworks.com>.
On 4 Nov 2016, at 01:37, Marcelo Vanzin <va...@cloudera.com>> wrote:

On Thu, Nov 3, 2016 at 3:47 PM, Zsolt Tóth <to...@gmail.com>> wrote:
What is the purpose of the delegation token renewal (the one that is done
automatically by Hadoop libraries, after 1 day by default)? It seems that it
always happens (every day) until the token expires, no matter what. I'd
probably find an answer to that in a basic Hadoop security description.



* DTs allow a long lived job to outlast the Kerberos ticket lifetime of the submitter; usually 48-72h.
* submitting jobs with DTs limit the access of the job to those services for which you have a DT; no need to acquire Kerberos tickets for every query being run. This keeps load on kerberos down, which is good as with Active Directory that's usually shared with the rest of the organisation. Some kerberos servers treat a bulk access from a few thousand machines as a brute force attack.
* Delegation tokens can also be revoked at the NN. After a process terminates, something (YARN NM?) can chat with the NN and say "no longer valid". In contrast, Kerberos TGTs stay valid until that timeout, without any revocation mechanism.

I'm not sure and I never really got a good answer to that (I had the
same question in the past). My best guess is to limit how long an
attacker can do bad things if he gets hold of a delegation token. But
IMO if an attacker gets a delegation token, that's pretty bad
regardless of how long he can use it...


correct: limits the damage. In contrast, if someone has your keytab, they have access until that KT expires.




I have a feeling that giving the keytab to Spark bypasses the concept behind
delegation tokens. As I understand, the NN basically says that "your
application can access hdfs with this delegation token, but only for 7
days".

I'm not sure why there's a 7 day limit either, but let's assume
there's a good reason. Basically the app, at that point, needs to
prove to the NN it has a valid kerberos credential. Whether that's
from someone typing their password into a terminal, or code using a
keytab, it doesn't really matter. If someone was worried about that
user being malicious they'd disable the user's login in the KDC.

This feature is needed because there are apps that need to keep
running, unattended, for longer than HDFS's max lifetime setting.


pretty much it. FWIW that's why turning Kerberos on midweek morning, rather than a friday evening, is wise. The 7 day timeout event will start happening during working hours.

https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md


Re: Delegation Token renewal in yarn-cluster

Posted by Marcelo Vanzin <va...@cloudera.com>.
On Fri, Nov 4, 2016 at 1:57 AM, Zsolt Tóth <to...@gmail.com> wrote:
> This was what confused me in the first place. Why does Spark ask for new
> tokens based on the renew-interval instead of the max-lifetime?

It could be just a harmless bug, since tokens have a "getMaxDate()"
method which I assume returns the token's lifetime, although there's
no documentation. Or it could be that the max lifetime of the token is
not really available to the code. If you want to experiment with the
code, that should be a small change (if getMaxDate() returns the right
thing).

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Delegation Token renewal in yarn-cluster

Posted by Zsolt Tóth <to...@gmail.com>.
I checked the logs of my tests, and found that the Spark schedules the
token refresh based on the renew-interval property, not the max-lifetime.

The settings in my tests:
dfs.namenode.delegation.key.update-interval=520000
dfs.namenode.delegation.token.max-lifetime=1020000
dfs.namenode.delegation.token.renew-interval=520000

During the job submission, spark.yarn.token.renewal.interval is set:
2016-11-04 09:12:25 INFO  Client:59 - Renewal Interval set to 520036

Then, it takes ~0.75*spark.yarn.token.renewal.interval to schedule the
token refresh.

2016-11-04 09:12:37 INFO  ExecutorDelegationTokenUpdater:59 - Scheduling
token refresh from HDFS in 404251 millis.
...
2016-11-04 09:19:21 INFO  ExecutorDelegationTokenUpdater:59 - Reading new
delegation tokens from ...
...
2016-11-04 09:19:21 INFO  ExecutorDelegationTokenUpdater:59 - Scheduling
token refresh from HDFS in 390064 millis.
...
2016-11-04 09:25:52 INFO  ExecutorDelegationTokenUpdater:59 - Reading new
delegation tokens from ...
...
2016-11-04 09:25:52 INFO  ExecutorDelegationTokenUpdater:59 - Scheduling
token refresh from HDFS in 390022 millis.

This was what confused me in the first place. Why does Spark ask for new
tokens based on the renew-interval instead of the max-lifetime?


2016-11-04 2:37 GMT+01:00 Marcelo Vanzin <va...@cloudera.com>:

> On Thu, Nov 3, 2016 at 3:47 PM, Zsolt Tóth <to...@gmail.com>
> wrote:
> > What is the purpose of the delegation token renewal (the one that is done
> > automatically by Hadoop libraries, after 1 day by default)? It seems
> that it
> > always happens (every day) until the token expires, no matter what. I'd
> > probably find an answer to that in a basic Hadoop security description.
>
> I'm not sure and I never really got a good answer to that (I had the
> same question in the past). My best guess is to limit how long an
> attacker can do bad things if he gets hold of a delegation token. But
> IMO if an attacker gets a delegation token, that's pretty bad
> regardless of how long he can use it...
>
> > I have a feeling that giving the keytab to Spark bypasses the concept
> behind
> > delegation tokens. As I understand, the NN basically says that "your
> > application can access hdfs with this delegation token, but only for 7
> > days".
>
> I'm not sure why there's a 7 day limit either, but let's assume
> there's a good reason. Basically the app, at that point, needs to
> prove to the NN it has a valid kerberos credential. Whether that's
> from someone typing their password into a terminal, or code using a
> keytab, it doesn't really matter. If someone was worried about that
> user being malicious they'd disable the user's login in the KDC.
>
> This feature is needed because there are apps that need to keep
> running, unattended, for longer than HDFS's max lifetime setting.
>
> --
> Marcelo
>

Re: Delegation Token renewal in yarn-cluster

Posted by Marcelo Vanzin <va...@cloudera.com>.
On Thu, Nov 3, 2016 at 3:47 PM, Zsolt Tóth <to...@gmail.com> wrote:
> What is the purpose of the delegation token renewal (the one that is done
> automatically by Hadoop libraries, after 1 day by default)? It seems that it
> always happens (every day) until the token expires, no matter what. I'd
> probably find an answer to that in a basic Hadoop security description.

I'm not sure and I never really got a good answer to that (I had the
same question in the past). My best guess is to limit how long an
attacker can do bad things if he gets hold of a delegation token. But
IMO if an attacker gets a delegation token, that's pretty bad
regardless of how long he can use it...

> I have a feeling that giving the keytab to Spark bypasses the concept behind
> delegation tokens. As I understand, the NN basically says that "your
> application can access hdfs with this delegation token, but only for 7
> days".

I'm not sure why there's a 7 day limit either, but let's assume
there's a good reason. Basically the app, at that point, needs to
prove to the NN it has a valid kerberos credential. Whether that's
from someone typing their password into a terminal, or code using a
keytab, it doesn't really matter. If someone was worried about that
user being malicious they'd disable the user's login in the KDC.

This feature is needed because there are apps that need to keep
running, unattended, for longer than HDFS's max lifetime setting.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Delegation Token renewal in yarn-cluster

Posted by Zsolt Tóth <to...@gmail.com>.
Thank you for the clarification Marcelo, makes sense.
I'm thinking about 2 questions here, somewhat unrelated to the original
problem.

What is the purpose of the delegation token renewal (the one that is done
automatically by Hadoop libraries, after 1 day by default)? It seems that
it always happens (every day) until the token expires, no matter what. I'd
probably find an answer to that in a basic Hadoop security description.

I have a feeling that giving the keytab to Spark bypasses the concept
behind delegation tokens. As I understand, the NN basically says that "your
application can access hdfs with this delegation token, but only for 7
days". After 7 days, the NN should *ideally* ask me like "this app runs for
a week now, do you want to continue that?" - then I'd need to login with my
keytab and give the new delegation token to the application. I know that
this would be really difficult to handle, but now Spark just "ignores" the
whole token expiration mechanism and relogins every time it is needed. Am I
missing something?



2016-11-03 22:42 GMT+01:00 Marcelo Vanzin <va...@cloudera.com>:

> I think you're a little confused about what "renewal" means here, and
> this might be the fault of the documentation (I haven't read it in a
> while).
>
> The existing delegation tokens will always be "renewed", in the sense
> that Spark (actually Hadoop code invisible to Spark) will talk to the
> NN to extend its lifetime. The feature you're talking about is for
> creating *new* delegation tokens after the old ones expire and cannot
> be renewed anymore (i.e. the max-lifetime configuration).
>
> On Thu, Nov 3, 2016 at 2:02 PM, Zsolt Tóth <to...@gmail.com>
> wrote:
> > Yes, I did change dfs.namenode.delegation.key.update-interval and
> > dfs.namenode.delegation.token.renew-interval to 15 min, the
> max-lifetime to
> > 30min. In this case the application (without Spark having the keytab) did
> > not fail after 15 min, only after 30 min. Is it possible that the
> resource
> > manager somehow automatically renews the delegation tokens for my
> > application?
> >
> > 2016-11-03 21:34 GMT+01:00 Marcelo Vanzin <va...@cloudera.com>:
> >>
> >> Sounds like your test was set up incorrectly. The default TTL for
> >> tokens is 7 days. Did you change that in the HDFS config?
> >>
> >> The issue definitely exists and people definitely have run into it. So
> >> if you're not hitting it, it's most definitely an issue with your test
> >> configuration.
> >>
> >> On Thu, Nov 3, 2016 at 7:22 AM, Zsolt Tóth <to...@gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > I ran some tests regarding Spark's Delegation Token renewal mechanism.
> >> > As I
> >> > see, the concept here is simple: if I give my keytab file and client
> >> > principal to Spark, it starts a token renewal thread, and renews the
> >> > namenode delegation tokens after some time. This works fine.
> >> >
> >> > Then I tried to run a long application (with HDFS operation in the
> end)
> >> > without providing the keytab/principal to Spark, and I expected it to
> >> > fail
> >> > after the token expires. It turned out that this is not the case, the
> >> > application finishes successfully without a delegation token renewal
> by
> >> > Spark.
> >> >
> >> > My question is: how is that possible? Shouldn't a saveAsTextfile()
> fail
> >> > after the namenode delegation token expired?
> >> >
> >> > Regards,
> >> > Zsolt
> >>
> >>
> >>
> >> --
> >> Marcelo
> >
> >
>
>
>
> --
> Marcelo
>

Re: Delegation Token renewal in yarn-cluster

Posted by Marcelo Vanzin <va...@cloudera.com>.
I think you're a little confused about what "renewal" means here, and
this might be the fault of the documentation (I haven't read it in a
while).

The existing delegation tokens will always be "renewed", in the sense
that Spark (actually Hadoop code invisible to Spark) will talk to the
NN to extend its lifetime. The feature you're talking about is for
creating *new* delegation tokens after the old ones expire and cannot
be renewed anymore (i.e. the max-lifetime configuration).

On Thu, Nov 3, 2016 at 2:02 PM, Zsolt Tóth <to...@gmail.com> wrote:
> Yes, I did change dfs.namenode.delegation.key.update-interval and
> dfs.namenode.delegation.token.renew-interval to 15 min, the max-lifetime to
> 30min. In this case the application (without Spark having the keytab) did
> not fail after 15 min, only after 30 min. Is it possible that the resource
> manager somehow automatically renews the delegation tokens for my
> application?
>
> 2016-11-03 21:34 GMT+01:00 Marcelo Vanzin <va...@cloudera.com>:
>>
>> Sounds like your test was set up incorrectly. The default TTL for
>> tokens is 7 days. Did you change that in the HDFS config?
>>
>> The issue definitely exists and people definitely have run into it. So
>> if you're not hitting it, it's most definitely an issue with your test
>> configuration.
>>
>> On Thu, Nov 3, 2016 at 7:22 AM, Zsolt Tóth <to...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > I ran some tests regarding Spark's Delegation Token renewal mechanism.
>> > As I
>> > see, the concept here is simple: if I give my keytab file and client
>> > principal to Spark, it starts a token renewal thread, and renews the
>> > namenode delegation tokens after some time. This works fine.
>> >
>> > Then I tried to run a long application (with HDFS operation in the end)
>> > without providing the keytab/principal to Spark, and I expected it to
>> > fail
>> > after the token expires. It turned out that this is not the case, the
>> > application finishes successfully without a delegation token renewal by
>> > Spark.
>> >
>> > My question is: how is that possible? Shouldn't a saveAsTextfile() fail
>> > after the namenode delegation token expired?
>> >
>> > Regards,
>> > Zsolt
>>
>>
>>
>> --
>> Marcelo
>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Delegation Token renewal in yarn-cluster

Posted by Zsolt Tóth <to...@gmail.com>.
Yes, I did change dfs.namenode.delegation.key.update-interval
and dfs.namenode.delegation.token.renew-interval to 15 min, the
max-lifetime to 30min. In this case the application (without Spark having
the keytab) did not fail after 15 min, only after 30 min. Is it possible
that the resource manager somehow automatically renews the delegation
tokens for my application?

2016-11-03 21:34 GMT+01:00 Marcelo Vanzin <va...@cloudera.com>:

> Sounds like your test was set up incorrectly. The default TTL for
> tokens is 7 days. Did you change that in the HDFS config?
>
> The issue definitely exists and people definitely have run into it. So
> if you're not hitting it, it's most definitely an issue with your test
> configuration.
>
> On Thu, Nov 3, 2016 at 7:22 AM, Zsolt Tóth <to...@gmail.com>
> wrote:
> > Hi,
> >
> > I ran some tests regarding Spark's Delegation Token renewal mechanism.
> As I
> > see, the concept here is simple: if I give my keytab file and client
> > principal to Spark, it starts a token renewal thread, and renews the
> > namenode delegation tokens after some time. This works fine.
> >
> > Then I tried to run a long application (with HDFS operation in the end)
> > without providing the keytab/principal to Spark, and I expected it to
> fail
> > after the token expires. It turned out that this is not the case, the
> > application finishes successfully without a delegation token renewal by
> > Spark.
> >
> > My question is: how is that possible? Shouldn't a saveAsTextfile() fail
> > after the namenode delegation token expired?
> >
> > Regards,
> > Zsolt
>
>
>
> --
> Marcelo
>

Re: Delegation Token renewal in yarn-cluster

Posted by Marcelo Vanzin <va...@cloudera.com>.
Sounds like your test was set up incorrectly. The default TTL for
tokens is 7 days. Did you change that in the HDFS config?

The issue definitely exists and people definitely have run into it. So
if you're not hitting it, it's most definitely an issue with your test
configuration.

On Thu, Nov 3, 2016 at 7:22 AM, Zsolt Tóth <to...@gmail.com> wrote:
> Hi,
>
> I ran some tests regarding Spark's Delegation Token renewal mechanism. As I
> see, the concept here is simple: if I give my keytab file and client
> principal to Spark, it starts a token renewal thread, and renews the
> namenode delegation tokens after some time. This works fine.
>
> Then I tried to run a long application (with HDFS operation in the end)
> without providing the keytab/principal to Spark, and I expected it to fail
> after the token expires. It turned out that this is not the case, the
> application finishes successfully without a delegation token renewal by
> Spark.
>
> My question is: how is that possible? Shouldn't a saveAsTextfile() fail
> after the namenode delegation token expired?
>
> Regards,
> Zsolt



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org