You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Norbert Luksa (Jira)" <ji...@apache.org> on 2020/03/17 15:29:00 UTC

[jira] [Commented] (IMPALA-9359) Recover gracefully from corrupt kerberos credential cache

    [ https://issues.apache.org/jira/browse/IMPALA-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060995#comment-17060995 ] 

Norbert Luksa commented on IMPALA-9359:
---------------------------------------

Looks like ASF Jira bot failed to copy the commit message, so here it is for reference:
IMPALA-9359: recover from corrupt kerberos ccache

This is a clean cherry-pick of KUDU-3050. The original commit
message is below.

KUDU-3050: recover from corrupt kerberos ccache

This handles two failure modes:
* krb5_cc_start_seq_get() can fail if the kerberos credential cache gets
  corrupted on disk, e.g. is truncated.
* the renewal can fail to find a credential in the credential cache,
  either if it is missing or the renewal thread hits an error while
  reading through credentials.

Also add some additional logging and limit the max backoff time
to make it easier to debug other kinds of renewal errors.

The test triggers a pre-existing memory leak bug in some older
Kerberos libraries. Added a suppression for leak sanitizer
to ClientNegotiation::CheckGSSAPI() to suppress it.

Test:
Add a test that exercises the recovery logic after truncating
the credential cache. The test failed before this change.

Change-Id: I86567f16816d1c6729679398ce56296744cb30c9
Reviewed-on: http://gerrit.cloudera.org:8080/15407
Reviewed-by: Thomas Tauber-Marshall <tm...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>

> Recover gracefully from corrupt kerberos credential cache
> ---------------------------------------------------------
>
>                 Key: IMPALA-9359
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9359
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Security
>    Affects Versions: Impala 3.3.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: kerberos
>             Fix For: Impala 3.4.0
>
>
> # Start up a kerberized Impala cluster
> # Corrupt the kerberos ticket cache used by impala /tmp/krb5cc_impala_internal
> # Observe queries fail. The details depend a lot on timing, etc. I have seen communication failures between impalads and with other systems, e.g. HDFS.
> # The system will stay wedge in this state indefinitely
> We have seen this happen once in production from /tmp filling up.
> I prototyped a fix that amounts to re-running Kinit() to blow away the broken credential cache. It needs more work to be production-ready



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org