You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Rushabh S Shah (JIRA)" <ji...@apache.org> on 2017/05/10 21:39:04 UTC

[jira] [Created] (HDFS-11804) KMS client needs retry logic

Rushabh S Shah created HDFS-11804:
-------------------------------------

             Summary: KMS client needs retry logic
                 Key: HDFS-11804
                 URL: https://issues.apache.org/jira/browse/HDFS-11804
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 2.6.0
            Reporter: Rushabh S Shah
            Assignee: Rushabh S Shah


The kms client appears to have no retry logic – at all.  It's completely decoupled from the ipc retry logic.  This has major impacts if the KMS is unreachable for any reason, including but limited to network connection issues, timeouts, the +restart during an upgrade+.

This has some major ramifications:
# Jobs may fail to submit, although oozie resubmit logic should mask it
# Non-oozie launchers may experience higher rates if they do not already have retry logic.
# Tasks reading EZ files will fail, probably be masked by framework reattempts
# EZ file creation fails after creating a 0-length file – client receives EDEK in the create response, then fails when decrypting the EDEK
# Bulk hadoop fs copies, and maybe distcp, will prematurely fail



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org