You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Daniel Templeton (JIRA)" <ji...@apache.org> on 2015/11/25 19:46:10 UTC

[jira] [Created] (YARN-4394) ClientServiceDelegate doesn't handle retries during AM restart as intended

Daniel Templeton created YARN-4394:
--------------------------------------

             Summary: ClientServiceDelegate doesn't handle retries during AM restart as intended
                 Key: YARN-4394
                 URL: https://issues.apache.org/jira/browse/YARN-4394
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Daniel Templeton
            Assignee: Daniel Templeton


In the {{invoke()}} method, I found the following code:

{code}
  private AtomicBoolean usingAMProxy = new AtomicBoolean(false);
...
        // if it's AM shut down, do not decrement maxClientRetry as we wait for
        // AM to be restarted.
        if (!usingAMProxy.get()) {
          maxClientRetry--;
        }
        usingAMProxy.set(false);
{code}

When we create the AM proxy, we set the flag to true.  If we fail to connect, the impact of the flag being true is that the code will try one extra time, giving it 400ms instead of just 300ms.  I can't imagine that's the intended behavior.  After any failure, the flag will forever more be false, but fortunately (?!?) the flag is otherwise unused.

Looks like I need to do some archeology to figure out how we ended up here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)