You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "xieguiming (Commented) (JIRA)" <ji...@apache.org> on 2012/04/11 17:31:19 UTC
[jira] [Commented] (MAPREDUCE-4074) Client continuously retries to
RM When RM goes down before launching Application Master
[ https://issues.apache.org/jira/browse/MAPREDUCE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251674#comment-13251674 ]
xieguiming commented on MAPREDUCE-4074:
---------------------------------------
The reason is below function:
{code:title=ClientServiceDelegate.java|borderStyle=solid}
private synchronized Object invoke(String method, Class argClass,
Object args) throws YarnRemoteException {
Method methodOb = null;
try {
methodOb = MRClientProtocol.class.getMethod(method, argClass);
} catch (SecurityException e) {
throw new YarnException(e);
} catch (NoSuchMethodException e) {
throw new YarnException("Method name mismatch", e);
}
while (true) {
try {
return methodOb.invoke(getProxy(), args);
} catch (YarnRemoteException yre) {
LOG.warn("Exception thrown by remote end.", yre);
throw yre;
} catch (InvocationTargetException e) {
if (e.getTargetException() instanceof YarnRemoteException) {
LOG.warn("Error from remote end: " + e
.getTargetException().getLocalizedMessage());
LOG.debug("Tracing remote error ", e.getTargetException());
throw (YarnRemoteException) e.getTargetException();
}
LOG.debug("Failed to contact AM/History for job " + jobId +
" retrying..", e.getTargetException());
// Force reconnection by setting the proxy to null.
realProxy = null;
} catch (Exception e) {
LOG.debug("Failed to contact AM/History for job " + jobId
+ " Will retry..", e);
// Force reconnection by setting the proxy to null.
realProxy = null;
}
}
}
{code}
When RM goes down, and will throw the java.lang.reflect.UndeclaredThrowableException, and will continuously retry.
> Client continuously retries to RM When RM goes down before launching Application Master
> ---------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4074
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4074
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.23.1
> Reporter: Devaraj K
>
> Client continuously tries to RM and logs the below messages when the RM goes down before launching App Master.
> I feel exception should be thrown or break the loop after finite no of retries.
> {code:xml}
> 28/03/12 07:15:03 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 0 time(s).
> 28/03/12 07:15:04 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 1 time(s).
> 28/03/12 07:15:05 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 2 time(s).
> 28/03/12 07:15:06 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 3 time(s).
> 28/03/12 07:15:07 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 4 time(s).
> 28/03/12 07:15:08 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 5 time(s).
> 28/03/12 07:15:09 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 6 time(s).
> 28/03/12 07:15:10 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 7 time(s).
> 28/03/12 07:15:11 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 8 time(s).
> 28/03/12 07:15:12 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 9 time(s).
> 28/03/12 07:15:13 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 0 time(s).
> 28/03/12 07:15:14 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 1 time(s).
> 28/03/12 07:15:15 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 2 time(s).
> 28/03/12 07:15:16 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 3 time(s).
> 28/03/12 07:15:17 INFO ipc.Client: Retrying connect to server: linux-f330.site/10.18.40.182:8032. Already tried 4 time(s).
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira