You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wilfred Spiegelenburg (JIRA)" <ji...@apache.org> on 2016/08/30 03:12:20 UTC
[jira] [Reopened] (YARN-5567) Fix script exit code checking in
NodeHealthScriptRunner#reportHealthStatus
[ https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wilfred Spiegelenburg reopened YARN-5567:
-----------------------------------------
The script handling had a lot of comments in it why the exit code was ignored and an exit code that is not zero should not change the health status:
{code}
144 * The node is marked unhealthy if
145 * <ol>
146 * <li>The node health script times out</li>
147 * <li>The node health scripts output has a line which begins with ERROR</li>
148 * <li>An exception is thrown while executing the script</li>
149 * </ol>
150 * If the script throws {@link IOException} or {@link ExitCodeException} the
151 * output is ignored and node is left remaining healthy, as script might
152 * have syntax error.
{code}
What we have just done is break all of this. We now do not ignore the exit code and mark the node as unhealthy. I assume this was originally done for a reason and we could have just introduced a backwards incompatible behavioural change.
Looking at the underlying ShellCommandExecutor and tracing back to the {{Shell.runCommnad()}} method: all non zero exit codes will throw a {{ExitCodeException}}.
If we are going to change the behaviour that is documented we should not do it in release 2.8.1 and also update all related documentation.
> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --------------------------------------------------------------------------
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.8.0, 3.0.0-alpha1
> Reporter: Yufei Gu
> Assignee: Yufei Gu
> Fix For: 2.8.1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
> case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be
> {code}
> case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org