You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by sn...@apache.org on 2019/09/25 08:36:44 UTC

[hadoop] branch trunk updated: YARN-6715. Fix documentation about NodeHealthScriptRunner. Contributed by Peter Bacsko

This is an automated email from the ASF dual-hosted git repository.

snemeth pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/hadoop.git


The following commit(s) were added to refs/heads/trunk by this push:
     new c724577  YARN-6715. Fix documentation about NodeHealthScriptRunner. Contributed by Peter Bacsko
c724577 is described below

commit c72457787df33b44a853fceff0cfe180850c4960
Author: Szilard Nemeth <sn...@apache.org>
AuthorDate: Wed Sep 25 10:36:22 2019 +0200

    YARN-6715. Fix documentation about NodeHealthScriptRunner. Contributed by Peter Bacsko
---
 .../main/java/org/apache/hadoop/util/NodeHealthScriptRunner.java | 1 +
 .../java/org/apache/hadoop/util/TestNodeHealthScriptRunner.java  | 9 +++++++++
 .../hadoop-yarn-site/src/site/markdown/NodeManager.md            | 8 +++++++-
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NodeHealthScriptRunner.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NodeHealthScriptRunner.java
index 7c46c5b..f2a5b24 100644
--- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NodeHealthScriptRunner.java
+++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NodeHealthScriptRunner.java
@@ -163,6 +163,7 @@ public class NodeHealthScriptRunner extends AbstractService {
         setHealthStatus(false, exceptionStackTrace);
         break;
       case FAILED_WITH_EXIT_CODE:
+        // see Javadoc above - we don't report bad health intentionally
         setHealthStatus(true, "", now);
         break;
       case FAILED:
diff --git a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestNodeHealthScriptRunner.java b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestNodeHealthScriptRunner.java
index 8fc64d1..2748c0b 100644
--- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestNodeHealthScriptRunner.java
+++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestNodeHealthScriptRunner.java
@@ -94,6 +94,8 @@ public class TestNodeHealthScriptRunner {
     String timeOutScript =
       Shell.WINDOWS ? "@echo off\nping -n 4 127.0.0.1 >nul\necho \"I am fine\""
       : "sleep 4\necho \"I am fine\"";
+    String exitCodeScript = "exit 127";
+
     Configuration conf = new Configuration();
     writeNodeHealthScriptFile(normalScript, true);
     NodeHealthScriptRunner nodeHealthScriptRunner = new NodeHealthScriptRunner(
@@ -132,5 +134,12 @@ public class TestNodeHealthScriptRunner {
     Assert.assertEquals(
             NodeHealthScriptRunner.NODE_HEALTH_SCRIPT_TIMED_OUT_MSG,
             nodeHealthScriptRunner.getHealthReport());
+
+    // Exit code 127
+    writeNodeHealthScriptFile(exitCodeScript, true);
+    timerTask.run();
+    Assert.assertTrue("Node health status reported unhealthy",
+        nodeHealthScriptRunner.isHealthy());
+    Assert.assertEquals("", nodeHealthScriptRunner.getHealthReport());
   }
 }
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
index 5b9f0ef..e4ed57f 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManager.md
@@ -44,7 +44,13 @@ The following configuration parameters can be used to modify the disk checks:
 
 ###External Health Script
 
-Users may specify their own health checker script that will be invoked by the health checker service. Users may specify a timeout as well as options to be passed to the script. If the script exits with a non-zero exit code, times out or results in an exception being thrown, the node is marked as unhealthy. Please note that if the script cannot be executed due to permissions or an incorrect path, etc, then it counts as a failure and the node will be reported as unhealthy. Please note that [...]
+Users may specify their own health checker script that will be invoked by the health checker service. Users may specify a timeout as well as options to be passed to the script. If the script times out, results in an exception being thrown or outputs a line which begins with the string ERROR, the node is marked as unhealthy. Please note that:
+
+  * Exit code other than 0 is **not** considered to be a failure because it might have been caused by a syntax error. Therefore the node will **not** be marked as unhealthy.
+
+  * If the script cannot be executed due to permissions or an incorrect path, etc, then it counts as a failure and the node will be reported as unhealthy.
+
+  * Specifying a health check script is not mandatory. If no script is specified, only the disk checker status will be used to determine the health of the node.
 
 The following configuration parameters can be used to set the health script:
 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org