You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/03/11 04:11:34 UTC

[GitHub] [hadoop] dineshchitlangia commented on a change in pull request #2748: HDFS-15879. Exclude slow nodes when choose targets for blocks

dineshchitlangia commented on a change in pull request #2748:
URL: https://github.com/apache/hadoop/pull/2748#discussion_r592050376



##########
File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##########
@@ -356,6 +382,44 @@
         DFSConfigKeys.DFS_NAMENODE_BLOCKS_PER_POSTPONEDBLOCKS_RESCAN_KEY_DEFAULT);
   }
 
+  private void startSlowPeerCollector() {
+    if (slowPeerCollectorDaemon != null) {
+      return;
+    }
+    slowPeerCollectorDaemon = new Daemon(new Runnable() {
+      @Override
+      public void run() {
+        while (true) {
+          try {
+            slowPeers = getSlowPeers();
+          } catch (Exception e) {
+            LOG.error("Slow peers collected failed", e);

Review comment:
       ```suggestion
               LOG.error("Failed to collect slow peers", e);
   ```

##########
File path: hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
##########
@@ -2368,6 +2368,37 @@
   </description>
 </property>
 
+<property>
+  <name>dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled</name>
+  <value>false</value>
+  <description>
+    If this is set to true, we will filter out slow nodes
+    when choosing targets for blocks.
+  </description>
+</property>
+
+<property>
+  <name>dfs.namenode.max.slowpeer.collect.nodes</name>
+  <value>5</value>
+  <description>
+    How many slow nodes we will collect for filtering out
+    when choosing targets for blocks.
+
+    It is ignored if dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is false.
+  </description>
+</property>
+
+<property>
+  <name>dfs.namenode.slowpeer.collect.interval</name>
+  <value>30m</value>
+  <description>
+    How offen slow nodes we will collect for filtering out
+    when choosing targets for blocks.

Review comment:
       ```suggestion
        Interval at which the slow peer trackers runs in the background to collect slow peers.
   ```

##########
File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SlowPeerTracker.java
##########
@@ -233,6 +234,20 @@ public String getSlowNode() {
     }
   }
 
+  /**
+   * Returns all tracking slow peers.
+   * @param numNodes
+   * @return
+   */
+  public ArrayList<String> getSlowNodes(int numNodes) {
+    Collection<ReportForJson> jsonReports = getJsonReports(numNodes);
+    ArrayList<String> slowNodes = new ArrayList<>();
+    for (ReportForJson jsonReport : jsonReports) {
+      slowNodes.add(jsonReport.getSlowNode());
+    }
+    return slowNodes;
+  }

Review comment:
       Somewhere in this method, we should log the slow peers as a WARN so that it shows up on Namenode logs.
   This will be useful for admins when debugging HDFS performance issue. If they see list of slow nodes reported, they can quickly take a look at affected nodes instead of searching for various datanodes logs and searching for various kinds of "Slow Block Receiver" type of log messages in those logs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org