You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "zhangshuyan0 (via GitHub)" <gi...@apache.org> on 2023/05/05 03:02:44 UTC

[GitHub] [hadoop] zhangshuyan0 opened a new pull request, #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

zhangshuyan0 opened a new pull request, #5622:
URL: https://github.com/apache/hadoop/pull/5622

   ### Description of PR
   `processFirstBlockReport()` is used to process first block report from datanode. It does not calculating `toRemove` list because it believes that there is no metadata about the datanode in the namenode. However, If a datanode is re registered after restarting, its `blockReportCount` will be updated to 0. 
   https://github.com/apache/hadoop/blob/c7699d3dcd4f8feaf2c5ae5943b8a4cec738e95d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java#L1007-L1012
   That is to say, the first block report after a datanode restarts will be processed by `processFirstBlockReport()`. 
   https://github.com/apache/hadoop/blob/c7699d3dcd4f8feaf2c5ae5943b8a4cec738e95d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L2916-L2925
   This is unreasonable because the metadata of the datanode already exists in namenode at this time, and if redundant replica metadata is not removed in time, the blocks with insufficient replicas cannot be reconstruct in time, which increases the risk of missing block. In summary, `processFirstBlockReport()` should only be used when the namenode restarts, not when the datanode restarts.
   ### How was this patch tested?
   Add a new unit test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1536130935

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 221m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  2s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 17s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/1/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 138 unchanged - 0 fixed = 139 total (was 138)  |
   | +1 :green_heart: |  mvnsite  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 32s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 59s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 218m 51s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 512m 43s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
   |   | hadoop.hdfs.TestRollingUpgrade |
   |   | hadoop.hdfs.TestDecommissionWithStriped |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5622 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux fa71be949238 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c698567728b364df3260cbd5b56cd933b1f0628e |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/1/testReport/ |
   | Max. process+thread count | 3463 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1539287136

   Committed to trunk. Thanks @zhangshuyan0 for your contributions, and thanks @ayushtkn for your reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on a diff in pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on code in PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#discussion_r1187085999


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java:
##########
@@ -2066,4 +2069,55 @@ public void testValidateReconstructionWorkAndRacksNotEnough() {
     // validateReconstructionWork return false, need to perform resetTargets().
     assertNull(work.getTargets());
   }
+
+  /**
+   * Test whether the first block report after DataNode restart is completely
+   * processed.
+   */
+  @Test
+  public void testBlockReportAfterDataNodeRestart() throws Exception {
+    Configuration conf = new HdfsConfiguration();
+    try (MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
+           .numDataNodes(3).storagesPerDatanode(1).build()) {
+      cluster.waitActive();
+      BlockManager blockManager = cluster.getNamesystem().getBlockManager();
+      DistributedFileSystem fs = cluster.getFileSystem();
+      final Path filePath = new Path("/tmp.txt");
+      final long fileLen = 1L;
+      DFSTestUtil.createFile(fs, filePath, fileLen, (short) 3, 1L);
+      DFSTestUtil.waitForReplication(fs, filePath, (short) 3, 60000);
+      ArrayList<DataNode> datanodes = cluster.getDataNodes();
+      assertEquals(datanodes.size(), 3);
+
+      // Stop RedundancyMonitor.
+      blockManager.setInitializedReplQueues(false);
+
+      // Delete the replica on the first datanode.
+      File dnDir =
+          datanodes.get(0).getFSDataset().getVolumeList().get(0)
+              .getCurrentDir();
+      String[] children = FileUtil.list(dnDir);
+      for (String s : children) {
+        if (!s.equals("VERSION")) {
+          FileUtil.fullyDeleteContents(new File(dnDir, s));
+        }
+      }
+
+      // The number of replicas is still 3 because the datanode has not sent
+      // a new block report.
+      FileStatus stat = fs.getFileStatus(filePath);
+      BlockLocation[] locs = fs.getFileBlockLocations(stat, 0, stat.getLen());
+      assertEquals(3, locs[0].getHosts().length);
+
+      // Restart the first datanode.
+      cluster.restartDataNode(0, true);
+
+      // Wait for the block report to be processed.
+      Thread.sleep(5000);

Review Comment:
   Thanks for your review, I'll check it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1538780054

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 23s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  3s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 38s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 203m 41s |  |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 303m 40s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5622 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux baab2ee3e222 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c6416bc2ebf0600ef153cd381ef6ee52372225b1 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/4/testReport/ |
   | Max. process+thread count | 3451 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/4/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1536736228

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 27s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 26s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 207m  6s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 307m 40s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5622 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux bac6d7454899 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 32bf0759a5f22651b2b67c204ab12e9a6b689d52 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/2/testReport/ |
   | Max. process+thread count | 3287 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5622/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao merged pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao merged PR #5622:
URL: https://github.com/apache/hadoop/pull/5622


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1537031643

   > If the first block report is processed using processFirstBlockReport when datanode restart, the incorrect metadata in Namenode memory will not be fixed until the next block report. In my opinion, the processing of the first block report after datanode restart should be the same as the processing of subsequent block reports.
   
   Make sense to me. +1 from my side. 
   
   cc @jojochuang @ayushtkn @goiri Would you mind to take another reviews? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ayushtkn commented on a diff in pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "ayushtkn (via GitHub)" <gi...@apache.org>.
ayushtkn commented on code in PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#discussion_r1187022192


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java:
##########
@@ -2066,4 +2069,55 @@ public void testValidateReconstructionWorkAndRacksNotEnough() {
     // validateReconstructionWork return false, need to perform resetTargets().
     assertNull(work.getTargets());
   }
+
+  /**
+   * Test whether the first block report after DataNode restart is completely
+   * processed.
+   */
+  @Test
+  public void testBlockReportAfterDataNodeRestart() throws Exception {
+    Configuration conf = new HdfsConfiguration();
+    try (MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
+           .numDataNodes(3).storagesPerDatanode(1).build()) {
+      cluster.waitActive();
+      BlockManager blockManager = cluster.getNamesystem().getBlockManager();
+      DistributedFileSystem fs = cluster.getFileSystem();
+      final Path filePath = new Path("/tmp.txt");
+      final long fileLen = 1L;
+      DFSTestUtil.createFile(fs, filePath, fileLen, (short) 3, 1L);
+      DFSTestUtil.waitForReplication(fs, filePath, (short) 3, 60000);
+      ArrayList<DataNode> datanodes = cluster.getDataNodes();
+      assertEquals(datanodes.size(), 3);
+
+      // Stop RedundancyMonitor.
+      blockManager.setInitializedReplQueues(false);
+
+      // Delete the replica on the first datanode.
+      File dnDir =
+          datanodes.get(0).getFSDataset().getVolumeList().get(0)
+              .getCurrentDir();
+      String[] children = FileUtil.list(dnDir);
+      for (String s : children) {
+        if (!s.equals("VERSION")) {
+          FileUtil.fullyDeleteContents(new File(dnDir, s));
+        }
+      }
+
+      // The number of replicas is still 3 because the datanode has not sent
+      // a new block report.
+      FileStatus stat = fs.getFileStatus(filePath);
+      BlockLocation[] locs = fs.getFileBlockLocations(stat, 0, stat.getLen());
+      assertEquals(3, locs[0].getHosts().length);
+
+      // Restart the first datanode.
+      cluster.restartDataNode(0, true);
+
+      // Wait for the block report to be processed.
+      Thread.sleep(5000);

Review Comment:
   can you use specific wait methods or at worst some GenericTestUtils.waitFor instead of a specific sleep. MiniDfsCluster has some ``waitDataNodeFullyStarted`` or ``waitFirstBRCompleted``, check if you can use that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1536398861

   @Hexiaoqiao Thanks for your review.  I think missing blocks is a more serious problem than performance loss. If the first block report is processed using `processFirstBlockReport` when datanode restart, the incorrect metadata in Namenode memory will not be fixed until the next block report. In my opinion, the processing of the first block report after datanode restart should be the same as the processing of subsequent block reports.
   Here are the answers to your questions.
   a. `processFirstBlockReport` is to speed up the startup of NameNode. If additional logic is added to it, it will prolong the time that NameNode is in safemode.
   b.  A new datanode has no replicas (or very few replicas), and its `reports`(param of `blockReport` RPC)  and `DatanodeStorageInfo#BlockIterator` almost have no content. So this PR has no effect on adding new datanodes.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5622: HDFS-16999. Fix wrong use of processFirstBlockReport().

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5622:
URL: https://github.com/apache/hadoop/pull/5622#issuecomment-1535670070

   @zhangshuyan0 Thanks for your proposal and try to fix this issue. Just glance the PR, it propose to switch `processFirstBlockReport` to `processReport` when restart DataNode only, right? I am concerned the performance if we do that.
   a. Is it possible to improve `processFirstBlockReport` to solve this issue?
   b. Will it affect the process logic or performance when add one new DataNode to cluster?
   Thanks again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org