You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "hfutatzhanghb (via GitHub)" <gi...@apache.org> on 2023/05/11 08:58:37 UTC

[GitHub] [hadoop] hfutatzhanghb opened a new pull request, #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

hfutatzhanghb opened a new pull request, #5643:
URL: https://github.com/apache/hadoop/pull/5643

   The description is in HDFS-17003.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1556708182

   ![image](https://github.com/apache/hadoop/assets/25115709/a181d152-e642-4b41-a770-03b7f8d693f2)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1549102000

   Hi, @sodonnel @Hexiaoqiao @tomscut , could you please help me review the code. thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1554053305

   @ayushtkn  Sir, could you please also help me review this PR ? thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1549838520

   Have you validated this change works correctly, as I did not test the code when I wrote it?
   
   Also, do you think you could add a test somehow to reproduce the issue and also that the fix corrects it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1579831422

   LGTM. +1 from my side. Let's wait if @zhangshuyan0 has any concerns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1577869018

   I think the added UT still has a problem. I'm not sure if you saw this comment from me:
   
   > Should we skip dnd and dnd2 when turn on heartbeat here? Otherwise the corrupt replicas may be deleted from InvalidateBlock if delete commands are sent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1220714880


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,129 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  /**
+   * This unit test try to cover the below situation:
+   * Suppose we have an EC file with RS(d,p) policy and block group id
+   * is blk_-9223372036845119810_1920002.
+   * If the first and second data block in this ec block group are corrupted,
+   * meanwhile we read this EC file.
+   * It will trigger reportBadBlock RPC and
+   * add the blk_-9223372036845119810_1920002
+   * and blk_-9223372036845119809_1920002 blocks to corruptReplicas.
+   * It will also reconstruct the two blocks and send IBR to namenode,
+   * then execute BlockManager#addStoredBlock and
+   * invalidateCorruptReplicas method. Suppose we first receive the IBR of
+   * blk_-9223372036845119810_1920002, then in invalidateCorruptReplicas method,
+   * it will only invalidate 9223372036845119809_1920002 on the two datanodes contains
+   * the two corrupt blocks.
+   *
+   * @throws Exception
+   */
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // Find the first block file.
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // Corrupt the block file.
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Find the second block file.
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // Corrupt the second block file.
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode.
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // Do stateful read.
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // Check whether the corruption has been reported to the NameNode.
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // Double check.
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        if (!datanode.getDatanodeUuid().equals(dnd.getDatanodeUuid()) ||

Review Comment:
   Should this be && instead of || ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1544059352

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 48s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 34s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 206m 45s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 311m 57s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 0beff46017b7 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f38a250b1c02fbe7d684855facbdf55c87b97eaf |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/1/testReport/ |
   | Max. process+thread count | 3215 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1552544798

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 56s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 39s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  3s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 17s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 203m 26s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 304m 31s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/6/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 96c13f317603 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 98af494799beba7ca8a35aaf7af7bf1ca28e6cd6 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/6/testReport/ |
   | Max. process+thread count | 3476 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/6/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1551117346

   @hfutatzhanghb Thanks for involving me here. It makes sense to me. +1 to "add a test somehow to reproduce the issue and check if it is fixed".
   
   > In other words, if a replica corrupt due to GENSTAMP_MISMATCH, Can the check on datanode pass successfully after receiving the command(Line91 below)?
   
   To @zhangshuyan0 , IMO this is another case (Maybe replication or EC both has the issue.), Let's file another issue to track and try to fix it.
   Thanks again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1553160474

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 25s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  3s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 10s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 205m 58s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/7/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 305m 35s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/7/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d45c95bf78fc 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ba94c96d99cd0ee72b106a3347b0b14cf0c312ae |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/7/testReport/ |
   | Max. process+thread count | 3206 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/7/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1576468725

   @Hexiaoqiao he @ayushtkn @sodonnel @zhangshuyan0 Hi, sir. Could you please help me to push this issue forward ? Thanks again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1576849188

   Sorry for the late response. I will review this days. cc @zhangshuyan0 would you mind to take a review if have bandwidth.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1552291958

   fix checkstyle,   
   
   hadoop.hdfs.TestErasureCodingPolicies and hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy were passed in my local environment.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1197477766


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,108 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // find the first block file
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // corrupt the block file
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // find the second block file
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // corrupt the second block file
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // do stateful read
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // check whether the corruption has been reported to the NameNode
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // double check
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        DataNodeTestUtils.setHeartbeatsDisabledForTests(datanode, false);

Review Comment:
   Should we skip dnd and dnd2 when turn on heartbeat here? Otherwise the corrupt replicas may be deleted from `InvalidateBlock` if delete commands are sent. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1544680459

   If I understand correctly, for a replicated block, if there are two corrupt block the code in InvalidateCorruptReplicas will be called when the block has been replicated correctly. At that point there will be 3 good replicas and 2 corrupt stored in the corruptReplicas map. The code in the above method will then iterate over those two and "invalidate" them on the datanodes they are stored in.
   
   For EC, the same applies, however we are sending the blockID + index of the last reported replica to both DNs. All we store in the corruptReplica map, is the block group ID (ie the block ID with the replica index stripped out) and then the list of nodes hosting it. At this point in the code we don't know what index is on each of the nodes hosting a corrupt replica. Is this correct?
   
   Its not clear to me how the fix in this PR fixes the problem:
   
   ```
           if (blk.isStriped()) {
             DatanodeStorageInfo[] storages = getStorages(blk);
             for (DatanodeStorageInfo storage : storages) {
               final Block b = getBlockOnStorage(blk, storage);
               if (b != null) {
                 reported = b;
               }
             }
           }
   ```
   For each node stored, we get the storages for the block, which will be the nodes hosting it. Then we getStorageOnBlock and it is sure to return non-null for each of the storages, as they all host a block in the group, right?
   
   Do we not need to somehow find the replica index for the block for each of the nodes listed, and then setup the "reported block" with the correct blockID + index for that node, passing that to invalidate?
   
   Would something like this work - NOTE - I have not tested this at all:
   
   ```
   --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
   +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
   @@ -3536,9 +3536,25 @@ private void invalidateCorruptReplicas(BlockInfo blk, Block reported,
        // ConcurrentModificationException, when the block is removed from the node
        DatanodeDescriptor[] nodesCopy =
            nodes.toArray(new DatanodeDescriptor[nodes.size()]);
   +
   +    DatanodeStorageInfo[] storages = null;
   +    if (blk.isStriped()) {
   +      storages = getStorages(blk);
   +    }
   +
        for (DatanodeDescriptor node : nodesCopy) {
   +      Block blockToInvalidate = reported;
   +      if (blk.isStriped()) {
   +        for (DatanodeStorageInfo s : storages) {
   +          if (s.getDatanodeDescriptor() == node) {
   +            blockToInvalidate = getBlockOnStorage(blk, s);
   +            break;
   +          }
   +        }
   +      }
   +
          try {
   -        if (!invalidateBlock(new BlockToMarkCorrupt(reported, blk, null,
   +        if (!invalidateBlock(new BlockToMarkCorrupt(blockToInvalidate, blk, null,
                Reason.ANY), node, numberReplicas)) {
              removedFromBlocksMap = false;
            }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1220738668


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,108 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // find the first block file
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // corrupt the block file
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // find the second block file
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // corrupt the second block file
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // do stateful read
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // check whether the corruption has been reported to the NameNode
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // double check
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        DataNodeTestUtils.setHeartbeatsDisabledForTests(datanode, false);

Review Comment:
   fixed



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,129 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  /**
+   * This unit test try to cover the below situation:
+   * Suppose we have an EC file with RS(d,p) policy and block group id
+   * is blk_-9223372036845119810_1920002.
+   * If the first and second data block in this ec block group are corrupted,
+   * meanwhile we read this EC file.
+   * It will trigger reportBadBlock RPC and
+   * add the blk_-9223372036845119810_1920002
+   * and blk_-9223372036845119809_1920002 blocks to corruptReplicas.
+   * It will also reconstruct the two blocks and send IBR to namenode,
+   * then execute BlockManager#addStoredBlock and
+   * invalidateCorruptReplicas method. Suppose we first receive the IBR of
+   * blk_-9223372036845119810_1920002, then in invalidateCorruptReplicas method,
+   * it will only invalidate 9223372036845119809_1920002 on the two datanodes contains
+   * the two corrupt blocks.
+   *
+   * @throws Exception
+   */
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // Find the first block file.
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // Corrupt the block file.
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Find the second block file.
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // Corrupt the second block file.
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode.
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // Do stateful read.
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // Check whether the corruption has been reported to the NameNode.
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // Double check.
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        if (!datanode.getDatanodeUuid().equals(dnd.getDatanodeUuid()) ||

Review Comment:
   @zhangshuyan0 shuyan, I think we should use || here, because when call getBlocksToInvalidateByLimit method, it will execute `blockSet.pollN(limit)` which will remove the block from nodeToECBlocks and these block will be added to DatanodeDescriptor object's invalidateBlocks field.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1221254591


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,129 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  /**
+   * This unit test try to cover the below situation:
+   * Suppose we have an EC file with RS(d,p) policy and block group id
+   * is blk_-9223372036845119810_1920002.
+   * If the first and second data block in this ec block group are corrupted,
+   * meanwhile we read this EC file.
+   * It will trigger reportBadBlock RPC and
+   * add the blk_-9223372036845119810_1920002
+   * and blk_-9223372036845119809_1920002 blocks to corruptReplicas.
+   * It will also reconstruct the two blocks and send IBR to namenode,
+   * then execute BlockManager#addStoredBlock and
+   * invalidateCorruptReplicas method. Suppose we first receive the IBR of
+   * blk_-9223372036845119810_1920002, then in invalidateCorruptReplicas method,
+   * it will only invalidate 9223372036845119809_1920002 on the two datanodes contains
+   * the two corrupt blocks.
+   *
+   * @throws Exception
+   */
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // Find the first block file.
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // Corrupt the block file.
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Find the second block file.
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // Corrupt the second block file.
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode.
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // Do stateful read.
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // Check whether the corruption has been reported to the NameNode.
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // Double check.
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        if (!datanode.getDatanodeUuid().equals(dnd.getDatanodeUuid()) ||

Review Comment:
   @zhangshuyan0 Thanks shuyan for your reviewing, have fixed.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1578868127

   Thanks a lot, sir
   
   
   
   
   ---- Replied Message ----
   | From | ***@***.***> |
   | Date | 06/06/2023 22:22 |
   | To | ***@***.***> |
   | Cc | ***@***.***>***@***.***> |
   | Subject | Re: [apache/hadoop] HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode (PR #5643) |
   
   CI just came back green as I typed my review. I am happy with this now, but I will let others have a look too.
   
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you were mentioned.Message ID: ***@***.***>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao merged pull request #5643: HDFS-17003. Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao merged PR #5643:
URL: https://github.com/apache/hadoop/pull/5643


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1556424290

   > 
   
   @Hexiaoqiao Sir, thanks for your suggestions. 
   1、The failed UT was passed in my local environment.
   2、I have fixed checkstyle problems.
   Thanks again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1556699136

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |  28m  0s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 30s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  6s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  0s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m  7s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 205m  1s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/9/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 332m 15s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/9/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 2f3718ab399e 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / def09a293956c3609ceeb2d786c208a99af42a9a |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/9/testReport/ |
   | Max. process+thread count | 3637 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/9/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] sodonnel commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1578862483

   CI just came back green as I typed my review. I am happy with this now, but I will let others have a look too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1580962363

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  1s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 15s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 33s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 41s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  7s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 29s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 200m 39s |  |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 309m 53s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/12/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux ed32e42e0842 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9b31f7a19e59013b456429ad31213620575d02c1 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/12/testReport/ |
   | Max. process+thread count | 3513 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/12/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1544110986

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 38s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m  7s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  3s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 58s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 212m  0s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 317m 30s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 47fe60f099f4 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 11ac8ffa350cf7b60c5465f4464b0a921d78d7e2 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/2/testReport/ |
   | Max. process+thread count | 3005 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1551205998

   > @hfutatzhanghb Thanks for involving me here. It makes sense to me. +1 to "add a test somehow to reproduce the issue and check if it is fixed".
   > 
   > > In other words, if a replica corrupt due to GENSTAMP_MISMATCH, Can the check on datanode pass successfully after receiving the command(Line91 below)?
   > 
   > To @zhangshuyan0 , IMO this is another case (Maybe replication or EC both has the issue.), Let's file another issue to track and try to fix it. Thanks again.
   
   @Hexiaoqiao Thanks for reviewing~.  I have added an UT to reproduce this case. 
   @sodonnel , PTAL, thanks a lot.
   
   ![image](https://github.com/apache/hadoop/assets/25115709/8b2f6586-fd32-424f-814f-b22ffd775016)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1198901155


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,108 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // find the first block file

Review Comment:
   Please use a capital letter at the beginning of the sentences and period at the end of it for all annotation.



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,108 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  @Test

Review Comment:
   Just suggest to add some java doc for the new unit test about what do you want to cover case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1578857511

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 23s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 27s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 23s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 13s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 210m 36s |  |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 312m 30s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/11/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 753a1c89b76d 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 79e018cbd41774be75f9b1dd3304f90aba3d16c5 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/11/testReport/ |
   | Max. process+thread count | 3116 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/11/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1578249636

   > Should we skip dnd and dnd2 when turn on heartbeat here? Otherwise the corrupt replicas may be deleted from InvalidateBlock if delete commands are sent.
   
   @zhangshuyan0 Hi, shuyan. I can not see your comment in this page, so sorry about it.
   Thanks for your suggestions, agree with you, and i have fixed it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1545343969

   > 
   
   Hi, @sodonnel . Sorry for replying late and thanks your reviewing. I totally agree with you. Current fix still has problems, I will take a look at this continuously. Thanks for your advice again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1198937622


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,108 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  @Test

Review Comment:
   @Hexiaoqiao , Sir, I have added some java doc. Thanks for your suggestions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1555054384

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 34s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  3s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   0m 53s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/8/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 6 new + 79 unchanged - 0 fixed = 85 total (was 79)  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 16s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 204m 29s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/8/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 307m 40s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/8/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 1a8d1fdd7f16 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ccf3a53aa5050c4259ff85b6970b8f60b5410f50 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/8/testReport/ |
   | Max. process+thread count | 3332 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/8/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1556147416

   > Please use a capital letter at the beginning of the sentences and period at the end of it for all annotation.
   
   Just suggest to improve all added comments following this comments if necessary.(include some other PRs you have submitted.). Please also check the checkstyle and failed unit test if it is related to this changes. Thanks. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1549538635

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 21s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 10s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  4s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 10s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 203m 45s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 303m 38s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3ef13ed58f02 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2fa0f203344c7b4dc45a96b4d5590c5ccae87b55 |
   | Default Java | Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/3/testReport/ |
   | Max. process+thread count | 3441 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1549846297

   > Have you validated this change works correctly, as I did not test the code when I wrote it?
   > 
   > Also, do you think you could add a test somehow to reproduce the issue and also that the fix corrects it?
   
   @sodonnel , Sir, i test this code in our cluster.  everything looks well.  Yes, it is better to add an UT to reproduce the issue.
   I will finish it soonly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1551764563

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |  15m 46s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  1s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 27s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/5/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 79 unchanged - 0 fixed = 80 total (was 79)  |
   | +1 :green_heart: |  mvnsite  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 28s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 225m 10s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 340m 42s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
   |   | hadoop.hdfs.TestErasureCodingPolicies |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux a7f33bbb23b9 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0a33f491412c6dd0cac6ee79dae20c8a6fc4f78f |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/5/testReport/ |
   | Max. process+thread count | 3420 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on pull request #5643: HDFS-17003. Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "Hexiaoqiao (via GitHub)" <gi...@apache.org>.
Hexiaoqiao commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1582282121

   Committed to trunk. Thanks @hfutatzhanghb for your contirbutions, And @zhangshuyan0, @sodonnel for the reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1221223343


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,129 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  /**
+   * This unit test try to cover the below situation:
+   * Suppose we have an EC file with RS(d,p) policy and block group id
+   * is blk_-9223372036845119810_1920002.
+   * If the first and second data block in this ec block group are corrupted,
+   * meanwhile we read this EC file.
+   * It will trigger reportBadBlock RPC and
+   * add the blk_-9223372036845119810_1920002
+   * and blk_-9223372036845119809_1920002 blocks to corruptReplicas.
+   * It will also reconstruct the two blocks and send IBR to namenode,
+   * then execute BlockManager#addStoredBlock and
+   * invalidateCorruptReplicas method. Suppose we first receive the IBR of
+   * blk_-9223372036845119810_1920002, then in invalidateCorruptReplicas method,
+   * it will only invalidate 9223372036845119809_1920002 on the two datanodes contains
+   * the two corrupt blocks.
+   *
+   * @throws Exception
+   */
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // Find the first block file.
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // Corrupt the block file.
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Find the second block file.
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // Corrupt the second block file.
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode.
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // Do stateful read.
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // Check whether the corruption has been reported to the NameNode.
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // Double check.
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        if (!datanode.getDatanodeUuid().equals(dnd.getDatanodeUuid()) ||

Review Comment:
   Sorry I am not understand. Is this condition(line 269-270) always true?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] zhangshuyan0 commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "zhangshuyan0 (via GitHub)" <gi...@apache.org>.
zhangshuyan0 commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1553963327

   To @Hexiaoqiao Thanks for you suggestion. I create a [JIRA](https://issues.apache.org/jira/browse/HDFS-17021) for this issue and will try to fix this later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1198906389


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,108 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // find the first block file

Review Comment:
   Fixed.  Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1562543360

   @ayushtkn Sir, feel so sorry about involving you here too late.  could you also please help review this PR ?  Thanks a lot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1551757561

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |  17m 19s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 11s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 57s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  0s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  0s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   0m 49s | [/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/4/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 79 unchanged - 0 fixed = 80 total (was 79)  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 50s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 221m 14s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 337m 58s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
   |   | hadoop.hdfs.TestErasureCodingPolicies |
   |   | hadoop.hdfs.qjournal.TestNNWithQJM |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 898259b5f727 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0a33f491412c6dd0cac6ee79dae20c8a6fc4f78f |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/4/testReport/ |
   | Max. process+thread count | 3667 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/4/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hfutatzhanghb commented on a diff in pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hfutatzhanghb (via GitHub)" <gi...@apache.org>.
hfutatzhanghb commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1221253077


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,129 @@ public void testInvalidateBlock() throws IOException, InterruptedException {
     }
   }
 
+  /**
+   * This unit test try to cover the below situation:
+   * Suppose we have an EC file with RS(d,p) policy and block group id
+   * is blk_-9223372036845119810_1920002.
+   * If the first and second data block in this ec block group are corrupted,
+   * meanwhile we read this EC file.
+   * It will trigger reportBadBlock RPC and
+   * add the blk_-9223372036845119810_1920002
+   * and blk_-9223372036845119809_1920002 blocks to corruptReplicas.
+   * It will also reconstruct the two blocks and send IBR to namenode,
+   * then execute BlockManager#addStoredBlock and
+   * invalidateCorruptReplicas method. Suppose we first receive the IBR of
+   * blk_-9223372036845119810_1920002, then in invalidateCorruptReplicas method,
+   * it will only invalidate 9223372036845119809_1920002 on the two datanodes contains
+   * the two corrupt blocks.
+   *
+   * @throws Exception
+   */
+  @Test
+  public void testCorruptionECBlockInvalidate() throws Exception {
+
+    final Path file = new Path("/invalidate_corrupted");
+    final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+    final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+    DFSTestUtil.writeFile(dfs, file, bytes);
+
+    int dnIndex = findFirstDataNode(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS);
+    int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+        CELL_SIZE * NUM_DATA_UNITS, 2);
+    Assert.assertNotEquals(-1, dnIndex);
+    Assert.assertNotEquals(-1, dnIndex2);
+
+    LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+        .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+        .get(0);
+    final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+        CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+    final Block b = blks[0].getBlock().getLocalBlock();
+    final Block b2 = blks[1].getBlock().getLocalBlock();
+
+    // Find the first block file.
+    File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+    File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile.exists());
+    // Corrupt the block file.
+    LOG.info("Deliberately corrupting file " + blkFile.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Find the second block file.
+    File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+    File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2, blks[1].getBlock());
+    Assert.assertTrue("Block file does not exist", blkFile2.exists());
+    // Corrupt the second block file.
+    LOG.info("Deliberately corrupting file " + blkFile2.getName());
+    try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+      out.write("corruption".getBytes());
+      out.flush();
+    }
+
+    // Disable the heartbeat from DN so that the corrupted block record is kept
+    // in NameNode.
+    for (DataNode dataNode : cluster.getDataNodes()) {
+      DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+    }
+    try {
+      // Do stateful read.
+      StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+          ByteBuffer.allocate(1024));
+
+      // Check whether the corruption has been reported to the NameNode.
+      final FSNamesystem ns = cluster.getNamesystem();
+      final BlockManager bm = ns.getBlockManager();
+      BlockInfo blockInfo = (ns.getFSDirectory().getINode4Write(file.toString())
+          .asFile().getBlocks())[0];
+      GenericTestUtils.waitFor(() -> {
+        if (bm.getCorruptReplicas(blockInfo) == null) {
+          return false;
+        }
+        return bm.getCorruptReplicas(blockInfo).size() == 2;
+      }, 250, 60000);
+      // Double check.
+      Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+      DatanodeDescriptor dnd =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+      DatanodeDescriptor dnd2 =
+          NameNodeAdapter.getDatanode(ns, cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+      for (DataNode datanode : cluster.getDataNodes()) {
+        if (!datanode.getDatanodeUuid().equals(dnd.getDatanodeUuid()) ||

Review Comment:
   OMG, i am sorry, i mistook your review as line 276- line 277. my bad.
   Yes, here should be &&. I will fix it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #5643: HDFS-17003. Erasure coding: invalidate wrong block after reporting bad blocks from datanode

Posted by "hadoop-yetus (via GitHub)" <gi...@apache.org>.
hadoop-yetus commented on PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1578856600

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 2 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   1m 14s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 23s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 46s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   3m 13s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 18s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 208m  8s |  |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 312m 42s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/10/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 82caaaaa8397 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 79e018cbd41774be75f9b1dd3304f90aba3d16c5 |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/10/testReport/ |
   | Max. process+thread count | 3253 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5643/10/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org