You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2022/06/06 02:53:05 UTC

[GitHub] [hadoop] ZanderXu opened a new pull request, #4407: HDFS-16622. addRDBI in IncrementalBlockReportManager may remove the b…

ZanderXu opened a new pull request, #4407:
URL: https://github.com/apache/hadoop/pull/4407

   JIRA: [HDFS-16622](https://issues.apache.org/jira/browse/HDFS-16622).  addRDBI in IncrementalBlockReportManager may remove the block with bigger GS.
   I suspect there is a bug in function addRDBI(ReceivedDeletedBlockInfo rdbi,DatanodeStorage storage)(line 250).
   Bug code in the for loop:
   synchronized void addRDBI(ReceivedDeletedBlockInfo rdbi,
         DatanodeStorage storage) {
       // Make sure another entry for the same block is first removed.
       // There may only be one such entry.
       for (PerStorageIBR perStorage : pendingIBRs.values()) {
         if (perStorage.remove(rdbi.getBlock()) != null) {
           break;
         }
       }
       getPerStorageIBR(storage).put(rdbi);
     }
   Removed the GS of the Block in ReceivedDeletedBlockInfo may be greater than the GS of the Block in rdbi. And NN will invalidate the Replicate will small GS when complete one block.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #4407: HDFS-16622. addRDBI in IncrementalBlockReportManager may remove the b…

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on PR #4407:
URL: https://github.com/apache/hadoop/pull/4407#issuecomment-1147344802

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m  3s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 18s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m  0s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 41s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 395m  0s |  |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 512m 44s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4407/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4407 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux e7745f582308 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 91f7ff3a9989a9a18398cf8c82b1e30492a86bad |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4407/1/testReport/ |
   | Max. process+thread count | 2066 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4407/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on a diff in pull request #4407: HDFS-16622. addRDBI in IncrementalBlockReportManager may remove the b…

Posted by GitBox <gi...@apache.org>.
Hexiaoqiao commented on code in PR #4407:
URL: https://github.com/apache/hadoop/pull/4407#discussion_r891245808


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java:
##########
@@ -251,12 +251,20 @@ synchronized void addRDBI(ReceivedDeletedBlockInfo rdbi,
       DatanodeStorage storage) {
     // Make sure another entry for the same block is first removed.
     // There may only be one such entry.
+    ReceivedDeletedBlockInfo removedInfo = null;
     for (PerStorageIBR perStorage : pendingIBRs.values()) {
-      if (perStorage.remove(rdbi.getBlock()) != null) {
+      removedInfo = perStorage.remove(rdbi.getBlock());
+      if (removedInfo != null) {
         break;
       }
     }
-    getPerStorageIBR(storage).put(rdbi);
+    if (removedInfo != null &&

Review Comment:
   My first feeling is `pendingIBRs` should keep the freshest `rdbis` set to report NameNode. But after changes, it will be not the fresh data and also inconsistence with block data on Storage, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ZanderXu commented on a diff in pull request #4407: HDFS-16622. addRDBI in IncrementalBlockReportManager may remove the b…

Posted by GitBox <gi...@apache.org>.
ZanderXu commented on code in PR #4407:
URL: https://github.com/apache/hadoop/pull/4407#discussion_r891298950


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java:
##########
@@ -251,12 +251,20 @@ synchronized void addRDBI(ReceivedDeletedBlockInfo rdbi,
       DatanodeStorage storage) {
     // Make sure another entry for the same block is first removed.
     // There may only be one such entry.
+    ReceivedDeletedBlockInfo removedInfo = null;
     for (PerStorageIBR perStorage : pendingIBRs.values()) {
-      if (perStorage.remove(rdbi.getBlock()) != null) {
+      removedInfo = perStorage.remove(rdbi.getBlock());
+      if (removedInfo != null) {
         break;
       }
     }
-    getPerStorageIBR(storage).put(rdbi);
+    if (removedInfo != null &&

Review Comment:
   We encountered the case of concurrent CloseRecovery.  The CloseRecovery with small GS early process block on Storage but later being added into pendingIBRs, and CloseRecovery with bigger GS later process block on Storage but early being added into pendingIBRs. As a result, the large GS block is stored on the disk, but small GS block being reported to Namenode.  And very unfortunately, the block has one this valid replica, and leads to the block missing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] ZanderXu commented on a diff in pull request #4407: HDFS-16622. addRDBI in IncrementalBlockReportManager may remove the b…

Posted by GitBox <gi...@apache.org>.
ZanderXu commented on code in PR #4407:
URL: https://github.com/apache/hadoop/pull/4407#discussion_r891298950


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java:
##########
@@ -251,12 +251,20 @@ synchronized void addRDBI(ReceivedDeletedBlockInfo rdbi,
       DatanodeStorage storage) {
     // Make sure another entry for the same block is first removed.
     // There may only be one such entry.
+    ReceivedDeletedBlockInfo removedInfo = null;
     for (PerStorageIBR perStorage : pendingIBRs.values()) {
-      if (perStorage.remove(rdbi.getBlock()) != null) {
+      removedInfo = perStorage.remove(rdbi.getBlock());
+      if (removedInfo != null) {
         break;
       }
     }
-    getPerStorageIBR(storage).put(rdbi);
+    if (removedInfo != null &&

Review Comment:
   We encountered the case of concurrent CloseRecovery.  The CloseRecovery with small GS early process block on Storage but later being added into pendingIBRs, and CloseRecovery with bigger GS later process block on Storage but early being added into pendingIBRs. As a result, the large GS block is stored on the disk, but small GS block being reported to Namenode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] Hexiaoqiao commented on a diff in pull request #4407: HDFS-16622. addRDBI in IncrementalBlockReportManager may remove the b…

Posted by GitBox <gi...@apache.org>.
Hexiaoqiao commented on code in PR #4407:
URL: https://github.com/apache/hadoop/pull/4407#discussion_r893514408


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java:
##########
@@ -251,12 +251,20 @@ synchronized void addRDBI(ReceivedDeletedBlockInfo rdbi,
       DatanodeStorage storage) {
     // Make sure another entry for the same block is first removed.
     // There may only be one such entry.
+    ReceivedDeletedBlockInfo removedInfo = null;
     for (PerStorageIBR perStorage : pendingIBRs.values()) {
-      if (perStorage.remove(rdbi.getBlock()) != null) {
+      removedInfo = perStorage.remove(rdbi.getBlock());
+      if (removedInfo != null) {
         break;
       }
     }
-    getPerStorageIBR(storage).put(rdbi);
+    if (removedInfo != null &&

Review Comment:
   @ZanderXu Thanks for the detailed information. It is an interesting case. IMO, this improvement makes sense to me. Would you mind to add unit test to cover this case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org