You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/12/01 10:34:59 UTC

[GitHub] [ozone] aryangupta1998 opened a new pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundExc…

aryangupta1998 opened a new pull request #1643:
URL: https://github.com/apache/ozone/pull/1643


   …eption.
   
   ## What changes were proposed in this pull request?
   
   SCMBlockDeletingService terminates task on receiving ContainerNotFoundException. As a result, SCM stops deleting blocks.
   
   > 2020-11-09 23:53:10,026 ERROR org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService: Failed to get block deletion transactions from delTX log
   org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: Container with id #193 not found.
           at org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:542)
           at org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.getContainerInfo(ContainerStateMap.java:188)
           at org.apache.hadoop.hdds.scm.container.ContainerStateManager.getContainer(ContainerStateManager.java:499)
           at org.apache.hadoop.hdds.scm.container.SCMContainerManager.getContainer(SCMContainerManager.java:212)
           at org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl.getTransactions(DeletedBlockLogImpl.java:364)
           at org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService$DeletedBlockTransactionScanner.call(SCMBlockDeletingService.java:126)
           at org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService$DeletedBlockTransactionScanner.call(SCMBlockDeletingService.java:106)
           at org.apache.hadoop.hdds.utils.BackgroundService$PeriodicalTask.lambda$run$0(BackgroundService.java:112)
           at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4447
   ## How was this patch tested?
   
   Tested Manually 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] aryangupta1998 commented on pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
aryangupta1998 commented on pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#issuecomment-738838944


   @lokeshj1703 Thanks for reviewing the PR, I have made the required changes.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 closed pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
lokeshj1703 closed pull request #1643:
URL: https://github.com/apache/ozone/pull/1643


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] aryangupta1998 commented on a change in pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
aryangupta1998 commented on a change in pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#discussion_r535141055



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
##########
@@ -354,26 +356,43 @@ public DatanodeDeletedBlockTransactions getTransactions(
           ? extends Table.KeyValue<Long, DeletedBlocksTransaction>> iter =
                scmMetadataStore.getDeletedBlocksTXTable().iterator()) {
         int numBlocksAdded = 0;
+        BatchOperation batch =
+            scmMetadataStore.getBatchHandler().initBatchOperation();
+        List<DeletedBlocksTransaction> delTx =
+            new ArrayList<>();
         while (iter.hasNext() && numBlocksAdded < blockDeletionLimit) {
-          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue =
-              iter.next();
+          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue = iter.next();
           DeletedBlocksTransaction txn = keyValue.getValue();
           final ContainerID id = ContainerID.valueof(txn.getContainerID());
-          if (txn.getCount() > -1 && txn.getCount() <= maxRetry
-              && !containerManager.getContainer(id).isOpen()) {
-            numBlocksAdded += txn.getLocalIDCount();
-            getTransaction(txn, transactions);
-            transactionToDNsCommitMap
-                .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+          try {
+            if (txn.getCount() > -1 && txn.getCount() <= maxRetry
+                && !containerManager.getContainer(id).isOpen()) {
+              numBlocksAdded += txn.getLocalIDCount();
+              getTransaction(txn, transactions);
+              transactionToDNsCommitMap
+                  .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+            }
+          } catch (ContainerNotFoundException ex) {
+            delTx.add(txn);
           }
         }
+        deleteTransaction(delTx, batch);
+        scmMetadataStore.getBatchHandler().commitBatchOperation(batch);
       }
       return transactions;
     } finally {
       lock.unlock();
     }
   }
 
+  public void deleteTransaction(List<DeletedBlocksTransaction> delTx,

Review comment:
       Created a batch operation inside the function and renamed the function to purgeTransactions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#issuecomment-739813917


   @aryangupta1998 Thanks for the contribution! @runzhiwang Thanks for the review! I have committed the PR to master branch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] aryangupta1998 commented on a change in pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
aryangupta1998 commented on a change in pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#discussion_r535141758



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
##########
@@ -354,26 +356,43 @@ public DatanodeDeletedBlockTransactions getTransactions(
           ? extends Table.KeyValue<Long, DeletedBlocksTransaction>> iter =
                scmMetadataStore.getDeletedBlocksTXTable().iterator()) {
         int numBlocksAdded = 0;
+        BatchOperation batch =
+            scmMetadataStore.getBatchHandler().initBatchOperation();
+        List<DeletedBlocksTransaction> delTx =

Review comment:
       Renamed the list to txnsToBePurged.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] aryangupta1998 commented on a change in pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
aryangupta1998 commented on a change in pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#discussion_r535139125



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
##########
@@ -354,26 +356,43 @@ public DatanodeDeletedBlockTransactions getTransactions(
           ? extends Table.KeyValue<Long, DeletedBlocksTransaction>> iter =
                scmMetadataStore.getDeletedBlocksTXTable().iterator()) {
         int numBlocksAdded = 0;
+        BatchOperation batch =
+            scmMetadataStore.getBatchHandler().initBatchOperation();
+        List<DeletedBlocksTransaction> delTx =
+            new ArrayList<>();
         while (iter.hasNext() && numBlocksAdded < blockDeletionLimit) {
-          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue =
-              iter.next();
+          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue = iter.next();
           DeletedBlocksTransaction txn = keyValue.getValue();
           final ContainerID id = ContainerID.valueof(txn.getContainerID());
-          if (txn.getCount() > -1 && txn.getCount() <= maxRetry
-              && !containerManager.getContainer(id).isOpen()) {
-            numBlocksAdded += txn.getLocalIDCount();
-            getTransaction(txn, transactions);
-            transactionToDNsCommitMap
-                .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+          try {
+            if (txn.getCount() > -1 && txn.getCount() <= maxRetry
+                && !containerManager.getContainer(id).isOpen()) {
+              numBlocksAdded += txn.getLocalIDCount();
+              getTransaction(txn, transactions);
+              transactionToDNsCommitMap
+                  .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+            }
+          } catch (ContainerNotFoundException ex) {

Review comment:
       Added LOG.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a change in pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#discussion_r535076573



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
##########
@@ -354,26 +356,43 @@ public DatanodeDeletedBlockTransactions getTransactions(
           ? extends Table.KeyValue<Long, DeletedBlocksTransaction>> iter =
                scmMetadataStore.getDeletedBlocksTXTable().iterator()) {
         int numBlocksAdded = 0;
+        BatchOperation batch =
+            scmMetadataStore.getBatchHandler().initBatchOperation();
+        List<DeletedBlocksTransaction> delTx =

Review comment:
       Lets rename the list to "txnsToBePurged" or sth better.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
##########
@@ -354,26 +356,43 @@ public DatanodeDeletedBlockTransactions getTransactions(
           ? extends Table.KeyValue<Long, DeletedBlocksTransaction>> iter =
                scmMetadataStore.getDeletedBlocksTXTable().iterator()) {
         int numBlocksAdded = 0;
+        BatchOperation batch =
+            scmMetadataStore.getBatchHandler().initBatchOperation();
+        List<DeletedBlocksTransaction> delTx =
+            new ArrayList<>();
         while (iter.hasNext() && numBlocksAdded < blockDeletionLimit) {
-          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue =
-              iter.next();
+          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue = iter.next();
           DeletedBlocksTransaction txn = keyValue.getValue();
           final ContainerID id = ContainerID.valueof(txn.getContainerID());
-          if (txn.getCount() > -1 && txn.getCount() <= maxRetry
-              && !containerManager.getContainer(id).isOpen()) {
-            numBlocksAdded += txn.getLocalIDCount();
-            getTransaction(txn, transactions);
-            transactionToDNsCommitMap
-                .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+          try {
+            if (txn.getCount() > -1 && txn.getCount() <= maxRetry
+                && !containerManager.getContainer(id).isOpen()) {
+              numBlocksAdded += txn.getLocalIDCount();
+              getTransaction(txn, transactions);
+              transactionToDNsCommitMap
+                  .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+            }
+          } catch (ContainerNotFoundException ex) {
+            delTx.add(txn);
           }
         }
+        deleteTransaction(delTx, batch);
+        scmMetadataStore.getBatchHandler().commitBatchOperation(batch);
       }
       return transactions;
     } finally {
       lock.unlock();
     }
   }
 
+  public void deleteTransaction(List<DeletedBlocksTransaction> delTx,

Review comment:
       Since we are passing a list, lets create a batch Operation in the function itself. Lets rename the function to purgeTransactions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundException

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#issuecomment-738808816


   @aryangupta1998 I forgot to add that BatchOperation should be enclosed in a try block. Can you please make that change as well? Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] runzhiwang commented on a change in pull request #1643: HDDS-4447. SCMBlockDeletingService should handle ContainerNotFoundExc…

Posted by GitBox <gi...@apache.org>.
runzhiwang commented on a change in pull request #1643:
URL: https://github.com/apache/ozone/pull/1643#discussion_r533395644



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/block/DeletedBlockLogImpl.java
##########
@@ -354,26 +356,43 @@ public DatanodeDeletedBlockTransactions getTransactions(
           ? extends Table.KeyValue<Long, DeletedBlocksTransaction>> iter =
                scmMetadataStore.getDeletedBlocksTXTable().iterator()) {
         int numBlocksAdded = 0;
+        BatchOperation batch =
+            scmMetadataStore.getBatchHandler().initBatchOperation();
+        List<DeletedBlocksTransaction> delTx =
+            new ArrayList<>();
         while (iter.hasNext() && numBlocksAdded < blockDeletionLimit) {
-          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue =
-              iter.next();
+          Table.KeyValue<Long, DeletedBlocksTransaction> keyValue = iter.next();
           DeletedBlocksTransaction txn = keyValue.getValue();
           final ContainerID id = ContainerID.valueof(txn.getContainerID());
-          if (txn.getCount() > -1 && txn.getCount() <= maxRetry
-              && !containerManager.getContainer(id).isOpen()) {
-            numBlocksAdded += txn.getLocalIDCount();
-            getTransaction(txn, transactions);
-            transactionToDNsCommitMap
-                .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+          try {
+            if (txn.getCount() > -1 && txn.getCount() <= maxRetry
+                && !containerManager.getContainer(id).isOpen()) {
+              numBlocksAdded += txn.getLocalIDCount();
+              getTransaction(txn, transactions);
+              transactionToDNsCommitMap
+                  .putIfAbsent(txn.getTxID(), new LinkedHashSet<>());
+            }
+          } catch (ContainerNotFoundException ex) {

Review comment:
       We need to LOG.warn when ContainerNotFoundException  happens.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org