You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "duongkame (via GitHub)" <gi...@apache.org> on 2023/01/26 06:41:22 UTC

[GitHub] [ozone] duongkame opened a new pull request, #4212: HDDS-7829. SCM to reject adding container to closed pipeline

duongkame opened a new pull request, #4212:
URL: https://github.com/apache/ozone/pull/4212

   ## What changes were proposed in this pull request?
   
   This is a correction of the fix for "SCM terminates when adding containers to a closed pipeline" in [HDDS-7738](https://issues.apache.org/jira/browse/HDDS-7738).
   
   In general, due to the lack of atomicity and the fact that SCM status can change between transactions, SCM should just reject transactions gracefully if some logical condition is not met.
   
   For example, when processing a transaction adding a container, SCM expects the associated pipeline to be in an `OPEN` state and rejects the container creation if the pipeline is `CLOSED`.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7829
   
   ## How was this patch tested?
   Unit test. 
   I could not find a definite way to create an integration test for this situation, because it involves some race-condition, i.e. pipeline is closed right after being allocated to a container. Yet, I manually injected errors and verified that SCM works normally after rejecting the failed transaction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on a diff in pull request #4212: HDDS-7829. SCM to reject adding container to closed pipeline

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on code in PR #4212:
URL: https://github.com/apache/ozone/pull/4212#discussion_r1092614358


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java:
##########
@@ -136,7 +137,14 @@ public CompletableFuture<Message> applyTransaction(
     try {
       final SCMRatisRequest request = SCMRatisRequest.decode(
           Message.valueOf(trx.getStateMachineLogEntry().getLogData()));
-      applyTransactionFuture.complete(process(request));
+
+      try {
+        applyTransactionFuture.complete(process(request));
+      } catch (SCMException ex) {
+        // SCMException is considered as logical rejection and is returned to
+        // Ratis client.

Review Comment:
   From `SCMException#ResultCodes`, `IO_EXCEPTION` and `INTERNAL_ERROR` should probably still crash the SCM



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] errose28 commented on pull request #4212: HDDS-7829. SCM to reject adding container to closed pipeline

Posted by "errose28 (via GitHub)" <gi...@apache.org>.
errose28 commented on PR #4212:
URL: https://github.com/apache/ozone/pull/4212#issuecomment-1413196396

   Following up on the testing part of this patch, the unit test is good, but it does not test the changes to `SCMStateMachine#applyTransaction`. It looks like this could be done by modifying `TestPipelineClose#testPipelineCloseWithClosedContainer` to use 1 node Ratis for SCM, then calling `pipelineManager#addContainerToPipeline` on the closed pipeline and check that an exception is thrown.
   
   I don't see a way to test that generic IOExceptions or SCMExceptions with fatal result codes terminate the SCM without failing the test due to JVM exit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on pull request #4212: HDDS-7829. SCM to reject adding container to closed pipeline

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime commented on PR #4212:
URL: https://github.com/apache/ozone/pull/4212#issuecomment-1408995192

   @swamirishi can you please take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] duongkame commented on pull request #4212: HDDS-7829. SCM to reject adding container to closed pipeline

Posted by "duongkame (via GitHub)" <gi...@apache.org>.
duongkame commented on PR #4212:
URL: https://github.com/apache/ozone/pull/4212#issuecomment-1421505742

   > Following up on the testing part of this patch, the unit test is good, but it does not test the changes to `SCMStateMachine#applyTransaction`. It looks like this could be done by modifying `TestPipelineClose#testPipelineCloseWithClosedContainer` to use 1 node Ratis for SCM, then calling `pipelineManager#addContainerToPipeline` on the closed pipeline and check that an exception is thrown.
   
   @errose28, thanks to your suggestion, I was able to create a proper integration test for the failure scenario.
   
   > I don't see a way to test that generic IOExceptions or SCMExceptions with fatal result codes terminate the SCM without failing the test due to JVM exit.
   
   I believe that can be resorted by `ExitUtils.disableSystemExit`, which prevents actual JVM exit for testing purposes.
   Yet, I scanned the code and could not find a good way to produce internal errors, maybe we can somehow generate `TimeoutException` and SCM always translates it to internal errors. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] duongkame commented on a diff in pull request #4212: HDDS-7829. SCM to reject adding container to closed pipeline

Posted by "duongkame (via GitHub)" <gi...@apache.org>.
duongkame commented on code in PR #4212:
URL: https://github.com/apache/ozone/pull/4212#discussion_r1099272468


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/SCMStateMachine.java:
##########
@@ -136,7 +137,14 @@ public CompletableFuture<Message> applyTransaction(
     try {
       final SCMRatisRequest request = SCMRatisRequest.decode(
           Message.valueOf(trx.getStateMachineLogEntry().getLogData()));
-      applyTransactionFuture.complete(process(request));
+
+      try {
+        applyTransactionFuture.complete(process(request));
+      } catch (SCMException ex) {
+        // SCMException is considered as logical rejection and is returned to
+        // Ratis client.

Review Comment:
   Good point, updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime merged pull request #4212: HDDS-7829. SCM to reject adding container to closed pipeline

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime merged PR #4212:
URL: https://github.com/apache/ozone/pull/4212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org