You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/04/13 08:47:19 UTC

[GitHub] [ozone] ChenSammi opened a new pull request #2152: HDDS-4986. Read failure because of unhealthy container.

ChenSammi opened a new pull request #2152:
URL: https://github.com/apache/ozone/pull/2152


   https://issues.apache.org/jira/browse/HDDS-4986


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#discussion_r615637189



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
##########
@@ -472,18 +465,12 @@ ContainerCommandResponseProto handleGetBlock(
       return malformedRequest(request);
     }
 
-    // The container can become unhealthy after the lock is released.
-    // The operation will likely fail/timeout in that happens.
-    try {
-      checkContainerIsHealthy(kvContainer);
-    } catch (StorageContainerException sce) {
-      return ContainerUtils.logAndReturnError(LOG, sce, request);
-    }
-
     ContainerProtos.BlockData responseData;
     try {
       BlockID blockID = BlockID.getFromProtobuf(
           request.getGetBlock().getBlockID());
+      checkContainerIsHealthy(kvContainer, blockID, Type.GetBlock);
+      BlockUtils.verifyBCSId(kvContainer, blockID);

Review comment:
       BCSID verification will implicitly happen in BlockManager.getBlock() call for containerBCSID. Second time will be redundant check.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi closed pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
ChenSammi closed pull request #2152:
URL: https://github.com/apache/ozone/pull/2152


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#discussion_r615637761



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
##########
@@ -823,19 +797,13 @@ ContainerCommandResponseProto handleGetSmallFile(
       return malformedRequest(request);
     }
 
-    // The container can become unhealthy after the lock is released.
-    // The operation will likely fail/timeout in that happens.
-    try {
-      checkContainerIsHealthy(kvContainer);
-    } catch (StorageContainerException sce) {
-      return ContainerUtils.logAndReturnError(LOG, sce, request);
-    }
-
     GetSmallFileRequestProto getSmallFileReq = request.getGetSmallFile();
 
     try {
       BlockID blockID = BlockID.getFromProtobuf(getSmallFileReq.getBlock()
           .getBlockID());
+      checkContainerIsHealthy(kvContainer, blockID, Type.GetSmallFile);
+      BlockUtils.verifyBCSId(kvContainer, blockID);
       BlockData responseData = blockManager.getBlock(kvContainer, blockID);
 

Review comment:
       BCSID verification will implicitly happen in BlockManager.getBlock() call for containerBCSID. Second time will be redundant check.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on a change in pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on a change in pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#discussion_r615598721



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
##########
@@ -635,17 +611,15 @@ ContainerCommandResponseProto handleReadChunk(
    * Throw an exception if the container is unhealthy.
    *
    * @throws StorageContainerException if the container is unhealthy.
-   * @param kvContainer
    */
   @VisibleForTesting
-  void checkContainerIsHealthy(KeyValueContainer kvContainer)
-      throws StorageContainerException {
+  void checkContainerIsHealthy(KeyValueContainer kvContainer, BlockID blockID,
+      Type cmd) {
     kvContainer.readLock();
     try {
       if (kvContainer.getContainerData().getState() == State.UNHEALTHY) {
-        throw new StorageContainerException(
-            "The container(" + kvContainer.getContainerData().getContainerID() +
-            ") replica is unhealthy.", CONTAINER_UNHEALTHY);
+        LOG.info("{} request {} for UNHEALTHY container {} replica", cmd,

Review comment:
       Change log level to WARN.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#discussion_r616485774



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
##########
@@ -472,18 +465,12 @@ ContainerCommandResponseProto handleGetBlock(
       return malformedRequest(request);
     }
 
-    // The container can become unhealthy after the lock is released.
-    // The operation will likely fail/timeout in that happens.
-    try {
-      checkContainerIsHealthy(kvContainer);
-    } catch (StorageContainerException sce) {
-      return ContainerUtils.logAndReturnError(LOG, sce, request);
-    }
-
     ContainerProtos.BlockData responseData;
     try {
       BlockID blockID = BlockID.getFromProtobuf(
           request.getGetBlock().getBlockID());
+      checkContainerIsHealthy(kvContainer, blockID, Type.GetBlock);
+      BlockUtils.verifyBCSId(kvContainer, blockID);

Review comment:
       Sure. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant merged pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
bshashikant merged pull request #2152:
URL: https://github.com/apache/ozone/pull/2152


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#issuecomment-822227512


   Failed UT cannot be reproduced locally.  
   
   @bshashikant  would you help to review this patch which enables the unhealthy container data read? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#discussion_r616463453



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
##########
@@ -635,17 +611,15 @@ ContainerCommandResponseProto handleReadChunk(
    * Throw an exception if the container is unhealthy.
    *
    * @throws StorageContainerException if the container is unhealthy.
-   * @param kvContainer
    */
   @VisibleForTesting
-  void checkContainerIsHealthy(KeyValueContainer kvContainer)
-      throws StorageContainerException {
+  void checkContainerIsHealthy(KeyValueContainer kvContainer, BlockID blockID,
+      Type cmd) {
     kvContainer.readLock();
     try {
       if (kvContainer.getContainerData().getState() == State.UNHEALTHY) {
-        throw new StorageContainerException(
-            "The container(" + kvContainer.getContainerData().getContainerID() +
-            ") replica is unhealthy.", CONTAINER_UNHEALTHY);
+        LOG.info("{} request {} for UNHEALTHY container {} replica", cmd,

Review comment:
       Changed. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] bshashikant commented on pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
bshashikant commented on pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#issuecomment-823940237


   Thanks @ChenSammi for the fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2152: HDDS-4986. Read failure because of unhealthy container.

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2152:
URL: https://github.com/apache/ozone/pull/2152#discussion_r616485931



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
##########
@@ -823,19 +797,13 @@ ContainerCommandResponseProto handleGetSmallFile(
       return malformedRequest(request);
     }
 
-    // The container can become unhealthy after the lock is released.
-    // The operation will likely fail/timeout in that happens.
-    try {
-      checkContainerIsHealthy(kvContainer);
-    } catch (StorageContainerException sce) {
-      return ContainerUtils.logAndReturnError(LOG, sce, request);
-    }
-
     GetSmallFileRequestProto getSmallFileReq = request.getGetSmallFile();
 
     try {
       BlockID blockID = BlockID.getFromProtobuf(getSmallFileReq.getBlock()
           .getBlockID());
+      checkContainerIsHealthy(kvContainer, blockID, Type.GetSmallFile);
+      BlockUtils.verifyBCSId(kvContainer, blockID);
       BlockData responseData = blockManager.getBlock(kvContainer, blockID);
 

Review comment:
       OK.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org