You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/10/23 10:03:40 UTC

[GitHub] [hadoop-ozone] bshashikant opened a new pull request #1519: HDDS-4388. Make writeStateMachineTimeout retry count proportional to node failure timeout

bshashikant opened a new pull request #1519:
URL: https://github.com/apache/hadoop-ozone/pull/1519


   
   
   ## What changes were proposed in this pull request?
   Currently, in ratis "writeStateMachinecall" gets retried indefinitely in event of a timeout. In case, where disks are slow/overloaded or number of chunk writer threads are not available for a period of 10s, writeStateMachine call times out in 10s. In cases like these, the same write chunk keeps on getting retried causing the same chink of data to be overwritten. The idea here is to abort the request once the node failure timeout reaches.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-4388
   
   
   ## How was this patch tested?
   Verified by checking the config value in tests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] lokeshj1703 closed pull request #1519: HDDS-4388. Make writeStateMachineTimeout retry count proportional to node failure timeout

Posted by GitBox <gi...@apache.org>.
lokeshj1703 closed pull request #1519:
URL: https://github.com/apache/hadoop-ozone/pull/1519


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] lokeshj1703 commented on pull request #1519: HDDS-4388. Make writeStateMachineTimeout retry count proportional to node failure timeout

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on pull request #1519:
URL: https://github.com/apache/hadoop-ozone/pull/1519#issuecomment-717039024


   @bshashikant Thanks for the contribution! I have merged the PR to master branch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] lokeshj1703 commented on a change in pull request #1519: HDDS-4388. Make writeStateMachineTimeout retry count proportional to node failure timeout

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #1519:
URL: https://github.com/apache/hadoop-ozone/pull/1519#discussion_r510844138



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/XceiverServerRatis.java
##########
@@ -200,6 +200,9 @@ private RaftProperties newRaftProperties() {
     TimeUnit timeUnit;
     long duration;
 
+    // set the node failure timeout
+    setNodeFailureTimeout(properties);

Review comment:
       `setNodeFailureTimeout` initializes a field. I think we can initialize the field in the constructor itself and remove this function.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org