You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "sumitagrawl (via GitHub)" <gi...@apache.org> on 2023/08/08 12:24:49 UTC

[GitHub] [ozone] sumitagrawl opened a new pull request, #5161: HDDS-9134. Decommissioning does not complete even after 40 mins

sumitagrawl opened a new pull request, #5161:
URL: https://github.com/apache/ozone/pull/5161

   ## What changes were proposed in this pull request?
   
   1. Wrapped Request Stream observer, with setting observer as error and returning isReady() true, so that it will fail while operating
   2. timeout provided with 1 minute even if stream is not ready
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-9134
   
   ## How was this patch tested?
   
   Test with debug to verify exit of replication at DN
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #5161: HDDS-9134. GRPC based replication can get stuck forever if the receiver is not available

Posted by "siddhantsangwan (via GitHub)" <gi...@apache.org>.
siddhantsangwan commented on PR #5161:
URL: https://github.com/apache/ozone/pull/5161#issuecomment-1671175809

   Merging to master. Thanks @sumitagrawl for working on this, and @kerneltime and @sodonnel for the reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan merged pull request #5161: HDDS-9134. GRPC based replication can get stuck forever if the receiver is not available

Posted by "siddhantsangwan (via GitHub)" <gi...@apache.org>.
siddhantsangwan merged PR #5161:
URL: https://github.com/apache/ozone/pull/5161


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #5161: HDDS-9134. Decommissioning does not complete even after 40 mins

Posted by "sodonnel (via GitHub)" <gi...@apache.org>.
sodonnel commented on PR #5161:
URL: https://github.com/apache/ozone/pull/5161#issuecomment-1670028916

   Change seems reasonable to me, but I don't have any prior knowledge of how this stuff works (Grpc, or the changes to make the sender backoff). Ideally someone more familiar with this should review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kerneltime commented on a diff in pull request #5161: HDDS-9134. GRPC based replication can get stuck forever if the receiver is not available

Posted by "kerneltime (via GitHub)" <gi...@apache.org>.
kerneltime commented on code in PR #5161:
URL: https://github.com/apache/ozone/pull/5161#discussion_r1287718789


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/replication/GrpcOutputStream.java:
##########
@@ -147,10 +149,12 @@ private void flushBuffer(boolean eof) {
    * the stream until it's ready.
    */
   private void waitUntilReady() {
-    while (!streamObserver.isReady()) {
+    int count = 0;
+    while (!streamObserver.isReady() && count < READY_RETRY_COUNT) {

Review Comment:
   Won't this lead to the same memory bloat? Retrying the whole container replication might be better if the channel is busy for a while.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org