You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/04/13 23:25:42 UTC

[GitHub] [hadoop-ozone] bharatviswa504 opened a new pull request #815: HDDS-3219. Write operation when both OM followers are shutdown.

bharatviswa504 opened a new pull request #815: HDDS-3219. Write operation when both OM followers are shutdown.
URL: https://github.com/apache/hadoop-ozone/pull/815
 
 
   ## What changes were proposed in this pull request?
   
   Added a new parameter for om rpc client time out. In this way, it will only affect OM Rpc Client.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3291
   
   ## How was this patch tested?
   
   Tested this on a docker cluster with the below settings. We should increase the timeout duration to a larger value so that OM will think it is the leader for a longer period even though it is not, and the request will be accepted by leader, and it will retry forever.
   OZONE-SITE.XML_ozone.om.client.rpc.timeout=30s
   OZONE-SITE.XML_ozone.om.leader.election.minimum.timeout.duration=1m
   
   Now with this patch, request fails after 15 retries. And for OM Server which it thinks it is leader, we get SocketTimeOutException, and move to next OM.
   
   Logs:
   ```
   2020-04-13 21:59:44,667 [main] INFO  RetryInvocationHandler:411 - com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "om3":9862; java.net.UnknownHostException; For more details see:  http://wiki.apache.org/hadoop/UnknownHost, while invoking $Proxy20.submitRequest over nodeId=om3,nodeAddress=om3:9862 after 13 failover attempts. Trying to failover immediately.
   2020-04-13 21:59:44,667 [main] INFO  RetryInvocationHandler:411 - com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "om1":9862; java.net.UnknownHostException; For more details see:  http://wiki.apache.org/hadoop/UnknownHost, while invoking $Proxy20.submitRequest over nodeId=om1,nodeAddress=om1:9862 after 14 failover attempts. Trying to failover immediately.
   2020-04-13 22:00:14,677 [main] INFO  RetryInvocationHandler:411 - com.google.protobuf.ServiceException: java.net.SocketTimeoutException: Call From 531e9bfac0d9/172.24.0.4 to om2:9862 failed on socket timeout exception: java.net.SocketTimeoutException: 30000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.24.0.4:47798 remote=om2/172.24.0.7:9862]; For more details see:  http://wiki.apache.org/hadoop/SocketTimeout, while invoking $Proxy20.submitRequest over nodeId=om2,nodeAddress=om2:9862 after 15 failover attempts. Trying to failover immediately.
   2020-04-13 22:00:14,678 [main] ERROR OMFailoverProxyProvider:286 - Failed to connect to OMs: [nodeId=om1,nodeAddress=om1:9862, nodeId=om3,nodeAddress=om3:9862, nodeId=om2,nodeAddress=om2:9862]. Attempted 15 failover
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bharatviswa504 merged pull request #815: HDDS-3291. Write operation when both OM followers are shutdown.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 merged pull request #815: HDDS-3291. Write operation when both OM followers are shutdown.
URL: https://github.com/apache/hadoop-ozone/pull/815
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] bharatviswa504 commented on issue #815: HDDS-3291. Write operation when both OM followers are shutdown.

Posted by GitBox <gi...@apache.org>.
bharatviswa504 commented on issue #815: HDDS-3291. Write operation when both OM followers are shutdown.
URL: https://github.com/apache/hadoop-ozone/pull/815#issuecomment-614785352
 
 
   Thank You @arp7 for the review.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org