You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2021/03/06 16:53:51 UTC

[GitHub] [zookeeper] functioner edited a comment on pull request #1596: ZOOKEEPER-4203: Leader swallows the ZooKeeperServer.State.ERROR from Leader.LearnerCnxAcceptor in some concurrency condition

functioner edited a comment on pull request #1596:
URL: https://github.com/apache/zookeeper/pull/1596#issuecomment-791988099


   > One thing I am wondering (which was not introduced by your code) is why we catch and continue on `SaslException`s, but "crash" on other `IOException`s? Not a major point as the server will now recover, but perhaps we could just accept that `accept` can fail? @eolivelli, @symat?
   
   @ztzg I think it can be explained with:
   https://github.com/apache/zookeeper/blob/0c98d1d3252e645a5c25bfecaba5ca1f5cd3258e/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L523-L536
   In the case of `SaslException`, the exception is not thrown again, then it will not be caught by:
   https://github.com/apache/zookeeper/blob/0c98d1d3252e645a5c25bfecaba5ca1f5cd3258e/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L498-L504
   In this case, `ZooKeeperServer.State.ERROR` is not set, everything works well temporally, and the critical code
   https://github.com/apache/zookeeper/blob/0c98d1d3252e645a5c25bfecaba5ca1f5cd3258e/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L688
   finishes, so, if there is any error later, it can be detect by:
   https://github.com/apache/zookeeper/blob/0c98d1d3252e645a5c25bfecaba5ca1f5cd3258e/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L754
   Then, the quorum can handle it well.
   
   However, if the IOException is not `SaslException`, it will be thrown again by:
   https://github.com/apache/zookeeper/blob/0c98d1d3252e645a5c25bfecaba5ca1f5cd3258e/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L523-L536
   In this case, the `ZooKeeperServer.State.ERROR` may be set before the critical code:
   https://github.com/apache/zookeeper/blob/0c98d1d3252e645a5c25bfecaba5ca1f5cd3258e/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L688
   Then this error state is covered by running state. The symptom described by this issue occurs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org