You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@helix.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2013/08/06 22:58:47 UTC

[jira] [Commented] (HELIX-195) Race condition between FINALIZE callbacks and Zk Callbacks

    [ https://issues.apache.org/jira/browse/HELIX-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731255#comment-13731255 ] 

Hudson commented on HELIX-195:
------------------------------

SUCCESS: Integrated in helix #1117 (See [https://builds.apache.org/job/helix/1117/])
HELIX-195: fix race condition between FINALIZE callbacks and zk callbacks, rb=13345 (zzhang: rev 6d5397990d9629009e304ae6eea4a046d5429871)
* helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java
* helix-core/src/main/java/org/apache/helix/manager/zk/DistributedLeaderElection.java
* helix-core/src/test/java/org/apache/helix/integration/manager/ClusterDistributedController.java
* helix-core/src/main/java/org/apache/helix/manager/zk/ParticipantManagerHelper.java
* helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
* helix-core/src/test/java/org/apache/helix/integration/manager/TestZkCallbackHandlerLeak.java
* helix-core/src/test/java/org/apache/helix/TestHelper.java
* helix-core/src/test/java/org/apache/helix/integration/manager/TestConsecutiveZkSessionExpiry.java
* helix-core/src/test/java/org/apache/helix/integration/manager/TestControllerManager.java
* helix-core/src/main/java/org/apache/helix/manager/zk/DistributedControllerManager.java
* helix-core/src/test/java/org/apache/helix/integration/manager/TestParticipantManager.java
* helix-core/src/test/java/org/apache/helix/integration/manager/TestDistributedControllerManager.java
* helix-core/src/main/java/org/apache/helix/manager/zk/ParticipantManager.java
* helix-core/src/main/java/org/apache/helix/manager/zk/ControllerManager.java
* helix-core/src/main/java/org/apache/helix/manager/zk/AbstractManager.java

                
> Race condition between FINALIZE callbacks and Zk Callbacks
> ----------------------------------------------------------
>
>                 Key: HELIX-195
>                 URL: https://issues.apache.org/jira/browse/HELIX-195
>             Project: Apache Helix
>          Issue Type: Sub-task
>            Reporter: dafu
>            Assignee: dafu
>
> FINALIZE callbacks are sent async via CallbackHandler#reset(), while Zk callbacks are queued in ZkEventThread. It's possible that we are handling a FINALIZE callback before all Zk callbacks are cleaned up. This creates race conditions, for example, in zk session expiry, when a GenericController gets a FINALIZE callback, it cleans up all listeners using ZkClient#unsubscribe(), but Zk callbacks  leftover in ZkEventThread comes later, and re-subscribe all listeners, causing zk watcher leaking.
> This is observed by setting up two controllers and expire the leader (by simulating a long gc). The second controller takes the leadership and add all listeners, but when the former leader recovers from gc, it gets leftover Zk callbacks and re-subscribe the live-instance listener hence react to all live-instance changes, though it doesn't acquire the leadership.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira