You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/01/10 02:52:15 UTC

[GitHub] [helix] kaisun2000 commented on issue #677: Add method to wait and return established session's ID

kaisun2000 commented on issue #677: Add method to wait and return established session's ID
URL: https://github.com/apache/helix/pull/677#issuecomment-572849996
 
 
   Overall, let me summarize the history, the invariant which should lead to a working solution and where can be further improved. 
   
   The history was about auto-reconnect. Here, ZkClient contains a connection object and connection object wrap a native Zookeeper instance (client object representing a session). ZkClient also has its own eventthread (Note this is different from Zookeeper instance eventthread. ZkClient pass its process() to native Zookeeper and this process() would run on native Zookeeper event thread processing state change and data change notification, and put the event of state change or data change to ZkClient event thread for processing. ZkClient further provides auto-reconnect functionality as Zookeeper client can have disconnected state and reconnect later automatically. However, if Zookeeper client can't reconnect in time, it would be expired by the server. This is why ZkClient would do auto-reconnect. 
   
   Here, the invariants are the following:
   1/ Each native Zookeeper object conceptually denotes one session, would have only one session ID, it would not change over time.
   2/ The Zookeeper object would only get its session ID the first time connected (SynConnected state change notification arrives) Otherwise, you would see the sessionID as 0.
   
   This invariant would be important for us to reason the correctness of the code. 
   
   So reconnect only happens when sessionExpired event notification received and a new Zookeeper object would be created.
   
   Note before this fix, the bug lies in the fact that an event thrown by zookeeper native thread would be later processed by ZkClient event thread. By the time ZkClient event thread process the event, they can happen when new session (thus new Zookeeper object) is on. 
   
   The general idea is to add the sessionID information with the event running on ZKClient event thread. Thus they can detect stale session and act accordingly. (Mainly for ephemeral node creation case)
   
   Before this batch of change, the handleNewSession() callback from ZKHelixManager would be called when sessionExpired notification delivered to the ZKClient. The thinking is that because auto-connect would happen anyway. 
   In this batch of change, we need to pass the sesssionID to handleNewSession(). Thus, we can not call handleNewSession() immediately for the reason that before the first synconnect notification to the new session arrives, there is no valid sessionID. 
   Thus, we made the change to deliver handleNewSession() only when the synconnected is delivered for the new Zookeeper object when reconnect happens. 
   
   All the above are done before this change. 
   
   However, there is still an issue. The first time when zkClient is created in the ZKHelixManager, handleNewSession would only be subscribed a little bit later. There is a possibility that this handleNewSession would be later than the first SynConnect get delivered. That is why waitForEstablishedSession() is added in this diff. 
   
   Actually, the alternative approach can be that creating ZkClient and add the subscriber (handleNewSession in this case) can be an atomic operation. That way, it would miss the first handleNewSession. This can be a potential better alternative. 
    
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org