You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Neha Narkhede (Updated) (JIRA)" <ji...@apache.org> on 2012/02/15 21:31:00 UTC
[jira] [Updated] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neha Narkhede updated ZOOKEEPER-1382:
-------------------------------------

    Attachment: ZOOKEEPER-1382_3.3.4.patch

Found this bug on 3.3.4 and hence, am uploading a draft patch with a possible fix for 3.3.4 first. If it looks good, I'll try doing the same on trunk.

Here is what I think is causing this bug. Please correct me if I'm wrong -

Say a client connects to a > 1 node ZK cluster and registers watch on some path. After that, if the client goes into a GC pause, the leader expires the session and proposes closing the session to the followers. Meanwhile, the client wakes up and tries to renew its session with one of the followers. Now, when the follower detects that the session is invalid, it sends a close session response to the client, and enables receive on the selector key for that connection. So it receives the setWatches request during the renew session. There is a race condition since the request processors on the follower merely "pass-through" the non-transactional setWatches request while the NIOServerCnxn has sent the close session request to the client. Due to the above, sometimes, the watches are set right after the NIOServerCnxn is removed from the ZooKeeperServer. Hence, the watches are left on the servers, for expired sessions, effectively causing a memory leak.  

The fix is to enable reads on the connection only if the session is found to be valid.

For testing, I wrote a test and added it as part of the DisconnectedWatcherTest. However, it uses the in-memory ZK ensemble class from curator, with a few modifications to test watches. In addition to this, have added a timeoutSession() API to TestableZooKeeper to simulate the client side GC pause. However, it involves suspending and resuming the event thread. 

I have 2 concerns about tests for this patch -
1. Method used to simulate a client side GC pause
2. Longer test duration, since the test runs multiple iterations to reproduce this corner case race condition

For 1, this is hacky, works, but it will be good to know other alternatives for writing this test.
For 2, since this is more of a system test, wonder what is the best way to include it in zookeeper. Also, the new helper classes are from curator. 

Lastly, I left the DEBUG/TRACE logging, that I found useful to understand the server code, in the patch. Maybe, not all of it follows logging conventions.

                
> Zookeeper server holds onto dead/expired session ids in the watch data structures
> ---------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1382
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.4
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>         Attachments: ZOOKEEPER-1382_3.3.4.patch
>
>
> I've observed that zookeeper server holds onto expired session ids in the watcher data structures. The result is the wchp command reports session ids that cannot be found through cons/dump and those expired session ids sit there maybe until the server is restarted. Here are snippets from the client and the server logs that lead to this state, for one particular session id 0x134485fd7bcb26f -
> There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 and I'm using ZkClient to connect to the cluster
> From the application log -
> application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application Session establishment complete on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, negotiated timeout = 6000
> application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application] Client session timed out, have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing socket connection and attempting reconnect
> application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] [main-SendThread(226.prod:12913)] [application] Unable to reconnect to ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket connection
> On the leader zk, 225 -
> zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, timeout of 6000ms exceeded
> zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination for sessionid: 0x134485fd7bcb26f
> On the server, the client was initially connected to, 223 -
> zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  [CommitProcessor:1:NIOServerCnxn@1580] - Established session 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
> zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
> Here are the log snippets from 226, which is the server, the client reconnected to, before getting session expired event -
> 2012-01-27 09:52:38,190 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
> 2012-01-27 09:52:38,191 - INFO  [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
> 2012-01-27 09:52:38,191 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:49367 which had sessionid 0x134485fd7bcb26f
> wchp output from 226, taken on 01/30 -
> nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *226.*wchp* | wc -l
> 3
> wchp output from 223, taken on 01/30 -
> nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *223.*wchp* | wc -l
> 0
> cons output from 223 and 226, taken on 01/30 -
> nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *226.*cons* | wc -l
> 0
> nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *223.*cons* | wc -l
> 0
> So, what seems to have happened is that the client was able to re-register the watches on the new server (226), after it got disconnected from 223, inspite of having an expired session id. 
> In NIOServerCnxn, I saw that after suspecting that a session is expired, a server removes the cnxn and its watches from its internal data structures. But before that it allows more requests to be processed even if the session is expired -
>             // Now that the session is ready we can start receiving packets
>             synchronized (this.factory) {
>                 sk.selector().wakeup();
>                 enableRecv();
>             }
>         } catch (Exception e) {
>             LOG.warn("Exception while establishing session, closing", e);
>             close();
>         }
> I wonder if the client somehow sneaked in the set watches, right after the server removed the connection through removeCnxn() API ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira