You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@helix.apache.org by "Kanak Biscuitwala (JIRA)" <ji...@apache.org> on 2013/11/22 19:02:36 UTC

[jira] [Updated] (HELIX-321) Controller forgets that it's the leader

     [ https://issues.apache.org/jira/browse/HELIX-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kanak Biscuitwala updated HELIX-321:
------------------------------------

    Attachment: leader_election.txt

> Controller forgets that it's the leader
> ---------------------------------------
>
>                 Key: HELIX-321
>                 URL: https://issues.apache.org/jira/browse/HELIX-321
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Kanak Biscuitwala
>         Attachments: leader_election.txt
>
>
> 1. See log messages:
> INFO [2013-11-22 17:34:11,919] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 20171ms for sessionid 0x142016175c10856, closing socket connection and attempting reconnect
> INFO [2013-11-22 17:34:22,051] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn - Opening socket connection to server eat1-app87.corp/172.18.158.133:2181
> INFO [2013-11-22 17:34:22,052] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn - Socket connection established to eat1-app87.corp/172.18.158.133:2181, initiating session
> INFO [2013-11-22 17:34:22,055] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x142016175c10856 has expired, closing socket connection
> INFO [2013-11-22 17:34:22,055] main-EventThread - org.I0Itec.zkclient.ZkClient - zookeeper state changed (Expired)
> INFO [2013-11-22 17:34:22,055] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixConnection - KeeperState:Expired, expiredSessionId: 142016175c10856
> 2. Controller reconnects, removes all callbacks
> INFO [2013-11-22 17:34:22,068] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn - Socket connection established to eat1-app87.corp/172.18.158.133:2181, initiating session
> INFO [2013-11-22 17:34:22,126] main-SendThread(eat1-app87.corp:2181) - org.apache.zookeeper.ClientCnxn - Session establishment complete on server eat1-app87.corp/172.18.158.133:2181, sessionid = 0x142016175c1085c, negotiated timeout = 30000
> INFO [2013-11-22 17:34:22,126] main-EventThread - org.I0Itec.zkclient.ZkClient - zookeeper state changed (SyncConnected)
> 3. Callbacks ignored; not leader, relenquishes leadership
> ERROR [2013-11-22 17:34:22,187] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.controller.GenericHelixController - Cluster manager: controller1 is not leader. Pipeline will not be invoked
> INFO [2013-11-22 17:34:22,200] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection - controller1 reqlinquishes leadership of cluster: perf-test-cluster
> 4. Controller reacquires leadership
> INFO [2013-11-22 17:34:22,204] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection - controller1 is trying to acquire leadership for cluster: perf-test-cluster
> INFO [2013-11-22 17:34:22,215] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection - controller1 acquires leadership of cluster: perf-test-cluster
> 4. Controller thinks it's not leader even though the LEADER node is in place and correct
> ERROR [2013-11-22 17:34:22,294] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.controller.GenericHelixController - Cluster manager: controller1 is not leader. Pipeline will not be invoked
> 5. Controller tries to become leader when it already is???
> INFO [2013-11-22 17:34:22,335] ZkClient-EventThread-10-eat1-app87.corp:2181 - org.apache.helix.manager.zk.ZkHelixLeaderElection - controller1 is trying to acquire leadership for cluster: perf-test-cluster
> Logs attached



--
This message was sent by Atlassian JIRA
(v6.1#6144)