You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Antonio Antonucci (Jira)" <ji...@apache.org> on 2022/03/06 14:00:00 UTC

[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501964#comment-17501964 ] 

Antonio Antonucci commented on ZOOKEEPER-832:
---------------------------------------------

Hi everybody, I am having a similar issue with kafka_2.13-3.1.0, which includes zookeeper 3.6.3–6401e4ad2087061bc6b9f80dec2d69f2e3c8660a. OS is Ubuntu 21.10

Two things to highlight:
 * 1) The issue I am having is:

INFO Refusing session request for client /192.168.1.163:42230 as it has seen zxid 0xa25 our last zxid is 0x23 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)

>> Initially the zookeeper datadir was configured as per default under /tmp.

>> Following this error I have moved the datadir under /opt/zookeeper, deleted all the logs directory content and rebooted the server. 

>> Soon after the rebooting, the errror message was:

INFO Refusing session request for client /192.168.1.163:41738 as it has seen zxid 0xa25 our last zxid is 0x0 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)

>> However once I started Kafka, the error went back to the original "... it has seen zxid 0xa25 our last zxid is 0x23.. "

>> I don't know how to sort this out
 * 2) I have tried to run the ZkWorkarounderMultiThreaded java file, however I am getting the following errors:

ZkWorkarounderMultiThreaded.java
ZkWorkarounderMultiThreaded.java:40: error: cannot access ACL
    zk.create(path, data, Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
      ^
  class file for org.apache.zookeeper.data.ACL not found
ZkWorkarounderMultiThreaded.java:45: error: cannot access Stat
    zk.setData(path, data, zk.exists(path, true).getVersion());
                                    ^
  class file for org.apache.zookeeper.data.Stat not found
ZkWorkarounderMultiThreaded.java:58: error: cannot find symbol
    ExecutorService es = Executors.newCachedThreadPool();
    ^
  symbol:   class ExecutorService
  location: class ZkWorkarounderMultiThreaded
ZkWorkarounderMultiThreaded.java:58: error: cannot find symbol
    ExecutorService es = Executors.newCachedThreadPool();
                         ^
  symbol:   variable Executors
  location: class ZkWorkarounderMultiThreaded
4 errors
error: compilation failed

>> I have only included the following lines in the java code, however I guess things have changed with zk 3.6.3?

package core.framework.zookeeper;

import java.io.IOException;
import java.util.Date;
import java.util.List;
import java.util.concurrent.CountDownLatch;

import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.Watcher.Event.KeeperState;
import org.apache.zookeeper.ZooDefs.Ids;
import org.apache.zookeeper.ZooKeeper;

import org.apache.zookeeper.KeeperException;

>> I don't literally know how to sort this out.

>> Can somebody help me please?

> Invalid session id causes infinite loop during automatic reconnect
> ------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-832
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.5, 3.5.0, 3.4.11
>         Environment: All
>            Reporter: Ryan Holmes
>            Assignee: Mohammad Arshad
>            Priority: Critical
>         Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch
>
>
> Steps to reproduce:
> 1.) Connect to a standalone server using the Java client.
> 2.) Stop the server.
> 3.) Delete the contents of the data directory (i.e. the persisted session data).
> 4.) Start the server.
> The client now automatically tries to reconnect but the server refuses the connection because the session id is invalid. The client and server are now in an infinite loop of attempted and rejected connections. While this situation represents a catastrophic failure and the current behavior is not incorrect, it appears that there is no way to detect this situation on the client and therefore no way to recover.
> The suggested improvement is to send an event to the default watcher indicating that the current state is "session invalid", similar to how the "session expired" state is handled.
> Server log output (repeats indefinitely):
> 2010-08-05 11:48:08,283 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - Accepted socket connection from /127.0.0.1:63292
> 2010-08-05 11:48:08,284 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last zxid is 0x0 client must try another server
> 2010-08-05 11:48:08,284 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed socket connection for client /127.0.0.1:63292 (no session established for client)
> Client log output (repeats indefinitely):
> 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - Opening socket connection to server localhost/127.0.0.1:2181
> 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 0x12a3ae4e893000a for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring exception during shutdown input
> java.nio.channels.ClosedChannelException
> 	at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
> 	at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
> 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring exception during shutdown output
> java.nio.channels.ClosedChannelException
> 	at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
> 	at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)