You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Leon Finker <le...@gmail.com> on 2021/03/04 19:19:36 UTC

Client Acceptor pool hangs

Hi,

We are encountering the following deadlock (pretty often) on 1.13.1:

1. Client (bridge) acceptor thread is locked up in this stack

"Handshaker 0.0.0.0/0.0.0.0:40011 Thread 2" #219 daemon prio=5
os_prio=0 tid=0x00007f755c007000 nid=0x44a2 runnable
[0x00007f75847c7000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.net.SocketInputStream.read(SocketInputStream.java:223)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.getCommunicationModeForNonSelector(AcceptorImpl.java:1559)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.handleNewClientConnection(AcceptorImpl.java:1430)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$handOffNewClientConnection$4(AcceptorImpl.java:1341)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$407/2146094985.run(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

2. The 4 Handshaker threads for that pool are stuck in this stack
"Handshaker 0.0.0.0/0.0.0.0:40011 Thread 2" #219 daemon prio=5
os_prio=0 tid=0x00007f755c007000 nid=0x44a2 runnable
[0x00007f75847c7000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:170)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at java.net.SocketInputStream.read(SocketInputStream.java:223)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.getCommunicationModeForNonSelector(AcceptorImpl.java:1559)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.handleNewClientConnection(AcceptorImpl.java:1430)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl.lambda$handOffNewClientConnection$4(AcceptorImpl.java:1341)
        at org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$$Lambda$407/2146094985.run(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Is there any reason there is no socket read timeout set here:
private CommunicationMode getCommunicationModeForNonSelector(Socket
socket) throws IOException {
    socket.setSoTimeout(0);
    socketCreator.forCluster().handshakeIfSocketIsSSL(socket, acceptTimeout);
    byte communicationModeByte = (byte) socket.getInputStream().read();

This blocks any new client connections to the server. Why not set read
timeout? For some reason it's explicitly set to  0 (infinite)...

Thank you