You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@river.apache.org by Christopher Dolan <ch...@avid.com> on 2011/01/17 18:16:33 UTC

Jeri mux livelock

We've captured a stacktrace from the field where it looks like the Jeri
ConnectionManager has locked up trying to start a new Mux connection.
I'm wondering 1) if anyone else has seen something like this (is this
the same as RIVER-254?); 2) if my analysis looks correct; and 3) if
anyone sees a possible solution.

The excerpt below comes from jstacktrace on River 2.1.0.  My analysis is
that:

 * UMMDTableNotifs-thread-1 is blocked waiting for the ConnectionManager
 * ConnectionManager is blocked waiting for the outbound connection
 * UMMDTableNotifs-thread-2 is trying to establish the outbound
connection and is waiting on the network.  Notably the code where this
thread is waiting says in Mux.start():

		while (!muxDown && !clientConnectionReady) {
		    try {
			muxLock.wait();		// REMIND: timeout?
		    } catch (InterruptedException e) {
			setDown("interrupt waiting for connection
header", e);
		    }
		}

 * the mux reader is blocked reading from the network socket.  (this
uses StreamConnectionIO instead of SocketChannelConnectionIO)


This appears to be similar to
https://issues.apache.org/jira/browse/RIVER-254 but some of the details
are different, so I'm not sure if it's the same bug.


Open lock chains
================
Chain 1:
"UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 waiting for
net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8 held by:
"(JSK)
ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 waiting for
net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
0 held by:
"UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 (waiting on
notification)


"(JSK)
ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 prio=5 alive, in native,
blocked, daemon
    -- Blocked trying to get lock:
net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
0[thin lock]
    at jrockit/vm/Threads.sleep(I)V(Native Method)
    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
    at
jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
    at
jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
    at jrockit/vm/Locks.monitorEnter(Locks.java:2439)[optimized]
    at
net/jini/jeri/connection/ConnectionManager$OutboundMux.checkIdle(Connect
ionManager.java:377)
    at
net/jini/jeri/connection/ConnectionManager.checkIdle(ConnectionManager.j
ava:256)
    ^-- Holding lock:
net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[recursive]
    at
net/jini/jeri/connection/ConnectionManager$Reaper.run(ConnectionManager.
java:572)
    ^-- Holding lock:
net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
    at
com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
]
    at java/lang/Thread.run(Thread.java:619)[optimized]
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace


"UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 prio=5 alive, in
native, blocked, daemon
    -- Blocked trying to get lock:
net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
    at jrockit/vm/Threads.sleep(I)V(Native Method)
    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
    at
jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
    at
jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
    at
net/jini/jeri/connection/ConnectionManager.connectPending(ConnectionMana
ger.java:176)[inlined]
    at
net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
nager.java:629)[optimized]
    ^-- Holding lock:
net/jini/jeri/connection/ConnectionManager$ReqIterator@0x000000020CBA800
0[biased lock]
    at
net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
    at
net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
ionHandler.java:708)[optimized]
    at
net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
andler.java:659)[optimized]
    at
net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
528)
    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
Source)
    at
com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
MMDRemoteEvent;)V(Unknown Source)
    at
com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
    at
com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
)V(Unknown Source)
    at
java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
r.java:885)[optimized]
    at
java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
va:907)
    at java/lang/Thread.run(Thread.java:619)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace


"UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 prio=5 alive, in
native, waiting, daemon
    -- Waiting for notification on:
java/lang/Object@0x0000000204146898[fat lock]
    at
jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native
Method)
    at jrockit/vm/Locks.wait(Locks.java:2229)[inlined]
    at java/lang/Object.wait(Object.java:485)[optimized]
    at com/sun/jini/jeri/internal/mux/Mux.start(Mux.java:207)
    ^-- Lock released while waiting:
java/lang/Object@0x0000000204146898[fat lock]
    at
net/jini/jeri/connection/ConnectionManager$OutboundMux.newRequest(Connec
tionManager.java:356)[inlined]
    at
net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
nager.java:631)[optimized]
    ^-- Holding lock:
net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
0[thin lock]
    ^-- Holding lock:
net/jini/jeri/connection/ConnectionManager$ReqIterator@0x0000000204147BF
8[biased lock]
    at
net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
    at
net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
ionHandler.java:708)[optimized]
    at
net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
andler.java:659)[optimized]
    at
net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
528)
    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
Source)
    at
com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
MMDRemoteEvent;)V(Unknown Source)
    at
com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
    at
com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
)V(Unknown Source)
    at
java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
r.java:885)[optimized]
    at
java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
va:907)
    at java/lang/Thread.run(Thread.java:619)
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace

"(JSK) mux reader" id=175467 idx=0x470 tid=4700 prio=5 alive, in native,
daemon
    at
jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII
)I(Native Method)
    at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)
    at
java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(S
ocketInputStream.java)
    at java/net/SocketInputStream.read(SocketInputStream.java:129)
    at
com/sun/jini/jeri/internal/mux/StreamConnectionIO$1.read(StreamConnectio
nIO.java:356)[optimized]
    at
com/sun/jini/jeri/internal/mux/StreamConnectionIO$Reader.run(StreamConne
ctionIO.java:263)[optimized]
    at
com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
]
    at java/lang/Thread.run(Thread.java:619)[optimized]
    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
    -- end of trace


Re: Jeri mux livelock

Posted by Gregg Wonderly <ge...@cox.net>.
In particular, on windows, there are limits on the number of live socket 
connections that can be established on desktop OSes.  The behavior of windows is 
to send TCP RST back immediately so you see a "connection refused" message 
instead of the "tcp queue length exceeded" behavior of never replying.  Not sure 
that this is what you are seeing, but I just like to remind people on the list 
of this issue when windows problems come up to help people understand why it can 
appear that an endpoint has disappeared, when in reality it may still be there 
when you do a netstat.

Gregg Wonderly

On 1/19/2011 2:50 AM, Dan Creswell wrote:
> Hey,
>
> Could you tell us what JVM and OS you're running on? It looks like JRockit,
> I'm guessing it's on Windows?
>
> Don't suppose you've reproduced the problem on any other JDK as well?
>
> If it's possible, could you say a little about what goes on in the class:
> com/avid/ummd/table/UMMDTableEventListener
>
> Seems like it's getting a remote event from something and then immediately
> invoking on some other service or is it the same service as generated the
> event or?
>
> And is there any chance you could run a netstat so we can see what was
> connected to what, how many connections there are etc?
>
> Cheers,
>
> Dan.
>
> On 17 January 2011 17:16, Christopher Dolan<ch...@avid.com>wrote:
>
>> We've captured a stacktrace from the field where it looks like the Jeri
>> ConnectionManager has locked up trying to start a new Mux connection.
>> I'm wondering 1) if anyone else has seen something like this (is this
>> the same as RIVER-254?); 2) if my analysis looks correct; and 3) if
>> anyone sees a possible solution.
>>
>> The excerpt below comes from jstacktrace on River 2.1.0.  My analysis is
>> that:
>>
>>   * UMMDTableNotifs-thread-1 is blocked waiting for the ConnectionManager
>>   * ConnectionManager is blocked waiting for the outbound connection
>>   * UMMDTableNotifs-thread-2 is trying to establish the outbound
>> connection and is waiting on the network.  Notably the code where this
>> thread is waiting says in Mux.start():
>>
>>                 while (!muxDown&&  !clientConnectionReady) {
>>                     try {
>>                         muxLock.wait();         // REMIND: timeout?
>>                     } catch (InterruptedException e) {
>>                         setDown("interrupt waiting for connection
>> header", e);
>>                     }
>>                 }
>>
>>   * the mux reader is blocked reading from the network socket.  (this
>> uses StreamConnectionIO instead of SocketChannelConnectionIO)
>>
>>
>> This appears to be similar to
>> https://issues.apache.org/jira/browse/RIVER-254 but some of the details
>> are different, so I'm not sure if it's the same bug.
>>
>>
>> Open lock chains
>> ================
>> Chain 1:
>> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 waiting for
>> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8 held by:
>> "(JSK)
>> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
>> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 waiting for
>> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
>> 0 held by:
>> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 (waiting on
>> notification)
>>
>>
>> "(JSK)
>> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
>> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 prio=5 alive, in native,
>> blocked, daemon
>>     -- Blocked trying to get lock:
>> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
>> 0[thin lock]
>>     at jrockit/vm/Threads.sleep(I)V(Native Method)
>>     at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>>     at
>> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>>     at
>> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>>     at jrockit/vm/Locks.monitorEnter(Locks.java:2439)[optimized]
>>     at
>> net/jini/jeri/connection/ConnectionManager$OutboundMux.checkIdle(Connect
>> ionManager.java:377)
>>     at
>> net/jini/jeri/connection/ConnectionManager.checkIdle(ConnectionManager.j
>> ava:256)
>>     ^-- Holding lock:
>> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[recursive]
>>     at
>> net/jini/jeri/connection/ConnectionManager$Reaper.run(ConnectionManager.
>> java:572)
>>     ^-- Holding lock:
>> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>>     at
>> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
>> ]
>>     at java/lang/Thread.run(Thread.java:619)[optimized]
>>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>>     -- end of trace
>>
>>
>> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 prio=5 alive, in
>> native, blocked, daemon
>>     -- Blocked trying to get lock:
>> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>>     at jrockit/vm/Threads.sleep(I)V(Native Method)
>>     at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>>     at
>> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>>     at
>> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>>     at
>> net/jini/jeri/connection/ConnectionManager.connectPending(ConnectionMana
>> ger.java:176)[inlined]
>>     at
>> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
>> nager.java:629)[optimized]
>>     ^-- Holding lock:
>> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x000000020CBA800
>> 0[biased lock]
>>     at
>> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>>     at
>> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
>> ionHandler.java:708)[optimized]
>>     at
>> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
>> andler.java:659)[optimized]
>>     at
>> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
>> 528)
>>     at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
>> Source)
>>     at
>> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
>> MMDRemoteEvent;)V(Unknown Source)
>>     at
>> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
>> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>>     at
>> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
>> )V(Unknown Source)
>>     at
>> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
>> r.java:885)[optimized]
>>     at
>> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>> va:907)
>>     at java/lang/Thread.run(Thread.java:619)
>>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>>     -- end of trace
>>
>>
>> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 prio=5 alive, in
>> native, waiting, daemon
>>     -- Waiting for notification on:
>> java/lang/Object@0x0000000204146898[fat lock]
>>     at
>> jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native
>> Method)
>>     at jrockit/vm/Locks.wait(Locks.java:2229)[inlined]
>>     at java/lang/Object.wait(Object.java:485)[optimized]
>>     at com/sun/jini/jeri/internal/mux/Mux.start(Mux.java:207)
>>     ^-- Lock released while waiting:
>> java/lang/Object@0x0000000204146898[fat lock]
>>     at
>> net/jini/jeri/connection/ConnectionManager$OutboundMux.newRequest(Connec
>> tionManager.java:356)[inlined]
>>     at
>> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
>> nager.java:631)[optimized]
>>     ^-- Holding lock:
>> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
>> 0[thin lock]
>>     ^-- Holding lock:
>> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x0000000204147BF
>> 8[biased lock]
>>     at
>> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>>     at
>> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
>> ionHandler.java:708)[optimized]
>>     at
>> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
>> andler.java:659)[optimized]
>>     at
>> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
>> 528)
>>     at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
>> Source)
>>     at
>> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
>> MMDRemoteEvent;)V(Unknown Source)
>>     at
>> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
>> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>>     at
>> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
>> )V(Unknown Source)
>>     at
>> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
>> r.java:885)[optimized]
>>     at
>> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>> va:907)
>>     at java/lang/Thread.run(Thread.java:619)
>>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>>     -- end of trace
>>
>> "(JSK) mux reader" id=175467 idx=0x470 tid=4700 prio=5 alive, in native,
>> daemon
>>     at
>> jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII
>> )I(Native Method)
>>     at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)
>>     at
>> java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(S
>> ocketInputStream.java)
>>     at java/net/SocketInputStream.read(SocketInputStream.java:129)
>>     at
>> com/sun/jini/jeri/internal/mux/StreamConnectionIO$1.read(StreamConnectio
>> nIO.java:356)[optimized]
>>     at
>> com/sun/jini/jeri/internal/mux/StreamConnectionIO$Reader.run(StreamConne
>> ctionIO.java:263)[optimized]
>>     at
>> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
>> ]
>>     at java/lang/Thread.run(Thread.java:619)[optimized]
>>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>>     -- end of trace
>>
>>
>


Re: Jeri mux livelock

Posted by Dan Creswell <da...@gmail.com>.
Hey,

Could you tell us what JVM and OS you're running on? It looks like JRockit,
I'm guessing it's on Windows?

Don't suppose you've reproduced the problem on any other JDK as well?

If it's possible, could you say a little about what goes on in the class:
com/avid/ummd/table/UMMDTableEventListener

Seems like it's getting a remote event from something and then immediately
invoking on some other service or is it the same service as generated the
event or?

And is there any chance you could run a netstat so we can see what was
connected to what, how many connections there are etc?

Cheers,

Dan.

On 17 January 2011 17:16, Christopher Dolan <ch...@avid.com>wrote:

> We've captured a stacktrace from the field where it looks like the Jeri
> ConnectionManager has locked up trying to start a new Mux connection.
> I'm wondering 1) if anyone else has seen something like this (is this
> the same as RIVER-254?); 2) if my analysis looks correct; and 3) if
> anyone sees a possible solution.
>
> The excerpt below comes from jstacktrace on River 2.1.0.  My analysis is
> that:
>
>  * UMMDTableNotifs-thread-1 is blocked waiting for the ConnectionManager
>  * ConnectionManager is blocked waiting for the outbound connection
>  * UMMDTableNotifs-thread-2 is trying to establish the outbound
> connection and is waiting on the network.  Notably the code where this
> thread is waiting says in Mux.start():
>
>                while (!muxDown && !clientConnectionReady) {
>                    try {
>                        muxLock.wait();         // REMIND: timeout?
>                    } catch (InterruptedException e) {
>                        setDown("interrupt waiting for connection
> header", e);
>                    }
>                }
>
>  * the mux reader is blocked reading from the network socket.  (this
> uses StreamConnectionIO instead of SocketChannelConnectionIO)
>
>
> This appears to be similar to
> https://issues.apache.org/jira/browse/RIVER-254 but some of the details
> are different, so I'm not sure if it's the same bug.
>
>
> Open lock chains
> ================
> Chain 1:
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 waiting for
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8 held by:
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 waiting for
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0 held by:
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 (waiting on
> notification)
>
>
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 prio=5 alive, in native,
> blocked, daemon
>    -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>    at jrockit/vm/Threads.sleep(I)V(Native Method)
>    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>    at jrockit/vm/Locks.monitorEnter(Locks.java:2439)[optimized]
>    at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.checkIdle(Connect
> ionManager.java:377)
>    at
> net/jini/jeri/connection/ConnectionManager.checkIdle(ConnectionManager.j
> ava:256)
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[recursive]
>    at
> net/jini/jeri/connection/ConnectionManager$Reaper.run(ConnectionManager.
> java:572)
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>    at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>    at java/lang/Thread.run(Thread.java:619)[optimized]
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 prio=5 alive, in
> native, blocked, daemon
>    -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>    at jrockit/vm/Threads.sleep(I)V(Native Method)
>    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>    at
> net/jini/jeri/connection/ConnectionManager.connectPending(ConnectionMana
> ger.java:176)[inlined]
>    at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:629)[optimized]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x000000020CBA800
> 0[biased lock]
>    at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>    at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>    at java/lang/Thread.run(Thread.java:619)
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 prio=5 alive, in
> native, waiting, daemon
>    -- Waiting for notification on:
> java/lang/Object@0x0000000204146898[fat lock]
>    at
> jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native
> Method)
>    at jrockit/vm/Locks.wait(Locks.java:2229)[inlined]
>    at java/lang/Object.wait(Object.java:485)[optimized]
>    at com/sun/jini/jeri/internal/mux/Mux.start(Mux.java:207)
>    ^-- Lock released while waiting:
> java/lang/Object@0x0000000204146898[fat lock]
>    at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.newRequest(Connec
> tionManager.java:356)[inlined]
>    at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:631)[optimized]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x0000000204147BF
> 8[biased lock]
>    at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>    at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>    at java/lang/Thread.run(Thread.java:619)
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
> "(JSK) mux reader" id=175467 idx=0x470 tid=4700 prio=5 alive, in native,
> daemon
>    at
> jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII
> )I(Native Method)
>    at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)
>    at
> java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(S
> ocketInputStream.java)
>    at java/net/SocketInputStream.read(SocketInputStream.java:129)
>    at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$1.read(StreamConnectio
> nIO.java:356)[optimized]
>    at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$Reader.run(StreamConne
> ctionIO.java:263)[optimized]
>    at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>    at java/lang/Thread.run(Thread.java:619)[optimized]
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>

Re: Jeri mux livelock

Posted by Patricia Shanahan <pa...@acm.org>.
Thanks for the input. I need to finish off a previous bug fix, but once 
I've done that I'll look into this one, unless someone else picks it up 
first.

Meanwhile, I'd like to get as many clues as possible. I don't suppose 
you can reproduce it, or extract a test case, but any information about 
the type of workload and other conditions might be useful.

Patricia


Christopher Dolan wrote:
> We've captured a stacktrace from the field where it looks like the Jeri
> ConnectionManager has locked up trying to start a new Mux connection.
> I'm wondering 1) if anyone else has seen something like this (is this
> the same as RIVER-254?); 2) if my analysis looks correct; and 3) if
> anyone sees a possible solution.
> 
> The excerpt below comes from jstacktrace on River 2.1.0.  My analysis is
> that:
> 
>  * UMMDTableNotifs-thread-1 is blocked waiting for the ConnectionManager
>  * ConnectionManager is blocked waiting for the outbound connection
>  * UMMDTableNotifs-thread-2 is trying to establish the outbound
> connection and is waiting on the network.  Notably the code where this
> thread is waiting says in Mux.start():
> 
> 		while (!muxDown && !clientConnectionReady) {
> 		    try {
> 			muxLock.wait();		// REMIND: timeout?
> 		    } catch (InterruptedException e) {
> 			setDown("interrupt waiting for connection
> header", e);
> 		    }
> 		}
> 
>  * the mux reader is blocked reading from the network socket.  (this
> uses StreamConnectionIO instead of SocketChannelConnectionIO)
> 
> 
> This appears to be similar to
> https://issues.apache.org/jira/browse/RIVER-254 but some of the details
> are different, so I'm not sure if it's the same bug.
> 
> 
> Open lock chains
> ================
> Chain 1:
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 waiting for
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8 held by:
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 waiting for
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0 held by:
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 (waiting on
> notification)
> 
> 
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 prio=5 alive, in native,
> blocked, daemon
>     -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>     at jrockit/vm/Threads.sleep(I)V(Native Method)
>     at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>     at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>     at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>     at jrockit/vm/Locks.monitorEnter(Locks.java:2439)[optimized]
>     at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.checkIdle(Connect
> ionManager.java:377)
>     at
> net/jini/jeri/connection/ConnectionManager.checkIdle(ConnectionManager.j
> ava:256)
>     ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[recursive]
>     at
> net/jini/jeri/connection/ConnectionManager$Reaper.run(ConnectionManager.
> java:572)
>     ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>     at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>     at java/lang/Thread.run(Thread.java:619)[optimized]
>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>     -- end of trace
> 
> 
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 prio=5 alive, in
> native, blocked, daemon
>     -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>     at jrockit/vm/Threads.sleep(I)V(Native Method)
>     at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>     at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>     at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>     at
> net/jini/jeri/connection/ConnectionManager.connectPending(ConnectionMana
> ger.java:176)[inlined]
>     at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:629)[optimized]
>     ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x000000020CBA800
> 0[biased lock]
>     at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>     at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>     at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>     at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>     at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>     at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>     at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>     at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>     at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>     at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>     at java/lang/Thread.run(Thread.java:619)
>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>     -- end of trace
> 
> 
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 prio=5 alive, in
> native, waiting, daemon
>     -- Waiting for notification on:
> java/lang/Object@0x0000000204146898[fat lock]
>     at
> jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native
> Method)
>     at jrockit/vm/Locks.wait(Locks.java:2229)[inlined]
>     at java/lang/Object.wait(Object.java:485)[optimized]
>     at com/sun/jini/jeri/internal/mux/Mux.start(Mux.java:207)
>     ^-- Lock released while waiting:
> java/lang/Object@0x0000000204146898[fat lock]
>     at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.newRequest(Connec
> tionManager.java:356)[inlined]
>     at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:631)[optimized]
>     ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>     ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x0000000204147BF
> 8[biased lock]
>     at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>     at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>     at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>     at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>     at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>     at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>     at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>     at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>     at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>     at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>     at java/lang/Thread.run(Thread.java:619)
>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>     -- end of trace
> 
> "(JSK) mux reader" id=175467 idx=0x470 tid=4700 prio=5 alive, in native,
> daemon
>     at
> jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII
> )I(Native Method)
>     at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)
>     at
> java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(S
> ocketInputStream.java)
>     at java/net/SocketInputStream.read(SocketInputStream.java:129)
>     at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$1.read(StreamConnectio
> nIO.java:356)[optimized]
>     at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$Reader.run(StreamConne
> ctionIO.java:263)[optimized]
>     at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>     at java/lang/Thread.run(Thread.java:619)[optimized]
>     at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>     -- end of trace
> 
> 


RE: Jeri mux livelock

Posted by Christopher Dolan <ch...@avid.com>.
Thanks, Peter.  This is exactly the answer I was hoping to see.  :-)  I won't bother the list again until I can get that product  upgraded to a more recent River and/or I can find a way to reproduce the problem in-house.

Chris

-----Original Message-----
From: Peter Jones [mailto:pcj@roundroom.net] 
Sent: Thursday, January 20, 2011 11:29 PM
To: river-user@incubator.apache.org
Subject: Re: Jeri mux livelock

This particular blocked thread dependency graph should not occur with
the RIVER-254 fix, because ConnectionManager.OutboundMux.checkIdle no
longer synchronizes on the OutboundMux instance.  Also note that the
fix for RIVER-229/242 changed the ConnectionManager.Reaper code a bit.

But you would still probably see thread "UMMDTableNotifs-thread-1"
waiting for thread "UMMDTableNotifs-thread-2" to unblock, however,
because they both want a connection to the same endpoint, so the
former is waiting on the latter's success (to share the connection) or
failure (to try connecting again) in doing so.

So I think that the most important question is why
"UMMDTableNotifs-thread-2" is blocking.  It's waiting for the server
side to respond with its header for the initial connection handshake
(the underlying TCP connection has been successfully established).
There should really be a configurable timeout for this wait (which
occurs in com.sun.jini.jeri.internal.mux.Mux.start)-- I can't remember
if there is already a bug filed for that.

-- Peter


On Mon, Jan 17, 2011 at 12:16 PM, Christopher Dolan
<ch...@avid.com> wrote:
> We've captured a stacktrace from the field where it looks like the Jeri
> ConnectionManager has locked up trying to start a new Mux connection.
> I'm wondering 1) if anyone else has seen something like this (is this
> the same as RIVER-254?); 2) if my analysis looks correct; and 3) if
> anyone sees a possible solution.
>
> The excerpt below comes from jstacktrace on River 2.1.0.  My analysis is
> that:
>
>  * UMMDTableNotifs-thread-1 is blocked waiting for the ConnectionManager
>  * ConnectionManager is blocked waiting for the outbound connection
>  * UMMDTableNotifs-thread-2 is trying to establish the outbound
> connection and is waiting on the network.  Notably the code where this
> thread is waiting says in Mux.start():
>
>                while (!muxDown && !clientConnectionReady) {
>                    try {
>                        muxLock.wait();         // REMIND: timeout?
>                    } catch (InterruptedException e) {
>                        setDown("interrupt waiting for connection
> header", e);
>                    }
>                }
>
>  * the mux reader is blocked reading from the network socket.  (this
> uses StreamConnectionIO instead of SocketChannelConnectionIO)
>
>
> This appears to be similar to
> https://issues.apache.org/jira/browse/RIVER-254 but some of the details
> are different, so I'm not sure if it's the same bug.
>
>
> Open lock chains
> ================
> Chain 1:
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 waiting for
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8 held by:
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 waiting for
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0 held by:
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 (waiting on
> notification)
>
>
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 prio=5 alive, in native,
> blocked, daemon
>    -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>    at jrockit/vm/Threads.sleep(I)V(Native Method)
>    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>    at jrockit/vm/Locks.monitorEnter(Locks.java:2439)[optimized]
>    at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.checkIdle(Connect
> ionManager.java:377)
>    at
> net/jini/jeri/connection/ConnectionManager.checkIdle(ConnectionManager.j
> ava:256)
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[recursive]
>    at
> net/jini/jeri/connection/ConnectionManager$Reaper.run(ConnectionManager.
> java:572)
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>    at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>    at java/lang/Thread.run(Thread.java:619)[optimized]
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 prio=5 alive, in
> native, blocked, daemon
>    -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>    at jrockit/vm/Threads.sleep(I)V(Native Method)
>    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>    at
> net/jini/jeri/connection/ConnectionManager.connectPending(ConnectionMana
> ger.java:176)[inlined]
>    at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:629)[optimized]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x000000020CBA800
> 0[biased lock]
>    at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>    at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>    at java/lang/Thread.run(Thread.java:619)
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 prio=5 alive, in
> native, waiting, daemon
>    -- Waiting for notification on:
> java/lang/Object@0x0000000204146898[fat lock]
>    at
> jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native
> Method)
>    at jrockit/vm/Locks.wait(Locks.java:2229)[inlined]
>    at java/lang/Object.wait(Object.java:485)[optimized]
>    at com/sun/jini/jeri/internal/mux/Mux.start(Mux.java:207)
>    ^-- Lock released while waiting:
> java/lang/Object@0x0000000204146898[fat lock]
>    at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.newRequest(Connec
> tionManager.java:356)[inlined]
>    at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:631)[optimized]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x0000000204147BF
> 8[biased lock]
>    at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>    at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>    at java/lang/Thread.run(Thread.java:619)
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
> "(JSK) mux reader" id=175467 idx=0x470 tid=4700 prio=5 alive, in native,
> daemon
>    at
> jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII
> )I(Native Method)
>    at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)
>    at
> java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(S
> ocketInputStream.java)
>    at java/net/SocketInputStream.read(SocketInputStream.java:129)
>    at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$1.read(StreamConnectio
> nIO.java:356)[optimized]
>    at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$Reader.run(StreamConne
> ctionIO.java:263)[optimized]
>    at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>    at java/lang/Thread.run(Thread.java:619)[optimized]
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>

Re: Jeri mux livelock

Posted by Peter Jones <pc...@roundroom.net>.
This particular blocked thread dependency graph should not occur with
the RIVER-254 fix, because ConnectionManager.OutboundMux.checkIdle no
longer synchronizes on the OutboundMux instance.  Also note that the
fix for RIVER-229/242 changed the ConnectionManager.Reaper code a bit.

But you would still probably see thread "UMMDTableNotifs-thread-1"
waiting for thread "UMMDTableNotifs-thread-2" to unblock, however,
because they both want a connection to the same endpoint, so the
former is waiting on the latter's success (to share the connection) or
failure (to try connecting again) in doing so.

So I think that the most important question is why
"UMMDTableNotifs-thread-2" is blocking.  It's waiting for the server
side to respond with its header for the initial connection handshake
(the underlying TCP connection has been successfully established).
There should really be a configurable timeout for this wait (which
occurs in com.sun.jini.jeri.internal.mux.Mux.start)-- I can't remember
if there is already a bug filed for that.

-- Peter


On Mon, Jan 17, 2011 at 12:16 PM, Christopher Dolan
<ch...@avid.com> wrote:
> We've captured a stacktrace from the field where it looks like the Jeri
> ConnectionManager has locked up trying to start a new Mux connection.
> I'm wondering 1) if anyone else has seen something like this (is this
> the same as RIVER-254?); 2) if my analysis looks correct; and 3) if
> anyone sees a possible solution.
>
> The excerpt below comes from jstacktrace on River 2.1.0.  My analysis is
> that:
>
>  * UMMDTableNotifs-thread-1 is blocked waiting for the ConnectionManager
>  * ConnectionManager is blocked waiting for the outbound connection
>  * UMMDTableNotifs-thread-2 is trying to establish the outbound
> connection and is waiting on the network.  Notably the code where this
> thread is waiting says in Mux.start():
>
>                while (!muxDown && !clientConnectionReady) {
>                    try {
>                        muxLock.wait();         // REMIND: timeout?
>                    } catch (InterruptedException e) {
>                        setDown("interrupt waiting for connection
> header", e);
>                    }
>                }
>
>  * the mux reader is blocked reading from the network socket.  (this
> uses StreamConnectionIO instead of SocketChannelConnectionIO)
>
>
> This appears to be similar to
> https://issues.apache.org/jira/browse/RIVER-254 but some of the details
> are different, so I'm not sure if it's the same bug.
>
>
> Open lock chains
> ================
> Chain 1:
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 waiting for
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8 held by:
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 waiting for
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0 held by:
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 (waiting on
> notification)
>
>
> "(JSK)
> ConnectionManager[net.jini.jeri.tcp.TcpEndpoint$ConnectionEndpointImpl@3
> a9c1504].Reaper" id=168278 idx=0x6a4 tid=4152 prio=5 alive, in native,
> blocked, daemon
>    -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>    at jrockit/vm/Threads.sleep(I)V(Native Method)
>    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>    at jrockit/vm/Locks.monitorEnter(Locks.java:2439)[optimized]
>    at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.checkIdle(Connect
> ionManager.java:377)
>    at
> net/jini/jeri/connection/ConnectionManager.checkIdle(ConnectionManager.j
> ava:256)
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[recursive]
>    at
> net/jini/jeri/connection/ConnectionManager$Reaper.run(ConnectionManager.
> java:572)
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>    at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>    at java/lang/Thread.run(Thread.java:619)[optimized]
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>
> "UMMDTableNotifs-thread-1" id=1270 idx=0x650 tid=340 prio=5 alive, in
> native, blocked, daemon
>    -- Blocked trying to get lock:
> net/jini/jeri/connection/ConnectionManager@0x00000001D4E0A7D8[thin lock]
>    at jrockit/vm/Threads.sleep(I)V(Native Method)
>    at jrockit/vm/Locks.waitForThinRelease(Locks.java:1209)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStageHard(Locks.java:1342)[optimized]
>    at
> jrockit/vm/Locks.monitorEnterSecondStage(Locks.java:1259)[optimized]
>    at
> net/jini/jeri/connection/ConnectionManager.connectPending(ConnectionMana
> ger.java:176)[inlined]
>    at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:629)[optimized]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x000000020CBA800
> 0[biased lock]
>    at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>    at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>    at java/lang/Thread.run(Thread.java:619)
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>
> "UMMDTableNotifs-thread-2" id=1271 idx=0x658 tid=3276 prio=5 alive, in
> native, waiting, daemon
>    -- Waiting for notification on:
> java/lang/Object@0x0000000204146898[fat lock]
>    at
> jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native
> Method)
>    at jrockit/vm/Locks.wait(Locks.java:2229)[inlined]
>    at java/lang/Object.wait(Object.java:485)[optimized]
>    at com/sun/jini/jeri/internal/mux/Mux.start(Mux.java:207)
>    ^-- Lock released while waiting:
> java/lang/Object@0x0000000204146898[fat lock]
>    at
> net/jini/jeri/connection/ConnectionManager$OutboundMux.newRequest(Connec
> tionManager.java:356)[inlined]
>    at
> net/jini/jeri/connection/ConnectionManager$ReqIterator.next(ConnectionMa
> nager.java:631)[optimized]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$OutboundMux@0x00000002041468A
> 0[thin lock]
>    ^-- Holding lock:
> net/jini/jeri/connection/ConnectionManager$ReqIterator@0x0000000204147BF
> 8[biased lock]
>    at
> net/jini/jeri/BasicObjectEndpoint$1.next(BasicObjectEndpoint.java:371)
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethodOnce(BasicInvocat
> ionHandler.java:708)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invokeRemoteMethod(BasicInvocationH
> andler.java:659)[optimized]
>    at
> net/jini/jeri/BasicInvocationHandler.invoke(BasicInvocationHandler.java:
> 528)
>    at $Proxy203.notify(Lnet/jini/core/event/RemoteEvent;)V(Unknown
> Source)
>    at
> com/avid/ummd/table/UMMDTableEventListener.notify(Lcom/avid/ummd/event/U
> MMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableAdaptersManager$UMMDTableAdapterListener.no
> tify(Lcom/avid/ummd/event/UMMDRemoteEvent;)V(Unknown Source)
>    at
> com/avid/ummd/table/UMMDTableEventListenersManager$ListenerSendTask.run(
> )V(Unknown Source)
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> r.java:885)[optimized]
>    at
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:907)
>    at java/lang/Thread.run(Thread.java:619)
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
> "(JSK) mux reader" id=175467 idx=0x470 tid=4700 prio=5 alive, in native,
> daemon
>    at
> jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII
> )I(Native Method)
>    at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:46)
>    at
> java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(S
> ocketInputStream.java)
>    at java/net/SocketInputStream.read(SocketInputStream.java:129)
>    at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$1.read(StreamConnectio
> nIO.java:356)[optimized]
>    at
> com/sun/jini/jeri/internal/mux/StreamConnectionIO$Reader.run(StreamConne
> ctionIO.java:263)[optimized]
>    at
> com/sun/jini/thread/ThreadPool$Worker.run(ThreadPool.java:136)[optimized
> ]
>    at java/lang/Thread.run(Thread.java:619)[optimized]
>    at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
>    -- end of trace
>
>