You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by "Mike Pomraning (JIRA)" <ji...@apache.org> on 2015/11/06 18:51:10 UTC

[jira] [Comment Edited] (SSHD-557) MINA SSHD deadlocks upon multiple concurrent HTTP downloads over locally-forwarded ports

    [ https://issues.apache.org/jira/browse/SSHD-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994061#comment-14994061 ] 

Mike Pomraning edited comment on SSHD-557 at 11/6/15 5:50 PM:
--------------------------------------------------------------

I have tested the SSHD-565 timeout, but it's no help by itself as the workers re-enter the {{wait}} too quickly.  An indefinite wait becomes a busier indefinite wait.  We can reproduce this with just a few sessions.

Now, I'd like to persuade you that increasing the workers is an inappropriate fix, too.  (As an aside, I think [~sgardner] was suggesting a dedicated thread that was "read only" on the sessions, and thus could not itself get stuck in {{waitForSpace}}.)

Essentially, we have a multiplexing multithreaded model where it is possible for every worker thread to enter an indefinitely blocking wait and thus starve all clients, even those with "healthy" sessions.  I am sympathetic to your suggestion of capacity tuning, and I see this as an issue of _correctness_ under any load rather than _performance_ under heavy load.

In a single event-driven thread multiplexing many clients — maybe Netty or twisted python or perl POE — it's not correct to block indefinitely on any one client.  Saying that we need N+1 threads, where N is the number of clients, defeats the purpose of event-driven multiplexed client sessions in the first place.

A fix might be to put the "stuck" sessions in their own set, off to the side, and leave them there until incoming data suggest their window can be resized.


was (Author: mpomraning-apache):
I have tested the SSHD-565 timeout, but it's no help by itself as the workers re-enter the {{wait}} too quickly.  An indefinite wait becomes a busier indefinite wait.  We can reproduce this with just a few sessions.

Now, I'd like to persuade you that increasing the workers is an inappropriate fix, too.  (As an aside, I think [~sgardner] was suggesting a dedicated thread that was "read only" on the sessions, and thus could not itself get stuck in {{waitForSpace}}.)

Essentially, we have a multiplexing multithreaded model where it is possible for every worker thread to enter an indefinitely blocking wait and thus starve all clients, even those with "healthy" sessions.  I am sympathetic to your suggestion of capacity tuning, and I see this as an issue of _correctness_ under any load rather than _performance_ under heavy load.

In a single event-driven thread multiplexing many clients — maybe Netty or twisted python or perl POE — it's not correct to block indefinitely on any one client.  Saying that we need N+1 threads, where N is the number of clients, defeats the purpose of event-driven multiplexed client sessions in the first place.

A fix might be to put the "stuck" session in their own set, off to the side, and leave them there until incoming data suggest their window can be resized.

> MINA SSHD deadlocks upon multiple concurrent HTTP downloads over locally-forwarded ports
> ----------------------------------------------------------------------------------------
>
>                 Key: SSHD-557
>                 URL: https://issues.apache.org/jira/browse/SSHD-557
>             Project: MINA SSHD
>          Issue Type: Bug
>    Affects Versions: 0.14.0, 1.0.0
>         Environment: Fedora 21, kernel 3.17.4-301, OpenJDK 1.8.0_51-b16
>            Reporter: Sam Gardner
>            Assignee: Goldstein Lyor
>              Labels: sshd
>             Fix For: 1.1.0
>
>         Attachments: AbstractUserAuth.java, setup-tunnels-remote.sh, setup-tunnels.sh, ssh-load-remote.sh, ssh-load.sh, sshd-core-0.14.0-jar-with-dependencies.jar, sshd-core-1.0.0-jar-with-dependencies.jar
>
>
> SSHD-Core 1.0.0 deadlocks when multiple concurrent downloads are in progress over ChannelOutputStreams
> In both Mina-SSHD 0.14.0 and 1.0.0 the server will become deadlocked if multiple concurrent downloads are started over a locally-forwarded TCP socket. In this state the process is still alive and client connections stay open, but no data can be transmitted and new connections will not be accepted - when inspected via telnet, the SSH port will not even return a banner.
> Inspection of the hung process using the {{jstack}} tool shows that each NIO worker thread is in the following state: {noformat}
> "sshd-SshServer[36f6e879]-nio2-thread-4" #14 daemon prio=5 os_prio=0 tid=0x00007f219c009000 nid=0x34c6 in Object.wait() [0x00007f21c8439000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000d799f9b0> (a org.apache.sshd.common.channel.Window)
>         at java.lang.Object.wait(Object.java:502)
>         at org.apache.sshd.common.channel.Window.waitForSpace(Window.java:175)
>         - locked <0x00000000d799f9b0> (a org.apache.sshd.common.channel.Window)
>         at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:126)
>         - locked <0x00000000d799fa68> (a org.apache.sshd.common.channel.ChannelOutputStream)
>         at org.apache.sshd.server.forward.TcpipServerChannel$1.messageReceived(TcpipServerChannel.java:156)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:220)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker.invokeDirect(Invoker.java:157)
>         at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implRead(UnixAsynchronousSocketChannelImpl.java:553)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:276)
>         at sun.nio.ch.AsynchronousSocketChannelImpl.read(AsynchronousSocketChannelImpl.java:297)
>         at java.nio.channels.AsynchronousSocketChannel.read(AsynchronousSocketChannel.java:420)
>         at org.apache.sshd.common.io.nio2.Nio2Session.doReadCycle(Nio2Session.java:247)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:224)
>         at org.apache.sshd.common.io.nio2.Nio2Session$2.onCompleted(Nio2Session.java:212)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler$1.run(Nio2CompletionHandler.java:34)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at org.apache.sshd.common.io.nio2.Nio2CompletionHandler.completed(Nio2CompletionHandler.java:31)
>         at sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:126)
>         at sun.nio.ch.Invoker$2.run(Invoker.java:218)
>         at sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> These are the steps to reproduce:
> # Start up approximately 200 separate SSH sessions with one local forward to an HTTP server:
> {{ssh -N -LXXXXX:<HTTP_SERVER>:80 account@<MINA_SSHD_SERVER>}}
> ## To simplify this process, I modified [org.apache.sshd.server.auth.AbstractUserAuth|^AbstractUserAuth.java] to always accept requests - this makes it possible to create the sessions on the same client (though I've also reproduced this on separate clients as well).
> ## This [script|^setup-tunnels.sh] will setup the client sessions as background processes
> # With the above sessions alive and backgrounded, fetch large files concurrently from the HTTP server over multiple forwarded ports:
> {{wget http://localhost:XXXXX/hugefile}}
> ## This [script|^ssh-load.sh] will start {{wget}} downloads in the above manner.
> # After a time, the {{wget}} downloads will stop progressing, and inspection of the SSH server process via {{jstack <PID>}} will show each of the {{nio2-thread-[0-9]}} threads in the deadlocked state.
> Note: I previously filed SSHD-539 on this, and have since reproduced it on vanilla MINA SSHD-Core without our proprietary modifications and running outside of Tomcat.
> The JAR files I used to reproduce are attached to this ticket as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)