You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by "Carl Wicklow (JIRA)" <ji...@apache.org> on 2016/07/22 14:25:21 UTC

[jira] [Comment Edited] (DIRMINA-1021) MINA-CORE does not remove sessions if exceptions occur while closing

    [ https://issues.apache.org/jira/browse/DIRMINA-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389586#comment-15389586 ] 

Carl Wicklow edited comment on DIRMINA-1021 at 7/22/16 2:25 PM:
----------------------------------------------------------------


Thanks Emmanuel.

I wasn't clear when I said "no exception is thrown". I meant that only in the context of what happens in setInterestedInWrite().

In terms of the writeBuffer() call, an exception certainly is caught ....

{code:title=AbstractPollingIoProcessor.writeBuffer()|borderStyle=solid}
           try {
                localWrittenBytes = write(session, buf, length);
            } catch (IOException ioe) {
                // IOException("Broken Pipe") is caught ...
                buf.free();
                session.close(true);
                destroy(session); 
                return 0;
            }
{code}

The session.close(true) passes up the filter chain (as you showed in an earlier post), and the session gets scheduled for removal ...

{code:title=AbstractPollingIoProcessor.remove()|borderStyle=solid}
    public final void remove(S session) {
        scheduleRemove(session);
        startupProcessor();
    }

    private void scheduleRemove(S session) {
        removingSessions.add(session);
    }
{code}

The destroy(session) called from writeBuffer() then cancels the selection key for this session (session state is now CLOSING).

{code:title=NioProcessor|borderStyle=solid}
    @Override
    protected void destroy(NioSession session) throws Exception {
        ByteChannel ch = session.getChannel();
        SelectionKey key = session.getSelectionKey();
        if (key != null) {
            key.cancel();
        }
        ch.close();
    }
{code}

The writeBuffer() returns 0, the flushNow returns false from here ...

{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
                if (localWrittenBytes == 0) {
                    setInterestedInWrite(session, true);
                    return false;
                }
{code}

Which pops us back into the processor run loop ...

{code:title=AbstractPollingIoProcessor.Processor.run()|borderStyle=solid}
                   // Write the pending requests
                    long currentTime = System.currentTimeMillis();
                    flush(currentTime);

                    // And manage removed sessions
                    nSessions -= removeSessions();
{code}

... ready to remove the sessions. Our session is CLOSING now, though, so the removal is not propagated ...

{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
                case CLOSING:
                    // Skip if channel is already closed
                    // In any case, remove the session from the queue
                    removedSessions++;
                    break;

{code}

... and that means the CloseFuture never completes, nor does the WriteFuture for the write that threw the exception, and no exception has been propogated to the IoHandler.

Am still reviewing the rest (there's a lot to go through !), but I think this may be a path thru the code where the IoHandler doesn't get told about the exception.



was (Author: cwicklow):


Thanks Emmanuel.

I wasn't clear when I said "no exception is thrown". I meant that only in the context of what happens in setInterestedInWrite().

In terms of the writeBuffer() call, an exception certainly is caught ....

{code:title=AbstractPollingIoProcessor.writeBuffer()|borderStyle=solid}
           try {
                localWrittenBytes = write(session, buf, length);
            } catch (IOException ioe) {
                // IOException("Broken Pipe") is caught ...
                buf.free();
                session.close(true);
                destroy(session); 
                return 0;
            }
{code}

The session.close(true) passes up the filter chain (as you showed in an earlier post), and the session gets scheduled for removal ...

{code:title=AbstractPollingIoProcessor.remove()|borderStyle=solid}
    public final void remove(S session) {
        scheduleRemove(session);
        startupProcessor();
    }

    private void scheduleRemove(S session) {
        removingSessions.add(session);
    }
{code}

The destroy(session) called from writeBuffer() then cancels the selection key for this session (session state is now CLOSING).

{code:title=NioProcessor|borderStyle=solid}
    @Override
    protected void destroy(NioSession session) throws Exception {
        ByteChannel ch = session.getChannel();
        SelectionKey key = session.getSelectionKey();
        if (key != null) {
            key.cancel();
        }
        ch.close();
    }

The writeBuffer() returns 0, the flushNow returns false from here ...

{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
                if (localWrittenBytes == 0) {
                    setInterestedInWrite(session, true);
                    return false;
                }
{code}

Which pops us back into the processor run loop ...

{code:title=AbstractPollingIoProcessor.Processor.run()|borderStyle=solid}
                   // Write the pending requests
                    long currentTime = System.currentTimeMillis();
                    flush(currentTime);

                    // And manage removed sessions
                    nSessions -= removeSessions();
{code}

... ready to remove the sessions. Our session is CLOSING now, though, so the removal is not propagated ...

{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
                case CLOSING:
                    // Skip if channel is already closed
                    // In any case, remove the session from the queue
                    removedSessions++;
                    break;

{code}

... and that means the CloseFuture never completes, nor does the WriteFuture for the write that threw the exception, and no exception has been propogated to the IoHandler.

Am still reviewing the rest (there's a lot to go through !), but I think this may be a path thru the code where the IoHandler doesn't get told about the exception.


> MINA-CORE does not remove sessions if exceptions occur while closing
> --------------------------------------------------------------------
>
>                 Key: DIRMINA-1021
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-1021
>             Project: MINA
>          Issue Type: Bug
>    Affects Versions: 2.0.8
>         Environment: mina-ssh 0.14.0 
> mina-core 2.0.8 
> Multiple OSes / Java configurations: 
> * Mac OS X El Capitan on Java 8 (1.8.0_60) 
> * CentOS 6.4 on Java 8 (1.8.0_60) 
> * CentOS 6.5 on Java 8 (1.8.0_20-b26).
>            Reporter: Doug Kelly
>         Attachments: attempt-removing-sessions-closing.patch
>
>
> MINA SSHD isn't removing sessions when using the MINA/NIO backend if an exception as received as the session is closing (such as a connection reset is received with data still in the write buffer). When this case happens, it seems that {{NioProcessor.getState}} returns the state as {{CLOSING}} (I'm assuming since the underlying channel is now closed), which means that the {{AbstractPollingIoProcessor.removeSessions()}} won't ever prune the session, since a {{CLOSING}} state is simply ignored. The result is a resource leak over time, since these sessions are never pruned (it's a slow leak, since entering this condition is racy – on my workstation, I can produce it through randomly interrupting connections anywhere from 1/6 to 1/10th of the time). (This may either be major or critical; reprioritize as necessary.)
> I specifically see this error with Gerrit 2.10.4 and Gerrit 2.11.5 (using mina-sshd 0.14.0 / mina-core 2.0.8), and it looks like the code path is unchanged in mina-sshd 1.0.0 / mina-core 2.0.9. I was unsure if this is specifically a bug in mina-core or, if it's something unique to mina-sshd. My local development system runs Mac OS X El Capitan on Java 8 (1.8.0_60), but I've also seen this on Linux (CentOS 6.4, again Java 1.8.0_60 and CentOS 6.5 on Java 1.8.0_20-b26).
> The fix may be as simple as attempting to remove the session if {{OPENED}} or {{CLOSING}}, but I'm unsure what side-effects this may have with other backends. I'll be happy to test it locally, but I'm fairly ignorant when it comes to MINA's code.
> The attached patch (to mina-core) seems to resolve the issue by following the reproduction case I have on the [Gerrit issue tracker|https://code.google.com/p/gerrit/issues/detail?id=3685].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)