You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by "John K Peterson (JIRA)" <ji...@apache.org> on 2006/04/05 08:56:52 UTC

[jira] Created: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Race condition in SocketAcceptorDelegate.unbind
-----------------------------------------------

         Key: DIRMINA-202
         URL: http://issues.apache.org/jira/browse/DIRMINA-202
     Project: Directory MINA
        Type: Bug

    Versions: 1.0    
    Reporter: John K Peterson


On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.

SocketAcceptorDelegate.unbind does the following:

1) creates a cancellation request
2) starts up a worker thread
3) puts the cancellation request on the cancelQueue
4) wakes up the worker thread's selector
5) waits for the cancellation request to be done

The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "John K Peterson (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373503 ] 

John K Peterson commented on DIRMINA-202:
-----------------------------------------

O.K., with println's added, I've verified that selector.wakeup is not causing selector.select to return like it should.  I still don't know what the special conditions are for it to get into this state, since the simple case doesn't fail, but it sure doesn't seem to be a MINA issue.  I will take it up with the Classpath folks...

Thanks for your help in any case.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "John K Peterson (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373356 ] 

John K Peterson commented on DIRMINA-202:
-----------------------------------------

I don't see any 'select(1000)' in SocketAcceptorDelegate.Worker at:

http://svn.apache.org/viewcvs.cgi/directory/trunks/mina/core/src/main/java/org/apache/mina/transport/socket/nio/support/SocketAcceptorDelegate.java?rev=389042&view=markup

...I only see:

   public void run()
        {
            for( ;; )
            {
                try
                {
                    int nKeys = selector.select();

                    registerNew();
                    cancelKeys();

[...]

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "Trustin Lee (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373236 ] 

Trustin Lee commented on DIRMINA-202:
-------------------------------------

Moreover, select(1000) never hangs because we set 1 second timeout.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Versions: 1.0
>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "John K Peterson (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373355 ] 

John K Peterson commented on DIRMINA-202:
-----------------------------------------

I apologize.  I'm not very familiar with NIO yet and I only read the summary of the selector.wakeup function.  You're correct that it sounds like it could be a problem with Classpath.  (I've had several other issues with it running ApacheDS that I've reported to them already.)  I'll do a separate test with selector and wakeup to see if Classpath is the problem.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "Niklas Therning (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/DIRMINA-202?page=all ]
     
Niklas Therning resolved DIRMINA-202:
-------------------------------------

    Resolution: Cannot Reproduce
     Assign To: Niklas Therning

This wasn't a bug in MINA but in JamVM when used together with GNU classpath. John, please close this issue.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson
>     Assignee: Niklas Therning

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "Trustin Lee (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/DIRMINA-202?page=all ]

Trustin Lee updated DIRMINA-202:
--------------------------------

    Version:     (was: 1.0)

MINA 1.0 is not released yet.  Removed a version number from an 'Affects version' field.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "John K Peterson (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373448 ] 

John K Peterson commented on DIRMINA-202:
-----------------------------------------

Turns out the version I'm working with is 0.9.2.  Sorry for the confusion.

I've tried to strip down the class to remove all the MINA-specific stuff, however, when I do that it doesn't hang.  So, the basic wakeup before select is not the problem. I'm wondering if maybe it somehow gets back around to 'select' a second time.  Now I've got the right source and can throw in some print statements in to find out the order and frequency of events...

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "Niklas Therning (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373235 ] 

Niklas Therning commented on DIRMINA-202:
-----------------------------------------

Hmmm, I'm not sure this is a MINA problem actually. From the Javadoc of Selector.wakeup():

"... If no selection operation is currently in progress then the next invocation of one of these methods [select() or selectNow()] will return immediately unless the selectNow() method is invoked in the meantime. ..."

It could be a problem with GNU classpath. Would it be possible for you to test this using GNU classpath's NIO directly? It should be quite easy to verify that classpath's wakeup() functions as expected.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Versions: 1.0
>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "John K Peterson (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373553 ] 

John K Peterson commented on DIRMINA-202:
-----------------------------------------

I found the problem.  Classpath utilizes Thread.interrupt in it's Selector.wakeup call.  However, the way JamVM implements Thread.interrupt, it will only interrupt monitors, not system calls like select.   I've modified JamVM so that it sends SIGUSR2 on interrupt and installed a null signal handler for it.  That's enough to interrupt the select system call and allow Selector.select to notice that the thread is interrupted.

> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (DIRMINA-202) Race condition in SocketAcceptorDelegate.unbind

Posted by "Trustin Lee (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/DIRMINA-202?page=comments#action_12373453 ] 

Trustin Lee commented on DIRMINA-202:
-------------------------------------

You're right.  It is not select(1000) but just select().  Call me an idiot! :D

OK, I feel like I need to test this case.  Could you please attach a JUnit TestCase for this issue?



> Race condition in SocketAcceptorDelegate.unbind
> -----------------------------------------------
>
>          Key: DIRMINA-202
>          URL: http://issues.apache.org/jira/browse/DIRMINA-202
>      Project: Directory MINA
>         Type: Bug

>     Reporter: John K Peterson

>
> On my system (Linux 2.4.22/JamVM 1.4.2/Classpath 0.90), ApacheDS gets stuck in SocketAcceptorDelegate.unbind.
> SocketAcceptorDelegate.unbind does the following:
> 1) creates a cancellation request
> 2) starts up a worker thread
> 3) puts the cancellation request on the cancelQueue
> 4) wakes up the worker thread's selector
> 5) waits for the cancellation request to be done
> The problem is that 4) assumes that 2) has gotten the worker thread to point where it has called selector.select().   However, there's no guarantee that the worker thread has gotten that far, in which case the wakeup occurs before the select and then the select hangs.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira