You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@felix.apache.org by "Matt Banner (JIRA)" <ji...@apache.org> on 2011/07/26 20:17:09 UTC

[jira] [Created] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Event Admin deadlocks when sendEvent is called from within a handleEvent method
-------------------------------------------------------------------------------

Key: FELIX-3055
URL: https://issues.apache.org/jira/browse/FELIX-3055
Project: Felix
Issue Type: Bug
Components: Event Admin
Affects Versions: eventadmin-1.2.12
Reporter: Matt Banner

The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation. This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).

I have attached some sample code which should make this easy to reproduce. Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event. The handler for this event then attempts to send a new "bar" event. Every time this happens, a thread in the event admin pool will become deadlocked. If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).

Also attached is a thread dump from when all 20 threads have become deadlocked. Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155). I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carsten Ziegeler resolved FELIX-3055.
-------------------------------------

       Resolution: Fixed
    Fix Version/s: eventadmin-1.2.14

Many thanks for reporting this, Matt!

Actually we had two problems: one of the this deadlock, but after fixing this I ran into starvation problems if more threads than the pool offers send cascading events. While the whole pool is processing the "outer" events, inner event sending is blocked as the pool has no free thread.

I've fixed both in revision 1151755 and ran some new tests on this version including
the tck. I couldn't detect any problems.

Would be great, Matt, if you could run your tests as well to verify.

Thanks!

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>         Environment: Win2K8 R2, Java 1.6.0_17
>            Reporter: Matt Banner
>            Assignee: Carsten Ziegeler
>             Fix For: eventadmin-1.2.14
>
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carsten Ziegeler reassigned FELIX-3055:
---------------------------------------

    Assignee: Carsten Ziegeler

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>            Reporter: Matt Banner
>            Assignee: Carsten Ziegeler
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Matt Banner (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Banner updated FELIX-3055:
-------------------------------

    Environment: Win2K8 R2, Java 1.6.0_17

Sorry, I didn't think about this being platform dependent.  I am seeing the issue on Win2k8 R2 with Java 1.6.0_17 and on Windows XP with Java 1.6.0_16, but NOT on RHEL 4 with Java 1.6.0_17.

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>         Environment: Win2K8 R2, Java 1.6.0_17
>            Reporter: Matt Banner
>            Assignee: Carsten Ziegeler
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072001#comment-13072001 ] 

Carsten Ziegeler commented on FELIX-3055:
-----------------------------------------

Yeah, actually, it's me to say sorry...I was too dumb to correctly test...my event handler was configured to be ignored for timeout handling....running a correct test reveals the exact same problem as you describe :(
I'll have a look

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>         Environment: Win2K8 R2, Java 1.6.0_17
>            Reporter: Matt Banner
>            Assignee: Carsten Ziegeler
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Matt Banner (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Banner updated FELIX-3055:
-------------------------------

    Attachment: threaddump.txt

Attaching thread dump.

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>            Reporter: Matt Banner
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Carsten Ziegeler (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071849#comment-13071849 ] 

Carsten Ziegeler commented on FELIX-3055:
-----------------------------------------

I can't reproduce this right now, and the implementation passes the tck which tests this as well. I even wrote a massive test for this send 2000 events where each uses your pattern from your test - so maybe there is a difference in the environment we're using?
What os / java version are you using?

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>            Reporter: Matt Banner
>            Assignee: Carsten Ziegeler
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Matt Banner (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072393#comment-13072393 ] 

Matt Banner commented on FELIX-3055:
------------------------------------

Everything looks good to me.  I tried my original test code as well as a new test that recursively calls sendEvent() a greater number of times than there are threads in the pool (to check for the starvation issue), and I also tried it out in our actual application as a sanity check.

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>         Environment: Win2K8 R2, Java 1.6.0_17
>            Reporter: Matt Banner
>            Assignee: Carsten Ziegeler
>             Fix For: eventadmin-1.2.14
>
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (FELIX-3055) Event Admin deadlocks when sendEvent is called from within a handleEvent method

Posted by "Matt Banner (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/FELIX-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Banner updated FELIX-3055:
-------------------------------

    Attachment: TestClass.java

Class to help with reproducing issue.

> Event Admin deadlocks when sendEvent is called from within a handleEvent method
> -------------------------------------------------------------------------------
>
>                 Key: FELIX-3055
>                 URL: https://issues.apache.org/jira/browse/FELIX-3055
>             Project: Felix
>          Issue Type: Bug
>          Components: Event Admin
>    Affects Versions: eventadmin-1.2.12
>            Reporter: Matt Banner
>         Attachments: TestClass.java, threaddump.txt
>
>
> The Felix Event Admin service doesn't correctly handle the case where EventAdmin.sendEvent(Event) is called from within an EventHandler.handleEvent(Event) implementation.  This happens whether the original event (the one being handled by the EventHandler, not the one it is dispatching) was dispatched using EventAdmin.sendEvent(Event) or EventAdmin.postEvent(Event).
> I have attached some sample code which should make this easy to reproduce.  Every time you invoke the "dispatch foo" command from the GoGo shell, it will post a "foo" event.  The handler for this event then attempts to send a new "bar" event.  Every time this happens, a thread in the event admin pool will become deadlocked.  If you run the dispatch command more times than the minimum number of threads in the pool (I think 20, by default), they will all be deadlocked and the event admin service will stop invoking event handlers (it seemed strange to me that this happens when you hit the minimum number of thread in the pool rather than the maximum, but I haven't had time to investigate that).
> Also attached is a thread dump from when all 20 threads have become deadlocked.  Note that they are all stuck on line 240 of SyncDeliverTasks.java waiting on a rendezvous with the timer barrier (line numbers refer to the current revision on trunk, r1074155).  I suspect that this is because the attempt to rendezvous on line 266 is met by the attempt on line 208 in the case where the event handler recursively calls sendEvent, leaving line 240 with no corresponding call to meet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira