You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "James Birdsall (JIRA)" <ji...@apache.org> on 2016/08/10 00:58:20 UTC

[jira] [Commented] (PROTON-1276) Reactor: Race between CollectorImpl peek and pop can result in dispatching freed events

    [ https://issues.apache.org/jira/browse/PROTON-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414534#comment-15414534 ] 

James Birdsall commented on PROTON-1276:
----------------------------------------

It looks like this issue was caused by our code accidentally multi-threading the reactor in ways it is not intended to handle. This issue does not seem to occur anymore after I got a fix for the multi-threading.

> Reactor: Race between CollectorImpl peek and pop can result in dispatching freed events
> ---------------------------------------------------------------------------------------
>
>                 Key: PROTON-1276
>                 URL: https://issues.apache.org/jira/browse/PROTON-1276
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-j
>    Affects Versions: 0.13.0
>         Environment: Windows 10
>            Reporter: James Birdsall
>            Priority: Minor
>
> Last week I started trying to track down some unhandled NullPointerExceptions coming out of the Proton-J reactor:
> {quote}
> org.apache.qpid.proton.reactor.impl.IOHandler.onUnhandled(IOHandler.java:354)
> org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:236)
> org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
> org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:307)
> org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:276)
> com.microsoft.azure.servicebus.MessagingFactory$RunReactor.run(MessagingFactory.java:355)
> {quote}
> In IOHandler.onUnhandled(), the call to reactor.getSelector() is throwing the NPE because reactor is null. It's null because that's what Event.getReactor() returns when it doesn't recognize the type of EventImpl.context. It doesn't recognize the type because EventImpl.context is null, as is EventImpl.type.
> The event that's being dispatched was retrieved by the collector.peek() call in ReactorImpl.process(). Looking at CollectorImpl, the code specifically guards against putting an event with null type into the collector. It looks like the only time that an event can be completely null inside is in CollectorImpl.pop(), when the event that was just removed from the collector is clear()ed and then put on the free list. For grins, I added code that sets the context to the string "POISON" after the clear(), and modified CollectorImpl.peek() to detect when it was about to return a poisoned event, and peek() detected a poisoned event before every NPE.
> Looking at the code, there's a race between pop() and peek() because pop() cleans up the event while head is still pointing at it. I rewrote the code to remove the event from the chain before doing anything else, and so far it looks like that solves the problem:
> {code}
> @Override
>     public void pop()
>     {
>         if (head != null) {
>         	EventImpl freeing = head;
>         	head = head.next;
>         	freeing.clear();
>         	freeing.next = free;
>         	free = freeing;
>         }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org