You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by franz1981 <gi...@git.apache.org> on 2017/11/03 13:42:02 UTC

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

GitHub user franz1981 opened a pull request:

    https://github.com/apache/activemq-artemis/pull/1638

    ARTEMIS-1499 ArtemisExecutor and ProcessorBase needs support of recursive blocking executions

    Submitting a CountDownLatch::countDown task and CountDownLatch::await right after in the event loop will block ArtemisExecutor.
    This fix provide:
     - a way to validate blocking actions (ie failing flushes on event loop)
     - thread local sequential execution for task submissions while on event loop (ie execute).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/franz1981/activemq-artemis recursive_ordered_executor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/activemq-artemis/pull/1638.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1638
    
----
commit ee7c5db9c1f7a72fd29b791dfce4c7fba3b17543
Author: Francesco Nigro <ni...@gmail.com>
Date:   2017-11-03T13:40:25Z

    ARTEMIS-1499 ArtemisExecutor and ProcessorBase needs support of recursive blocking executions
    
    Submitting a CountDownLatch::countDown task and CountDownLatch::await right after in the event loop will block ArtemisExecutor.
    This fix provide:
     - a way to validate blocking actions (ie failing flushes on event loop)
     - thread local sequential execution for task submissions while on event loop (ie execute).

----


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ProcessorBase warns if a flush is...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981 yeah.. I think we should close this... because:
    
    It's not actually even loop. A ProcessorBase is just the context of a fake executor. Each OrderedExecutor will have its own instance.. This thing may confuse things a bit.
    
    Say you have 10 running sessions doing deliveries. .you will end up with 10 threads.. each an instance of the OrderedExecutor.. the only difference is that this will take a thread from the pool and it may schedule tasks on that.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic Re `ArtemisExecutor::flush` I've provided the most defensive/conservative way to manage it, but I'm sure that it can be solved (like `execute`) in a more permissive and simpler way if called inside the `event loop`.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic 
    > Instead of changing the semantic of flush, lets just avoid the bug... this PR should be closed IMO.
    
    I'm not sure about it TBH: help to understand why we can't check if `flush` is being called for within the event loop? 
    It is a wrong use of it as you said and we can check/validate it with ease (or at least we could log it as a warning/error) and without any perf impact. 


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic Why a thread local is not ok? It is not allocating nothing and it costs about like a MOV assembly instruction as it is implemented on the last JDK: having such logs (in debug mode at least)  could be really of some help for a developer if some weird execution paths could happen...


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    throwing an executor call, and waiting for a .wait is clearly a bug... 
    
    Adding this complex code on such a hot path has more risks in itself than eventually fixing issues.
    
    I wouldn't merge this PR


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @andytaylor @clebertsuconic Got it :+1: 
    I will refactor the current PR handling only the fix on `flush` while leaving the improvement for recursive executions to be proposed in a different PR :)


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981 changing flush is still re-engineering..
    
    if there's any code calling flush within the OrderedExecutor, that's a classical starvation problem.. it's a bug that needs to be removed.
    
    Instead of changing the semantic of flush, lets just avoid the bug... this PR should be closed IMO.


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148852871
  
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -111,7 +141,10 @@ public final boolean isFlushed() {
     
        protected void task(T command) {
           tasks.add(command);
    -      startPoller();
    +      //there is no need to alert the poller if already in the event loop
    --- End diff --
    
    This is the improvement I've mentioned: I've left it here just to understand if it is safe but I can push in another PR apart (in the future) too, if it seems safe.


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148852944
  
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -111,7 +141,10 @@ public final boolean isFlushed() {
     
        protected void task(T command) {
           tasks.add(command);
    -      startPoller();
    +      //there is no need to alert the poller if already in the event loop
    --- End diff --
    
    @clebertsuconic This is the improvement I've mentioned: I've left it here just to understand if it is safe but I can push in another PR apart (in the future) too, if it seems safe.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    i understand it and I agree, but please reconsider it with tests and a longer review: as I've said we can't prohibit that kind of behaviours with ease and if it works (as I've tested/expected) it could make everything more stable/solid.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @gemmellr @clebertsuconic I'm not very used to "even loop" patterns hence I call for help :)
    
    I've found that our `ArtemisExecutor` deadlock with the (fresh new) test `shouldSubmitBlockingTasksFromEventLoopMaintainingOrder` TL;DR if a `CountDownLatch::await` is blocking the event loop while waiting a submitted `CoundDownLatch::countDown`.
    Enforcing any code rule to avoid these patterns to be used it not simple, hence I've provided a fix to it.
    Just as a note, the same pattern (verified) blocks the Netty's `EventLoop` too.
    This fix come at the cost of a potential bigger calls stack, but providing a good boost for "in event loop" executions too, but I'm not 100% to have messed up with others things.
    Wdyt?



---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981 I disagree.. it would hide bugs.. which could end up in being more dangerous.


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148800347
  
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    -      tasks.add(command);
    --- End diff --
    
    It is not only the flush that has issues but the execute too: take a look at the tests I've added :)


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by andytaylor <gi...@git.apache.org>.
Github user andytaylor commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    Clebert is correct, lets fix rather than re engineer


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981  shouldSubmitBlockingTasksFromEventLoopMaintainingOrder is a buggy code... it's an anti-pattern we shouldn't have in our code. 
    
    Sometimes we must do such thing, but then we must make sure we are using proper executions.. or proper callbacks (even better)


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    I don't think we should have any thread locals on these executors... they are quite hot on the path.. I would leave it alone.. if there's anything broken.. just change the caller.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981 ProcessorBase has no knowledge of Netty. just make the fix simple.. flush is only called in 3 places.. fix those places instead of rewriting the whole thing.


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148796005
  
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    --- End diff --
    
    What about just fix the offending flush than add this complex piece?


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148796392
  
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    -      tasks.add(command);
    --- End diff --
    
    -1.... lets just fix the offending part?


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ArtemisExecutor and Proces...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1638#discussion_r148801546
  
    --- Diff: artemis-commons/src/main/java/org/apache/activemq/artemis/utils/actors/ProcessorBase.java ---
    @@ -109,12 +125,49 @@ public final boolean isFlushed() {
           return stateUpdater.get(this) == STATE_NOT_RUNNING;
        }
     
    -   protected void task(T command) {
    -      tasks.add(command);
    --- End diff --
    
    you've added tests that are simulating bugs at the code.. I would rather fix bugs than protect the executor from them.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic You're right, but I will run some tests with the broker to see if the warning will be printed...just in case.
    I've added a lil improvement on task submission but for the reasons you've said I can remove it and let it being pushed as an improvement in another PR (in the future). 


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic Just a couple questions (so I can add some doc/tests): 
    
    1. it is allowed/legal to submit new tasks` from the event loop thread?
    2. it is allowed call `flush` from the event loop thread? What is the expected behaviour while doing it?  


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981 what fix on flush? where do you see it being used.
    
    
    1. it is legal to submit tasks...
    2. you shouldn't call flush within a thread runnable. That's broken behaviour.
    
    Instead of fixing what's not broken... just remove any calls to flush within executors.. I don't think there's any now.


---

[GitHub] activemq-artemis pull request #1638: ARTEMIS-1499 ProcessorBase warns if a f...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 closed the pull request at:

    https://github.com/apache/activemq-artemis/pull/1638


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ProcessorBase warns if a flush is...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic after several tests I haven't found any call to 'flush' from within the event loop thread: I can close this if isn't needed :)
    Thanks for the reviews guys 


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by clebertsuconic <gi...@git.apache.org>.
Github user clebertsuconic commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @franz1981  is there actually an issue? an actual deadlock?
    
    My problem is adding complexity to the code for no reason.


---

[GitHub] activemq-artemis issue #1638: ARTEMIS-1499 ArtemisExecutor and ProcessorBase...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1638
  
    @clebertsuconic I've mentioned Netty just as an example of a event loop framework that doesn't handle well recursive submits into the event loop/blocking call issued from without the event loop.
    The fix (if i'll remove the other changes) is simple: i'm just checking if the call will happen inside the event loop: let me push it and you can see it by yourself :+1: 


---