You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by qihongxu <gi...@git.apache.org> on 2018/12/26 07:00:51 UTC

[GitHub] activemq-artemis pull request #2482: ARTEMIS-2214 Cache durable&priority in ...

GitHub user qihongxu opened a pull request:

    https://github.com/apache/activemq-artemis/pull/2482

    ARTEMIS-2214 Cache durable&priority in PagedReference

    We recently performed a test on artemis broker and found a severe performance issue.
    
    When paged messages are being consumed, decrementMetrics in QueuePendingMessageMetrics will try to ‘getMessage’ to check whether they are durable or not. In this way queue will be locked for a long time because page may be GCed and need to be reload entirely. Other operations rely on queue will be blocked at this time, which cause a significant TPS drop. Detailed stacks are attached below.
    
    This also happens when consumer is closed and messages are pushed back to the queue, artemis will check priority on return if these messages are paged.
    
    To solve the issue, durable and priority need to be cached in PagedReference just like messageID, transactionID and so on. I have applied a patch to fix the issue. Any review is appreciated.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/qihongxu/activemq-artemis modify_pagedReference

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/activemq-artemis/pull/2482.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2482
    
----
commit a49ad880c2372afdb88bd805fb6e20fdae1de784
Author: Qihong Xu <qi...@...>
Date:   2018-12-26T03:11:10Z

    ARTEMIS-2214 Cache durable&priority in PagedReference

----


---

[GitHub] activemq-artemis pull request #2482: ARTEMIS-2214 Cache durable&priority in ...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/2482#discussion_r244150155
  
    --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/paging/cursor/PagedReferenceImpl.java ---
    @@ -363,4 +371,20 @@ public long getPersistentSize() {
           return messageSize;
        }
     
    +   @Override
    +   public byte getPriority() {
    +      if (priority == -1) {
    +         priority = getMessage().getPriority();
    +      }
    +      return priority;
    +   }
    +
    +   @Override
    +   public boolean isDurable() {
    +      if (messageID < 0) {
    --- End diff --
    
    should use durable field or additional flag to see if its been cached, messageID maybe cached already but not durable.


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    Nice catch!!!
    If you care about checking perf of paging please try to profile the broker with https://github.com/jvm-profiling-tools/async-profiler using lock/cpu events too: it will allows to get profile data without affecting performance, helping to find issues like this one :+1


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by qihongxu <gi...@git.apache.org>.
Github user qihongxu commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    > Im very cautious of optimising for one case
    > Its very unusual for consumers to go away as it is good design in most mom a consumer is long lived. Like wise upgrade is a non normal event.
    > 
    > So without priority removed im -1 this. As i stated in an ideal world everything in paging would be off heap, we shouldnt just end up with every value in page ref, else we lose the reason for paging originals intent which is to remove the message from heap at penalty that it will be slower. If you want faster add more ram so more can be kept on heap.
    
    Not adding priority is acceptable since it's rare. I will later remove the priority part.
    And how about deliveryTime? It's already in PageRef and might need to be set in the constructor.


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&deliveryTime in Pag...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    @franz1981 im away in another country without my main computer with my apache git ssh cert key. Could you merge this?


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    @qihongxu nudge


---

[GitHub] activemq-artemis pull request #2482: ARTEMIS-2214 Cache durable&priority in ...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/2482#discussion_r244149886
  
    --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/paging/cursor/PagedReferenceImpl.java ---
    @@ -74,6 +74,10 @@
     
        private long messageSize = -1;
     
    +   private byte priority;
    --- End diff --
    
    default this to -1


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by qihongxu <gi...@git.apache.org>.
Github user qihongxu commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    [stacks.txt](https://github.com/apache/activemq-artemis/files/2709657/stacks.txt)



---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by qihongxu <gi...@git.apache.org>.
Github user qihongxu commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    @michaelandrepearce Thanks for the review. Now I have changed the code at your suggestion.


---

[GitHub] activemq-artemis pull request #2482: ARTEMIS-2214 Cache durable&priority in ...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/2482#discussion_r244150563
  
    --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/paging/cursor/PagedReferenceImpl.java ---
    @@ -74,6 +74,10 @@
     
        private long messageSize = -1;
     
    +   private byte priority;
    +
    +   private boolean durable;
    --- End diff --
    
    need to default this not set somehow, possible use a byte to represent the boolean with 0 = false, 1 = true, -1 not set


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    Yes you could add this to set in the section within the constructor where message != null, seems sensible.


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by qihongxu <gi...@git.apache.org>.
Github user qihongxu commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    > Nice catch!!!
    > If you care about checking perf of paging please try to profile the broker with https://github.com/jvm-profiling-tools/async-profiler using lock/cpu events too: it will allows to get profile data without affecting performance, helping to find issues like this one :)
    
    Thanks for your advice! We used to record performance from producer/consumer side. Later on we will try to dig into broker side by using tools as recommended :)


---

[GitHub] activemq-artemis pull request #2482: ARTEMIS-2214 Cache durable&deliveryTime...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/2482#discussion_r245233815
  
    --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/paging/cursor/PagedReferenceImpl.java ---
    @@ -120,14 +126,16 @@ public PagedReferenceImpl(final PagePosition position,
              this.largeMessage = message.getMessage().isLargeMessage() ? IS_LARGE_MESSAGE : IS_NOT_LARGE_MESSAGE;
              this.transactionID = message.getTransactionID();
              this.messageID = message.getMessage().getMessageID();
    -
    +         this.durable = message.getMessage().isDurable() ? IS_DURABLE : IS_NOT_DURABLE;
    +         this.deliveryTime = message.getMessage().getScheduledDeliveryTime();
              //pre-cache the message size so we don't have to reload the message later if it is GC'd
              getPersistentSize();
           } else {
              this.largeMessage = UNDEFINED_IS_LARGE_MESSAGE;
              this.transactionID = -2;
              this.messageID = -1;
              this.messageSize = -1;
    +         this.durable = UNDEFINED_IS_DURABLE;
    --- End diff --
    
    for completeness (its a nit) set deliveryTime to its undefined value here.


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    Just looking over this a bit more. As the page ref needs to be a small as possible. The whole point of paging is to remove it from memory to be able to scale.
    
    With regards to priority i dont see this being accessed or affecting a hot path its only on access to message references. In regards to a consumer closing and having unhandled messages. This isnt on hot path imo, and shouldnt be tuning for that.


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by qihongxu <gi...@git.apache.org>.
Github user qihongxu commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    @michaelandrepearce 
    There are some cases will perform lots of rollbacks in a short period of time. For example if we would like to upgrade our server while thousands of consumers are receiving message, the close of server causes massive rollbacks to original address, thus queue might blocked on reading GC'ed pages. Under this circumstance the upgrade will take more than 5-10 minutes(e.g 2000 consumers) and make a negative impact on downstream systems.
    
    > deliveryTime can be set in the constructor like transactionID , messageID , etc :)
    
    @wy96f 
    Yes we do find similar block in PagedReferenceImpl::getScheduledDeliveryTime() since if deliveryTime is not set it will call getMessage() during rollback. 
    
    To these two situations, detailed stacks are shown in attachment.
    
    Considering priority only occupy one byte, it might be worthwhile to add it in PageRef to improve stability:) As for deliveryTime, since it is already in PageRef, we can simply add `this.deliveryTime = message.getMessage().getScheduledDeliveryTime();` in constructor to avoid block on rollback.


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&priority in PagedRe...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    Im very cautious of optimising for one case
     Its very unusual for consumers to go away as it is good design in most mom a consumer is long lived.
    
    So without it removed im -1. As i stated in an ideal world everything in paging would be off heap. 


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&deliveryTime in Pag...

Posted by franz1981 <gi...@git.apache.org>.
Github user franz1981 commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    I will probably do it on the weekend or on Monday ;)


---

[GitHub] activemq-artemis issue #2482: ARTEMIS-2214 Cache durable&deliveryTime in Pag...

Posted by michaelandrepearce <gi...@git.apache.org>.
Github user michaelandrepearce commented on the issue:

    https://github.com/apache/activemq-artemis/pull/2482
  
    @qihongxu looks good to me now.


---

[GitHub] activemq-artemis pull request #2482: ARTEMIS-2214 Cache durable&priority in ...

Posted by wy96f <gi...@git.apache.org>.
Github user wy96f commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/2482#discussion_r245195927
  
    --- Diff: artemis-server/src/main/java/org/apache/activemq/artemis/core/paging/cursor/PagedReferenceImpl.java ---
    @@ -120,6 +128,8 @@ public PagedReferenceImpl(final PagePosition position,
              this.largeMessage = message.getMessage().isLargeMessage() ? IS_LARGE_MESSAGE : IS_NOT_LARGE_MESSAGE;
              this.transactionID = message.getTransactionID();
              this.messageID = message.getMessage().getMessageID();
    +         this.priority = message.getMessage().getPriority();
    --- End diff --
    
    deliveryTime can be set in the constructor like transactionID , messageID , etc :)


---