You are viewing a plain text version of this content. The canonical link for it is here.

Posted to gitbox@activemq.apache.org by GitBox <gi...@apache.org> on 2021/05/04 16:19:37 UTC

[GitHub] [activemq-artemis] franz1981 opened a new pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

franz1981 opened a new pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566


   https://issues.apache.org/jira/browse/ARTEMIS-3282


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-832470852


   I believe this can be further improved too, just need to think twice how :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-845190747


   I think this one is a good shape now and it's fixing the bug of sync'ing on the replca too @clebertsuconic 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630973519



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       > I really don't see how a user can choose a value for maxReplicaResponseBatchBytes
   
   It can be 0 with users that care about 99.XXX percentile latencies and have configured kernel bypass drivers.
   It could be the MTU size for users that knows it, it should be -1 for "common" users (right now I've chosen 1500 that's the typical MTU size).
   I think is very low level config really, yet to be see how it can be usefull: I'm still in the process of validating it before dropping the "draft" status of the PR.
   
   >  I think it has to be automatic or automagical, based on the some limit on what can be read.
   
   From the point of view of network utilization and memory usage, just using -1 or 1500 is already a huge step forward if compared with the "previous" (pre-https://issues.apache.org/jira/browse/ARTEMIS-2877)  behaviour.
   
   Size of Ethernet frame - 24 Bytes
   Size of IPv4 Header (without any options) - 20 bytes
   Size of TCP Header (without any options) - 20 Bytes
   
   Total size of an Ethernet Frame carrying an IP Packet with an empty TCP Segment - 24 + 20 + 20 = 64 bytes
   
   When packet size is > MTU, the TCP packets are going to be fragmented, but that's fine because it will amortize syscall cost instead, while maximizing network usage too.
   While just sending responses one by one means sending a ~3X overhead of data for each response sent, that will hurt both latencies and CPU/network usage.
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630973519



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       > I really don't see how a user can choose a value for maxReplicaResponseBatchBytes
   
   It can be 0 with users that care about 99.XXX percentile latencies and have configured kernel bypass drivers.
   It could be the MTU size for users that knows it, it should be -1 for "common" users (right now I've chosen 1500 that's the typical MTU size).
   I think is very low level config really, yet to be see how it can be usefull: I'm still in the process of validating it before dropping the "draft" status of the PR.
   
   >  I think it has to be automatic or automagical, based on the some limit on what can be read.
   
   From the point of view of network utilization and memory usage, just using -1 or 1500 is already a huge step forward if compared with the "previous" (pre-https://issues.apache.org/jira/browse/ARTEMIS-2877)  behaviour.
   
   Size of Ethernet frame - 24 Bytes
   Size of IPv4 Header (without any options) - 20 bytes
   Size of TCP Header (without any options) - 20 Bytes
   
   Total size of an Ethernet Frame carrying an IP Packet with an empty TCP Segment - 24 + 20 + 20 = 64 bytes
   
   When packet size is > MTU, the TCP packets are going to be fragmented, but that's fine because it will amortize syscall cost instead, while maximizing network usage too.
   While just sending responses one by one means sending a ~3X overhead of data for each response sent, that will hurt both latencies and CPU/network usage.
   
   
   >  I wonder if something similar based on confirmation-window could work here
   
   No idea, IIRC  the replication flow of packets won't obey any of the other cluster connection channel packets flow rules, no duplicate checks/no confirmation window or anything similar, @clebertsuconic can you confirm? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] gtully commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

gtully commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r636791950



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       ok, I understand better, the batching is limited by the consumed buffers, before the next read there will always be a batch end.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833834162


   Preliminary tests using paging won't show such regression anyway:
   no batching
   ```
   **************
   RUN 1	EndToEnd Throughput: 10626 ops/sec
   **************
   EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
   mean               1522.66
   min                 129.54
   50.00%             1376.26
   90.00%             1507.33
   99.00%             5734.40
   99.90%            10747.90
   99.99%            18350.08
   max               26214.40
   count               160000
   ```
   batching:
   ```
   **************
   RUN 1	EndToEnd Throughput: 11202 ops/sec
   **************
   EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
   mean               1446.31
   min                 111.10
   50.00%             1376.26
   90.00%             1482.75
   99.00%             3784.70
   99.90%             8224.77
   99.99%            14024.70
   max               17563.65
   count               160000
   ```
   Same scenario: 16  producer / consumers sending to 16 queues 100 bytes persistent msg with replication vs a broker that's already paging due to
   ```
   ./artemis producer --message-count 22000 --message-size 100000
   ```
   During the whole load test the broker is paging over the 16 queues != TEST and batching still seems beneficial
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 edited a comment on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 edited a comment on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833839211


   Tests has shown something interesting going on on backup side:
   ![image](https://user-images.githubusercontent.com/13125299/117361311-2aca8e80-aeba-11eb-83fd-9000719ba939.png)
   In violet there are calls to `FileChannel::write` due to `handlePageWriteEvent` and `FileChannelImpl::force` due to `handlePageEvent`. The latter doesn't seem right, considered what we deliver with common journal writes ie writes are performed in async and we send response back before hitting the journal. @clebertsuconic wdyt?
   I'm thinking to save `fdatasync` due to paging to happen in the Netty event loop....
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630973519



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       > I really don't see how a user can choose a value for maxReplicaResponseBatchBytes
   
   It can be 0 with users that care about 99.XXX percentile latencies and have configured kernel bypass drivers.
   It could be the MTU size for users that knows it, it should be -1 for "common" users (right now I've chosen 1500 that's the typical MTU size).
   I think is very low level config really, yet to be seen how it can be useful: I'm still in the process of validating it before dropping the "draft" status of the PR.
   
   >  I think it has to be automatic or automagical, based on the some limit on what can be read.
   
   From the point of view of network utilization and memory usage, just using -1 or 1500 is already a huge step forward if compared with the "previous" (pre-https://issues.apache.org/jira/browse/ARTEMIS-2877)  behaviour.
   
   Size of Ethernet frame - 24 Bytes
   Size of IPv4 Header (without any options) - 20 bytes
   Size of TCP Header (without any options) - 20 Bytes
   
   Total size of an Ethernet Frame carrying an IP Packet with an empty TCP Segment - 24 + 20 + 20 = 64 bytes
   
   When packet size is > MTU, the TCP packets are going to be fragmented, but that's fine because it will amortize syscall cost instead, while maximizing network usage too.
   While just sending responses one by one means sending a ~3X overhead of data for each response sent, that will hurt both latencies and CPU/network usage.
   
   
   >  I wonder if something similar based on confirmation-window could work here
   
   No idea, IIRC  the replication flow of packets won't obey any of the other cluster connection channel packets flow rules, no duplicate checks/no confirmation window or anything similar, @clebertsuconic can you confirm? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-832632624


   Results shows that with a long enough stream of packet types, `switch` perform slightly better then the `if cascade`:
   ```
   Benchmark                              (mostSeenProb)  (size)  Mode  Cnt   Score   Error  Units
   SwitchVsIfCascadeBenchmark.testIf                  40    8192  avgt   16  21.344 ± 1.075  ns/op
   SwitchVsIfCascadeBenchmark.testIf                  60    8192  avgt   16  17.265 ± 0.091  ns/op
   SwitchVsIfCascadeBenchmark.testIf                  80    8192  avgt   16  11.571 ± 0.046  ns/op
   SwitchVsIfCascadeBenchmark.testSwitch              40    8192  avgt   16  19.416 ± 0.166  ns/op
   SwitchVsIfCascadeBenchmark.testSwitch              60    8192  avgt   16  15.342 ± 0.074  ns/op
   SwitchVsIfCascadeBenchmark.testSwitch              80    8192  avgt   16  10.942 ± 0.013  ns/op
   ```
   So there's no point to use the if one given that's a bit less readable too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 closed pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 closed pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-832072034


   This can be relevant for you, @michaelandrepearce , because with super fast kernel bypass drivers batching can cause some tail latency outlier; this tuning can save it to happen (by setting the batching === 0) and can be used to improve performance too.
   
   I still need to perform some tests and see if the new default (using ~ TCP MTU === 1500 bytes) is good enough for most users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833811477


   I was expecting exposing batching was meant to give a good speedup with kernel bypass drivers but seems not: but I still think the current implementation can still hide a subtle regression with paging/large messages.
   I'm going to perform some test to verify it, but seems related to the fact that, on backup, large msgs write and page writes are still happening on the Netty thread and given that they perform synchronous operations on file channels (some form of FileChannel::write) they can delay for some time (untile Netty decide there are no more incoming data from the replication socket) any already batched response, causing it to be delayed too much.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] gtully commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

gtully commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r631215271



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       For sure batching is good, what I am struggling with is the use case for setting a limit on accumulating, b/c the other end is waiting, there also needs to be some timeout in case the limit is not reached. Batching on low utilisation seems unfair on the sender in this case, and the sender is blocking. I need to peek more to see what -1 does :-) thanks for the detail.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630803743



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       @clebertsuconic @gtully 
   I'm not yet sure about it: a page write can be a very fast operation depending on the amount of data to be written ie it's the cost to copy some buffer X times, really. Flushing any existing batch is just to ensure no further delay is going to affect the overall response time of replication, but if the batch is too small, bad network utilization will still hurt scalability and average latencies.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 edited a comment on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 edited a comment on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833811477


   I was expecting exposing batching size was meant to give a good speedup with kernel bypass drivers but seems not: I see instead that the current implementation can hide a subtle perf regression with paging/large messages.
   I'm going to perform some test to verify it, but seems related to the fact that, on backup, large msgs write and page writes are still happening on the Netty thread and given that they perform synchronous operations on file channels (some form of FileChannel::write) they can delay for some time (untile Netty decide there are no more incoming data from the replication socket) any already batched response, causing it to be delayed too much.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-857460000


   I've decided to NOT expose the response batch size, but save sync on paging/large messages to happen, instead, going to create a separate PR to address this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833881055


   The last commit isn't givin the nice speedup I was expecting:
   ```
   **************
   RUN 1	EndToEnd Throughput: 11155 ops/sec
   **************
   EndToEnd SERVICE-TIME Latencies distribution in MICROSECONDS
   mean               1438.52
   min                 116.74
   50.00%             1376.26
   90.00%             1482.75
   99.00%             3424.26
   99.90%             7143.42
   99.99%            15400.96
   max               17825.79
   count               160000
   ```
   latency are on par with normal batching + fdatasync, although the box used have a very fast disk and the amount of paged data wasn't enough to make fdatasync to be long enough.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630973519



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       > I really don't see how a user can choose a value for maxReplicaResponseBatchBytes
   
   It can be 0 with users that care about 99.XXX percentile latencies and have configured kernel bypass drivers.
   It could be the MTU size for users that knows it, it should be -1 for "common" users (right now I've chosen 1500 that's the typical MTU size).
   I think is very low level config really, yet to be see how it can be usefull: I'm still in the process of validating it before dropping the "draft" status of the PR.
   
   >  I think it has to be automatic or automagical, based on the some limit on what can be read.
   
   From the point of view of network utilization and memory usage, just using -1 or 1500 is already a huge step forward if compared with the "previous" (pre-https://issues.apache.org/jira/browse/ARTEMIS-2877)  behaviour.
   
   Size of Ethernet frame - 24 Bytes
   Size of IPv4 Header (without any options) - 20 bytes
   Size of TCP Header (without any options) - 20 Bytes
   
   Total size of an Ethernet Frame carrying an IP Packet with an empty TCP Segment - 24 + 20 + 20 = 64 bytes
   
   When packet size is > MTU, the TCP packets are going to be fragmented, but that's fine because it will amortize syscall cost instead, while maximizing network usage too.
   While just sending responses one by one means sending a ~3X overhead of data for each response sent, that will hurt both latencies and CPU/network usage.
   
   
   >  I wonder if something similar based on confirmation-window could work here
   
   No idea, the replication flow of packets won't obey IIRC any of the other cluster connection channel packets flow rules AFAIK, no duplicate checks/no confirmation window or anything similar, @clebertsuconic can you confirm? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] gtully commented on a change in pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

gtully commented on a change in pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#discussion_r630953529



##########
File path: artemis-server/src/main/java/org/apache/activemq/artemis/core/replication/ReplicationEndpoint.java
##########
@@ -221,6 +221,8 @@ public void handlePacket(final Packet packet) {
                handleCommitRollback((ReplicationCommitMessage) packet);
                break;
             case PacketImpl.REPLICATION_PAGE_WRITE:
+               // potential blocking I/O operation! flush existing packets to save long tail latency
+               endOfBatch();

Review comment:
       I think if blocking ops are removed, there is no need to eagerly flush the batch here. However I really don't see how a user can choose a value for maxReplicaResponseBatchBytes. I think it has to be automatic or automagical, based on the some limit on what can be read.
   5.x has optimiseAck on by default, at 70% of the prefetch for openwire consumers. That covers the auto ack case and only sends an actual ack every X non persistent messages. I wonder if something similar based on confirmation-window could work here. Tho the endpoint probably has no way of knowing what is set on the other end I guess. Is there a relationship between the confirmation window and responses already?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 commented on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 commented on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833839211


   Tests has shown something interesting:
   ![image](https://user-images.githubusercontent.com/13125299/117361311-2aca8e80-aeba-11eb-83fd-9000719ba939.png)
   In violet there are calls to `FileChannel::write` due to `handlePageWriteEvent` and `FileChannelImpl::force` due to `handlePageEvent`. The latter doesn't seem right, considered what we deliver with common journal writes ie writes are performed in async and we send response back before hitting the journal. @clebertsuconic wdyt?
   I'm thinking to save `fdatasync` due to paging to happen in the Netty event loop....
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 edited a comment on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 edited a comment on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833811477


   I was expecting exposing batching size was meant to give a good speedup with kernel bypass drivers but seems not: but I still think the current implementation can still hide a subtle regression with paging/large messages.
   I'm going to perform some test to verify it, but seems related to the fact that, on backup, large msgs write and page writes are still happening on the Netty thread and given that they perform synchronous operations on file channels (some form of FileChannel::write) they can delay for some time (untile Netty decide there are no more incoming data from the replication socket) any already batched response, causing it to be delayed too much.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 edited a comment on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 edited a comment on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833811477


   I was expecting exposing batching size was meant to give a good speedup with kernel bypass drivers but seems not: I see instead that the current implementation can hide a subtle perf regression with paging/large messages.
   I'm going to perform some test to verify it, but seems related to the fact that, on backup, large msgs write and page writes are still happening on the Netty thread and given that they perform synchronous operations on file channels (some form of FileChannel::write) they can delay for some time any already batched responses to be sent back to the master broker.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [activemq-artemis] franz1981 edited a comment on pull request #3566: ARTEMIS-3282 Expose replication response batching tuning

Posted by GitBox <gi...@apache.org>.

franz1981 edited a comment on pull request #3566:
URL: https://github.com/apache/activemq-artemis/pull/3566#issuecomment-833839211


   Tests has shown something interesting going on on backup side, although it's not a regression:
   ![image](https://user-images.githubusercontent.com/13125299/117361311-2aca8e80-aeba-11eb-83fd-9000719ba939.png)
   In violet there are calls to `FileChannel::write` due to `handlePageWriteEvent` and `FileChannelImpl::force` due to `handlePageEvent`. The latter doesn't seem right, considered what we deliver with common journal writes ie writes are performed in async and we send response back before hitting the journal. @clebertsuconic wdyt?
   I'm thinking to save `fdatasync` due to paging to happen in the Netty event loop....
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org