You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Francesco Nigro (Jira)" <ji...@apache.org> on 2020/08/12 17:45:00 UTC
[jira] [Created] (ARTEMIS-2877) Improve journal replication scalability

Francesco Nigro created ARTEMIS-2877:
----------------------------------------

             Summary: Improve journal replication scalability 
                 Key: ARTEMIS-2877
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2877
             Project: ActiveMQ Artemis
          Issue Type: Task
          Components: Broker
    Affects Versions: 2.14.0, 2.13.0, 2.12.0, 2.11.0, 2.10.1, 2.10.0, 2.9.0, 2.8.1, 2.7.0
            Reporter: Francesco Nigro
             Fix For: 2.15.0


Journal scalability with a replicated pair has degraded due to:
* a semantic change on journal sync that was causing the Netty event loop on the backup to await any journal operation to hit the disk - see [ARTEMIS-2837
Bursts of open files under high load|https://issues.apache.org/jira/browse/ARTEMIS-2837]
* a semantic change on NettyConnection::write from within the Netty event loop, that is now immediately writing a buffer, while it was delaying it by offering it again in the event loop  -  see: [ARTEMIS-2205
Make AMQP Processing Single Threaded and other AMQP perf improvements|https://issues.apache.org/jira/browse/ARTEMIS-2205] (in particular https://github.com/apache/activemq-artemis/commit/a40a459f8c536a10a0dccae6e522ec38f09dd544#diff-3477fe0d8138d589ef33feeea2ecd28eL377-L392)


The former issues has been solved by reverting the changes and reimplementing the new semantic by using a flag to switch between the twos.

The latter need some more explanation to be understood:
# ReplicationEndpoint is responsible handle packets from a live
# Netty process incoming packets in batch, depending how live is sending them
# after each processed packets coming from live (that would likely end to append something to the journal), a replication packet response is sent back from backup to the live: in the original behavior (before 2.7.0), delaying the response back into the event loop means handling its sending at the end of all the incoming batch of packets, thus making the journal to process burst of appends and amortizing the full cost of awaking the I/O thread responsible of appending data to the journal. 
# Netty can capture an ChannelInboundHandler::channelReadComplete event to flush and batch packet responses, allowing the journal to handle bursts of appends like before
# A new executor on the journal to further reduce the cost of awaking the appending thread and reducing the appending cost




--
This message was sent by Atlassian Jira
(v8.3.4#803005)