You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Francesco Nigro (Jira)" <ji...@apache.org> on 2020/08/12 21:23:00 UTC

[jira] [Updated] (ARTEMIS-2877) Fix journal replication scalability

     [ https://issues.apache.org/jira/browse/ARTEMIS-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Francesco Nigro updated ARTEMIS-2877:
-------------------------------------
    Summary: Fix journal replication scalability   (was: Improve journal replication scalability )

> Fix journal replication scalability 
> ------------------------------------
>
>                 Key: ARTEMIS-2877
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2877
>             Project: ActiveMQ Artemis
>          Issue Type: Task
>          Components: Broker
>    Affects Versions: 2.7.0, 2.8.1, 2.9.0, 2.10.0, 2.10.1, 2.11.0, 2.12.0, 2.13.0, 2.14.0
>            Reporter: Francesco Nigro
>            Assignee: Francesco Nigro
>            Priority: Major
>             Fix For: 2.15.0
>
>
> Journal scalability with a replicated pair has degraded due to:
> * a semantic change on journal sync that was causing the Netty event loop on the backup to await any journal operation to hit the disk - see https://issues.apache.org/jira/browse/ARTEMIS-2837
> * a semantic change on NettyConnection::write from within the Netty event loop, that is now immediately writing and flushing buffers, while it was delaying it by offering it again in the event loop  -  see: https://issues.apache.org/jira/browse/ARTEMIS-2205 (in particular https://github.com/apache/activemq-artemis/commit/a40a459f8c536a10a0dccae6e522ec38f09dd544#diff-3477fe0d8138d589ef33feeea2ecd28eL377-L392)
> The former issues has been solved by reverting the changes and reimplementing the new semantic by using a flag to switch between the twos.
> The latter need some more explanation to be understood:
> # ReplicationEndpoint is responsible to handle packets from live
> # Netty provide incoming packets to ReplicationEndpoint in batches
> # after each processed packet coming from live (that would likely end to append something to the journal), a replication packet response need to be sent back from backup to the live: in the original behavior (< 2.7.0) the responses were delayed until the end of a processed batch of packets, thus making the journal to append records in bursts, amortizing the full cost of awaking the I/O thread responsible of appending data to the journal. 
> To emulate the original "bursty" behavior. but making the batching more explicit (and tunable too), it can be solved:
> # using Netty's ChannelInboundHandler::channelReadComplete event to flush each batch of packet responses as before
> # [OPTIONAL] implement a new append executor on the journal to further reduce the cost of awaking the appending thread, reducing the appending cost



--
This message was sent by Atlassian Jira
(v8.3.4#803005)