You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by "Rongtong Jin (Jira)" <ji...@apache.org> on 2023/03/13 11:53:00 UTC

[jira] [Created] (COMDEV-513) RocketMQ TieredStore Integration with High Availability Architecture

Rongtong Jin created COMDEV-513:
-----------------------------------

             Summary: RocketMQ TieredStore Integration with High Availability Architecture
                 Key: COMDEV-513
                 URL: https://issues.apache.org/jira/browse/COMDEV-513
             Project: Community Development
          Issue Type: Task
          Components: Comdev, GSoC/Mentoring ideas
            Reporter: Rongtong Jin


{*}Apache RocketMQ{*}{*}{*}

Apache RocketMQ is a distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability.

Page: [https://rocketmq.apache.org|https://rocketmq.apache.org/]

 

*Background*

With the official release of RocketMQ 5.1.0, tiered storage has arrived as a new independent module in the Technical Preview milestone. This allows users to unload messages from local disks to other cheaper storage, extending message retention time at a lower cost.

Reference RIP-57: [https://github.com/apache/rocketmq/wiki/RIP-57-Tiered-storage-for-RocketMQ]

In addition, RocketMQ introduced a new high availability architecture in version 5.0.

Reference RIP-44: [https://github.com/apache/rocketmq/wiki/RIP-44-Support-DLedger-Controller]

However, currently RocketMQ tiered storage only supports single replicas.

 

*Task*

Currently, tiered storage only supports single replicas, and there are still the following issues in the integration with the high availability architecture:
 * Metadata synchronization: how to reliably synchronize metadata between master and slave nodes.
 * Disallowing message uploads beyond the confirm offset: to avoid message rollback, the maximum uploaded offset cannot exceed the confirm offset.
 * Starting multi-tier storage upload when the slave changes to master, and stopping tiered storage upload when the master becomes the slave: only the master node has write and delete permissions, and after the slave node is promoted, it needs to quickly resume tiered storage breakpoint resumption.
 * Design of slave pull protocol: how a newly launched empty slave can properly synchronize data through the tiered storage architecture. (If synchronization is performed based on the first or last file, resumption of breakpoints may not be possible when switching again).

So you need to provide a complete plan to solve the above issues and ultimately complete the integration of tiered storage and high availability architecture, while verifying it through the existing tiered storage file version and OpenChaos testing.

 

*Relevant Skills*
 * Interest in messaging middleware and distributed storage systems
 * Java development skills
 * Having a good understanding of RocketMQ tiered storage and high availability architecture



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@community.apache.org
For additional commands, e-mail: dev-help@community.apache.org