You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by "Zhimin Li (Jira)" <ji...@apache.org> on 2023/03/08 06:12:00 UTC

[jira] [Created] (COMDEV-506) [GSoC] RocketMQ TieredStore Integration with HDFS

Zhimin Li created COMDEV-506:
--------------------------------

             Summary: [GSoC] RocketMQ TieredStore Integration with HDFS
                 Key: COMDEV-506
                 URL: https://issues.apache.org/jira/browse/COMDEV-506
             Project: Community Development
          Issue Type: New Feature
         Environment: rocketmq,hdfs
            Reporter: Zhimin Li


h2. [GSoC] RocketMQ TieredStore Integration with HDFS

Github Issue: [https://github.com/apache/rocketmq/issues/6282]
h3. Apache RocketMQ and HDFS
 *  Apache RocketMQ is a cloud native messaging and streaming platform, making it simple to build event-driven applications. 

 *  Hadoop Distributed File System (HDFS) is a distributed file system designed to store and manage large data sets across multiple servers or clusters. HDFS provides a reliable, scalable, and fault-tolerant platform for storing and accessing data that can be accessed by a variety of applications running on the hadoop cluster. 

h3. Background

High-speed storage media, such as solid-state drives (SSDs), are typically more expensive than traditional hard disk drives (HDDs). To minimize storage costs, the local data disk size of a rocketmq broker is often limited. HDFS can store large amounts of data at a lower cost, it has better support for storing and retrieving data sequentially rather than randomly. In order to preserve message data over a long period or facilitate message export, the RocketMQ project previously introduced a tiered storage plugin. Now it is necessary to implement a storage plugin to save data on hdfs.
h3. Relevant Skills
 * Interest in messging middleware and distributed storage system

 * Java development skills

 * Having a good understanding of rocketmq and hdfs models

Anyways, the most important relevant skill is motivation and readiness to learn during the project!
h3. Tasks
 * understand the basic concepts and principles in distributed systems

 * provide related design documents

 * develop one that uses hdfs as the backend storage plugin to store rocketmq message data

 * write effective unit test code

 * *suggest improvements to the tiered storage interface

 * *what ever comes in your mind further ideas are always welcome

h3. Learning Material
 * RocketMQ HomePage ([https://rocketmq.apache.org|https://rocketmq.apache.org/]) *Github*: [https://github.com/apache/rocketmq]

 * RocketMQ Tiered Storage Design ([https://github.com/apache/rocketmq/wiki/RIP-57-Tiered-storage-for-RocketMQ])

 * HDFS HomePage ([https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html])

h3. Name and contact information
 * Mentor: Zhimin Li, Apache RocketMQ Committer, [lizhimin@apache.org|mailto:lizhimin@apache.org]

 * Mailing List: [dev@rocketmq.apache.org|mailto:dev@rocketmq.apache.org]

 * Website: [https://rocketmq.apache.org/] and [https://hadoop.apache.org/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@community.apache.org
For additional commands, e-mail: dev-help@community.apache.org