You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by hzh0425 <hz...@163.com> on 2023/01/21 05:34:52 UTC

[DISCUSS] Tiered-Storage: Implement RocksdbBasedMetadataCache for TopicBased RLMM

Background:
In KIP-405: Kafka Tiered Storage - Apache Kafka - Apache Software Foundation,  kafka introduced the feature of hierarchical storage, and RLMM is responsible for storing remote segment's metadata.


BTW, [KAFKA-9555] Topic-based implementation for the RemoteLogMetadataManager - ASF JIRA (apache.org) implements the default RLMM - 'TopicBased-RLMM'.




Problem:
TopicBased RLMM stores all metadata of subscriptions in memory.

In our practice, we found that when the metadata gradually increases, there will be a huge burden on the broker's memory (GB level), and at the same time, it will be very time-consuming to save the snapshot of the full amount of metadata to the disk.




Solution

We hope to introduce rocksdb to solve this problem:
- Implement a RocksdbBasedMetadataCache
- All metadata is stored on disk, only a small amount of rocksdb memory cache is required.
- There is no need to bear the time consumption caused by saving the full amount of snapshot metadata to disk, rocksdb can guarantee incremental storage.


You are welcome to discuss this Improvement by replying email !


Thanks,
Hzh0425


| |
hzhkafka
|
|
hzhkafka@163.com
|

Re: [DISCUSS] Tiered-Storage: Implement RocksdbBasedMetadataCache for TopicBased RLMM

Posted by Alexandre Dupriez <al...@gmail.com>.
Hi, hzh0425,

Thank you for raising the question. There have been discussions about
using RocksDB as a RLMM in the community, one of which can be found on
the dev list [1].
What is the size of the metadata set in your use case? Have you
considered using an external data store to abstract metadata
persistence away from the local storage?

[1] https://lists.apache.org/thread/8lcslnwrnj1s7mk2c3g3fw0zqjwrogds

Thanks,
Alexandre

Le sam. 21 janv. 2023 à 05:35, hzh0425 <hz...@163.com> a écrit :
>
> Background:
> In KIP-405: Kafka Tiered Storage - Apache Kafka - Apache Software Foundation,  kafka introduced the feature of hierarchical storage, and RLMM is responsible for storing remote segment's metadata.
>
>
> BTW, [KAFKA-9555] Topic-based implementation for the RemoteLogMetadataManager - ASF JIRA (apache.org) implements the default RLMM - 'TopicBased-RLMM'.
>
>
>
>
> Problem:
> TopicBased RLMM stores all metadata of subscriptions in memory.
>
> In our practice, we found that when the metadata gradually increases, there will be a huge burden on the broker's memory (GB level), and at the same time, it will be very time-consuming to save the snapshot of the full amount of metadata to the disk.
>
>
>
>
> Solution
>
> We hope to introduce rocksdb to solve this problem:
> - Implement a RocksdbBasedMetadataCache
> - All metadata is stored on disk, only a small amount of rocksdb memory cache is required.
> - There is no need to bear the time consumption caused by saving the full amount of snapshot metadata to disk, rocksdb can guarantee incremental storage.
>
>
> You are welcome to discuss this Improvement by replying email !
>
>
> Thanks,
> Hzh0425
>
>
> | |
> hzhkafka
> |
> |
> hzhkafka@163.com
> |