You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by Zhanhui Li <li...@gmail.com> on 2022/10/09 13:51:50 UTC

[DISCUSSION] RIP-47

Hi,

RocketMQ has stuck to its data format for many years and we are
experiencing many blocking problems originating from this design.

RIP-47 https://github.com/apache/rocketmq/wiki/RIP-47-Data-Layout-V2 is to
designed to solve the mentioned issues.

Please review the design. Constructive feedback and comments are
anticipated.

It is a major RIP to RocketMQ, multiple critical data paths are involved.
It's welcome to join this RIP if you are interested.

Zhanhui Li

Re: [DISCUSSION] RIP-47

Posted by Zhanhui Li <li...@gmail.com>.
Hi fuyou,

Thanks for your feedback. Basically, if both broker and clients support v2
serialization, we can minimize the cost of serialization and
deserialization a lot. If clients remained v1 and brokers upgraded, extra
overhead would be minimal, within an acceptable boundary.

Here are a few blogs discussing pros and cos of this style of serde, among
implementations.

https://engineering.fb.com/2015/07/31/android/improving-facebook-s-performance-on-android-with-flatbuffers/
https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html

Zhanhui Li

On Tue, Oct 11, 2022 at 10:02 AM fuyou <fu...@gmail.com> wrote:

> Great RIP,
> one question:
> we thought about  zero copy when consumers pull messages,if we support
> v2,zero copy can't support.
>
> Zhanhui Li <li...@gmail.com> 于2022年10月10日周一 15:57写道:
>
> > Hi zhimin,
> >
> > >> 1. About the length of Topic, 128 can meet most scenarios. Even if the
> > >> storage module supports store long topic data,
> > >>    I still recommend that the length of this non-system topic‘s
> > >> length should not exceed the 128 limit.
> > >>    For system topics, such as the retry topic name of pop consumption
> > >> can increase the upper limit.
> >
> > Storage layer, IMO, is not the best place to apply such limit. Broker
> > module might be a better option; Normally, we should store a logical,
> fixed
> > width number in the storage system and let the upper layer to interpret
> the
> > actual name.
> >
> >
> > >>  For the system properties in the message, there should be its own
> > >> namespace to distinguish it from the user's properties and protect it
> > >> from being modified by the user.
> >
> > Yes! This is what this RIP is going to do.
> >
> > >> Is it necessary to store the server IP in every message? This
> > >> design will waste more storage space.
> >
> > Yes, this is indeed an issue. Actually, two questions need to be
> answered:
> > 1, how does store IP help; 2, How to define store IP?  The first node
> that
> > saves the message or the node that serves the message pull request.
> >
> > >> The "topic remapping" needs to consider the compatibility with the
> > >> existing design, It is recommended to discuss it as another
> > >> improvement.
> >
> > As pointed out in the previous section, most storage system stores
> logical,
> > fixed width number and let the business layer interpret the actual topic
> > name. This matters in this RIP as it decides what to store in the storage
> > tier. Logical fixed with number or string name or both during migration.
> >
> > >> We should consider compression and columnar storage in the new
> > >> storage format.
> >
> > Compression can be made transparent, similar to what RocsDB/LevelDB does
> > for SST files at the bottom most level.
> >
> >
> > On Mon, Oct 10, 2022 at 2:50 PM zhimin li <li...@gmail.com> wrote:
> >
> > > This is a great improvement plan, I have the following thoughts:
> > >
> > > 1. About the length of Topic, 128 can meet most scenarios. Even if the
> > > storage module supports store long topic data,
> > >     I still recommend that the length of this non-system topic‘s
> > > length should not exceed the 128 limit.
> > >     For system topics, such as the retry topic name of pop consumption
> > > can increase the upper limit.
> > >
> > > 2. For the system properties in the message, there should be its own
> > > namespace to distinguish it from the user's properties and protect it
> > > from being modified by the user.
> > >
> > > 3. Is it necessary to store the server IP in every message? This
> > > design will waste more storage space.
> > >
> > > 4. The "topic remapping" needs to consider the compatibility with the
> > > existing design, It is recommended to discuss it as another
> > > improvement.
> > >
> > > 5. We should consider compression and columnar storage in the new
> > > storage format.
> > >
> >
>
>
> --
>    =============================================
>
>   fuyou001
> Best Regards
>

Re: [DISCUSSION] RIP-47

Posted by fuyou <fu...@gmail.com>.
Great RIP,
one question:
we thought about  zero copy when consumers pull messages,if we support
v2,zero copy can't support.

Zhanhui Li <li...@gmail.com> 于2022年10月10日周一 15:57写道:

> Hi zhimin,
>
> >> 1. About the length of Topic, 128 can meet most scenarios. Even if the
> >> storage module supports store long topic data,
> >>    I still recommend that the length of this non-system topic‘s
> >> length should not exceed the 128 limit.
> >>    For system topics, such as the retry topic name of pop consumption
> >> can increase the upper limit.
>
> Storage layer, IMO, is not the best place to apply such limit. Broker
> module might be a better option; Normally, we should store a logical, fixed
> width number in the storage system and let the upper layer to interpret the
> actual name.
>
>
> >>  For the system properties in the message, there should be its own
> >> namespace to distinguish it from the user's properties and protect it
> >> from being modified by the user.
>
> Yes! This is what this RIP is going to do.
>
> >> Is it necessary to store the server IP in every message? This
> >> design will waste more storage space.
>
> Yes, this is indeed an issue. Actually, two questions need to be answered:
> 1, how does store IP help; 2, How to define store IP?  The first node that
> saves the message or the node that serves the message pull request.
>
> >> The "topic remapping" needs to consider the compatibility with the
> >> existing design, It is recommended to discuss it as another
> >> improvement.
>
> As pointed out in the previous section, most storage system stores logical,
> fixed width number and let the business layer interpret the actual topic
> name. This matters in this RIP as it decides what to store in the storage
> tier. Logical fixed with number or string name or both during migration.
>
> >> We should consider compression and columnar storage in the new
> >> storage format.
>
> Compression can be made transparent, similar to what RocsDB/LevelDB does
> for SST files at the bottom most level.
>
>
> On Mon, Oct 10, 2022 at 2:50 PM zhimin li <li...@gmail.com> wrote:
>
> > This is a great improvement plan, I have the following thoughts:
> >
> > 1. About the length of Topic, 128 can meet most scenarios. Even if the
> > storage module supports store long topic data,
> >     I still recommend that the length of this non-system topic‘s
> > length should not exceed the 128 limit.
> >     For system topics, such as the retry topic name of pop consumption
> > can increase the upper limit.
> >
> > 2. For the system properties in the message, there should be its own
> > namespace to distinguish it from the user's properties and protect it
> > from being modified by the user.
> >
> > 3. Is it necessary to store the server IP in every message? This
> > design will waste more storage space.
> >
> > 4. The "topic remapping" needs to consider the compatibility with the
> > existing design, It is recommended to discuss it as another
> > improvement.
> >
> > 5. We should consider compression and columnar storage in the new
> > storage format.
> >
>


-- 
   =============================================

  fuyou001
Best Regards

Re: [DISCUSSION] RIP-47

Posted by Zhanhui Li <li...@gmail.com>.
Hi zhimin,

>> 1. About the length of Topic, 128 can meet most scenarios. Even if the
>> storage module supports store long topic data,
>>    I still recommend that the length of this non-system topic‘s
>> length should not exceed the 128 limit.
>>    For system topics, such as the retry topic name of pop consumption
>> can increase the upper limit.

Storage layer, IMO, is not the best place to apply such limit. Broker
module might be a better option; Normally, we should store a logical, fixed
width number in the storage system and let the upper layer to interpret the
actual name.


>>  For the system properties in the message, there should be its own
>> namespace to distinguish it from the user's properties and protect it
>> from being modified by the user.

Yes! This is what this RIP is going to do.

>> Is it necessary to store the server IP in every message? This
>> design will waste more storage space.

Yes, this is indeed an issue. Actually, two questions need to be answered:
1, how does store IP help; 2, How to define store IP?  The first node that
saves the message or the node that serves the message pull request.

>> The "topic remapping" needs to consider the compatibility with the
>> existing design, It is recommended to discuss it as another
>> improvement.

As pointed out in the previous section, most storage system stores logical,
fixed width number and let the business layer interpret the actual topic
name. This matters in this RIP as it decides what to store in the storage
tier. Logical fixed with number or string name or both during migration.

>> We should consider compression and columnar storage in the new
>> storage format.

Compression can be made transparent, similar to what RocsDB/LevelDB does
for SST files at the bottom most level.


On Mon, Oct 10, 2022 at 2:50 PM zhimin li <li...@gmail.com> wrote:

> This is a great improvement plan, I have the following thoughts:
>
> 1. About the length of Topic, 128 can meet most scenarios. Even if the
> storage module supports store long topic data,
>     I still recommend that the length of this non-system topic‘s
> length should not exceed the 128 limit.
>     For system topics, such as the retry topic name of pop consumption
> can increase the upper limit.
>
> 2. For the system properties in the message, there should be its own
> namespace to distinguish it from the user's properties and protect it
> from being modified by the user.
>
> 3. Is it necessary to store the server IP in every message? This
> design will waste more storage space.
>
> 4. The "topic remapping" needs to consider the compatibility with the
> existing design, It is recommended to discuss it as another
> improvement.
>
> 5. We should consider compression and columnar storage in the new
> storage format.
>

Re: [DISCUSSION] RIP-47

Posted by zhimin li <li...@gmail.com>.
This is a great improvement plan, I have the following thoughts:

1. About the length of Topic, 128 can meet most scenarios. Even if the
storage module supports store long topic data,
    I still recommend that the length of this non-system topic‘s
length should not exceed the 128 limit.
    For system topics, such as the retry topic name of pop consumption
can increase the upper limit.

2. For the system properties in the message, there should be its own
namespace to distinguish it from the user's properties and protect it
from being modified by the user.

3. Is it necessary to store the server IP in every message? This
design will waste more storage space.

4. The "topic remapping" needs to consider the compatibility with the
existing design, It is recommended to discuss it as another
improvement.

5. We should consider compression and columnar storage in the new
storage format.

Re: [DISCUSSION] RIP-47

Posted by Zhanhui Li <li...@gmail.com>.
Hi Zhendong,

Nice to see your feedback.

In terms of Batched Messages. I believe we can solve it this way

In message headers IDL, we have an enum, flagging if the data layout is
single or batch/composite or other format. If the enum says single, then
the body is user data. If enum says composite, then body of the frame
should be considered as several nested messages. The format of the nested
messages may be some compact representation.  We may add one more task,
tackling this issue.

Zhanhui Li


On Mon, Oct 10, 2022 at 11:13 AM dongeforever <do...@apache.org>
wrote:

> Nice try.
> Looking forward to the detailed layout.
>
> By the way, the new data format may need to consider more about the batch
> record, several messages in one record end to end (from producers to
> consumers).
>
>
> Zhanhui Li <li...@gmail.com> 于2022年10月9日周日 21:52写道:
>
> > Hi,
> >
> > RocketMQ has stuck to its data format for many years and we are
> > experiencing many blocking problems originating from this design.
> >
> > RIP-47 https://github.com/apache/rocketmq/wiki/RIP-47-Data-Layout-V2 is
> to
> > designed to solve the mentioned issues.
> >
> > Please review the design. Constructive feedback and comments are
> > anticipated.
> >
> > It is a major RIP to RocketMQ, multiple critical data paths are involved.
> > It's welcome to join this RIP if you are interested.
> >
> > Zhanhui Li
> >
>

Re: [DISCUSSION] RIP-47

Posted by dongeforever <do...@apache.org>.
Nice try.
Looking forward to the detailed layout.

By the way, the new data format may need to consider more about the batch
record, several messages in one record end to end (from producers to
consumers).


Zhanhui Li <li...@gmail.com> 于2022年10月9日周日 21:52写道:

> Hi,
>
> RocketMQ has stuck to its data format for many years and we are
> experiencing many blocking problems originating from this design.
>
> RIP-47 https://github.com/apache/rocketmq/wiki/RIP-47-Data-Layout-V2 is to
> designed to solve the mentioned issues.
>
> Please review the design. Constructive feedback and comments are
> anticipated.
>
> It is a major RIP to RocketMQ, multiple critical data paths are involved.
> It's welcome to join this RIP if you are interested.
>
> Zhanhui Li
>

Re: [DISCUSSION] RIP-47

Posted by Zhanhui Li <li...@gmail.com>.
> IMO, the main reason for storing broker ip is that messages are bound to
broker, not topic.

I am not sure if this the main reason or not... binding messages to node
not topic is obviously a design flaw.

> We must find out which broker the message is stored on before we fetch
the message.  And it is no way to migrate messages from their original
broker to another.

It is indeed something we need to improve. But it deserves a dedicated RIP.


> The new storage layer design should consider this problem, at least not
to add new obstacles.

Definitely... The design of this RIP is to grant maximum freedom to the
business layer and make the storage flexible and adaptable enough.

> Eg. Where to store the mapping of topic name and topic identity? Is this
mapping consistent across brokers?

Ideally, the mapping should be accessible to the whole cluster. The
mapping, at least, needs to be consistent among brokers of the same group.
In the perspective of the storage layer, it expects the topic ID on topic
creation from broker module.


On Sun, Oct 16, 2022 at 11:57 AM SSpirits <ad...@lv5.moe> wrote:

> Hi Zhanhui,
>
> IMO, the main reason for storing broker ip is that messages are bound to
> broker, not topic. We must find out which broker the message is stored on
> before we fetch the message.  And it is no way to migrate messages from
> their original broker to another.
>
> The new storage layer design should consider this problem, at least not to
> add new obstacles. Eg. Where to store the mapping of topic name and topic
> identity? Is this mapping consistent across brokers?
>
>
> On Oct 9, 2022, 21:52 +0800, Zhanhui Li <li...@gmail.com>, wrote:
> Hi,
>
> RocketMQ has stuck to its data format for many years and we are
> experiencing many blocking problems originating from this design.
>
> RIP-47 https://github.com/apache/rocketmq/wiki/RIP-47-Data-Layout-V2 is to
> designed to solve the mentioned issues.
>
> Please review the design. Constructive feedback and comments are
> anticipated.
>
> It is a major RIP to RocketMQ, multiple critical data paths are involved.
> It's welcome to join this RIP if you are interested.
>
> Zhanhui Li
>

Re: [DISCUSSION] RIP-47

Posted by SSpirits <ad...@lv5.moe>.
Hi Zhanhui,

IMO, the main reason for storing broker ip is that messages are bound to broker, not topic. We must find out which broker the message is stored on before we fetch the message.  And it is no way to migrate messages from their original broker to another.

The new storage layer design should consider this problem, at least not to add new obstacles. Eg. Where to store the mapping of topic name and topic identity? Is this mapping consistent across brokers?


On Oct 9, 2022, 21:52 +0800, Zhanhui Li <li...@gmail.com>, wrote:
Hi,

RocketMQ has stuck to its data format for many years and we are
experiencing many blocking problems originating from this design.

RIP-47 https://github.com/apache/rocketmq/wiki/RIP-47-Data-Layout-V2 is to
designed to solve the mentioned issues.

Please review the design. Constructive feedback and comments are
anticipated.

It is a major RIP to RocketMQ, multiple critical data paths are involved.
It's welcome to join this RIP if you are interested.

Zhanhui Li