You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@rocketmq.apache.org by 金融通 <ji...@mails.ucas.ac.cn> on 2020/04/30 07:28:44 UTC

[DISSCUSS] RIP-18 Metadata management architecture upgrade

Hi RocketMQ Community,

I think it is a good choice to start the evolution of architecture for RocketMQ with Metadata management architecture upgrade.

Currently, the metadata consistency of RocketMQ is maintained by full connection. For example, each broker registers with each nameserver to ensure that the view of routing information seen between nameservers is the same, and each consumer instance sends heartbeats (carrying subscription information) to broker to ensure that the view of subscription information seen between brokers is the same. However, such consistency maintenance is weak. Unreliable network and delay may cause inconsistent views, which has caused a lot of&nbsp;issues.

On the other hand, after RocketMQ 4.5.0, we have used the Raft protocol (DLedger) to solve the consistency problem of log replication. DLedger is a raft-based log storage library. At the beginning of the design, we hoped to apply it to consistent metadata storage. If the metadata of RocketMQ is stored as log and the consistency is guaranteed by using the raft protocol (DLedger), the issue of metadata consistency will be solved.

So I submitted RIP-18 Metadata management architecture upgrade, which describes the specific plan in more detail. I hope to hear more voices from the community. So please tell me your thoughts by replying to this email or commenting on google docs.

Best Regards!
Rongtong Jin

RIP-18 Metadata management architecture upgrade
https://docs.google.com/document/d/1hQxlbtlMDwNxyVDGsIIUpDNWwfS6hP0PGKY9-A2KUOA/edit?usp=sharing

&gt; -----原始邮件-----
&gt; 发件人: "Gosling Von" <fe...@gmail.com>
&gt; 发送时间: 2019-01-30 17:54:47 (星期三)
&gt; 收件人: dev <de...@rocketmq.apache.org>
&gt; 抄送:
&gt; 主题: [DISCUSS] Thought of The Evolution of The Next Decade Architecture for RocketMQ
&gt;
&gt; Hi,
&gt;
&gt; I would like to say happy new year to everyone, especially for the guys from the eastern hemisphere. I think that when you see this topic, you already know what I want to say :-)
&gt;
&gt; After more than 6 years of inspection from the community and market, Apache RocketMQ has been widely used in the field of financial and e-commerce online transactions. Known know data has shown that, just in China, RocketMQ covers more than 40% of the traditional messaging scene. With the globalization of the community in the past two years, this development has spread to all of the worlds. However, through continuous community activities, including technical exchanges with some of the experts from the Microsoft, Berkeley, etc., coupled with the emergence of IoT, AI, Blockchain and other scenarios around the world, I began to think about the architecture evolution for RocketMQ. I hope we could make it as the data infrastructure of cloud computing era. and we could better serve in the next decade.
&gt;
&gt;
&gt; First of all, the overall architecture will take the separation of storage computing and pluggable architecture. Regarding the separation of storage computing, I know that this is a controversial topic in the industry. You may also see that Twitter had gave up their messaging solution EventBus, which serving and storage layers are decoupled. one of the important reason which is given by "introduces an additional hop". That's right, usually, you don't need so much. But what I want to express here is that the value of storage computing separation is just like the single responsibility in our design pattern, so that focus is more focused. For example, if messaging engine is deployed in the edge, we could arrange computing nodes to be deployed on demand. Because it is a computationally intensive task, we can focus on how to improve computing power and response speed without concerned about the machine cost, operation and maintenance cost brought by storage. Another case, RocketMQ storage is regarded as a kind of time series storage. It not only provides the storage capacity of single data, but also the capacity of bulk storage, but in any case it is a data type independent sequential additional storage. Under this architecture, if you want to realize the current transaction capability, there are still some complications, especially when you want to make RocketMQ a one-stop microservice transaction solution. We have already tried this. Known feedback from the bank is, they have made some modifications to the storage in the financial system. For example, when the file storage is replaced with a relational type, NoSQL or NewSQL storage, the benefit is enhanced maintainability. Enhanced transaction processing capabilities. In this sense, we could make a pluggable design in RocketMQ 5.0, by default we will provide the ultimate sequential addition capability storage, which is also the best storage implementation of the disk seek algorithm. But it also brings another question, how to improve the query and processing ability of data. Here I want to share another preliminary design idea, we could continue to use the data structure such as Commitlog to store the original data, and then build the index or intermediate aggregation results based on Commitlog. At present, our index structure is not well integrated and utilized, from this we could continue to modify and optimize the index. In addition, we can use DPDK/SPDK and write Pos atomic increment to achieve the best lock-free design. Considering the data that has been committed, this series needs to be explored to a large extent, even including cooperation with some other communities and universities. So at this level, I think we could make RocketMQ have the separation deployment capability, while the storage capacity is pluggable and can be replaced as needed.
&gt;
&gt; Second, support the OpenMessaging standard. I think many guys have already noticed the new messaging standard drafted by Alibaba, Yahoo and other company. I am also the chair of this project. In this blueprint of the standard, a very important problem is solved, that is Interoperability, this interoperability is not only between different messaging vendors, but also between the upstream and downstream of the messaging. And this interoperability is reflected to the user, which is the consistency of the API or the protocol. Although we think that the API is also a kind of protocol, I want to emphasize that the consistency of the protocol has been tried by countless scenes. But so far, I personally have not seen a particularly versatile and simple solution, whether it is AMQP, MQTT, including RSocket, which has recently been recognized by everyone, there is not much innovation to work on this level. And we want to avoid some repetitive innovations. At this time, the API layer standard is particularly important, so RocketMQ 5.0 will focus on supporting OpenMessaging standardization in API testing. In the future of multi-language, we hope that through this set of APIs, we can completely solve all the problems that you currently encounter with RocketMQ multi-language.
&gt;
&gt; The natural support of multiple protocols, I think this is also very important. So in 5.0, we could reconstruct the remoting module, to provide a pluggable transport layer protocol support in the computing node. HTTP2.0 may be our default protocol. On the basis of again, we also consider integrating TCP-based MQTT, UDP-based CoAP. Of course, we also clearly see that with the gradual popularization of 5.0G networks, we may have to actively follow up the needs of the market. Anyway, we could provide the flexible wire protocol extension when we want to support more concrete domain protocol. This is something we must consider carefully.
&gt;
&gt;
&gt; A lightweight streaming engine base on messaging is a very natural thought. I am also an early explorer of streaming, but the so-called streaming we made in previous years is strictly a pseudo-scene, why is it a pseudo-scene? Actually, we don’t need to deploy a streaming engine. Instead of, we could only use the messaging to reach a same effect in most cases. In addition, in the stream computing scenario, messaging and storage are very important, so why don't we let the messaging support the scheduling and calculation of task nodes naturally, and our built-in storage can better help us better. We only need to provide a lib package, which makes it easy for messaging to have streaming capabilities. As for the subsequent SQL processing, CEP, FAAS and etc. I believe that this is the evolution of this programming model.
&gt;
&gt; We have been talking about it before. RocketMQ is a unified messaging platform integrating computing, storage and scheduling. Today I share my rough thought of the evolution of the overall architecture of RocketMQ 5.0. I also hope to hear the opinions of the community. Including other PMC and Committer thoughts. Next, we could call for RIP discussion for the details, I hope more pmc or committers could act as the sheepherder of the RIP, making landing more reliable in the 2019.
&gt;
&gt;
&gt; Best Regards,
&gt; Von Gosling

</d...@gmail.com>

回复: [DISSCUSS] RIP-18 Metadata management architecture upgrade

Posted by 胡宗棠 <zo...@hotmail.com>.

Hi @heng du<ma...@gmail.com>@Jianhai

I think it's important to unify the message model with KV model to store metadata.It will resolve some data inconsistency problems if we
implements RIP-18 Metadata management architecture upgrade.And we're pleased to participate in this feature.

________________________________
发件人: heng du <du...@apache.org>
发送时间: 2020年8月21日 10:09
收件人: dev <de...@rocketmq.apache.org>
主题: Re: [DISSCUSS] RIP-18 Metadata management architecture upgrade

@Jianhai

It is better to unify the message model with the KV model to store
metadata(eg. compact topic). would you like to submit a proposal in this
point?

Xu Jianhai <sn...@gmail.com> 于2020年5月6日周三 下午3:04写道：

> maybe we could build openmessaging-kv based on DLedger,  and then rocketmq
> nameserver  uses openmessaging-kv to construct raft consensus.
>
> On Thu, Apr 30, 2020 at 3:29 PM 金融通 <ji...@mails.ucas.ac.cn>
> wrote:
>
> > Hi RocketMQ Community,
> >
> > I think it is a good choice to start the evolution of architecture for
> > RocketMQ with Metadata management architecture upgrade.
> >
> > Currently, the metadata consistency of RocketMQ is maintained by full
> > connection. For example, each broker registers with each nameserver to
> > ensure that the view of routing information seen between nameservers is
> the
> > same, and each consumer instance sends heartbeats (carrying subscription
> > information) to broker to ensure that the view of subscription
> information
> > seen between brokers is the same. However, such consistency maintenance
> is
> > weak. Unreliable network and delay may cause inconsistent views, which
> has
> > caused a lot of&nbsp;issues.
> >
> > On the other hand, after RocketMQ 4.5.0, we have used the Raft protocol
> > (DLedger) to solve the consistency problem of log replication. DLedger
> is a
> > raft-based log storage library. At the beginning of the design, we hoped
> to
> > apply it to consistent metadata storage. If the metadata of RocketMQ is
> > stored as log and the consistency is guaranteed by using the raft
> protocol
> > (DLedger), the issue of metadata consistency will be solved.
> >
> > So I submitted RIP-18 Metadata management architecture upgrade, which
> > describes the specific plan in more detail. I hope to hear more voices
> from
> > the community. So please tell me your thoughts by replying to this email
> or
> > commenting on google docs.
> >
> > Best Regards!
> > Rongtong Jin
> >
> > RIP-18 Metadata management architecture upgrade
> >
> >
> https://docs.google.com/document/d/1hQxlbtlMDwNxyVDGsIIUpDNWwfS6hP0PGKY9-A2KUOA/edit?usp=sharing
> >
> >
> > &gt; -----原始邮件-----
> > &gt; 发件人: "Gosling Von" <fe...@gmail.com>
> > &gt; 发送时间: 2019-01-30 17:54:47 (星期三)
> > &gt; 收件人: dev <de...@rocketmq.apache.org>
> > &gt; 抄送:
> > &gt; 主题: [DISCUSS] Thought of The Evolution of The Next Decade
> > Architecture for RocketMQ
> > &gt;
> > &gt; Hi,
> > &gt;
> > &gt; I would like to say happy new year to everyone, especially for the
> > guys from the eastern hemisphere. I think that when you see this topic,
> you
> > already know what I want to say :-)
> > &gt;
> > &gt; After more than 6 years of inspection from the community and market,
> > Apache RocketMQ has been widely used in the field of financial and
> > e-commerce online transactions. Known know data has shown that, just in
> > China, RocketMQ covers more than 40% of the traditional messaging scene.
> > With the globalization of the community in the past two years, this
> > development has spread to all of the worlds. However, through continuous
> > community activities, including technical exchanges with some of the
> > experts from the Microsoft, Berkeley, etc., coupled with the emergence of
> > IoT, AI, Blockchain and other scenarios around the world, I began to
> think
> > about the architecture evolution for RocketMQ. I hope we could make it as
> > the data infrastructure of cloud computing era. and we could better serve
> > in the next decade.
> > &gt;
> > &gt;
> > &gt; First of all, the overall architecture will take the separation of
> > storage computing and pluggable architecture. Regarding the separation of
> > storage computing, I know that this is a controversial topic in the
> > industry. You may also see that Twitter had gave up their messaging
> > solution EventBus, which serving and storage layers are decoupled. one of
> > the important reason which is given by "introduces an additional hop".
> > That's right, usually, you don't need so much. But what I want to express
> > here is that the value of storage computing separation is just like the
> > single responsibility in our design pattern, so that focus is more
> focused.
> > For example, if messaging engine is deployed in the edge, we could
> arrange
> > computing nodes to be deployed on demand. Because it is a computationally
> > intensive task, we can focus on how to improve computing power and
> response
> > speed without concerned about the machine cost, operation and maintenance
> > cost brought by storage. Another case, RocketMQ storage is regarded as a
> > kind of time series storage. It not only provides the storage capacity of
> > single data, but also the capacity of bulk storage, but in any case it
> is a
> > data type independent sequential additional storage. Under this
> > architecture, if you want to realize the current transaction capability,
> > there are still some complications, especially when you want to make
> > RocketMQ a one-stop microservice transaction solution. We have already
> > tried this. Known feedback from the bank is, they have made some
> > modifications to the storage in the financial system. For example, when
> the
> > file storage is replaced with a relational type, NoSQL or NewSQL storage,
> > the benefit is enhanced maintainability. Enhanced transaction processing
> > capabilities. In this sense, we could make a pluggable design in RocketMQ
> > 5.0, by default we will provide the ultimate sequential addition
> capability
> > storage, which is also the best storage implementation of the disk seek
> > algorithm. But it also brings another question, how to improve the query
> > and processing ability of data. Here I want to share another preliminary
> > design idea, we could continue to use the data structure such as
> Commitlog
> > to store the original data, and then build the index or intermediate
> > aggregation results based on Commitlog. At present, our index structure
> is
> > not well integrated and utilized, from this we could continue to modify
> and
> > optimize the index. In addition, we can use DPDK/SPDK and write Pos
> atomic
> > increment to achieve the best lock-free design. Considering the data that
> > has been committed, this series needs to be explored to a large extent,
> > even including cooperation with some other communities and universities.
> So
> > at this level, I think we could make RocketMQ have the separation
> > deployment capability, while the storage capacity is pluggable and can be
> > replaced as needed.
> > &gt;
> > &gt; Second, support the OpenMessaging standard. I think many guys have
> > already noticed the new messaging standard drafted by Alibaba, Yahoo and
> > other company. I am also the chair of this project. In this blueprint of
> > the standard, a very important problem is solved, that is
> Interoperability,
> > this interoperability is not only between different messaging vendors,
> but
> > also between the upstream and downstream of the messaging. And this
> > interoperability is reflected to the user, which is the consistency of
> the
> > API or the protocol. Although we think that the API is also a kind of
> > protocol, I want to emphasize that the consistency of the protocol has
> been
> > tried by countless scenes. But so far, I personally have not seen a
> > particularly versatile and simple solution, whether it is AMQP, MQTT,
> > including RSocket, which has recently been recognized by everyone, there
> is
> > not much innovation to work on this level. And we want to avoid some
> > repetitive innovations. At this time, the API layer standard is
> > particularly important, so RocketMQ 5.0 will focus on supporting
> > OpenMessaging standardization in API testing. In the future of
> > multi-language, we hope that through this set of APIs, we can completely
> > solve all the problems that you currently encounter with RocketMQ
> > multi-language.
> > &gt;
> > &gt; The natural support of multiple protocols, I think this is also very
> > important. So in 5.0, we could reconstruct the remoting module, to
> provide
> > a pluggable transport layer protocol support in the computing node.
> HTTP2.0
> > may be our default protocol. On the basis of again, we also consider
> > integrating TCP-based MQTT, UDP-based CoAP. Of course, we also clearly
> see
> > that with the gradual popularization of 5.0G networks, we may have to
> > actively follow up the needs of the market. Anyway, we could provide the
> > flexible wire protocol extension when we want to support more concrete
> > domain protocol. This is something we must consider carefully.
> > &gt;
> > &gt;
> > &gt; A lightweight streaming engine base on messaging is a very natural
> > thought. I am also an early explorer of streaming, but the so-called
> > streaming we made in previous years is strictly a pseudo-scene, why is
> it a
> > pseudo-scene? Actually, we don’t need to deploy a streaming engine.
> Instead
> > of, we could only use the messaging to reach a same effect in most cases.
> > In addition, in the stream computing scenario, messaging and storage are
> > very important, so why don't we let the messaging support the scheduling
> > and calculation of task nodes naturally, and our built-in storage can
> > better help us better. We only need to provide a lib package, which makes
> > it easy for messaging to have streaming capabilities. As for the
> subsequent
> > SQL processing, CEP, FAAS and etc. I believe that this is the evolution
> of
> > this programming model.
> > &gt;
> > &gt; We have been talking about it before. RocketMQ is a unified
> messaging
> > platform integrating computing, storage and scheduling. Today I share my
> > rough thought of the evolution of the overall architecture of RocketMQ
> 5.0.
> > I also hope to hear the opinions of the community. Including other PMC
> and
> > Committer thoughts. Next, we could call for RIP discussion for the
> details,
> > I hope more pmc or committers could act as the sheepherder of the RIP,
> > making landing more reliable in the 2019.
> > &gt;
> > &gt;
> > &gt; Best Regards,
> > &gt; Von Gosling
> >
> >
> > </d...@gmail.com>
>

Re: [DISSCUSS] RIP-18 Metadata management architecture upgrade

Posted by heng du <du...@apache.org>.

@Jianhai

It is better to unify the message model with the KV model to store
metadata(eg. compact topic). would you like to submit a proposal in this
point?

Xu Jianhai <sn...@gmail.com> 于2020年5月6日周三 下午3:04写道：

> maybe we could build openmessaging-kv based on DLedger,  and then rocketmq
> nameserver  uses openmessaging-kv to construct raft consensus.
>
> On Thu, Apr 30, 2020 at 3:29 PM 金融通 <ji...@mails.ucas.ac.cn>
> wrote:
>
> > Hi RocketMQ Community,
> >
> > I think it is a good choice to start the evolution of architecture for
> > RocketMQ with Metadata management architecture upgrade.
> >
> > Currently, the metadata consistency of RocketMQ is maintained by full
> > connection. For example, each broker registers with each nameserver to
> > ensure that the view of routing information seen between nameservers is
> the
> > same, and each consumer instance sends heartbeats (carrying subscription
> > information) to broker to ensure that the view of subscription
> information
> > seen between brokers is the same. However, such consistency maintenance
> is
> > weak. Unreliable network and delay may cause inconsistent views, which
> has
> > caused a lot of&nbsp;issues.
> >
> > On the other hand, after RocketMQ 4.5.0, we have used the Raft protocol
> > (DLedger) to solve the consistency problem of log replication. DLedger
> is a
> > raft-based log storage library. At the beginning of the design, we hoped
> to
> > apply it to consistent metadata storage. If the metadata of RocketMQ is
> > stored as log and the consistency is guaranteed by using the raft
> protocol
> > (DLedger), the issue of metadata consistency will be solved.
> >
> > So I submitted RIP-18 Metadata management architecture upgrade, which
> > describes the specific plan in more detail. I hope to hear more voices
> from
> > the community. So please tell me your thoughts by replying to this email
> or
> > commenting on google docs.
> >
> > Best Regards!
> > Rongtong Jin
> >
> > RIP-18 Metadata management architecture upgrade
> >
> >
> https://docs.google.com/document/d/1hQxlbtlMDwNxyVDGsIIUpDNWwfS6hP0PGKY9-A2KUOA/edit?usp=sharing
> >
> >
> > &gt; -----原始邮件-----
> > &gt; 发件人: "Gosling Von" <fe...@gmail.com>
> > &gt; 发送时间: 2019-01-30 17:54:47 (星期三)
> > &gt; 收件人: dev <de...@rocketmq.apache.org>
> > &gt; 抄送:
> > &gt; 主题: [DISCUSS] Thought of The Evolution of The Next Decade
> > Architecture for RocketMQ
> > &gt;
> > &gt; Hi,
> > &gt;
> > &gt; I would like to say happy new year to everyone, especially for the
> > guys from the eastern hemisphere. I think that when you see this topic,
> you
> > already know what I want to say :-)
> > &gt;
> > &gt; After more than 6 years of inspection from the community and market,
> > Apache RocketMQ has been widely used in the field of financial and
> > e-commerce online transactions. Known know data has shown that, just in
> > China, RocketMQ covers more than 40% of the traditional messaging scene.
> > With the globalization of the community in the past two years, this
> > development has spread to all of the worlds. However, through continuous
> > community activities, including technical exchanges with some of the
> > experts from the Microsoft, Berkeley, etc., coupled with the emergence of
> > IoT, AI, Blockchain and other scenarios around the world, I began to
> think
> > about the architecture evolution for RocketMQ. I hope we could make it as
> > the data infrastructure of cloud computing era. and we could better serve
> > in the next decade.
> > &gt;
> > &gt;
> > &gt; First of all, the overall architecture will take the separation of
> > storage computing and pluggable architecture. Regarding the separation of
> > storage computing, I know that this is a controversial topic in the
> > industry. You may also see that Twitter had gave up their messaging
> > solution EventBus, which serving and storage layers are decoupled. one of
> > the important reason which is given by "introduces an additional hop".
> > That's right, usually, you don't need so much. But what I want to express
> > here is that the value of storage computing separation is just like the
> > single responsibility in our design pattern, so that focus is more
> focused.
> > For example, if messaging engine is deployed in the edge, we could
> arrange
> > computing nodes to be deployed on demand. Because it is a computationally
> > intensive task, we can focus on how to improve computing power and
> response
> > speed without concerned about the machine cost, operation and maintenance
> > cost brought by storage. Another case, RocketMQ storage is regarded as a
> > kind of time series storage. It not only provides the storage capacity of
> > single data, but also the capacity of bulk storage, but in any case it
> is a
> > data type independent sequential additional storage. Under this
> > architecture, if you want to realize the current transaction capability,
> > there are still some complications, especially when you want to make
> > RocketMQ a one-stop microservice transaction solution. We have already
> > tried this. Known feedback from the bank is, they have made some
> > modifications to the storage in the financial system. For example, when
> the
> > file storage is replaced with a relational type, NoSQL or NewSQL storage,
> > the benefit is enhanced maintainability. Enhanced transaction processing
> > capabilities. In this sense, we could make a pluggable design in RocketMQ
> > 5.0, by default we will provide the ultimate sequential addition
> capability
> > storage, which is also the best storage implementation of the disk seek
> > algorithm. But it also brings another question, how to improve the query
> > and processing ability of data. Here I want to share another preliminary
> > design idea, we could continue to use the data structure such as
> Commitlog
> > to store the original data, and then build the index or intermediate
> > aggregation results based on Commitlog. At present, our index structure
> is
> > not well integrated and utilized, from this we could continue to modify
> and
> > optimize the index. In addition, we can use DPDK/SPDK and write Pos
> atomic
> > increment to achieve the best lock-free design. Considering the data that
> > has been committed, this series needs to be explored to a large extent,
> > even including cooperation with some other communities and universities.
> So
> > at this level, I think we could make RocketMQ have the separation
> > deployment capability, while the storage capacity is pluggable and can be
> > replaced as needed.
> > &gt;
> > &gt; Second, support the OpenMessaging standard. I think many guys have
> > already noticed the new messaging standard drafted by Alibaba, Yahoo and
> > other company. I am also the chair of this project. In this blueprint of
> > the standard, a very important problem is solved, that is
> Interoperability,
> > this interoperability is not only between different messaging vendors,
> but
> > also between the upstream and downstream of the messaging. And this
> > interoperability is reflected to the user, which is the consistency of
> the
> > API or the protocol. Although we think that the API is also a kind of
> > protocol, I want to emphasize that the consistency of the protocol has
> been
> > tried by countless scenes. But so far, I personally have not seen a
> > particularly versatile and simple solution, whether it is AMQP, MQTT,
> > including RSocket, which has recently been recognized by everyone, there
> is
> > not much innovation to work on this level. And we want to avoid some
> > repetitive innovations. At this time, the API layer standard is
> > particularly important, so RocketMQ 5.0 will focus on supporting
> > OpenMessaging standardization in API testing. In the future of
> > multi-language, we hope that through this set of APIs, we can completely
> > solve all the problems that you currently encounter with RocketMQ
> > multi-language.
> > &gt;
> > &gt; The natural support of multiple protocols, I think this is also very
> > important. So in 5.0, we could reconstruct the remoting module, to
> provide
> > a pluggable transport layer protocol support in the computing node.
> HTTP2.0
> > may be our default protocol. On the basis of again, we also consider
> > integrating TCP-based MQTT, UDP-based CoAP. Of course, we also clearly
> see
> > that with the gradual popularization of 5.0G networks, we may have to
> > actively follow up the needs of the market. Anyway, we could provide the
> > flexible wire protocol extension when we want to support more concrete
> > domain protocol. This is something we must consider carefully.
> > &gt;
> > &gt;
> > &gt; A lightweight streaming engine base on messaging is a very natural
> > thought. I am also an early explorer of streaming, but the so-called
> > streaming we made in previous years is strictly a pseudo-scene, why is
> it a
> > pseudo-scene? Actually, we don’t need to deploy a streaming engine.
> Instead
> > of, we could only use the messaging to reach a same effect in most cases.
> > In addition, in the stream computing scenario, messaging and storage are
> > very important, so why don't we let the messaging support the scheduling
> > and calculation of task nodes naturally, and our built-in storage can
> > better help us better. We only need to provide a lib package, which makes
> > it easy for messaging to have streaming capabilities. As for the
> subsequent
> > SQL processing, CEP, FAAS and etc. I believe that this is the evolution
> of
> > this programming model.
> > &gt;
> > &gt; We have been talking about it before. RocketMQ is a unified
> messaging
> > platform integrating computing, storage and scheduling. Today I share my
> > rough thought of the evolution of the overall architecture of RocketMQ
> 5.0.
> > I also hope to hear the opinions of the community. Including other PMC
> and
> > Committer thoughts. Next, we could call for RIP discussion for the
> details,
> > I hope more pmc or committers could act as the sheepherder of the RIP,
> > making landing more reliable in the 2019.
> > &gt;
> > &gt;
> > &gt; Best Regards,
> > &gt; Von Gosling
> >
> >
> > </d...@gmail.com>
>

Re: [DISSCUSS] RIP-18 Metadata management architecture upgrade

Posted by Xu Jianhai <sn...@gmail.com>.

maybe we could build openmessaging-kv based on DLedger,  and then rocketmq
nameserver  uses openmessaging-kv to construct raft consensus.

On Thu, Apr 30, 2020 at 3:29 PM 金融通 <ji...@mails.ucas.ac.cn> wrote:

> Hi RocketMQ Community,
>
> I think it is a good choice to start the evolution of architecture for
> RocketMQ with Metadata management architecture upgrade.
>
> Currently, the metadata consistency of RocketMQ is maintained by full
> connection. For example, each broker registers with each nameserver to
> ensure that the view of routing information seen between nameservers is the
> same, and each consumer instance sends heartbeats (carrying subscription
> information) to broker to ensure that the view of subscription information
> seen between brokers is the same. However, such consistency maintenance is
> weak. Unreliable network and delay may cause inconsistent views, which has
> caused a lot of&nbsp;issues.
>
> On the other hand, after RocketMQ 4.5.0, we have used the Raft protocol
> (DLedger) to solve the consistency problem of log replication. DLedger is a
> raft-based log storage library. At the beginning of the design, we hoped to
> apply it to consistent metadata storage. If the metadata of RocketMQ is
> stored as log and the consistency is guaranteed by using the raft protocol
> (DLedger), the issue of metadata consistency will be solved.
>
> So I submitted RIP-18 Metadata management architecture upgrade, which
> describes the specific plan in more detail. I hope to hear more voices from
> the community. So please tell me your thoughts by replying to this email or
> commenting on google docs.
>
> Best Regards!
> Rongtong Jin
>
> RIP-18 Metadata management architecture upgrade
>
> https://docs.google.com/document/d/1hQxlbtlMDwNxyVDGsIIUpDNWwfS6hP0PGKY9-A2KUOA/edit?usp=sharing
>
>
> &gt; -----原始邮件-----
> &gt; 发件人: "Gosling Von" <fe...@gmail.com>
> &gt; 发送时间: 2019-01-30 17:54:47 (星期三)
> &gt; 收件人: dev <de...@rocketmq.apache.org>
> &gt; 抄送:
> &gt; 主题: [DISCUSS] Thought of The Evolution of The Next Decade
> Architecture for RocketMQ
> &gt;
> &gt; Hi,
> &gt;
> &gt; I would like to say happy new year to everyone, especially for the
> guys from the eastern hemisphere. I think that when you see this topic, you
> already know what I want to say :-)
> &gt;
> &gt; After more than 6 years of inspection from the community and market,
> Apache RocketMQ has been widely used in the field of financial and
> e-commerce online transactions. Known know data has shown that, just in
> China, RocketMQ covers more than 40% of the traditional messaging scene.
> With the globalization of the community in the past two years, this
> development has spread to all of the worlds. However, through continuous
> community activities, including technical exchanges with some of the
> experts from the Microsoft, Berkeley, etc., coupled with the emergence of
> IoT, AI, Blockchain and other scenarios around the world, I began to think
> about the architecture evolution for RocketMQ. I hope we could make it as
> the data infrastructure of cloud computing era. and we could better serve
> in the next decade.
> &gt;
> &gt;
> &gt; First of all, the overall architecture will take the separation of
> storage computing and pluggable architecture. Regarding the separation of
> storage computing, I know that this is a controversial topic in the
> industry. You may also see that Twitter had gave up their messaging
> solution EventBus, which serving and storage layers are decoupled. one of
> the important reason which is given by "introduces an additional hop".
> That's right, usually, you don't need so much. But what I want to express
> here is that the value of storage computing separation is just like the
> single responsibility in our design pattern, so that focus is more focused.
> For example, if messaging engine is deployed in the edge, we could arrange
> computing nodes to be deployed on demand. Because it is a computationally
> intensive task, we can focus on how to improve computing power and response
> speed without concerned about the machine cost, operation and maintenance
> cost brought by storage. Another case, RocketMQ storage is regarded as a
> kind of time series storage. It not only provides the storage capacity of
> single data, but also the capacity of bulk storage, but in any case it is a
> data type independent sequential additional storage. Under this
> architecture, if you want to realize the current transaction capability,
> there are still some complications, especially when you want to make
> RocketMQ a one-stop microservice transaction solution. We have already
> tried this. Known feedback from the bank is, they have made some
> modifications to the storage in the financial system. For example, when the
> file storage is replaced with a relational type, NoSQL or NewSQL storage,
> the benefit is enhanced maintainability. Enhanced transaction processing
> capabilities. In this sense, we could make a pluggable design in RocketMQ
> 5.0, by default we will provide the ultimate sequential addition capability
> storage, which is also the best storage implementation of the disk seek
> algorithm. But it also brings another question, how to improve the query
> and processing ability of data. Here I want to share another preliminary
> design idea, we could continue to use the data structure such as Commitlog
> to store the original data, and then build the index or intermediate
> aggregation results based on Commitlog. At present, our index structure is
> not well integrated and utilized, from this we could continue to modify and
> optimize the index. In addition, we can use DPDK/SPDK and write Pos atomic
> increment to achieve the best lock-free design. Considering the data that
> has been committed, this series needs to be explored to a large extent,
> even including cooperation with some other communities and universities. So
> at this level, I think we could make RocketMQ have the separation
> deployment capability, while the storage capacity is pluggable and can be
> replaced as needed.
> &gt;
> &gt; Second, support the OpenMessaging standard. I think many guys have
> already noticed the new messaging standard drafted by Alibaba, Yahoo and
> other company. I am also the chair of this project. In this blueprint of
> the standard, a very important problem is solved, that is Interoperability,
> this interoperability is not only between different messaging vendors, but
> also between the upstream and downstream of the messaging. And this
> interoperability is reflected to the user, which is the consistency of the
> API or the protocol. Although we think that the API is also a kind of
> protocol, I want to emphasize that the consistency of the protocol has been
> tried by countless scenes. But so far, I personally have not seen a
> particularly versatile and simple solution, whether it is AMQP, MQTT,
> including RSocket, which has recently been recognized by everyone, there is
> not much innovation to work on this level. And we want to avoid some
> repetitive innovations. At this time, the API layer standard is
> particularly important, so RocketMQ 5.0 will focus on supporting
> OpenMessaging standardization in API testing. In the future of
> multi-language, we hope that through this set of APIs, we can completely
> solve all the problems that you currently encounter with RocketMQ
> multi-language.
> &gt;
> &gt; The natural support of multiple protocols, I think this is also very
> important. So in 5.0, we could reconstruct the remoting module, to provide
> a pluggable transport layer protocol support in the computing node. HTTP2.0
> may be our default protocol. On the basis of again, we also consider
> integrating TCP-based MQTT, UDP-based CoAP. Of course, we also clearly see
> that with the gradual popularization of 5.0G networks, we may have to
> actively follow up the needs of the market. Anyway, we could provide the
> flexible wire protocol extension when we want to support more concrete
> domain protocol. This is something we must consider carefully.
> &gt;
> &gt;
> &gt; A lightweight streaming engine base on messaging is a very natural
> thought. I am also an early explorer of streaming, but the so-called
> streaming we made in previous years is strictly a pseudo-scene, why is it a
> pseudo-scene? Actually, we don’t need to deploy a streaming engine. Instead
> of, we could only use the messaging to reach a same effect in most cases.
> In addition, in the stream computing scenario, messaging and storage are
> very important, so why don't we let the messaging support the scheduling
> and calculation of task nodes naturally, and our built-in storage can
> better help us better. We only need to provide a lib package, which makes
> it easy for messaging to have streaming capabilities. As for the subsequent
> SQL processing, CEP, FAAS and etc. I believe that this is the evolution of
> this programming model.
> &gt;
> &gt; We have been talking about it before. RocketMQ is a unified messaging
> platform integrating computing, storage and scheduling. Today I share my
> rough thought of the evolution of the overall architecture of RocketMQ 5.0.
> I also hope to hear the opinions of the community. Including other PMC and
> Committer thoughts. Next, we could call for RIP discussion for the details,
> I hope more pmc or committers could act as the sheepherder of the RIP,
> making landing more reliable in the 2019.
> &gt;
> &gt;
> &gt; Best Regards,
> &gt; Von Gosling
>
>
> </d...@gmail.com>