You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rocketmq.apache.org by jinrongtong <ji...@apache.org> on 2022/02/07 02:43:38 UTC

[DISCUSS][RIP-34] Support quorum write and adaptive degradation in master-slave architecture

Hi, RocketMQ Community:

First of all, Happy Chinese New Year! And I want to start a RIP to support quorum write that users can specify at the broker the minimum number of replicas to be ack before returning. I also want to provide an adaptive degrade mode that can be automatically degraded according to the number of surviving replicas and commit log gap between master and slave in this RIP.

I have written my proposal and you can click on the link below:

https://docs.google.com/document/d/1ENCJl83OGnmtPMAbxTqtEs9LOOIkMNkz3A9HZJHyI2o/edit?usp=sharing

Chinese version:

https://shimo.im/docs/WJptDyXgjc3h6C8R




If you have any questions or suggestions, please reply to this email or comment on the proposal.




Thanks
RongtongJin

Re: Re: [DISCUSS][RIP-34] Support quorum write and adaptive degradation in master-slave architecture

Posted by dongeforever <do...@apache.org>.
It makes sense according to the RIP-32.

In the master-slave mode, if the old master wants to be online, it needs to
make sure the slave is online and exchanges data from the slave.

The totalReplicas help the master to check if there is a slave.

Rongtong Jin <ji...@mails.ucas.ac.cn> 于2022年2月8日周二 22:44写道:

> Hi @dongeforever, thank you for your careful review and reply.
>
> On the first question. I was worried that if haSlaveFallbehindMax is
> reused and its default value is 256MB, it would not degrade as expected
> when users upgraded the version without modifying this parameter, so I
> changed to haSlaveFallBehindMax and its default value is 256K, but its name
> is indeed confused for users. I will take your advice, maybe it would be
> better to name it haMaxGapNotInSync.
>
> Secondly, I'm sorry I didn't describe the totalReplicas parameter clearly
> in the proposal. totalReplicas parameter does not affect the number of
> replicas that need ack. Its main functions are as follows:
>
> 1. In RIP-32, lock quorum (refer to
> https://shimo.im/docs/6CqVccXgtWgXCYwv#anchor-kLQL) calculates the number
> of replicas to be locked according to the value of totalReplicas. For
> example, if totalReplicas = 3, it needs to lock 2 replicas to be successful.
> 2. It will be a verification parameter. For example, when totalReplicas =
> 1, it will only get the local data when calling getMinOffset and
> getMaxOffset. It will also skip the pre-online process when totalReplicas =
> 1.
>
> Therefore, if the real number of replicas is not equal to the configured
> totalReplicas, the normal replication will not be affected, but lock quorum
> will not be as expected in the scenario of order message.
>
> I will revise the content of the proposal ASAP.
>
> &quot;dongeforever&quot; &lt;dongeforever@apache.org&gt;写道:
> > This RIP is nice.
> > And I have read the doc, found some trivial problems
> > 1.* The new haSlaveFallBehindMax is easy to be confused with
> > old haSlaveFallbehindMax........I suggest just keeping the old one.  If
> you
> > insist on it, it is better to use another name.*
> > *2. what is for property  "totalReplicas"?  What will happen  if the real
> > replicas are not equal to the configured "**totalReplicas**"? ------ IMO,
> > the inSyncReplicas is enough.*
> >
> >
> > jinrongtong <ji...@apache.org> 于2022年2月7日周一 10:43写道:
> >
> > > Hi, RocketMQ Community:
> > >
> > > First of all, Happy Chinese New Year! And I want to start a RIP to
> support
> > > quorum write that users can specify at the broker the minimum number of
> > > replicas to be ack before returning. I also want to provide an adaptive
> > > degrade mode that can be automatically degraded according to the
> number of
> > > surviving replicas and commit log gap between master and slave in this
> RIP.
> > >
> > > I have written my proposal and you can click on the link below:
> > >
> > >
> > >
> https://docs.google.com/document/d/1ENCJl83OGnmtPMAbxTqtEs9LOOIkMNkz3A9HZJHyI2o/edit?usp=sharing
> > >
> > > Chinese version:
> > >
> > > https://shimo.im/docs/WJptDyXgjc3h6C8R
> > >
> > >
> > >
> > >
> > > If you have any questions or suggestions, please reply to this email or
> > > comment on the proposal.
> > >
> > >
> > >
> > >
> > > Thanks
> > > RongtongJin
>

Re: Re: [DISCUSS][RIP-34] Support quorum write and adaptive degradation in master-slave architecture

Posted by Rongtong Jin <ji...@mails.ucas.ac.cn>.
Hi @dongeforever, thank you for your careful review and reply.

On the first question. I was worried that if haSlaveFallbehindMax is reused and its default value is 256MB, it would not degrade as expected when users upgraded the version without modifying this parameter, so I changed to haSlaveFallBehindMax and its default value is 256K, but its name is indeed confused for users. I will take your advice, maybe it would be better to name it haMaxGapNotInSync.

Secondly, I'm sorry I didn't describe the totalReplicas parameter clearly in the proposal. totalReplicas parameter does not affect the number of replicas that need ack. Its main functions are as follows:

1. In RIP-32, lock quorum (refer to https://shimo.im/docs/6CqVccXgtWgXCYwv#anchor-kLQL) calculates the number of replicas to be locked according to the value of totalReplicas. For example, if totalReplicas = 3, it needs to lock 2 replicas to be successful.
2. It will be a verification parameter. For example, when totalReplicas = 1, it will only get the local data when calling getMinOffset and getMaxOffset. It will also skip the pre-online process when totalReplicas = 1. 

Therefore, if the real number of replicas is not equal to the configured totalReplicas, the normal replication will not be affected, but lock quorum will not be as expected in the scenario of order message.

I will revise the content of the proposal ASAP.

&quot;dongeforever&quot; &lt;dongeforever@apache.org&gt;写道:
> This RIP is nice.
> And I have read the doc, found some trivial problems
> 1.* The new haSlaveFallBehindMax is easy to be confused with
> old haSlaveFallbehindMax........I suggest just keeping the old one.  If you
> insist on it, it is better to use another name.*
> *2. what is for property  "totalReplicas"?  What will happen  if the real
> replicas are not equal to the configured "**totalReplicas**"? ------ IMO,
> the inSyncReplicas is enough.*
> 
> 
> jinrongtong <ji...@apache.org> 于2022年2月7日周一 10:43写道:
> 
> > Hi, RocketMQ Community:
> >
> > First of all, Happy Chinese New Year! And I want to start a RIP to support
> > quorum write that users can specify at the broker the minimum number of
> > replicas to be ack before returning. I also want to provide an adaptive
> > degrade mode that can be automatically degraded according to the number of
> > surviving replicas and commit log gap between master and slave in this RIP.
> >
> > I have written my proposal and you can click on the link below:
> >
> >
> > https://docs.google.com/document/d/1ENCJl83OGnmtPMAbxTqtEs9LOOIkMNkz3A9HZJHyI2o/edit?usp=sharing
> >
> > Chinese version:
> >
> > https://shimo.im/docs/WJptDyXgjc3h6C8R
> >
> >
> >
> >
> > If you have any questions or suggestions, please reply to this email or
> > comment on the proposal.
> >
> >
> >
> >
> > Thanks
> > RongtongJin

Re: [DISCUSS][RIP-34] Support quorum write and adaptive degradation in master-slave architecture

Posted by dongeforever <do...@apache.org>.
This RIP is nice.
And I have read the doc, found some trivial problems
1.* The new haSlaveFallBehindMax is easy to be confused with
old haSlaveFallbehindMax........I suggest just keeping the old one.  If you
insist on it, it is better to use another name.*
*2. what is for property  "totalReplicas"?  What will happen  if the real
replicas are not equal to the configured "**totalReplicas**"? ------ IMO,
the inSyncReplicas is enough.*


jinrongtong <ji...@apache.org> 于2022年2月7日周一 10:43写道:

> Hi, RocketMQ Community:
>
> First of all, Happy Chinese New Year! And I want to start a RIP to support
> quorum write that users can specify at the broker the minimum number of
> replicas to be ack before returning. I also want to provide an adaptive
> degrade mode that can be automatically degraded according to the number of
> surviving replicas and commit log gap between master and slave in this RIP.
>
> I have written my proposal and you can click on the link below:
>
>
> https://docs.google.com/document/d/1ENCJl83OGnmtPMAbxTqtEs9LOOIkMNkz3A9HZJHyI2o/edit?usp=sharing
>
> Chinese version:
>
> https://shimo.im/docs/WJptDyXgjc3h6C8R
>
>
>
>
> If you have any questions or suggestions, please reply to this email or
> comment on the proposal.
>
>
>
>
> Thanks
> RongtongJin