You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by David Jacot <dj...@confluent.io.INVALID> on 2022/06/01 09:04:24 UTC

Re: [VOTE] KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft

Hi Colin,

Thanks for your feedback! Please find my answers below.

> However, I wonder if this will be feasible for ZK-based brokers to do. We still support pre-topic-id IBP versions there, right? This might end up being more complex than you were hoping.

That's right. We support pre-topic-id IBP versions for ZK-based
brokers. We will only use version 2 if all the topics have an ID when
the request is constructed. Version 1 is used otherwise. I have
already implemented this part in the draft PR if you want to see how
it looks. It brings a little more complexity in the request
building/handling but that works. I think that it is definitely worth
it.

> We should add a comment in AlterPartitionResponse about the new error code INELIGIBLE_REPLICA. This is very important for error codes so we can track which ones are returned in which RPC version.

Totally. I have those comments in the PR but I forgot to add them in
the KIP. I will fix this.

> I also wonder if we should add a special error code for the case where the AlterPartitions call completes a reassignment AND the completed reassignment no longer includes the previous leader. We have been overloading FENCED_LEADER_EPOCH for this case, but this is coonfusing to operators (especially since this RPC does not support error messages, as opposed to codes)

Interesting. I was not aware of this one. We don't do this in the ZK
controller, right? I do agree that using FENCED_LEADER_EPOCH is
confusing here. We could introduce a NEW_LEADER_ELECTED error code for
this purpose.

> One thing, though, is that we should define how this interacts with replica placement. It seems to me that replicas should not be able to be placed on these inControlledShutdown nodes (unless done manually via the explicit placement API).

Yeah, I agree that we need to clarify this. If not mistaken, replicas
can be placed on fenced nodes so I don't see why it should be
different for inControlledShutdown nodes. Otherwise we will again have
that create topic issue when the cluster has only 3 nodes. However I
think that we have to guarantee two other invariants:
1) an inControlledShutdown node should not be added to the ISR when a
partition is created.
2) an inControlledShutdown node should not be picked as a leader for a
partition. (e.g. KAFKA-13944)

Let me add this part to the KIP.

> I also think we should spell out the fact that once you go into controlled shutdown, you don't come out except by creating a new broker instance. (new incarnation ID). This also makes me wonder if we need to support the shutting down -> not shutting down transition in BrokerRegistrationChangeRecord, since we don't plan on using that transition.

That makes sense. I will add this and remove that transition from
BrokerRegistrationChangeRecord.

> Finally, RegisterBrokerRecord / BrokerRegistrationChangeRecord should be bumped to the next RPC version since we have added a new field. You will also need to assign yourself a new IBP / MetadataVersion. (For BrokerRegistrationChangeRecord it would be possible to avoid the version bump, since we're using tagged fields, but it's better to have it for consistency, I think.)

Noted.

Let me update the KIP to incorporate all your feedback.

Cheers,
David

On Tue, May 31, 2022 at 9:42 PM Colin McCabe <cm...@apache.org> wrote:
>
> > We should add a comment in AlterPartitionResponse about the new error
> > code INELIGIBLE_REPLICA. This is very important for error codes so we
> > can track which ones are returned in which RPC version. I also wonder
>
> Here I'm referring to AlterPartitionResponse.json
>
> cheers,
> Colin
>
> >
> >
> > On Tue, May 31, 2022, at 08:36, David Jacot wrote:
> >> Hi folks,
> >>
> >> I'd like to start a vote for KIP-841:
> >> https://cwiki.apache.org/confluence/x/phmhD.
> >>
> >> Thanks,
> >> David

Re: [VOTE] KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft

Posted by David Jacot <dj...@confluent.io.INVALID>.
Thanks all!

The vote passes with binding +1 votes from Colin, David, José and myself.

David

On Fri, Jun 3, 2022 at 2:14 AM José Armando García Sancio
<js...@confluent.io.invalid> wrote:
>
> Thanks for proposing this improvement David Jacot. I think it is going
> to make the graceful shutdown process much more efficient.
>
> +1 (binding) from me.

Re: [VOTE] KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft

Posted by José Armando García Sancio <js...@confluent.io.INVALID>.
Thanks for proposing this improvement David Jacot. I think it is going
to make the graceful shutdown process much more efficient.

+1 (binding) from me.

Re: [VOTE] KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft

Posted by David Arthur <mu...@gmail.com>.
Thanks, David. I think the KIP looks good. +1 (binding) from me

-David

On Wed, Jun 1, 2022 at 12:54 PM Colin McCabe <cm...@apache.org> wrote:

> Thanks, David. With these changes, I am +1 (binding)
>
> Great to see this KIP moving forward!
>
> Colin
>
>
> On Wed, Jun 1, 2022, at 02:04, David Jacot wrote:
> > Hi Colin,
> >
> > Thanks for your feedback! Please find my answers below.
> >
> >> However, I wonder if this will be feasible for ZK-based brokers to do.
> We still support pre-topic-id IBP versions there, right? This might end up
> being more complex than you were hoping.
> >
> > That's right. We support pre-topic-id IBP versions for ZK-based
> > brokers. We will only use version 2 if all the topics have an ID when
> > the request is constructed. Version 1 is used otherwise. I have
> > already implemented this part in the draft PR if you want to see how
> > it looks. It brings a little more complexity in the request
> > building/handling but that works. I think that it is definitely worth
> > it.
> >
> >> We should add a comment in AlterPartitionResponse about the new error
> code INELIGIBLE_REPLICA. This is very important for error codes so we can
> track which ones are returned in which RPC version.
> >
> > Totally. I have those comments in the PR but I forgot to add them in
> > the KIP. I will fix this.
> >
> >> I also wonder if we should add a special error code for the case where
> the AlterPartitions call completes a reassignment AND the completed
> reassignment no longer includes the previous leader. We have been
> overloading FENCED_LEADER_EPOCH for this case, but this is coonfusing to
> operators (especially since this RPC does not support error messages, as
> opposed to codes)
> >
> > Interesting. I was not aware of this one. We don't do this in the ZK
> > controller, right? I do agree that using FENCED_LEADER_EPOCH is
> > confusing here. We could introduce a NEW_LEADER_ELECTED error code for
> > this purpose.
> >
> >> One thing, though, is that we should define how this interacts with
> replica placement. It seems to me that replicas should not be able to be
> placed on these inControlledShutdown nodes (unless done manually via the
> explicit placement API).
> >
> > Yeah, I agree that we need to clarify this. If not mistaken, replicas
> > can be placed on fenced nodes so I don't see why it should be
> > different for inControlledShutdown nodes. Otherwise we will again have
> > that create topic issue when the cluster has only 3 nodes. However I
> > think that we have to guarantee two other invariants:
> > 1) an inControlledShutdown node should not be added to the ISR when a
> > partition is created.
> > 2) an inControlledShutdown node should not be picked as a leader for a
> > partition. (e.g. KAFKA-13944)
> >
> > Let me add this part to the KIP.
> >
> >> I also think we should spell out the fact that once you go into
> controlled shutdown, you don't come out except by creating a new broker
> instance. (new incarnation ID). This also makes me wonder if we need to
> support the shutting down -> not shutting down transition in
> BrokerRegistrationChangeRecord, since we don't plan on using that
> transition.
> >
> > That makes sense. I will add this and remove that transition from
> > BrokerRegistrationChangeRecord.
> >
> >> Finally, RegisterBrokerRecord / BrokerRegistrationChangeRecord should
> be bumped to the next RPC version since we have added a new field. You will
> also need to assign yourself a new IBP / MetadataVersion. (For
> BrokerRegistrationChangeRecord it would be possible to avoid the version
> bump, since we're using tagged fields, but it's better to have it for
> consistency, I think.)
> >
> > Noted.
> >
> > Let me update the KIP to incorporate all your feedback.
> >
> > Cheers,
> > David
> >
> > On Tue, May 31, 2022 at 9:42 PM Colin McCabe <cm...@apache.org> wrote:
> >>
> >> > We should add a comment in AlterPartitionResponse about the new error
> >> > code INELIGIBLE_REPLICA. This is very important for error codes so we
> >> > can track which ones are returned in which RPC version. I also wonder
> >>
> >> Here I'm referring to AlterPartitionResponse.json
> >>
> >> cheers,
> >> Colin
> >>
> >> >
> >> >
> >> > On Tue, May 31, 2022, at 08:36, David Jacot wrote:
> >> >> Hi folks,
> >> >>
> >> >> I'd like to start a vote for KIP-841:
> >> >> https://cwiki.apache.org/confluence/x/phmhD.
> >> >>
> >> >> Thanks,
> >> >> David
>


-- 
David Arthur

Re: [VOTE] KIP-841: Fenced replicas should not be allowed to join the ISR in KRaft

Posted by Colin McCabe <cm...@apache.org>.
Thanks, David. With these changes, I am +1 (binding)

Great to see this KIP moving forward!

Colin


On Wed, Jun 1, 2022, at 02:04, David Jacot wrote:
> Hi Colin,
>
> Thanks for your feedback! Please find my answers below.
>
>> However, I wonder if this will be feasible for ZK-based brokers to do. We still support pre-topic-id IBP versions there, right? This might end up being more complex than you were hoping.
>
> That's right. We support pre-topic-id IBP versions for ZK-based
> brokers. We will only use version 2 if all the topics have an ID when
> the request is constructed. Version 1 is used otherwise. I have
> already implemented this part in the draft PR if you want to see how
> it looks. It brings a little more complexity in the request
> building/handling but that works. I think that it is definitely worth
> it.
>
>> We should add a comment in AlterPartitionResponse about the new error code INELIGIBLE_REPLICA. This is very important for error codes so we can track which ones are returned in which RPC version.
>
> Totally. I have those comments in the PR but I forgot to add them in
> the KIP. I will fix this.
>
>> I also wonder if we should add a special error code for the case where the AlterPartitions call completes a reassignment AND the completed reassignment no longer includes the previous leader. We have been overloading FENCED_LEADER_EPOCH for this case, but this is coonfusing to operators (especially since this RPC does not support error messages, as opposed to codes)
>
> Interesting. I was not aware of this one. We don't do this in the ZK
> controller, right? I do agree that using FENCED_LEADER_EPOCH is
> confusing here. We could introduce a NEW_LEADER_ELECTED error code for
> this purpose.
>
>> One thing, though, is that we should define how this interacts with replica placement. It seems to me that replicas should not be able to be placed on these inControlledShutdown nodes (unless done manually via the explicit placement API).
>
> Yeah, I agree that we need to clarify this. If not mistaken, replicas
> can be placed on fenced nodes so I don't see why it should be
> different for inControlledShutdown nodes. Otherwise we will again have
> that create topic issue when the cluster has only 3 nodes. However I
> think that we have to guarantee two other invariants:
> 1) an inControlledShutdown node should not be added to the ISR when a
> partition is created.
> 2) an inControlledShutdown node should not be picked as a leader for a
> partition. (e.g. KAFKA-13944)
>
> Let me add this part to the KIP.
>
>> I also think we should spell out the fact that once you go into controlled shutdown, you don't come out except by creating a new broker instance. (new incarnation ID). This also makes me wonder if we need to support the shutting down -> not shutting down transition in BrokerRegistrationChangeRecord, since we don't plan on using that transition.
>
> That makes sense. I will add this and remove that transition from
> BrokerRegistrationChangeRecord.
>
>> Finally, RegisterBrokerRecord / BrokerRegistrationChangeRecord should be bumped to the next RPC version since we have added a new field. You will also need to assign yourself a new IBP / MetadataVersion. (For BrokerRegistrationChangeRecord it would be possible to avoid the version bump, since we're using tagged fields, but it's better to have it for consistency, I think.)
>
> Noted.
>
> Let me update the KIP to incorporate all your feedback.
>
> Cheers,
> David
>
> On Tue, May 31, 2022 at 9:42 PM Colin McCabe <cm...@apache.org> wrote:
>>
>> > We should add a comment in AlterPartitionResponse about the new error
>> > code INELIGIBLE_REPLICA. This is very important for error codes so we
>> > can track which ones are returned in which RPC version. I also wonder
>>
>> Here I'm referring to AlterPartitionResponse.json
>>
>> cheers,
>> Colin
>>
>> >
>> >
>> > On Tue, May 31, 2022, at 08:36, David Jacot wrote:
>> >> Hi folks,
>> >>
>> >> I'd like to start a vote for KIP-841:
>> >> https://cwiki.apache.org/confluence/x/phmhD.
>> >>
>> >> Thanks,
>> >> David