You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Matthias J. Sax" <mj...@apache.org> on 2021/02/17 01:07:45 UTC

Re: [DISCUSS] Kafka 3.0

Hi,

given that we passed 2.8 feature freeze, I wanted to restart this
thread. Currently, `trunk` is at `2.9.0-SNAPSHOT` and I am wondering if
the decision for the 3.0 release is final and if we should bump the
version number?

I am asking particularly because there a many Jiras with a 3.0 target
release version for breaking changes and we should ensure that we have
enough time to work on those tickets. -- As long as we don't agree that
the next release will indeed be 3.0, those tickets are effectively
blocked/pending.

Thoughts?


-Matthias


On 10/15/20 4:28 PM, Matthias J. Sax wrote:
> Thanks for clarifying Colin. Works for me. Overall, 3.0 should be guided
> by the ZK removal progress and if we are not there yet, it's better to
> have a 2.8 first.
> 
> 
> -Matthias
> 
> 
> On 10/15/20 2:41 PM, Colin McCabe wrote:
>> Hi all,
>>
>> Just to follow up on this... since we're not quite ready for 3.0 yet, it's probably best if we release a 2.8 next, and then go to 3.0 after that.  Sorry for any confusion.
>>
>> best,
>> Colin
>>
>>
>> On Mon, Jul 20, 2020, at 12:52, Matthias J. Sax wrote:
>>> Did we reach any conclusion on the subject?
>>>
>>> It seems we are aiming for 2.7 after 2.6 and plan the major version bump
>>> to 3.0 after 2.7 (assuming we make progress on ZK removal as planned?)
>>>
>>>
>>> -Matthias
>>>
>>>
>>> On 5/18/20 1:11 PM, Boyang Chen wrote:
>>>> One more thing I would like to see deprecated (hopefully no one mentioned
>>>> before) is the zk based consumer offset support.
>>>>
>>>> On Mon, May 11, 2020 at 2:15 PM Colin McCabe <cm...@apache.org> wrote:
>>>>
>>>>> Hi Michael,
>>>>>
>>>>> It would be better to discuss the background behind KIP-500 in a separate
>>>>> thread, since this thread is about the Kafka 3.0 release.  As others have
>>>>> said, your questions are answered in the KIP.  For example, "what is the
>>>>> actual goal?" is addressed in the motivation section.
>>>>>
>>>>> I agree that Kafka's usage of Apache ZooKeeper could be optimized.  But
>>>>> there are fundamental limitations to this approach compared to storing our
>>>>> metadata internally.  For example, having to contact a remote server to
>>>>> reload all your metadata on a controller failover simply doesn't scale past
>>>>> a certain point.
>>>>>
>>>>> Apache Curator is a nice API, and if we were starting again today we would
>>>>> certainly consider using it.  But it doesn't allow us to do anything more
>>>>> efficiently than ZooKeeper could already do it.
>>>>>
>>>>> Finally, Kafka's core competence is logs.  While our replication protocol
>>>>> is not Raft, it shares many similarities with that protocol.  So I think
>>>>> it's a bit unfair to say that it is "catastrophic hubris" to believe we can
>>>>> implement the protocol.
>>>>>
>>>>> best,
>>>>> Colin
>>>>>
>>>>>
>>>>> On Sun, May 10, 2020, at 11:02, Michael K. Edwards wrote:
>>>>>> Yes, I've read the KIP.  But all it really says to me is "we have never
>>>>>> gotten around to using ZooKeeper properly."  To the extent that any of
>>>>> the
>>>>>> distributed-state-maintenance problems discussed in "Metadata as an Event
>>>>>> Log" can be solved — and some of them intrinsically can't, because CAP
>>>>>> theorem — most of them are already implemented very effectively in
>>>>> Curator
>>>>>> recipes.  (For instance, Curator's Tree Cache
>>>>>> https://curator.apache.org/curator-recipes/tree-cache.html is a good
>>>>> fit to
>>>>>> some of the state-maintenance needs.)
>>>>>>
>>>>>> Kafka does have some usage patterns that don't map neatly onto existing
>>>>>> Curator recipes.  For instance, neither LeaderSelector nor LeaderLatch
>>>>>> implements leader preference in the way that the existing Kafka partition
>>>>>> leadership election procedure does.  But why not handle that by improving
>>>>>> and extending Curator?  That way, other Curator users benefit, and we get
>>>>>> additional highly experienced reviewers' eyes on the distributed
>>>>>> algorithms, which are very very tricky to get right.
>>>>>>
>>>>>>
>>>>>> On Sun, May 10, 2020 at 10:47 AM Ron Dagostino <rn...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> Hi Michael.  This is discussed in the KIP.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum#KIP-500:ReplaceZooKeeperwithaSelf-ManagedMetadataQuorum-Motivation
>>>>>>>
>>>>>>> Ron
>>>>>>>
>>>>>>>> On May 10, 2020, at 1:35 PM, Michael K. Edwards <
>>>>> m.k.edwards@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> What is the actual goal of removing the ZooKeeper dependency?  In my
>>>>>>>> experience, if ZooKeeper is properly provisioned and deployed, it's
>>>>>>> largely
>>>>>>>> trouble-free.  (You do need to know how to use observers properly.)
>>>>>>> There
>>>>>>>> are some subtleties about timeouts and leadership changes, but
>>>>> they're
>>>>>>>> pretty small stuff.  Why go to all the trouble of building a new
>>>>>>>> distributed-consensus system that's going to have catastrophic bugs
>>>>> for
>>>>>>>> years to come?  It seems like such an act of hubris to me, as well
>>>>> as a
>>>>>>>> massive waste of engineering effort.  What is there to be gained?
>>>>>>>>
>>>>>>>>> On Fri, May 8, 2020 at 4:11 PM Matthias J. Sax <mj...@apache.org>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Sure, we can compile a list for Kafka Streams. But the KIP would be
>>>>> for
>>>>>>>>> 3.0, so I don't think it's urgent to do it now?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Matthias
>>>>>>>>>
>>>>>>>>>> On 5/8/20 3:47 PM, Colin McCabe wrote:
>>>>>>>>>> Thanks, Guozhang-- sounds like a good plan.
>>>>>>>>>>
>>>>>>>>>> I think it would be good to have a list of deprecated streams APIs
>>>>> that
>>>>>>>>> we want to remove in 3.0.  Maybe it's easiest to do that as its own
>>>>> KIP?
>>>>>>>>>>
>>>>>>>>>> For MirrorMaker 1, we should have a KIP to deprecate its use in
>>>>> 2.6 if
>>>>>>>>> we want to remove it in 3.0.  I don't have a good sense of how
>>>>>>> practical it
>>>>>>>>> is to deprecate this now, so I will defer to others here.  But the
>>>>> KIP
>>>>>>>>> freeze for 2.6 is coming soon, so if we want to make the case, now
>>>>> is
>>>>>>> the
>>>>>>>>> time.
>>>>>>>>>>
>>>>>>>>>> best,
>>>>>>>>>> Colin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On Thu, May 7, 2020, at 16:28, Guozhang Wang wrote:
>>>>>>>>>>> Hey folks,
>>>>>>>>>>>
>>>>>>>>>>> Sorry for stating that the bridge release would not break any
>>>>>>>>> compatibility
>>>>>>>>>>> before, which is incorrect and confused many people.
>>>>>>>>>>>
>>>>>>>>>>> I think one way to think about the versioning is that:
>>>>>>>>>>>
>>>>>>>>>>> 0) In a 2.x version moving ahead we would deprecate the
>>>>> ZK-dependent
>>>>>>>>> tools
>>>>>>>>>>> such as --zookeeper flags from various scripts (KIP-555)
>>>>>>>>>>>
>>>>>>>>>>> 1) In 3.0 we would at least make one incompatible change for
>>>>> example
>>>>>>> to
>>>>>>>>>>> remove the deprecated ZK flags.
>>>>>>>>>>>
>>>>>>>>>>> 2) In a future major version (e.g. 4.0) we would drop ZK entirely,
>>>>>>>>>>> including usages such as security credentials / broker
>>>>> registration /
>>>>>>>>> etc
>>>>>>>>>>> which are via ZK today as well.
>>>>>>>>>>>
>>>>>>>>>>> Then for the bridge release(s), it can be any or all of 3.x.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> For 1), I'd love to add a few more incompatibility changes in 3.0
>>>>> from
>>>>>>>>>>> Kafka Streams: we evolve Streams public APIs by deprecating and
>>>>> then
>>>>>>>>> remove
>>>>>>>>>>> in major releases, and since 2.0 we've accumulated quite a few
>>>>>>>>> deprecated
>>>>>>>>>>> APIs, and I can compile a list of KIPs that contain those if
>>>>> people
>>>>>>> are
>>>>>>>>>>> interested.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Guozhang
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 7, 2020 at 3:53 PM Colin McCabe <cm...@apache.org>
>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 6, 2020, at 21:33, Ryanne Dolan wrote:
>>>>>>>>>>>>>> In fact, we know that the bridge release will involve at least
>>>>> one
>>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
>>>>>>>>> --zookeeper
>>>>>>>>>>>>>> flags in the command-line tools.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If the bridge release(s) and the subsequent post-ZK release are
>>>>>>> _both_
>>>>>>>>>>>>> breaking changes, I think we only have one option: the 3.x line
>>>>> are
>>>>>>>>> the
>>>>>>>>>>>>> bridge release(s), and ZK is removed in 4.0, as suggested by
>>>>> Andrew
>>>>>>>>>>>>> Schofield.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Specifically:
>>>>>>>>>>>>> - in order to _remove_ (not merely deprecate) the --zookeeper
>>>>> args,
>>>>>>> we
>>>>>>>>>>>> will
>>>>>>>>>>>>> need a major release.
>>>>>>>>>>>>> - in oder to drop support for ZK entirely (e.g. break a bunch of
>>>>>>>>> external
>>>>>>>>>>>>> tooling like Cruise Control), we will need a major release.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I count two major releases.
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Ryanne,
>>>>>>>>>>>>
>>>>>>>>>>>> I agree that dropping ZK completely will need a new major release
>>>>>>> after
>>>>>>>>>>>> 3.0.  I think that's OK and in keeping with how we've handled
>>>>>>>>> deprecation
>>>>>>>>>>>> and removal in the past.  It's important for users to have a
>>>>> smooth
>>>>>>>>> upgrade
>>>>>>>>>>>> path.
>>>>>>>>>>>>
>>>>>>>>>>>> best,
>>>>>>>>>>>> Colin
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ryanne
>>>>>>>>>>>>>
>>>>>>>>>>>>> -
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 6, 2020 at 10:52 PM Colin McCabe <
>>>>> cmccabe@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, May 4, 2020, at 17:12, Ryanne Dolan wrote:
>>>>>>>>>>>>>>> Hey Colin, I think we should wait until after KIP-500's
>>>>> "bridge
>>>>>>>>>>>>>>> release" so there is a clean break from Zookeeper after 3.0.
>>>>> The
>>>>>>>>>>>>>>> bridge release by definition is an attempt to not break
>>>>> anything,
>>>>>>> so
>>>>>>>>>>>>>>> it theoretically doesn't warrant a major release.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Ryanne,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think it's important to clarify this a little bit.  The
>>>>> bridge
>>>>>>>>>>>> release
>>>>>>>>>>>>>> (really, releases, plural) allow you to upgrade from a cluster
>>>>> that
>>>>>>>>> is
>>>>>>>>>>>>>> using ZooKeeper to one that is not using ZooKeeper.  But, that
>>>>>>>>> doesn't
>>>>>>>>>>>>>> imply that the bridge release itself doesn't break anything.
>>>>>>>>> Upgrading
>>>>>>>>>>>>>> to the bridge release itself might involve some minor
>>>>>>>>> incompatibility.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kafka does occasionally have incompatible changes.  In those
>>>>> cases,
>>>>>>>>> we
>>>>>>>>>>>>>> bump the major version number.  One example is that when we
>>>>> went
>>>>>>> from
>>>>>>>>>>>>>> Kafka 1.x to Kafka 2.0, we dropped support for JDK7.  This is
>>>>> an
>>>>>>>>>>>>>> incompatible change.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In fact, we know that the bridge release will involve at least
>>>>> one
>>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
>>>>>>>>> --zookeeper
>>>>>>>>>>>>>> flags in the command-line tools.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We've been preparing for this change for a long time.  People
>>>>> have
>>>>>>>>>>>> spent
>>>>>>>>>>>>>> a lot of effort designing new APIs that can be used instead of
>>>>> the
>>>>>>>>> old
>>>>>>>>>>>>>> zookeeper-based code that some of the command-line tools
>>>>> used.  We
>>>>>>>>> have
>>>>>>>>>>>>>> also deprecated the old ZK-based flags.  But at the end of the
>>>>> day,
>>>>>>>>> it
>>>>>>>>>>>>>> is still an incompatible change.  So it's unfortunately not
>>>>>>> possible
>>>>>>>>>>>> for
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> bridge release to be a 2.x release.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If that's not the case (i.e. if a single "bridge release"
>>>>> turns
>>>>>>> out
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> be impractical), we should consider forking 3.0 while
>>>>> maintaining
>>>>>>> a
>>>>>>>>>>>>>>> line of Zookeeper-dependent Kafka in 2.x. That way 3.x can
>>>>> evolve
>>>>>>>>>>>>>>> dramatically without breaking the 2.x line. In particular,
>>>>>>> anything
>>>>>>>>>>>>>>> related to removing Zookeeper could land in pre-3.0 while
>>>>> every
>>>>>>>>> other
>>>>>>>>>>>>>>> feature targets 2.6.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just to be super clear about this, what we want to do here is
>>>>>>> support
>>>>>>>>>>>>>> operating in __either__ KIP-500 mode and legacy mode for a
>>>>> while.
>>>>>>> So
>>>>>>>>>>>> the
>>>>>>>>>>>>>> same branch will have support for both the old way and the new
>>>>> way
>>>>>>> of
>>>>>>>>>>>>>> managing metadata.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This will allow us to get an "alpha" version of the KIP-500
>>>>> mode
>>>>>>> out
>>>>>>>>>>>> early
>>>>>>>>>>>>>> for people to experiment with.  It also greatly reduces the
>>>>> number
>>>>>>> of
>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>> releases we have to make, and the amount of backporting we
>>>>> have to
>>>>>>>>> do.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you are proposing 2.6 should be the "bridge release", I
>>>>> think
>>>>>>>>> this
>>>>>>>>>>>>>>> is premature given Kafka's time-based release schedule. If the
>>>>>>>>> bridge
>>>>>>>>>>>>>>> features happen to be merged before 2.6's feature freeze, then
>>>>>>> sure
>>>>>>>>>>>> --
>>>>>>>>>>>>>>> let's make that the bridge release in retrospect. And if we
>>>>> get
>>>>>>> all
>>>>>>>>>>>>>>> the post-Zookeeper features merged before 2.7, I'm onboard
>>>>> with
>>>>>>>>>>>> naming
>>>>>>>>>>>>>>> it "3.0" instead.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That said, we should aim to remove legacy MirrorMaker before
>>>>> 3.0
>>>>>>> as
>>>>>>>>>>>>>>> well. I'm happy to drive that additional breaking change.
>>>>> Maybe
>>>>>>> 2.6
>>>>>>>>>>>>>>> can be the "bridge" for MM2 as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't have a strong opinion either way about this, but if we
>>>>> want
>>>>>>>>> to
>>>>>>>>>>>>>> remove the original MirrorMaker, we have to deprecate it first,
>>>>>>>>>>>> right?  Are
>>>>>>>>>>>>>> we ready to do that?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> best,
>>>>>>>>>>>>>> Colin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ryanne
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, May 4, 2020, 5:05 PM Colin McCabe <cmccabe@apache.org
>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We've had a few proposals recently for incompatible
>>>>> changes.  One
>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> them is my KIP-604: Remove ZooKeeper Flags from the
>>>>>>> Administrative
>>>>>>>>>>>>>>>> Tools.  The other is Boyang's KIP-590: Redirect ZK Mutation
>>>>>>>>>>>>>>>> Protocols to the Controller.  I think it's time to start
>>>>> thinking
>>>>>>>>>>>>>>>> about Kafka 3.0. Specifically, I think we should move to 3.0
>>>>>>> after
>>>>>>>>>>>>>>>> the 2.6 release.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> From the perspective of KIP-500, in Kafka 3.x we'd like to
>>>>> make
>>>>>>>>>>>>>>>> running in a ZooKeeper-less mode possible (but not yet the
>>>>>>>>>>>> default.)
>>>>>>>>>>>>>>>> This is the motivation behind KIP-590 and KIP-604, as well as
>>>>>>> some
>>>>>>>>>>>>>>>> of the other KIPs we've done recently.  Since it will take
>>>>> some
>>>>>>>>>>>> time
>>>>>>>>>>>>>>>> to stabilize the new ZooKeeper-free Kafka code, we will hide
>>>>> it
>>>>>>>>>>>>>>>> behind an option initially. (We'll have a KIP describing
>>>>> this all
>>>>>>>>>>>> in
>>>>>>>>>>>>>>>> detail soon.)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What does everyone think about having Kafka 3.0 come up next
>>>>>>> after
>>>>>>>>>>>>>>>> 2.6? Are there any other things we should change in the 2.6
>>>>> ->
>>>>>>> 3.0
>>>>>>>>>>>>>>>> transition?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> best, Colin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> Attachments:
>>> * signature.asc

Re: [DISCUSS] Kafka 3.0

Posted by Guozhang Wang <wa...@gmail.com>.
+1 on getting to 3.0 for the June release this year too.

Guozhang

On Mon, Feb 22, 2021 at 6:54 PM Matthias J. Sax <mj...@apache.org> wrote:

> To move this forward, I took the liberty to create a PR to bump the
> version to 3.0.0-SNAPSHOT
>
> https://github.com/apache/kafka/pull/10186
>
> Please let us know if there are any concerns.
>
>
> -Matthias
>
> On 2/16/21 5:18 PM, Ismael Juma wrote:
> > I'm +1 on 3.0 for the mid year release.
> >
> > On Tue, Feb 16, 2021 at 5:08 PM Matthias J. Sax <mj...@apache.org>
> wrote:
> >
> >> Hi,
> >>
> >> given that we passed 2.8 feature freeze, I wanted to restart this
> >> thread. Currently, `trunk` is at `2.9.0-SNAPSHOT` and I am wondering if
> >> the decision for the 3.0 release is final and if we should bump the
> >> version number?
> >>
> >> I am asking particularly because there a many Jiras with a 3.0 target
> >> release version for breaking changes and we should ensure that we have
> >> enough time to work on those tickets. -- As long as we don't agree that
> >> the next release will indeed be 3.0, those tickets are effectively
> >> blocked/pending.
> >>
> >> Thoughts?
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 10/15/20 4:28 PM, Matthias J. Sax wrote:
> >>> Thanks for clarifying Colin. Works for me. Overall, 3.0 should be
> guided
> >>> by the ZK removal progress and if we are not there yet, it's better to
> >>> have a 2.8 first.
> >>>
> >>>
> >>> -Matthias
> >>>
> >>>
> >>> On 10/15/20 2:41 PM, Colin McCabe wrote:
> >>>> Hi all,
> >>>>
> >>>> Just to follow up on this... since we're not quite ready for 3.0 yet,
> >> it's probably best if we release a 2.8 next, and then go to 3.0 after
> >> that.  Sorry for any confusion.
> >>>>
> >>>> best,
> >>>> Colin
> >>>>
> >>>>
> >>>> On Mon, Jul 20, 2020, at 12:52, Matthias J. Sax wrote:
> >>>>> Did we reach any conclusion on the subject?
> >>>>>
> >>>>> It seems we are aiming for 2.7 after 2.6 and plan the major version
> >> bump
> >>>>> to 3.0 after 2.7 (assuming we make progress on ZK removal as
> planned?)
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>>
> >>>>> On 5/18/20 1:11 PM, Boyang Chen wrote:
> >>>>>> One more thing I would like to see deprecated (hopefully no one
> >> mentioned
> >>>>>> before) is the zk based consumer offset support.
> >>>>>>
> >>>>>> On Mon, May 11, 2020 at 2:15 PM Colin McCabe <cm...@apache.org>
> >> wrote:
> >>>>>>
> >>>>>>> Hi Michael,
> >>>>>>>
> >>>>>>> It would be better to discuss the background behind KIP-500 in a
> >> separate
> >>>>>>> thread, since this thread is about the Kafka 3.0 release.  As
> others
> >> have
> >>>>>>> said, your questions are answered in the KIP.  For example, "what
> is
> >> the
> >>>>>>> actual goal?" is addressed in the motivation section.
> >>>>>>>
> >>>>>>> I agree that Kafka's usage of Apache ZooKeeper could be optimized.
> >> But
> >>>>>>> there are fundamental limitations to this approach compared to
> >> storing our
> >>>>>>> metadata internally.  For example, having to contact a remote
> server
> >> to
> >>>>>>> reload all your metadata on a controller failover simply doesn't
> >> scale past
> >>>>>>> a certain point.
> >>>>>>>
> >>>>>>> Apache Curator is a nice API, and if we were starting again today
> we
> >> would
> >>>>>>> certainly consider using it.  But it doesn't allow us to do
> anything
> >> more
> >>>>>>> efficiently than ZooKeeper could already do it.
> >>>>>>>
> >>>>>>> Finally, Kafka's core competence is logs.  While our replication
> >> protocol
> >>>>>>> is not Raft, it shares many similarities with that protocol.  So I
> >> think
> >>>>>>> it's a bit unfair to say that it is "catastrophic hubris" to
> believe
> >> we can
> >>>>>>> implement the protocol.
> >>>>>>>
> >>>>>>> best,
> >>>>>>> Colin
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sun, May 10, 2020, at 11:02, Michael K. Edwards wrote:
> >>>>>>>> Yes, I've read the KIP.  But all it really says to me is "we have
> >> never
> >>>>>>>> gotten around to using ZooKeeper properly."  To the extent that
> any
> >> of
> >>>>>>> the
> >>>>>>>> distributed-state-maintenance problems discussed in "Metadata as
> an
> >> Event
> >>>>>>>> Log" can be solved — and some of them intrinsically can't, because
> >> CAP
> >>>>>>>> theorem — most of them are already implemented very effectively in
> >>>>>>> Curator
> >>>>>>>> recipes.  (For instance, Curator's Tree Cache
> >>>>>>>> https://curator.apache.org/curator-recipes/tree-cache.html is a
> >> good
> >>>>>>> fit to
> >>>>>>>> some of the state-maintenance needs.)
> >>>>>>>>
> >>>>>>>> Kafka does have some usage patterns that don't map neatly onto
> >> existing
> >>>>>>>> Curator recipes.  For instance, neither LeaderSelector nor
> >> LeaderLatch
> >>>>>>>> implements leader preference in the way that the existing Kafka
> >> partition
> >>>>>>>> leadership election procedure does.  But why not handle that by
> >> improving
> >>>>>>>> and extending Curator?  That way, other Curator users benefit, and
> >> we get
> >>>>>>>> additional highly experienced reviewers' eyes on the distributed
> >>>>>>>> algorithms, which are very very tricky to get right.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Sun, May 10, 2020 at 10:47 AM Ron Dagostino <rndgstn@gmail.com
> >
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Michael.  This is discussed in the KIP.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum#KIP-500:ReplaceZooKeeperwithaSelf-ManagedMetadataQuorum-Motivation
> >>>>>>>>>
> >>>>>>>>> Ron
> >>>>>>>>>
> >>>>>>>>>> On May 10, 2020, at 1:35 PM, Michael K. Edwards <
> >>>>>>> m.k.edwards@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> What is the actual goal of removing the ZooKeeper dependency?
> In
> >> my
> >>>>>>>>>> experience, if ZooKeeper is properly provisioned and deployed,
> >> it's
> >>>>>>>>> largely
> >>>>>>>>>> trouble-free.  (You do need to know how to use observers
> >> properly.)
> >>>>>>>>> There
> >>>>>>>>>> are some subtleties about timeouts and leadership changes, but
> >>>>>>> they're
> >>>>>>>>>> pretty small stuff.  Why go to all the trouble of building a new
> >>>>>>>>>> distributed-consensus system that's going to have catastrophic
> >> bugs
> >>>>>>> for
> >>>>>>>>>> years to come?  It seems like such an act of hubris to me, as
> well
> >>>>>>> as a
> >>>>>>>>>> massive waste of engineering effort.  What is there to be
> gained?
> >>>>>>>>>>
> >>>>>>>>>>> On Fri, May 8, 2020 at 4:11 PM Matthias J. Sax <
> mjsax@apache.org
> >>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Sure, we can compile a list for Kafka Streams. But the KIP
> would
> >> be
> >>>>>>> for
> >>>>>>>>>>> 3.0, so I don't think it's urgent to do it now?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>>> On 5/8/20 3:47 PM, Colin McCabe wrote:
> >>>>>>>>>>>> Thanks, Guozhang-- sounds like a good plan.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think it would be good to have a list of deprecated streams
> >> APIs
> >>>>>>> that
> >>>>>>>>>>> we want to remove in 3.0.  Maybe it's easiest to do that as its
> >> own
> >>>>>>> KIP?
> >>>>>>>>>>>>
> >>>>>>>>>>>> For MirrorMaker 1, we should have a KIP to deprecate its use
> in
> >>>>>>> 2.6 if
> >>>>>>>>>>> we want to remove it in 3.0.  I don't have a good sense of how
> >>>>>>>>> practical it
> >>>>>>>>>>> is to deprecate this now, so I will defer to others here.  But
> >> the
> >>>>>>> KIP
> >>>>>>>>>>> freeze for 2.6 is coming soon, so if we want to make the case,
> >> now
> >>>>>>> is
> >>>>>>>>> the
> >>>>>>>>>>> time.
> >>>>>>>>>>>>
> >>>>>>>>>>>> best,
> >>>>>>>>>>>> Colin
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, May 7, 2020, at 16:28, Guozhang Wang wrote:
> >>>>>>>>>>>>> Hey folks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Sorry for stating that the bridge release would not break any
> >>>>>>>>>>> compatibility
> >>>>>>>>>>>>> before, which is incorrect and confused many people.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I think one way to think about the versioning is that:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 0) In a 2.x version moving ahead we would deprecate the
> >>>>>>> ZK-dependent
> >>>>>>>>>>> tools
> >>>>>>>>>>>>> such as --zookeeper flags from various scripts (KIP-555)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1) In 3.0 we would at least make one incompatible change for
> >>>>>>> example
> >>>>>>>>> to
> >>>>>>>>>>>>> remove the deprecated ZK flags.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2) In a future major version (e.g. 4.0) we would drop ZK
> >> entirely,
> >>>>>>>>>>>>> including usages such as security credentials / broker
> >>>>>>> registration /
> >>>>>>>>>>> etc
> >>>>>>>>>>>>> which are via ZK today as well.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Then for the bridge release(s), it can be any or all of 3.x.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> For 1), I'd love to add a few more incompatibility changes in
> >> 3.0
> >>>>>>> from
> >>>>>>>>>>>>> Kafka Streams: we evolve Streams public APIs by deprecating
> and
> >>>>>>> then
> >>>>>>>>>>> remove
> >>>>>>>>>>>>> in major releases, and since 2.0 we've accumulated quite a
> few
> >>>>>>>>>>> deprecated
> >>>>>>>>>>>>> APIs, and I can compile a list of KIPs that contain those if
> >>>>>>> people
> >>>>>>>>> are
> >>>>>>>>>>>>> interested.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, May 7, 2020 at 3:53 PM Colin McCabe <
> >> cmccabe@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, May 6, 2020, at 21:33, Ryanne Dolan wrote:
> >>>>>>>>>>>>>>>> In fact, we know that the bridge release will involve at
> >> least
> >>>>>>> one
> >>>>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
> >>>>>>>>>>> --zookeeper
> >>>>>>>>>>>>>>>> flags in the command-line tools.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If the bridge release(s) and the subsequent post-ZK release
> >> are
> >>>>>>>>> _both_
> >>>>>>>>>>>>>>> breaking changes, I think we only have one option: the 3.x
> >> line
> >>>>>>> are
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> bridge release(s), and ZK is removed in 4.0, as suggested
> by
> >>>>>>> Andrew
> >>>>>>>>>>>>>>> Schofield.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Specifically:
> >>>>>>>>>>>>>>> - in order to _remove_ (not merely deprecate) the
> --zookeeper
> >>>>>>> args,
> >>>>>>>>> we
> >>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>> need a major release.
> >>>>>>>>>>>>>>> - in oder to drop support for ZK entirely (e.g. break a
> >> bunch of
> >>>>>>>>>>> external
> >>>>>>>>>>>>>>> tooling like Cruise Control), we will need a major release.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I count two major releases.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Ryanne,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I agree that dropping ZK completely will need a new major
> >> release
> >>>>>>>>> after
> >>>>>>>>>>>>>> 3.0.  I think that's OK and in keeping with how we've
> handled
> >>>>>>>>>>> deprecation
> >>>>>>>>>>>>>> and removal in the past.  It's important for users to have a
> >>>>>>> smooth
> >>>>>>>>>>> upgrade
> >>>>>>>>>>>>>> path.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> best,
> >>>>>>>>>>>>>> Colin
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ryanne
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> -
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, May 6, 2020 at 10:52 PM Colin McCabe <
> >>>>>>> cmccabe@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Mon, May 4, 2020, at 17:12, Ryanne Dolan wrote:
> >>>>>>>>>>>>>>>>> Hey Colin, I think we should wait until after KIP-500's
> >>>>>>> "bridge
> >>>>>>>>>>>>>>>>> release" so there is a clean break from Zookeeper after
> >> 3.0.
> >>>>>>> The
> >>>>>>>>>>>>>>>>> bridge release by definition is an attempt to not break
> >>>>>>> anything,
> >>>>>>>>> so
> >>>>>>>>>>>>>>>>> it theoretically doesn't warrant a major release.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Ryanne,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I think it's important to clarify this a little bit.  The
> >>>>>>> bridge
> >>>>>>>>>>>>>> release
> >>>>>>>>>>>>>>>> (really, releases, plural) allow you to upgrade from a
> >> cluster
> >>>>>>> that
> >>>>>>>>>>> is
> >>>>>>>>>>>>>>>> using ZooKeeper to one that is not using ZooKeeper.  But,
> >> that
> >>>>>>>>>>> doesn't
> >>>>>>>>>>>>>>>> imply that the bridge release itself doesn't break
> anything.
> >>>>>>>>>>> Upgrading
> >>>>>>>>>>>>>>>> to the bridge release itself might involve some minor
> >>>>>>>>>>> incompatibility.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Kafka does occasionally have incompatible changes.  In
> those
> >>>>>>> cases,
> >>>>>>>>>>> we
> >>>>>>>>>>>>>>>> bump the major version number.  One example is that when
> we
> >>>>>>> went
> >>>>>>>>> from
> >>>>>>>>>>>>>>>> Kafka 1.x to Kafka 2.0, we dropped support for JDK7.  This
> >> is
> >>>>>>> an
> >>>>>>>>>>>>>>>> incompatible change.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In fact, we know that the bridge release will involve at
> >> least
> >>>>>>> one
> >>>>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
> >>>>>>>>>>> --zookeeper
> >>>>>>>>>>>>>>>> flags in the command-line tools.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> We've been preparing for this change for a long time.
> >> People
> >>>>>>> have
> >>>>>>>>>>>>>> spent
> >>>>>>>>>>>>>>>> a lot of effort designing new APIs that can be used
> instead
> >> of
> >>>>>>> the
> >>>>>>>>>>> old
> >>>>>>>>>>>>>>>> zookeeper-based code that some of the command-line tools
> >>>>>>> used.  We
> >>>>>>>>>>> have
> >>>>>>>>>>>>>>>> also deprecated the old ZK-based flags.  But at the end of
> >> the
> >>>>>>> day,
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>> is still an incompatible change.  So it's unfortunately
> not
> >>>>>>>>> possible
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> bridge release to be a 2.x release.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If that's not the case (i.e. if a single "bridge release"
> >>>>>>> turns
> >>>>>>>>> out
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> be impractical), we should consider forking 3.0 while
> >>>>>>> maintaining
> >>>>>>>>> a
> >>>>>>>>>>>>>>>>> line of Zookeeper-dependent Kafka in 2.x. That way 3.x
> can
> >>>>>>> evolve
> >>>>>>>>>>>>>>>>> dramatically without breaking the 2.x line. In
> particular,
> >>>>>>>>> anything
> >>>>>>>>>>>>>>>>> related to removing Zookeeper could land in pre-3.0 while
> >>>>>>> every
> >>>>>>>>>>> other
> >>>>>>>>>>>>>>>>> feature targets 2.6.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Just to be super clear about this, what we want to do here
> >> is
> >>>>>>>>> support
> >>>>>>>>>>>>>>>> operating in __either__ KIP-500 mode and legacy mode for a
> >>>>>>> while.
> >>>>>>>>> So
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> same branch will have support for both the old way and the
> >> new
> >>>>>>> way
> >>>>>>>>> of
> >>>>>>>>>>>>>>>> managing metadata.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> This will allow us to get an "alpha" version of the
> KIP-500
> >>>>>>> mode
> >>>>>>>>> out
> >>>>>>>>>>>>>> early
> >>>>>>>>>>>>>>>> for people to experiment with.  It also greatly reduces
> the
> >>>>>>> number
> >>>>>>>>> of
> >>>>>>>>>>>>>> Kafka
> >>>>>>>>>>>>>>>> releases we have to make, and the amount of backporting we
> >>>>>>> have to
> >>>>>>>>>>> do.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If you are proposing 2.6 should be the "bridge release",
> I
> >>>>>>> think
> >>>>>>>>>>> this
> >>>>>>>>>>>>>>>>> is premature given Kafka's time-based release schedule.
> If
> >> the
> >>>>>>>>>>> bridge
> >>>>>>>>>>>>>>>>> features happen to be merged before 2.6's feature freeze,
> >> then
> >>>>>>>>> sure
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> let's make that the bridge release in retrospect. And if
> we
> >>>>>>> get
> >>>>>>>>> all
> >>>>>>>>>>>>>>>>> the post-Zookeeper features merged before 2.7, I'm
> onboard
> >>>>>>> with
> >>>>>>>>>>>>>> naming
> >>>>>>>>>>>>>>>>> it "3.0" instead.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> That said, we should aim to remove legacy MirrorMaker
> >> before
> >>>>>>> 3.0
> >>>>>>>>> as
> >>>>>>>>>>>>>>>>> well. I'm happy to drive that additional breaking change.
> >>>>>>> Maybe
> >>>>>>>>> 2.6
> >>>>>>>>>>>>>>>>> can be the "bridge" for MM2 as well.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I don't have a strong opinion either way about this, but
> if
> >> we
> >>>>>>> want
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> remove the original MirrorMaker, we have to deprecate it
> >> first,
> >>>>>>>>>>>>>> right?  Are
> >>>>>>>>>>>>>>>> we ready to do that?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> best,
> >>>>>>>>>>>>>>>> Colin
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Ryanne
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Mon, May 4, 2020, 5:05 PM Colin McCabe <
> >> cmccabe@apache.org
> >>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> We've had a few proposals recently for incompatible
> >>>>>>> changes.  One
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>> them is my KIP-604: Remove ZooKeeper Flags from the
> >>>>>>>>> Administrative
> >>>>>>>>>>>>>>>>>> Tools.  The other is Boyang's KIP-590: Redirect ZK
> >> Mutation
> >>>>>>>>>>>>>>>>>> Protocols to the Controller.  I think it's time to start
> >>>>>>> thinking
> >>>>>>>>>>>>>>>>>> about Kafka 3.0. Specifically, I think we should move to
> >> 3.0
> >>>>>>>>> after
> >>>>>>>>>>>>>>>>>> the 2.6 release.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> From the perspective of KIP-500, in Kafka 3.x we'd like
> to
> >>>>>>> make
> >>>>>>>>>>>>>>>>>> running in a ZooKeeper-less mode possible (but not yet
> the
> >>>>>>>>>>>>>> default.)
> >>>>>>>>>>>>>>>>>> This is the motivation behind KIP-590 and KIP-604, as
> >> well as
> >>>>>>>>> some
> >>>>>>>>>>>>>>>>>> of the other KIPs we've done recently.  Since it will
> take
> >>>>>>> some
> >>>>>>>>>>>>>> time
> >>>>>>>>>>>>>>>>>> to stabilize the new ZooKeeper-free Kafka code, we will
> >> hide
> >>>>>>> it
> >>>>>>>>>>>>>>>>>> behind an option initially. (We'll have a KIP describing
> >>>>>>> this all
> >>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> detail soon.)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> What does everyone think about having Kafka 3.0 come up
> >> next
> >>>>>>>>> after
> >>>>>>>>>>>>>>>>>> 2.6? Are there any other things we should change in the
> >> 2.6
> >>>>>>> ->
> >>>>>>>>> 3.0
> >>>>>>>>>>>>>>>>>> transition?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> best, Colin
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> Attachments:
> >>>>> * signature.asc
> >>
> >
>


-- 
-- Guozhang

Re: [DISCUSS] Kafka 3.0

Posted by "Matthias J. Sax" <mj...@apache.org>.
To move this forward, I took the liberty to create a PR to bump the
version to 3.0.0-SNAPSHOT

https://github.com/apache/kafka/pull/10186

Please let us know if there are any concerns.


-Matthias

On 2/16/21 5:18 PM, Ismael Juma wrote:
> I'm +1 on 3.0 for the mid year release.
> 
> On Tue, Feb 16, 2021 at 5:08 PM Matthias J. Sax <mj...@apache.org> wrote:
> 
>> Hi,
>>
>> given that we passed 2.8 feature freeze, I wanted to restart this
>> thread. Currently, `trunk` is at `2.9.0-SNAPSHOT` and I am wondering if
>> the decision for the 3.0 release is final and if we should bump the
>> version number?
>>
>> I am asking particularly because there a many Jiras with a 3.0 target
>> release version for breaking changes and we should ensure that we have
>> enough time to work on those tickets. -- As long as we don't agree that
>> the next release will indeed be 3.0, those tickets are effectively
>> blocked/pending.
>>
>> Thoughts?
>>
>>
>> -Matthias
>>
>>
>> On 10/15/20 4:28 PM, Matthias J. Sax wrote:
>>> Thanks for clarifying Colin. Works for me. Overall, 3.0 should be guided
>>> by the ZK removal progress and if we are not there yet, it's better to
>>> have a 2.8 first.
>>>
>>>
>>> -Matthias
>>>
>>>
>>> On 10/15/20 2:41 PM, Colin McCabe wrote:
>>>> Hi all,
>>>>
>>>> Just to follow up on this... since we're not quite ready for 3.0 yet,
>> it's probably best if we release a 2.8 next, and then go to 3.0 after
>> that.  Sorry for any confusion.
>>>>
>>>> best,
>>>> Colin
>>>>
>>>>
>>>> On Mon, Jul 20, 2020, at 12:52, Matthias J. Sax wrote:
>>>>> Did we reach any conclusion on the subject?
>>>>>
>>>>> It seems we are aiming for 2.7 after 2.6 and plan the major version
>> bump
>>>>> to 3.0 after 2.7 (assuming we make progress on ZK removal as planned?)
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>>
>>>>> On 5/18/20 1:11 PM, Boyang Chen wrote:
>>>>>> One more thing I would like to see deprecated (hopefully no one
>> mentioned
>>>>>> before) is the zk based consumer offset support.
>>>>>>
>>>>>> On Mon, May 11, 2020 at 2:15 PM Colin McCabe <cm...@apache.org>
>> wrote:
>>>>>>
>>>>>>> Hi Michael,
>>>>>>>
>>>>>>> It would be better to discuss the background behind KIP-500 in a
>> separate
>>>>>>> thread, since this thread is about the Kafka 3.0 release.  As others
>> have
>>>>>>> said, your questions are answered in the KIP.  For example, "what is
>> the
>>>>>>> actual goal?" is addressed in the motivation section.
>>>>>>>
>>>>>>> I agree that Kafka's usage of Apache ZooKeeper could be optimized.
>> But
>>>>>>> there are fundamental limitations to this approach compared to
>> storing our
>>>>>>> metadata internally.  For example, having to contact a remote server
>> to
>>>>>>> reload all your metadata on a controller failover simply doesn't
>> scale past
>>>>>>> a certain point.
>>>>>>>
>>>>>>> Apache Curator is a nice API, and if we were starting again today we
>> would
>>>>>>> certainly consider using it.  But it doesn't allow us to do anything
>> more
>>>>>>> efficiently than ZooKeeper could already do it.
>>>>>>>
>>>>>>> Finally, Kafka's core competence is logs.  While our replication
>> protocol
>>>>>>> is not Raft, it shares many similarities with that protocol.  So I
>> think
>>>>>>> it's a bit unfair to say that it is "catastrophic hubris" to believe
>> we can
>>>>>>> implement the protocol.
>>>>>>>
>>>>>>> best,
>>>>>>> Colin
>>>>>>>
>>>>>>>
>>>>>>> On Sun, May 10, 2020, at 11:02, Michael K. Edwards wrote:
>>>>>>>> Yes, I've read the KIP.  But all it really says to me is "we have
>> never
>>>>>>>> gotten around to using ZooKeeper properly."  To the extent that any
>> of
>>>>>>> the
>>>>>>>> distributed-state-maintenance problems discussed in "Metadata as an
>> Event
>>>>>>>> Log" can be solved — and some of them intrinsically can't, because
>> CAP
>>>>>>>> theorem — most of them are already implemented very effectively in
>>>>>>> Curator
>>>>>>>> recipes.  (For instance, Curator's Tree Cache
>>>>>>>> https://curator.apache.org/curator-recipes/tree-cache.html is a
>> good
>>>>>>> fit to
>>>>>>>> some of the state-maintenance needs.)
>>>>>>>>
>>>>>>>> Kafka does have some usage patterns that don't map neatly onto
>> existing
>>>>>>>> Curator recipes.  For instance, neither LeaderSelector nor
>> LeaderLatch
>>>>>>>> implements leader preference in the way that the existing Kafka
>> partition
>>>>>>>> leadership election procedure does.  But why not handle that by
>> improving
>>>>>>>> and extending Curator?  That way, other Curator users benefit, and
>> we get
>>>>>>>> additional highly experienced reviewers' eyes on the distributed
>>>>>>>> algorithms, which are very very tricky to get right.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, May 10, 2020 at 10:47 AM Ron Dagostino <rn...@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Michael.  This is discussed in the KIP.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum#KIP-500:ReplaceZooKeeperwithaSelf-ManagedMetadataQuorum-Motivation
>>>>>>>>>
>>>>>>>>> Ron
>>>>>>>>>
>>>>>>>>>> On May 10, 2020, at 1:35 PM, Michael K. Edwards <
>>>>>>> m.k.edwards@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> What is the actual goal of removing the ZooKeeper dependency?  In
>> my
>>>>>>>>>> experience, if ZooKeeper is properly provisioned and deployed,
>> it's
>>>>>>>>> largely
>>>>>>>>>> trouble-free.  (You do need to know how to use observers
>> properly.)
>>>>>>>>> There
>>>>>>>>>> are some subtleties about timeouts and leadership changes, but
>>>>>>> they're
>>>>>>>>>> pretty small stuff.  Why go to all the trouble of building a new
>>>>>>>>>> distributed-consensus system that's going to have catastrophic
>> bugs
>>>>>>> for
>>>>>>>>>> years to come?  It seems like such an act of hubris to me, as well
>>>>>>> as a
>>>>>>>>>> massive waste of engineering effort.  What is there to be gained?
>>>>>>>>>>
>>>>>>>>>>> On Fri, May 8, 2020 at 4:11 PM Matthias J. Sax <mjsax@apache.org
>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Sure, we can compile a list for Kafka Streams. But the KIP would
>> be
>>>>>>> for
>>>>>>>>>>> 3.0, so I don't think it's urgent to do it now?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -Matthias
>>>>>>>>>>>
>>>>>>>>>>>> On 5/8/20 3:47 PM, Colin McCabe wrote:
>>>>>>>>>>>> Thanks, Guozhang-- sounds like a good plan.
>>>>>>>>>>>>
>>>>>>>>>>>> I think it would be good to have a list of deprecated streams
>> APIs
>>>>>>> that
>>>>>>>>>>> we want to remove in 3.0.  Maybe it's easiest to do that as its
>> own
>>>>>>> KIP?
>>>>>>>>>>>>
>>>>>>>>>>>> For MirrorMaker 1, we should have a KIP to deprecate its use in
>>>>>>> 2.6 if
>>>>>>>>>>> we want to remove it in 3.0.  I don't have a good sense of how
>>>>>>>>> practical it
>>>>>>>>>>> is to deprecate this now, so I will defer to others here.  But
>> the
>>>>>>> KIP
>>>>>>>>>>> freeze for 2.6 is coming soon, so if we want to make the case,
>> now
>>>>>>> is
>>>>>>>>> the
>>>>>>>>>>> time.
>>>>>>>>>>>>
>>>>>>>>>>>> best,
>>>>>>>>>>>> Colin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, May 7, 2020, at 16:28, Guozhang Wang wrote:
>>>>>>>>>>>>> Hey folks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry for stating that the bridge release would not break any
>>>>>>>>>>> compatibility
>>>>>>>>>>>>> before, which is incorrect and confused many people.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think one way to think about the versioning is that:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 0) In a 2.x version moving ahead we would deprecate the
>>>>>>> ZK-dependent
>>>>>>>>>>> tools
>>>>>>>>>>>>> such as --zookeeper flags from various scripts (KIP-555)
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) In 3.0 we would at least make one incompatible change for
>>>>>>> example
>>>>>>>>> to
>>>>>>>>>>>>> remove the deprecated ZK flags.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2) In a future major version (e.g. 4.0) we would drop ZK
>> entirely,
>>>>>>>>>>>>> including usages such as security credentials / broker
>>>>>>> registration /
>>>>>>>>>>> etc
>>>>>>>>>>>>> which are via ZK today as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then for the bridge release(s), it can be any or all of 3.x.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> For 1), I'd love to add a few more incompatibility changes in
>> 3.0
>>>>>>> from
>>>>>>>>>>>>> Kafka Streams: we evolve Streams public APIs by deprecating and
>>>>>>> then
>>>>>>>>>>> remove
>>>>>>>>>>>>> in major releases, and since 2.0 we've accumulated quite a few
>>>>>>>>>>> deprecated
>>>>>>>>>>>>> APIs, and I can compile a list of KIPs that contain those if
>>>>>>> people
>>>>>>>>> are
>>>>>>>>>>>>> interested.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, May 7, 2020 at 3:53 PM Colin McCabe <
>> cmccabe@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, May 6, 2020, at 21:33, Ryanne Dolan wrote:
>>>>>>>>>>>>>>>> In fact, we know that the bridge release will involve at
>> least
>>>>>>> one
>>>>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
>>>>>>>>>>> --zookeeper
>>>>>>>>>>>>>>>> flags in the command-line tools.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If the bridge release(s) and the subsequent post-ZK release
>> are
>>>>>>>>> _both_
>>>>>>>>>>>>>>> breaking changes, I think we only have one option: the 3.x
>> line
>>>>>>> are
>>>>>>>>>>> the
>>>>>>>>>>>>>>> bridge release(s), and ZK is removed in 4.0, as suggested by
>>>>>>> Andrew
>>>>>>>>>>>>>>> Schofield.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Specifically:
>>>>>>>>>>>>>>> - in order to _remove_ (not merely deprecate) the --zookeeper
>>>>>>> args,
>>>>>>>>> we
>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>> need a major release.
>>>>>>>>>>>>>>> - in oder to drop support for ZK entirely (e.g. break a
>> bunch of
>>>>>>>>>>> external
>>>>>>>>>>>>>>> tooling like Cruise Control), we will need a major release.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I count two major releases.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Ryanne,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree that dropping ZK completely will need a new major
>> release
>>>>>>>>> after
>>>>>>>>>>>>>> 3.0.  I think that's OK and in keeping with how we've handled
>>>>>>>>>>> deprecation
>>>>>>>>>>>>>> and removal in the past.  It's important for users to have a
>>>>>>> smooth
>>>>>>>>>>> upgrade
>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> best,
>>>>>>>>>>>>>> Colin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ryanne
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, May 6, 2020 at 10:52 PM Colin McCabe <
>>>>>>> cmccabe@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, May 4, 2020, at 17:12, Ryanne Dolan wrote:
>>>>>>>>>>>>>>>>> Hey Colin, I think we should wait until after KIP-500's
>>>>>>> "bridge
>>>>>>>>>>>>>>>>> release" so there is a clean break from Zookeeper after
>> 3.0.
>>>>>>> The
>>>>>>>>>>>>>>>>> bridge release by definition is an attempt to not break
>>>>>>> anything,
>>>>>>>>> so
>>>>>>>>>>>>>>>>> it theoretically doesn't warrant a major release.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Ryanne,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think it's important to clarify this a little bit.  The
>>>>>>> bridge
>>>>>>>>>>>>>> release
>>>>>>>>>>>>>>>> (really, releases, plural) allow you to upgrade from a
>> cluster
>>>>>>> that
>>>>>>>>>>> is
>>>>>>>>>>>>>>>> using ZooKeeper to one that is not using ZooKeeper.  But,
>> that
>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>> imply that the bridge release itself doesn't break anything.
>>>>>>>>>>> Upgrading
>>>>>>>>>>>>>>>> to the bridge release itself might involve some minor
>>>>>>>>>>> incompatibility.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kafka does occasionally have incompatible changes.  In those
>>>>>>> cases,
>>>>>>>>>>> we
>>>>>>>>>>>>>>>> bump the major version number.  One example is that when we
>>>>>>> went
>>>>>>>>> from
>>>>>>>>>>>>>>>> Kafka 1.x to Kafka 2.0, we dropped support for JDK7.  This
>> is
>>>>>>> an
>>>>>>>>>>>>>>>> incompatible change.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In fact, we know that the bridge release will involve at
>> least
>>>>>>> one
>>>>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
>>>>>>>>>>> --zookeeper
>>>>>>>>>>>>>>>> flags in the command-line tools.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We've been preparing for this change for a long time.
>> People
>>>>>>> have
>>>>>>>>>>>>>> spent
>>>>>>>>>>>>>>>> a lot of effort designing new APIs that can be used instead
>> of
>>>>>>> the
>>>>>>>>>>> old
>>>>>>>>>>>>>>>> zookeeper-based code that some of the command-line tools
>>>>>>> used.  We
>>>>>>>>>>> have
>>>>>>>>>>>>>>>> also deprecated the old ZK-based flags.  But at the end of
>> the
>>>>>>> day,
>>>>>>>>>>> it
>>>>>>>>>>>>>>>> is still an incompatible change.  So it's unfortunately not
>>>>>>>>> possible
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> bridge release to be a 2.x release.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If that's not the case (i.e. if a single "bridge release"
>>>>>>> turns
>>>>>>>>> out
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> be impractical), we should consider forking 3.0 while
>>>>>>> maintaining
>>>>>>>>> a
>>>>>>>>>>>>>>>>> line of Zookeeper-dependent Kafka in 2.x. That way 3.x can
>>>>>>> evolve
>>>>>>>>>>>>>>>>> dramatically without breaking the 2.x line. In particular,
>>>>>>>>> anything
>>>>>>>>>>>>>>>>> related to removing Zookeeper could land in pre-3.0 while
>>>>>>> every
>>>>>>>>>>> other
>>>>>>>>>>>>>>>>> feature targets 2.6.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Just to be super clear about this, what we want to do here
>> is
>>>>>>>>> support
>>>>>>>>>>>>>>>> operating in __either__ KIP-500 mode and legacy mode for a
>>>>>>> while.
>>>>>>>>> So
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> same branch will have support for both the old way and the
>> new
>>>>>>> way
>>>>>>>>> of
>>>>>>>>>>>>>>>> managing metadata.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This will allow us to get an "alpha" version of the KIP-500
>>>>>>> mode
>>>>>>>>> out
>>>>>>>>>>>>>> early
>>>>>>>>>>>>>>>> for people to experiment with.  It also greatly reduces the
>>>>>>> number
>>>>>>>>> of
>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>> releases we have to make, and the amount of backporting we
>>>>>>> have to
>>>>>>>>>>> do.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you are proposing 2.6 should be the "bridge release", I
>>>>>>> think
>>>>>>>>>>> this
>>>>>>>>>>>>>>>>> is premature given Kafka's time-based release schedule. If
>> the
>>>>>>>>>>> bridge
>>>>>>>>>>>>>>>>> features happen to be merged before 2.6's feature freeze,
>> then
>>>>>>>>> sure
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> let's make that the bridge release in retrospect. And if we
>>>>>>> get
>>>>>>>>> all
>>>>>>>>>>>>>>>>> the post-Zookeeper features merged before 2.7, I'm onboard
>>>>>>> with
>>>>>>>>>>>>>> naming
>>>>>>>>>>>>>>>>> it "3.0" instead.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> That said, we should aim to remove legacy MirrorMaker
>> before
>>>>>>> 3.0
>>>>>>>>> as
>>>>>>>>>>>>>>>>> well. I'm happy to drive that additional breaking change.
>>>>>>> Maybe
>>>>>>>>> 2.6
>>>>>>>>>>>>>>>>> can be the "bridge" for MM2 as well.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I don't have a strong opinion either way about this, but if
>> we
>>>>>>> want
>>>>>>>>>>> to
>>>>>>>>>>>>>>>> remove the original MirrorMaker, we have to deprecate it
>> first,
>>>>>>>>>>>>>> right?  Are
>>>>>>>>>>>>>>>> we ready to do that?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> best,
>>>>>>>>>>>>>>>> Colin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Ryanne
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, May 4, 2020, 5:05 PM Colin McCabe <
>> cmccabe@apache.org
>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> We've had a few proposals recently for incompatible
>>>>>>> changes.  One
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> them is my KIP-604: Remove ZooKeeper Flags from the
>>>>>>>>> Administrative
>>>>>>>>>>>>>>>>>> Tools.  The other is Boyang's KIP-590: Redirect ZK
>> Mutation
>>>>>>>>>>>>>>>>>> Protocols to the Controller.  I think it's time to start
>>>>>>> thinking
>>>>>>>>>>>>>>>>>> about Kafka 3.0. Specifically, I think we should move to
>> 3.0
>>>>>>>>> after
>>>>>>>>>>>>>>>>>> the 2.6 release.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> From the perspective of KIP-500, in Kafka 3.x we'd like to
>>>>>>> make
>>>>>>>>>>>>>>>>>> running in a ZooKeeper-less mode possible (but not yet the
>>>>>>>>>>>>>> default.)
>>>>>>>>>>>>>>>>>> This is the motivation behind KIP-590 and KIP-604, as
>> well as
>>>>>>>>> some
>>>>>>>>>>>>>>>>>> of the other KIPs we've done recently.  Since it will take
>>>>>>> some
>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>>> to stabilize the new ZooKeeper-free Kafka code, we will
>> hide
>>>>>>> it
>>>>>>>>>>>>>>>>>> behind an option initially. (We'll have a KIP describing
>>>>>>> this all
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> detail soon.)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> What does everyone think about having Kafka 3.0 come up
>> next
>>>>>>>>> after
>>>>>>>>>>>>>>>>>> 2.6? Are there any other things we should change in the
>> 2.6
>>>>>>> ->
>>>>>>>>> 3.0
>>>>>>>>>>>>>>>>>> transition?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> best, Colin
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> Attachments:
>>>>> * signature.asc
>>
> 

Re: [DISCUSS] Kafka 3.0

Posted by Ismael Juma <is...@juma.me.uk>.
I'm +1 on 3.0 for the mid year release.

On Tue, Feb 16, 2021 at 5:08 PM Matthias J. Sax <mj...@apache.org> wrote:

> Hi,
>
> given that we passed 2.8 feature freeze, I wanted to restart this
> thread. Currently, `trunk` is at `2.9.0-SNAPSHOT` and I am wondering if
> the decision for the 3.0 release is final and if we should bump the
> version number?
>
> I am asking particularly because there a many Jiras with a 3.0 target
> release version for breaking changes and we should ensure that we have
> enough time to work on those tickets. -- As long as we don't agree that
> the next release will indeed be 3.0, those tickets are effectively
> blocked/pending.
>
> Thoughts?
>
>
> -Matthias
>
>
> On 10/15/20 4:28 PM, Matthias J. Sax wrote:
> > Thanks for clarifying Colin. Works for me. Overall, 3.0 should be guided
> > by the ZK removal progress and if we are not there yet, it's better to
> > have a 2.8 first.
> >
> >
> > -Matthias
> >
> >
> > On 10/15/20 2:41 PM, Colin McCabe wrote:
> >> Hi all,
> >>
> >> Just to follow up on this... since we're not quite ready for 3.0 yet,
> it's probably best if we release a 2.8 next, and then go to 3.0 after
> that.  Sorry for any confusion.
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Mon, Jul 20, 2020, at 12:52, Matthias J. Sax wrote:
> >>> Did we reach any conclusion on the subject?
> >>>
> >>> It seems we are aiming for 2.7 after 2.6 and plan the major version
> bump
> >>> to 3.0 after 2.7 (assuming we make progress on ZK removal as planned?)
> >>>
> >>>
> >>> -Matthias
> >>>
> >>>
> >>> On 5/18/20 1:11 PM, Boyang Chen wrote:
> >>>> One more thing I would like to see deprecated (hopefully no one
> mentioned
> >>>> before) is the zk based consumer offset support.
> >>>>
> >>>> On Mon, May 11, 2020 at 2:15 PM Colin McCabe <cm...@apache.org>
> wrote:
> >>>>
> >>>>> Hi Michael,
> >>>>>
> >>>>> It would be better to discuss the background behind KIP-500 in a
> separate
> >>>>> thread, since this thread is about the Kafka 3.0 release.  As others
> have
> >>>>> said, your questions are answered in the KIP.  For example, "what is
> the
> >>>>> actual goal?" is addressed in the motivation section.
> >>>>>
> >>>>> I agree that Kafka's usage of Apache ZooKeeper could be optimized.
> But
> >>>>> there are fundamental limitations to this approach compared to
> storing our
> >>>>> metadata internally.  For example, having to contact a remote server
> to
> >>>>> reload all your metadata on a controller failover simply doesn't
> scale past
> >>>>> a certain point.
> >>>>>
> >>>>> Apache Curator is a nice API, and if we were starting again today we
> would
> >>>>> certainly consider using it.  But it doesn't allow us to do anything
> more
> >>>>> efficiently than ZooKeeper could already do it.
> >>>>>
> >>>>> Finally, Kafka's core competence is logs.  While our replication
> protocol
> >>>>> is not Raft, it shares many similarities with that protocol.  So I
> think
> >>>>> it's a bit unfair to say that it is "catastrophic hubris" to believe
> we can
> >>>>> implement the protocol.
> >>>>>
> >>>>> best,
> >>>>> Colin
> >>>>>
> >>>>>
> >>>>> On Sun, May 10, 2020, at 11:02, Michael K. Edwards wrote:
> >>>>>> Yes, I've read the KIP.  But all it really says to me is "we have
> never
> >>>>>> gotten around to using ZooKeeper properly."  To the extent that any
> of
> >>>>> the
> >>>>>> distributed-state-maintenance problems discussed in "Metadata as an
> Event
> >>>>>> Log" can be solved — and some of them intrinsically can't, because
> CAP
> >>>>>> theorem — most of them are already implemented very effectively in
> >>>>> Curator
> >>>>>> recipes.  (For instance, Curator's Tree Cache
> >>>>>> https://curator.apache.org/curator-recipes/tree-cache.html is a
> good
> >>>>> fit to
> >>>>>> some of the state-maintenance needs.)
> >>>>>>
> >>>>>> Kafka does have some usage patterns that don't map neatly onto
> existing
> >>>>>> Curator recipes.  For instance, neither LeaderSelector nor
> LeaderLatch
> >>>>>> implements leader preference in the way that the existing Kafka
> partition
> >>>>>> leadership election procedure does.  But why not handle that by
> improving
> >>>>>> and extending Curator?  That way, other Curator users benefit, and
> we get
> >>>>>> additional highly experienced reviewers' eyes on the distributed
> >>>>>> algorithms, which are very very tricky to get right.
> >>>>>>
> >>>>>>
> >>>>>> On Sun, May 10, 2020 at 10:47 AM Ron Dagostino <rn...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi Michael.  This is discussed in the KIP.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum#KIP-500:ReplaceZooKeeperwithaSelf-ManagedMetadataQuorum-Motivation
> >>>>>>>
> >>>>>>> Ron
> >>>>>>>
> >>>>>>>> On May 10, 2020, at 1:35 PM, Michael K. Edwards <
> >>>>> m.k.edwards@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> What is the actual goal of removing the ZooKeeper dependency?  In
> my
> >>>>>>>> experience, if ZooKeeper is properly provisioned and deployed,
> it's
> >>>>>>> largely
> >>>>>>>> trouble-free.  (You do need to know how to use observers
> properly.)
> >>>>>>> There
> >>>>>>>> are some subtleties about timeouts and leadership changes, but
> >>>>> they're
> >>>>>>>> pretty small stuff.  Why go to all the trouble of building a new
> >>>>>>>> distributed-consensus system that's going to have catastrophic
> bugs
> >>>>> for
> >>>>>>>> years to come?  It seems like such an act of hubris to me, as well
> >>>>> as a
> >>>>>>>> massive waste of engineering effort.  What is there to be gained?
> >>>>>>>>
> >>>>>>>>> On Fri, May 8, 2020 at 4:11 PM Matthias J. Sax <mjsax@apache.org
> >
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Sure, we can compile a list for Kafka Streams. But the KIP would
> be
> >>>>> for
> >>>>>>>>> 3.0, so I don't think it's urgent to do it now?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> -Matthias
> >>>>>>>>>
> >>>>>>>>>> On 5/8/20 3:47 PM, Colin McCabe wrote:
> >>>>>>>>>> Thanks, Guozhang-- sounds like a good plan.
> >>>>>>>>>>
> >>>>>>>>>> I think it would be good to have a list of deprecated streams
> APIs
> >>>>> that
> >>>>>>>>> we want to remove in 3.0.  Maybe it's easiest to do that as its
> own
> >>>>> KIP?
> >>>>>>>>>>
> >>>>>>>>>> For MirrorMaker 1, we should have a KIP to deprecate its use in
> >>>>> 2.6 if
> >>>>>>>>> we want to remove it in 3.0.  I don't have a good sense of how
> >>>>>>> practical it
> >>>>>>>>> is to deprecate this now, so I will defer to others here.  But
> the
> >>>>> KIP
> >>>>>>>>> freeze for 2.6 is coming soon, so if we want to make the case,
> now
> >>>>> is
> >>>>>>> the
> >>>>>>>>> time.
> >>>>>>>>>>
> >>>>>>>>>> best,
> >>>>>>>>>> Colin
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Thu, May 7, 2020, at 16:28, Guozhang Wang wrote:
> >>>>>>>>>>> Hey folks,
> >>>>>>>>>>>
> >>>>>>>>>>> Sorry for stating that the bridge release would not break any
> >>>>>>>>> compatibility
> >>>>>>>>>>> before, which is incorrect and confused many people.
> >>>>>>>>>>>
> >>>>>>>>>>> I think one way to think about the versioning is that:
> >>>>>>>>>>>
> >>>>>>>>>>> 0) In a 2.x version moving ahead we would deprecate the
> >>>>> ZK-dependent
> >>>>>>>>> tools
> >>>>>>>>>>> such as --zookeeper flags from various scripts (KIP-555)
> >>>>>>>>>>>
> >>>>>>>>>>> 1) In 3.0 we would at least make one incompatible change for
> >>>>> example
> >>>>>>> to
> >>>>>>>>>>> remove the deprecated ZK flags.
> >>>>>>>>>>>
> >>>>>>>>>>> 2) In a future major version (e.g. 4.0) we would drop ZK
> entirely,
> >>>>>>>>>>> including usages such as security credentials / broker
> >>>>> registration /
> >>>>>>>>> etc
> >>>>>>>>>>> which are via ZK today as well.
> >>>>>>>>>>>
> >>>>>>>>>>> Then for the bridge release(s), it can be any or all of 3.x.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> For 1), I'd love to add a few more incompatibility changes in
> 3.0
> >>>>> from
> >>>>>>>>>>> Kafka Streams: we evolve Streams public APIs by deprecating and
> >>>>> then
> >>>>>>>>> remove
> >>>>>>>>>>> in major releases, and since 2.0 we've accumulated quite a few
> >>>>>>>>> deprecated
> >>>>>>>>>>> APIs, and I can compile a list of KIPs that contain those if
> >>>>> people
> >>>>>>> are
> >>>>>>>>>>> interested.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Guozhang
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> On Thu, May 7, 2020 at 3:53 PM Colin McCabe <
> cmccabe@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, May 6, 2020, at 21:33, Ryanne Dolan wrote:
> >>>>>>>>>>>>>> In fact, we know that the bridge release will involve at
> least
> >>>>> one
> >>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
> >>>>>>>>> --zookeeper
> >>>>>>>>>>>>>> flags in the command-line tools.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If the bridge release(s) and the subsequent post-ZK release
> are
> >>>>>>> _both_
> >>>>>>>>>>>>> breaking changes, I think we only have one option: the 3.x
> line
> >>>>> are
> >>>>>>>>> the
> >>>>>>>>>>>>> bridge release(s), and ZK is removed in 4.0, as suggested by
> >>>>> Andrew
> >>>>>>>>>>>>> Schofield.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Specifically:
> >>>>>>>>>>>>> - in order to _remove_ (not merely deprecate) the --zookeeper
> >>>>> args,
> >>>>>>> we
> >>>>>>>>>>>> will
> >>>>>>>>>>>>> need a major release.
> >>>>>>>>>>>>> - in oder to drop support for ZK entirely (e.g. break a
> bunch of
> >>>>>>>>> external
> >>>>>>>>>>>>> tooling like Cruise Control), we will need a major release.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I count two major releases.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Ryanne,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I agree that dropping ZK completely will need a new major
> release
> >>>>>>> after
> >>>>>>>>>>>> 3.0.  I think that's OK and in keeping with how we've handled
> >>>>>>>>> deprecation
> >>>>>>>>>>>> and removal in the past.  It's important for users to have a
> >>>>> smooth
> >>>>>>>>> upgrade
> >>>>>>>>>>>> path.
> >>>>>>>>>>>>
> >>>>>>>>>>>> best,
> >>>>>>>>>>>> Colin
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Ryanne
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, May 6, 2020 at 10:52 PM Colin McCabe <
> >>>>> cmccabe@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, May 4, 2020, at 17:12, Ryanne Dolan wrote:
> >>>>>>>>>>>>>>> Hey Colin, I think we should wait until after KIP-500's
> >>>>> "bridge
> >>>>>>>>>>>>>>> release" so there is a clean break from Zookeeper after
> 3.0.
> >>>>> The
> >>>>>>>>>>>>>>> bridge release by definition is an attempt to not break
> >>>>> anything,
> >>>>>>> so
> >>>>>>>>>>>>>>> it theoretically doesn't warrant a major release.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Ryanne,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I think it's important to clarify this a little bit.  The
> >>>>> bridge
> >>>>>>>>>>>> release
> >>>>>>>>>>>>>> (really, releases, plural) allow you to upgrade from a
> cluster
> >>>>> that
> >>>>>>>>> is
> >>>>>>>>>>>>>> using ZooKeeper to one that is not using ZooKeeper.  But,
> that
> >>>>>>>>> doesn't
> >>>>>>>>>>>>>> imply that the bridge release itself doesn't break anything.
> >>>>>>>>> Upgrading
> >>>>>>>>>>>>>> to the bridge release itself might involve some minor
> >>>>>>>>> incompatibility.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Kafka does occasionally have incompatible changes.  In those
> >>>>> cases,
> >>>>>>>>> we
> >>>>>>>>>>>>>> bump the major version number.  One example is that when we
> >>>>> went
> >>>>>>> from
> >>>>>>>>>>>>>> Kafka 1.x to Kafka 2.0, we dropped support for JDK7.  This
> is
> >>>>> an
> >>>>>>>>>>>>>> incompatible change.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In fact, we know that the bridge release will involve at
> least
> >>>>> one
> >>>>>>>>>>>>>> incompatible change.  We will need to drop support for the
> >>>>>>>>> --zookeeper
> >>>>>>>>>>>>>> flags in the command-line tools.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We've been preparing for this change for a long time.
> People
> >>>>> have
> >>>>>>>>>>>> spent
> >>>>>>>>>>>>>> a lot of effort designing new APIs that can be used instead
> of
> >>>>> the
> >>>>>>>>> old
> >>>>>>>>>>>>>> zookeeper-based code that some of the command-line tools
> >>>>> used.  We
> >>>>>>>>> have
> >>>>>>>>>>>>>> also deprecated the old ZK-based flags.  But at the end of
> the
> >>>>> day,
> >>>>>>>>> it
> >>>>>>>>>>>>>> is still an incompatible change.  So it's unfortunately not
> >>>>>>> possible
> >>>>>>>>>>>> for
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> bridge release to be a 2.x release.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If that's not the case (i.e. if a single "bridge release"
> >>>>> turns
> >>>>>>> out
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>>> be impractical), we should consider forking 3.0 while
> >>>>> maintaining
> >>>>>>> a
> >>>>>>>>>>>>>>> line of Zookeeper-dependent Kafka in 2.x. That way 3.x can
> >>>>> evolve
> >>>>>>>>>>>>>>> dramatically without breaking the 2.x line. In particular,
> >>>>>>> anything
> >>>>>>>>>>>>>>> related to removing Zookeeper could land in pre-3.0 while
> >>>>> every
> >>>>>>>>> other
> >>>>>>>>>>>>>>> feature targets 2.6.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Just to be super clear about this, what we want to do here
> is
> >>>>>>> support
> >>>>>>>>>>>>>> operating in __either__ KIP-500 mode and legacy mode for a
> >>>>> while.
> >>>>>>> So
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> same branch will have support for both the old way and the
> new
> >>>>> way
> >>>>>>> of
> >>>>>>>>>>>>>> managing metadata.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This will allow us to get an "alpha" version of the KIP-500
> >>>>> mode
> >>>>>>> out
> >>>>>>>>>>>> early
> >>>>>>>>>>>>>> for people to experiment with.  It also greatly reduces the
> >>>>> number
> >>>>>>> of
> >>>>>>>>>>>> Kafka
> >>>>>>>>>>>>>> releases we have to make, and the amount of backporting we
> >>>>> have to
> >>>>>>>>> do.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If you are proposing 2.6 should be the "bridge release", I
> >>>>> think
> >>>>>>>>> this
> >>>>>>>>>>>>>>> is premature given Kafka's time-based release schedule. If
> the
> >>>>>>>>> bridge
> >>>>>>>>>>>>>>> features happen to be merged before 2.6's feature freeze,
> then
> >>>>>>> sure
> >>>>>>>>>>>> --
> >>>>>>>>>>>>>>> let's make that the bridge release in retrospect. And if we
> >>>>> get
> >>>>>>> all
> >>>>>>>>>>>>>>> the post-Zookeeper features merged before 2.7, I'm onboard
> >>>>> with
> >>>>>>>>>>>> naming
> >>>>>>>>>>>>>>> it "3.0" instead.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> That said, we should aim to remove legacy MirrorMaker
> before
> >>>>> 3.0
> >>>>>>> as
> >>>>>>>>>>>>>>> well. I'm happy to drive that additional breaking change.
> >>>>> Maybe
> >>>>>>> 2.6
> >>>>>>>>>>>>>>> can be the "bridge" for MM2 as well.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I don't have a strong opinion either way about this, but if
> we
> >>>>> want
> >>>>>>>>> to
> >>>>>>>>>>>>>> remove the original MirrorMaker, we have to deprecate it
> first,
> >>>>>>>>>>>> right?  Are
> >>>>>>>>>>>>>> we ready to do that?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> best,
> >>>>>>>>>>>>>> Colin
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ryanne
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, May 4, 2020, 5:05 PM Colin McCabe <
> cmccabe@apache.org
> >>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> We've had a few proposals recently for incompatible
> >>>>> changes.  One
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>>>> them is my KIP-604: Remove ZooKeeper Flags from the
> >>>>>>> Administrative
> >>>>>>>>>>>>>>>> Tools.  The other is Boyang's KIP-590: Redirect ZK
> Mutation
> >>>>>>>>>>>>>>>> Protocols to the Controller.  I think it's time to start
> >>>>> thinking
> >>>>>>>>>>>>>>>> about Kafka 3.0. Specifically, I think we should move to
> 3.0
> >>>>>>> after
> >>>>>>>>>>>>>>>> the 2.6 release.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> From the perspective of KIP-500, in Kafka 3.x we'd like to
> >>>>> make
> >>>>>>>>>>>>>>>> running in a ZooKeeper-less mode possible (but not yet the
> >>>>>>>>>>>> default.)
> >>>>>>>>>>>>>>>> This is the motivation behind KIP-590 and KIP-604, as
> well as
> >>>>>>> some
> >>>>>>>>>>>>>>>> of the other KIPs we've done recently.  Since it will take
> >>>>> some
> >>>>>>>>>>>> time
> >>>>>>>>>>>>>>>> to stabilize the new ZooKeeper-free Kafka code, we will
> hide
> >>>>> it
> >>>>>>>>>>>>>>>> behind an option initially. (We'll have a KIP describing
> >>>>> this all
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> detail soon.)
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> What does everyone think about having Kafka 3.0 come up
> next
> >>>>>>> after
> >>>>>>>>>>>>>>>> 2.6? Are there any other things we should change in the
> 2.6
> >>>>> ->
> >>>>>>> 3.0
> >>>>>>>>>>>>>>>> transition?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> best, Colin
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>> Attachments:
> >>> * signature.asc
>