You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Jeff Widman <je...@jeffwidman.com> on 2020/04/11 04:56:39 UTC

[DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest

It's a relatively minor change, only one line of code. :-D



-- 

*Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><

Re: [DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

Posted by Jeff Widman <je...@jeffwidman.com>.
Thanks for chiming in Ryanne.

Personally, I'd rather fix MM1 to have a default of not-skipping-messages,
as that seems more correct, rather than fix the MM2 "legacy" switch to
match the bad behavior.

Especially since this won't affect currently-running mirrormakers, so
there's not that much breaking change... it would only affect
newly-started mirrormakers.

Anyone else want to chime in on this? I'd like to drive to a decision soon,
so will call for a vote in the next day or two.



On Mon, Apr 27, 2020 at 10:49 AM Ryanne Dolan <ry...@gmail.com> wrote:

> Conversely, we could consider making MM2 use "latest" in "legacy mode", and
> leave MM1 as it is? (Just thinking out loud.)
>
> Ryanne
>
> On Mon, Apr 27, 2020 at 12:39 PM Jeff Widman <je...@jeffwidman.com> wrote:
>
> > Good questions:
> >
> >
> > *I agree that `auto.offset.reset="earliest"` would be a better default.
> > However, I am a little worried about backwardcompatibility. *
> >
> > Keep in mind that existing mirrormaker instances will *not* be affected
> for
> > topics they are currently consuming because they will already have saved
> > offsets. This will only affect mirrormakers that start consuming new
> > topics, for which they don't have a saved offset. In those cases, they
> will
> > stop seeing data loss when they first start consuming. My guess is the
> > majority of those new topics are going to be newly-created topics anyway,
> > so most of the time starting from the earliest simply prevents skipping
> the
> > first few seconds/minutes of data written to the topic.
> >
> > *What I am also wondering thought is, does this only affect MirrorMaker
> or
> > also MirrorMaker 2? *
> >
> > I checked and MM2 already sets `auto.offset.reset = 'earliest'`
> > <
> >
> https://github.com/apache/kafka/blob/d63eaaaa0181bb7b9b4f5ed088abc00d7b32aeb0/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java#L233
> > >
> > .
> >
> > *Also, is it worth to change MirrorMaker now that **MirrorMaker 2 is
> > available?*
> >
> > Given that it's 1-line of code, doesn't affect existing instances, and
> > prevents data loss on new regex subscriptions, I think it's worth
> > setting... I basically view it as a bugfix rather than a feature change.
> >
> > I realize MM1 is deprecated, but there's still a lot of old mirrormakers
> > running, so flipping this now will ease the future transition to MM2
> > because it brings the behavior of MM1 in line with MM2.
> >
> > Thoughts?
> >
> >
> >
> > On Sat, Apr 11, 2020 at 11:59 AM Matthias J. Sax <mj...@apache.org>
> wrote:
> >
> > > Jeff,
> > >
> > > thanks for the KIP. I agree that `auto.offset.reset="earliest"` would
> be
> > > a better default. However, I am a little worried about backward
> > > compatibility. And even if the current default is not idea, users can
> > > still change it.
> > >
> > > What I am also wondering thought is, does this only affect MirrorMaker
> > > or also MirrorMaker 2? Also, is it worth to change MirrorMaker now that
> > > MirrorMaker 2 is available?
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 4/10/20 9:56 PM, Jeff Widman wrote:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest
> > > >
> > > > It's a relatively minor change, only one line of code. :-D
> > > >
> > > >
> > > >
> > >
> > >
> >
> > --
> >
> > *Jeff Widman*
> > jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
> > <><
> >
>


-- 

*Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><

Re: [DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

Posted by Gwen Shapira <gw...@confluent.io>.
Same.

There is a  backward compatibility risk (we are changing the behavior, and
the assessment that it won't affect anyone may be correct - but there is a
risk that people depend on existing behaviors in ways we didn't consider)

Remember that users are yelling on twitter even when we do small and very
reasonable changes:
https://twitter.com/jessetanderson/status/1250095104779378690

Since we want to encourage users to move to a significantly better and
safer alternative, I just don't see how the compatibility risk is
justified.

Gwen


On Fri, May 8, 2020 at 4:08 PM Matthias J. Sax <mj...@apache.org> wrote:

> With the discussion about a 3.0 release and deprecating the old MM, I am
> wondering if it's worth to do anything.
>
> Please should just switch to MM2 that has a better default.
>
> Thoughts?
>
>
> On 4/27/20 10:48 AM, Ryanne Dolan wrote:
> > Conversely, we could consider making MM2 use "latest" in "legacy mode",
> and
> > leave MM1 as it is? (Just thinking out loud.)
> >
> > Ryanne
> >
> > On Mon, Apr 27, 2020 at 12:39 PM Jeff Widman <je...@jeffwidman.com>
> wrote:
> >
> >> Good questions:
> >>
> >>
> >> *I agree that `auto.offset.reset="earliest"` would be a better default.
> >> However, I am a little worried about backwardcompatibility. *
> >>
> >> Keep in mind that existing mirrormaker instances will *not* be affected
> for
> >> topics they are currently consuming because they will already have saved
> >> offsets. This will only affect mirrormakers that start consuming new
> >> topics, for which they don't have a saved offset. In those cases, they
> will
> >> stop seeing data loss when they first start consuming. My guess is the
> >> majority of those new topics are going to be newly-created topics
> anyway,
> >> so most of the time starting from the earliest simply prevents skipping
> the
> >> first few seconds/minutes of data written to the topic.
> >>
> >> *What I am also wondering thought is, does this only affect MirrorMaker
> or
> >> also MirrorMaker 2? *
> >>
> >> I checked and MM2 already sets `auto.offset.reset = 'earliest'`
> >> <
> >>
> https://github.com/apache/kafka/blob/d63eaaaa0181bb7b9b4f5ed088abc00d7b32aeb0/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java#L233
> >>>
> >> .
> >>
> >> *Also, is it worth to change MirrorMaker now that **MirrorMaker 2 is
> >> available?*
> >>
> >> Given that it's 1-line of code, doesn't affect existing instances, and
> >> prevents data loss on new regex subscriptions, I think it's worth
> >> setting... I basically view it as a bugfix rather than a feature change.
> >>
> >> I realize MM1 is deprecated, but there's still a lot of old mirrormakers
> >> running, so flipping this now will ease the future transition to MM2
> >> because it brings the behavior of MM1 in line with MM2.
> >>
> >> Thoughts?
> >>
> >>
> >>
> >> On Sat, Apr 11, 2020 at 11:59 AM Matthias J. Sax <mj...@apache.org>
> wrote:
> >>
> >>> Jeff,
> >>>
> >>> thanks for the KIP. I agree that `auto.offset.reset="earliest"` would
> be
> >>> a better default. However, I am a little worried about backward
> >>> compatibility. And even if the current default is not idea, users can
> >>> still change it.
> >>>
> >>> What I am also wondering thought is, does this only affect MirrorMaker
> >>> or also MirrorMaker 2? Also, is it worth to change MirrorMaker now that
> >>> MirrorMaker 2 is available?
> >>>
> >>>
> >>> -Matthias
> >>>
> >>>
> >>> On 4/10/20 9:56 PM, Jeff Widman wrote:
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest
> >>>>
> >>>> It's a relatively minor change, only one line of code. :-D
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >> --
> >>
> >> *Jeff Widman*
> >> jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
> >> <><
> >>
> >
>
>

-- 
Gwen Shapira
Engineering Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: [DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

Posted by "Matthias J. Sax" <mj...@apache.org>.
With the discussion about a 3.0 release and deprecating the old MM, I am
wondering if it's worth to do anything.

Please should just switch to MM2 that has a better default.

Thoughts?


On 4/27/20 10:48 AM, Ryanne Dolan wrote:
> Conversely, we could consider making MM2 use "latest" in "legacy mode", and
> leave MM1 as it is? (Just thinking out loud.)
> 
> Ryanne
> 
> On Mon, Apr 27, 2020 at 12:39 PM Jeff Widman <je...@jeffwidman.com> wrote:
> 
>> Good questions:
>>
>>
>> *I agree that `auto.offset.reset="earliest"` would be a better default.
>> However, I am a little worried about backwardcompatibility. *
>>
>> Keep in mind that existing mirrormaker instances will *not* be affected for
>> topics they are currently consuming because they will already have saved
>> offsets. This will only affect mirrormakers that start consuming new
>> topics, for which they don't have a saved offset. In those cases, they will
>> stop seeing data loss when they first start consuming. My guess is the
>> majority of those new topics are going to be newly-created topics anyway,
>> so most of the time starting from the earliest simply prevents skipping the
>> first few seconds/minutes of data written to the topic.
>>
>> *What I am also wondering thought is, does this only affect MirrorMaker or
>> also MirrorMaker 2? *
>>
>> I checked and MM2 already sets `auto.offset.reset = 'earliest'`
>> <
>> https://github.com/apache/kafka/blob/d63eaaaa0181bb7b9b4f5ed088abc00d7b32aeb0/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java#L233
>>>
>> .
>>
>> *Also, is it worth to change MirrorMaker now that **MirrorMaker 2 is
>> available?*
>>
>> Given that it's 1-line of code, doesn't affect existing instances, and
>> prevents data loss on new regex subscriptions, I think it's worth
>> setting... I basically view it as a bugfix rather than a feature change.
>>
>> I realize MM1 is deprecated, but there's still a lot of old mirrormakers
>> running, so flipping this now will ease the future transition to MM2
>> because it brings the behavior of MM1 in line with MM2.
>>
>> Thoughts?
>>
>>
>>
>> On Sat, Apr 11, 2020 at 11:59 AM Matthias J. Sax <mj...@apache.org> wrote:
>>
>>> Jeff,
>>>
>>> thanks for the KIP. I agree that `auto.offset.reset="earliest"` would be
>>> a better default. However, I am a little worried about backward
>>> compatibility. And even if the current default is not idea, users can
>>> still change it.
>>>
>>> What I am also wondering thought is, does this only affect MirrorMaker
>>> or also MirrorMaker 2? Also, is it worth to change MirrorMaker now that
>>> MirrorMaker 2 is available?
>>>
>>>
>>> -Matthias
>>>
>>>
>>> On 4/10/20 9:56 PM, Jeff Widman wrote:
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest
>>>>
>>>> It's a relatively minor change, only one line of code. :-D
>>>>
>>>>
>>>>
>>>
>>>
>>
>> --
>>
>> *Jeff Widman*
>> jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
>> <><
>>
> 


Re: [DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

Posted by Ryanne Dolan <ry...@gmail.com>.
Conversely, we could consider making MM2 use "latest" in "legacy mode", and
leave MM1 as it is? (Just thinking out loud.)

Ryanne

On Mon, Apr 27, 2020 at 12:39 PM Jeff Widman <je...@jeffwidman.com> wrote:

> Good questions:
>
>
> *I agree that `auto.offset.reset="earliest"` would be a better default.
> However, I am a little worried about backwardcompatibility. *
>
> Keep in mind that existing mirrormaker instances will *not* be affected for
> topics they are currently consuming because they will already have saved
> offsets. This will only affect mirrormakers that start consuming new
> topics, for which they don't have a saved offset. In those cases, they will
> stop seeing data loss when they first start consuming. My guess is the
> majority of those new topics are going to be newly-created topics anyway,
> so most of the time starting from the earliest simply prevents skipping the
> first few seconds/minutes of data written to the topic.
>
> *What I am also wondering thought is, does this only affect MirrorMaker or
> also MirrorMaker 2? *
>
> I checked and MM2 already sets `auto.offset.reset = 'earliest'`
> <
> https://github.com/apache/kafka/blob/d63eaaaa0181bb7b9b4f5ed088abc00d7b32aeb0/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java#L233
> >
> .
>
> *Also, is it worth to change MirrorMaker now that **MirrorMaker 2 is
> available?*
>
> Given that it's 1-line of code, doesn't affect existing instances, and
> prevents data loss on new regex subscriptions, I think it's worth
> setting... I basically view it as a bugfix rather than a feature change.
>
> I realize MM1 is deprecated, but there's still a lot of old mirrormakers
> running, so flipping this now will ease the future transition to MM2
> because it brings the behavior of MM1 in line with MM2.
>
> Thoughts?
>
>
>
> On Sat, Apr 11, 2020 at 11:59 AM Matthias J. Sax <mj...@apache.org> wrote:
>
> > Jeff,
> >
> > thanks for the KIP. I agree that `auto.offset.reset="earliest"` would be
> > a better default. However, I am a little worried about backward
> > compatibility. And even if the current default is not idea, users can
> > still change it.
> >
> > What I am also wondering thought is, does this only affect MirrorMaker
> > or also MirrorMaker 2? Also, is it worth to change MirrorMaker now that
> > MirrorMaker 2 is available?
> >
> >
> > -Matthias
> >
> >
> > On 4/10/20 9:56 PM, Jeff Widman wrote:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest
> > >
> > > It's a relatively minor change, only one line of code. :-D
> > >
> > >
> > >
> >
> >
>
> --
>
> *Jeff Widman*
> jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
> <><
>

Re: [DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

Posted by Jeff Widman <je...@jeffwidman.com>.
Good questions:


*I agree that `auto.offset.reset="earliest"` would be a better default.
However, I am a little worried about backwardcompatibility. *

Keep in mind that existing mirrormaker instances will *not* be affected for
topics they are currently consuming because they will already have saved
offsets. This will only affect mirrormakers that start consuming new
topics, for which they don't have a saved offset. In those cases, they will
stop seeing data loss when they first start consuming. My guess is the
majority of those new topics are going to be newly-created topics anyway,
so most of the time starting from the earliest simply prevents skipping the
first few seconds/minutes of data written to the topic.

*What I am also wondering thought is, does this only affect MirrorMaker or
also MirrorMaker 2? *

I checked and MM2 already sets `auto.offset.reset = 'earliest'`
<https://github.com/apache/kafka/blob/d63eaaaa0181bb7b9b4f5ed088abc00d7b32aeb0/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorConnectorConfig.java#L233>
.

*Also, is it worth to change MirrorMaker now that **MirrorMaker 2 is
available?*

Given that it's 1-line of code, doesn't affect existing instances, and
prevents data loss on new regex subscriptions, I think it's worth
setting... I basically view it as a bugfix rather than a feature change.

I realize MM1 is deprecated, but there's still a lot of old mirrormakers
running, so flipping this now will ease the future transition to MM2
because it brings the behavior of MM1 in line with MM2.

Thoughts?



On Sat, Apr 11, 2020 at 11:59 AM Matthias J. Sax <mj...@apache.org> wrote:

> Jeff,
>
> thanks for the KIP. I agree that `auto.offset.reset="earliest"` would be
> a better default. However, I am a little worried about backward
> compatibility. And even if the current default is not idea, users can
> still change it.
>
> What I am also wondering thought is, does this only affect MirrorMaker
> or also MirrorMaker 2? Also, is it worth to change MirrorMaker now that
> MirrorMaker 2 is available?
>
>
> -Matthias
>
>
> On 4/10/20 9:56 PM, Jeff Widman wrote:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest
> >
> > It's a relatively minor change, only one line of code. :-D
> >
> >
> >
>
>

-- 

*Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><

Re: [DISCUSS] KIP-592: MirrorMaker should replicate topics from earliest

Posted by "Matthias J. Sax" <mj...@apache.org>.
Jeff,

thanks for the KIP. I agree that `auto.offset.reset="earliest"` would be
a better default. However, I am a little worried about backward
compatibility. And even if the current default is not idea, users can
still change it.

What I am also wondering thought is, does this only affect MirrorMaker
or also MirrorMaker 2? Also, is it worth to change MirrorMaker now that
MirrorMaker 2 is available?


-Matthias


On 4/10/20 9:56 PM, Jeff Widman wrote:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-592%3A+MirrorMaker+should+replicate+topics+from+earliest
> 
> It's a relatively minor change, only one line of code. :-D
> 
> 
>