You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by John Roesler <vv...@apache.org> on 2020/08/02 00:30:16 UTC

Re: [DISCUSS] KIP-645: Replace abstract class Windows with a proper interface

Thanks for taking a look, Boyang!

Sure thing. JoinWindows is really just a way to specify how much of a buffer to maintain in order to support the API, namely how much of a timestamp delta should be joinable between the two streams. It’s not quite chopping the input streams up into Windows the way the windowed aggregations are. 

It probably seemed like a good choice at the time because we’re reusing the same internal state store as Windows. Initially, a lot of the configuration for that store was specified in Windows, which itself was a problem we have corrected by deprecating segments and retention. Now that Windows is just for configuring the window semantics and all store configuration is in the store builder, there’s much less overlap between Windows and JoinWindows. 

In fact, Windows has just three members now, windowsFor, size, and grace. Grace isn’t really related to the main semantic of Windows. To me, the fact that JoinWindows throws UnsupportedOperationException on one of the two primary methods of Windows is what made me feel so confident in stating that JoinWindows shouldn’t be in the Windows hierarchy. 

Does that seem reasonable?

Thanks,
John

On Fri, Jul 31, 2020, at 11:55, Boyang Chen wrote:
> Hey John,
> 
> in the Public interface section, you mentioned `Do not add the new
> interface to JoinWindows, which should not be part of this hierarchy`,
> could you explain a bit why and what's the plan with JoinWindows?
> 
> On Tue, Jul 28, 2020 at 12:50 PM Sophie Blee-Goldman <so...@confluent.io>
> wrote:
> 
> > Awesome, thanks for looking into the Window improvements as well!
> > (And I'm sorry I brought this upon you). I hope it's not too painful to get
> > everything in the Windows ecosystem looking good and reasonable,
> > and the benefits are pretty clear.
> >
> > Haven't had a careful look through the POC yet but the proposal and
> > public changes you described sound great. Thanks for the KIP!
> >
> > On Tue, Jul 28, 2020 at 10:27 AM John Roesler <vv...@apache.org> wrote:
> >
> > > Hi Sophie,
> > >
> > > A quick update: I've pushed a commit to the POC PR
> > > that includes the migration of Window to become a
> > > data class instead of an abstract class. It's quite a bit
> > > of code, but it does look like there is a clean
> > > deprecation/migration path.
> > >
> > > The basic idea is that we drop the "abstract" modifier
> > > from Window an deprecate its constructor. We add
> > > a public static `Window.withBounds(start,end)` to
> > > replace the constructor.
> > >
> > > Because the constructor is deprecated, existing
> > > subclasses will continue to compile, but receive
> > > a deprecation warning.
> > >
> > > We'd also slightly modify EnumerableWindowDefinition
> > > to _not_ be a parameterized class and instead
> > > update windowsFor like this:
> > > <W extends Window> Map<Long, W> windowsFor(final long timestamp)
> > >
> > > Then, any existing caller that expects to get back
> > > a subclass of windows during the deprecation period
> > > would still get a valid return. For example, if you
> > > had a custom window definition, and you
> > > invoke this method in your tests, assigning it to a
> > > subclass of Window, all your code would still compile,
> > > but you would get deprecation warnings.
> > >
> > > After the deprecation period, we could migrate Window
> > > to be a final class with a private constructor safely.
> > >
> > > If that sounds reasonable to you, I can update the
> > > KIP accordingly.
> > >
> > > Thanks,
> > > -John
> > >
> > > On Mon, Jul 27, 2020, at 22:11, John Roesler wrote:
> > > > Thanks for the reply, Sophie.
> > > >
> > > > Yes, I'd neglected to specify that Windows will implement maxSize()
> > > > by delegating to size(). It's updated now. I'd also neglected to say
> > that
> > > > I plan to alter both windowBy methods to use the new interface now.
> > > > Because Windows will implement the new interface, all implementations
> > > > will continue to work with windowBy.
> > > > So, yes, there are public methods changed, but no compatibility issues
> > > > arise. Existing implementations will only get a deprecation warning.
> > > >
> > > > The Window type is interesting. It actually seems to serve as just a
> > data
> > > > container. It almost doesn't need to be subclassed at all, since all
> > > > implementations would just need to store the start and end bounds.
> > > > As far as I can tell, the only purpose to subclass it is to implement
> > > > "overlap(Window other)", to tell if the window is both the same type
> > > > as and overlaps with the other window. But weirdly, this is unused
> > > > in the codebase.
> > > >
> > > > So one way we could go is to try and migrate it to just a final class,
> > > > effectively a tuple of `(start, end)`.
> > > >
> > > > However, another opportunity is to let it be a witness of the actual
> > type
> > > > of the window, so that you wouldn't be able to join a TimeWindow
> > > > with a SessionWindow, for example.
> > > >
> > > > However, because of covariance, it's more painful to change Window
> > > > than Windows, so it might not be worth it right now. If anything, it
> > > > would be more feasible to migrate toward the "simple data container"
> > > > approach. What are your thoughts?
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > > >
> > > > On Mon, Jul 27, 2020, at 18:19, Sophie Blee-Goldman wrote:
> > > > > Thanks for taking the time to really fill in the background details
> > for
> > > > > this KIP.
> > > > > The Motivation section is very informative. Hopefully this will also
> > > serve
> > > > > as a
> > > > > warning against making similar such mistakes in the future :P
> > > > >
> > > > > I notice that the `Window` class that
> > > > > parametrizes EnumerableWindowDefinition
> > > > > is also abstract. Did you consider migrating that to an interface as
> > > well?
> > > > >
> > > > > Also, pretty awesome that we can solve the issue with varying fixed
> > > sized
> > > > > windows
> > > > > (eg calendar months) on the side. For users who may have already
> > > extended
> > > > > Windows,
> > > > > do you plan to just have Windows implement the new #maxSize method
> > and
> > > > > return the existing
> > > > > size until Windows gets removed?
> > > > >
> > > > > One final note: this seems to be implicitly implied throughout the
> > KIP
> > > but
> > > > > just to be clear,
> > > > > you will be replacing any DSL methods that accept Windows with
> > > identical
> > > > > DSL methods
> > > > > that take the new EnumerableWindowDefinition as argument. So there is
> > > > > nothing deprecated
> > > > > and nothing added, but there are public methods changed. Is that
> > right?
> > > > >
> > > > > On Sun, Jul 26, 2020 at 1:23 PM John Roesler <vv...@apache.org>
> > > wrote:
> > > > >
> > > > > > Thanks Sophie and Boyang for asking for specifics.
> > > > > >
> > > > > > As far as I can tell, if we were to _remove_ all the
> > > non-public-method
> > > > > > members from Windows, including any constructors, we are left with
> > > > > > effectively an interface. I think this was my plan before. I don't
> > > think
> > > > > > I realized at the time that it's possible to replace the entire
> > > class with
> > > > > > an interface. Now I realize it is possible, hence the motivation to
> > > do it.
> > > > > >
> > > > > > We can choose either to maintain forever the tech debt of having to
> > > > > > enforce that Windows look, sound, and act just like an interface,
> > or
> > > we
> > > > > > can just replace it with an interface and be done with it. I.e.,
> > the
> > > > > > motivation is less maintenence burden for us and for users.
> > > > > >
> > > > > > Coincidentally, I had an interesting conversation with Matthias
> > about
> > > > > > this interface, and he made me realize that "fixed size" isn't the
> > > > > > essential
> > > > > > nature of this entity. Instead being enumerable is. Reframing the
> > > interface
> > > > > > in this way will enable us to implement variable sized windows like
> > > > > > MonthlyWindows.
> > > > > >
> > > > > > So, now there are two good reasons to vote for this KIP :)
> > > > > >
> > > > > > Anyway, I've updated the proposed interface and the motivation.
> > > > > >
> > > > > > To Sophie latter question, all of the necessary deprecation is
> > listed
> > > > > > in the KIP. We do not have to deprecate any windowBy methods.
> > > > > >
> > > > > > Thanks,
> > > > > > -John
> > > > > >
> > > > > > On Sat, Jul 25, 2020, at 00:52, Boyang Chen wrote:
> > > > > > > Thanks for the KIP John. I share a similar concern with the
> > > motivation,
> > > > > > it
> > > > > > > would be good if you could cast light on the actual downside of
> > > using a
> > > > > > > base class vs the interface, is it making the code fragile, or
> > > requiring
> > > > > > > redundant implementation, etc.
> > > > > > >
> > > > > > > Boyang
> > > > > > >
> > > > > > > On Tue, Jul 21, 2020 at 2:19 PM Sophie Blee-Goldman <
> > > sophie@confluent.io
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey John,
> > > > > > > >
> > > > > > > > Thanks for the KIP. I know this has been bugging you :)
> > > > > > > >
> > > > > > > > That said, I think the KIP is missing some elaboration in the
> > > > > > Motivation
> > > > > > > > section.
> > > > > > > > You mention a number of problems we've had and lived with in
> > the
> > > past
> > > > > > --
> > > > > > > > could
> > > > > > > > you give an example of one, and how it would be solved by your
> > > > > > proposal?
> > > > > > > >
> > > > > > > > By the way, I assume we would also need to deprecate all APIs
> > > that
> > > > > > accept a
> > > > > > > > Windows
> > > > > > > > parameter in favor of new ones that accept a
> > > > > > FixedSizeWindowDefinition? Off
> > > > > > > > the
> > > > > > > > top of my head that would be the windowedBy methods in
> > > KGroupedStream
> > > > > > and
> > > > > > > > CogroupedKStream
> > > > > > > >
> > > > > > > > On Tue, Jul 21, 2020 at 1:46 PM John Roesler <
> > > vvcephei@apache.org>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello all,
> > > > > > > > >
> > > > > > > > > I'd like to propose KIP-645, to correct a small API mistake
> > in
> > > > > > Streams.
> > > > > > > > > Fixing this now allows us to avoid perpetuating the mistake
> > in
> > > new
> > > > > > work.
> > > > > > > > > For example, it will allow us to implement KIP-450 cleanly.
> > > > > > > > >
> > > > > > > > > The change itself should be seamless for users.
> > > > > > > > >
> > > > > > > > > Please see https://cwiki.apache.org/confluence/x/6SN4CQ for
> > > details.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > -John
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>