You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Patrick Stuedi <ps...@confluent.io.INVALID> on 2021/07/19 11:22:12 UTC

[DISCUSS] KIP-763: Range queries with open endpoints

Hi everyone,

I would like to start the discussion for KIP-763: Range queries with open
endpoints.

The KIP can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-763%3A+Range+queries+with+open+endpoints

Any feedback will be highly appreciated.

Many thanks,
 Patrick

Re: [DISCUSS] KIP-763: Range queries with open endpoints

Posted by Patrick Stuedi <ps...@confluent.io.INVALID>.
Boyang,

Thanks for your comment. The concern about whether a null-based approach
increases the chances of hidden bugs is a valid concern. In the end, the
decision to go with this approach was guided by considering the effort and
implications of changing the API towards a new range interface, versus the
likelihood of such bugs happening. Changing an API is always a
major effort. On other hand, I think the risk of hidden bugs with the
null-based API are low, for two reasons. First, there should not be any
existing application developed for the old API passing null, as null
parameters did throw an exception in the old API. New applications passing
null without the intention to do an open endpoint range query are, I would
say, unlikely, and if they happen then it's clearly the application
developer's mistake who did not read the API documentation.

-Patrick

On Sat, Aug 28, 2021 at 8:24 AM Boyang Chen <re...@gmail.com>
wrote:

> Thanks Patrick for the KIP and sorry for being late to the party here. One
> thing I would like to understand is, what's the expected behavior when a
> user specifies a null key in previous versions? In the new version which
> uses null key for open-ended range queries, could it potentially hide bugs
> in the user's code when their intention was to do a bounded range query but
> accidentally set the key to null?
>
> Boyang
>
> On Wed, Jul 21, 2021 at 2:59 AM Patrick Stuedi
> <ps...@confluent.io.invalid>
> wrote:
>
> > Thanks John, Bruno for the valuable feedback!
> >
> > John: I had a quick look at the SessionStore and WindowStore interface.
> > While it looks like we should be able to apply similar ideas to those
> > stores, the actual interfaces are slightly different and expanding them
> for
> > open ranges may need a bit more thinking. In that sense, and to make sure
> > we will be converging with this KIP, I'd prefer to not expand the scope
> of
> > the KIP . As Bruno suggested we can always propose changes to the
> > SessionStore and WindowStore in a separate KIP. I'll add a subsection in
> > the rejected alternatives.
> >
> > @Bruno: good point, I'll add an example. Yes, all() will be equivalent to
> > range(null, null) and the implementation of all() will be a call to
> > range(null, null).
> >
> > -Patrick
> >
> > On Wed, Jul 21, 2021 at 9:12 AM Bruno Cadonna <ca...@apache.org>
> wrote:
> >
> > > Thanks for the KIP, Patrick!
> > >
> > > I agree with John that your KIP is well motivated.
> > >
> > > I have just one minor feedback. Could you add store.range(null, null)
> to
> > > the example snippets with the comment that this is equivalent to all()
> > > (it is equivalent, right?)? This question about equivalence between
> > > range(null, null) and all() came up in my mind when I read the KIP and
> I
> > > think I am not the only one.
> > >
> > > Regarding expanding the KIP to SessionStore and WindowStore, I think
> you
> > > should not expand scope of the KIP. We can do the changes to the
> > > SessionStore and WindowStore in a separate KIP. But it is your call!
> > >
> > >
> > > Best,
> > > Bruno
> > >
> > > On 21.07.21 00:18, John Roesler wrote:
> > > > Thanks for the KIP, Patrick!
> > > >
> > > > I think your KIP is well motivated. It's definitely a bummer
> > > > to have to iterate over the full store as a workaround for
> > > > open-ended ranges.
> > > >
> > > > I agree with your pragmatic approach here. We have recently
> > > > had a couple of other contributions to the store APIs
> > > > (prefix and reverseRange), and the complexity of adding
> > > > those new methods far exceeded what anyone expected. I'd be
> > > > in favor of being conservative with new store APIs until we
> > > > are able to reign in that complexity somehow. Since your
> > > > proposed semantics seem reasonable, I'm in favor.
> > > >
> > > > While reviewing your KIP, it struck me that we have several
> > > > range-like APIs on:
> > > > * SessionStore
> > > > (
> > >
> >
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java
> > > )
> > > > * WindowStore
> > > > (
> > >
> >
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlyWindowStore.java
> > > )
> > > > as well.
> > > >
> > > > Do you think it would be a good idea to also propose a
> > > > similar change on those APIs? It might be nice (for exactly
> > > > the same reasons you documented), but it also increases the
> > > > scope of your work. There is one extra wrinkle I can see:
> > > > SessionStore has versions of the range methods that take a
> > > > `long` instead of an `Instant`, so `null` wouldn't work for
> > > > them.
> > > >
> > > > If you prefer to keep your KIP narrow in scope to just the
> > > > KeyValueStore, I'd also support you. In that case, it might
> > > > be a good idea to simply mention in the "Rejected
> > > > Alternatives" that you decided not to address those other
> > > > store APIs at this time. That way, people later on won't
> > > > have to wonder why those other methods didn't get updated.
> > > >
> > > > Other than that, I have no concerns!
> > > >
> > > > Thank you,
> > > > John
> > > >
> > > > On Mon, 2021-07-19 at 13:22 +0200, Patrick Stuedi wrote:
> > > >> Hi everyone,
> > > >>
> > > >> I would like to start the discussion for KIP-763: Range queries with
> > > open
> > > >> endpoints.
> > > >>
> > > >> The KIP can be found here:
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-763%3A+Range+queries+with+open+endpoints
> > > >>
> > > >> Any feedback will be highly appreciated.
> > > >>
> > > >> Many thanks,
> > > >>   Patrick
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-763: Range queries with open endpoints

Posted by Boyang Chen <re...@gmail.com>.
Thanks Patrick for the KIP and sorry for being late to the party here. One
thing I would like to understand is, what's the expected behavior when a
user specifies a null key in previous versions? In the new version which
uses null key for open-ended range queries, could it potentially hide bugs
in the user's code when their intention was to do a bounded range query but
accidentally set the key to null?

Boyang

On Wed, Jul 21, 2021 at 2:59 AM Patrick Stuedi <ps...@confluent.io.invalid>
wrote:

> Thanks John, Bruno for the valuable feedback!
>
> John: I had a quick look at the SessionStore and WindowStore interface.
> While it looks like we should be able to apply similar ideas to those
> stores, the actual interfaces are slightly different and expanding them for
> open ranges may need a bit more thinking. In that sense, and to make sure
> we will be converging with this KIP, I'd prefer to not expand the scope of
> the KIP . As Bruno suggested we can always propose changes to the
> SessionStore and WindowStore in a separate KIP. I'll add a subsection in
> the rejected alternatives.
>
> @Bruno: good point, I'll add an example. Yes, all() will be equivalent to
> range(null, null) and the implementation of all() will be a call to
> range(null, null).
>
> -Patrick
>
> On Wed, Jul 21, 2021 at 9:12 AM Bruno Cadonna <ca...@apache.org> wrote:
>
> > Thanks for the KIP, Patrick!
> >
> > I agree with John that your KIP is well motivated.
> >
> > I have just one minor feedback. Could you add store.range(null, null) to
> > the example snippets with the comment that this is equivalent to all()
> > (it is equivalent, right?)? This question about equivalence between
> > range(null, null) and all() came up in my mind when I read the KIP and I
> > think I am not the only one.
> >
> > Regarding expanding the KIP to SessionStore and WindowStore, I think you
> > should not expand scope of the KIP. We can do the changes to the
> > SessionStore and WindowStore in a separate KIP. But it is your call!
> >
> >
> > Best,
> > Bruno
> >
> > On 21.07.21 00:18, John Roesler wrote:
> > > Thanks for the KIP, Patrick!
> > >
> > > I think your KIP is well motivated. It's definitely a bummer
> > > to have to iterate over the full store as a workaround for
> > > open-ended ranges.
> > >
> > > I agree with your pragmatic approach here. We have recently
> > > had a couple of other contributions to the store APIs
> > > (prefix and reverseRange), and the complexity of adding
> > > those new methods far exceeded what anyone expected. I'd be
> > > in favor of being conservative with new store APIs until we
> > > are able to reign in that complexity somehow. Since your
> > > proposed semantics seem reasonable, I'm in favor.
> > >
> > > While reviewing your KIP, it struck me that we have several
> > > range-like APIs on:
> > > * SessionStore
> > > (
> >
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java
> > )
> > > * WindowStore
> > > (
> >
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlyWindowStore.java
> > )
> > > as well.
> > >
> > > Do you think it would be a good idea to also propose a
> > > similar change on those APIs? It might be nice (for exactly
> > > the same reasons you documented), but it also increases the
> > > scope of your work. There is one extra wrinkle I can see:
> > > SessionStore has versions of the range methods that take a
> > > `long` instead of an `Instant`, so `null` wouldn't work for
> > > them.
> > >
> > > If you prefer to keep your KIP narrow in scope to just the
> > > KeyValueStore, I'd also support you. In that case, it might
> > > be a good idea to simply mention in the "Rejected
> > > Alternatives" that you decided not to address those other
> > > store APIs at this time. That way, people later on won't
> > > have to wonder why those other methods didn't get updated.
> > >
> > > Other than that, I have no concerns!
> > >
> > > Thank you,
> > > John
> > >
> > > On Mon, 2021-07-19 at 13:22 +0200, Patrick Stuedi wrote:
> > >> Hi everyone,
> > >>
> > >> I would like to start the discussion for KIP-763: Range queries with
> > open
> > >> endpoints.
> > >>
> > >> The KIP can be found here:
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-763%3A+Range+queries+with+open+endpoints
> > >>
> > >> Any feedback will be highly appreciated.
> > >>
> > >> Many thanks,
> > >>   Patrick
> > >
> > >
> >
>

Re: [DISCUSS] KIP-763: Range queries with open endpoints

Posted by Patrick Stuedi <ps...@confluent.io.INVALID>.
Thanks John, Bruno for the valuable feedback!

John: I had a quick look at the SessionStore and WindowStore interface.
While it looks like we should be able to apply similar ideas to those
stores, the actual interfaces are slightly different and expanding them for
open ranges may need a bit more thinking. In that sense, and to make sure
we will be converging with this KIP, I'd prefer to not expand the scope of
the KIP . As Bruno suggested we can always propose changes to the
SessionStore and WindowStore in a separate KIP. I'll add a subsection in
the rejected alternatives.

@Bruno: good point, I'll add an example. Yes, all() will be equivalent to
range(null, null) and the implementation of all() will be a call to
range(null, null).

-Patrick

On Wed, Jul 21, 2021 at 9:12 AM Bruno Cadonna <ca...@apache.org> wrote:

> Thanks for the KIP, Patrick!
>
> I agree with John that your KIP is well motivated.
>
> I have just one minor feedback. Could you add store.range(null, null) to
> the example snippets with the comment that this is equivalent to all()
> (it is equivalent, right?)? This question about equivalence between
> range(null, null) and all() came up in my mind when I read the KIP and I
> think I am not the only one.
>
> Regarding expanding the KIP to SessionStore and WindowStore, I think you
> should not expand scope of the KIP. We can do the changes to the
> SessionStore and WindowStore in a separate KIP. But it is your call!
>
>
> Best,
> Bruno
>
> On 21.07.21 00:18, John Roesler wrote:
> > Thanks for the KIP, Patrick!
> >
> > I think your KIP is well motivated. It's definitely a bummer
> > to have to iterate over the full store as a workaround for
> > open-ended ranges.
> >
> > I agree with your pragmatic approach here. We have recently
> > had a couple of other contributions to the store APIs
> > (prefix and reverseRange), and the complexity of adding
> > those new methods far exceeded what anyone expected. I'd be
> > in favor of being conservative with new store APIs until we
> > are able to reign in that complexity somehow. Since your
> > proposed semantics seem reasonable, I'm in favor.
> >
> > While reviewing your KIP, it struck me that we have several
> > range-like APIs on:
> > * SessionStore
> > (
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java
> )
> > * WindowStore
> > (
> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlyWindowStore.java
> )
> > as well.
> >
> > Do you think it would be a good idea to also propose a
> > similar change on those APIs? It might be nice (for exactly
> > the same reasons you documented), but it also increases the
> > scope of your work. There is one extra wrinkle I can see:
> > SessionStore has versions of the range methods that take a
> > `long` instead of an `Instant`, so `null` wouldn't work for
> > them.
> >
> > If you prefer to keep your KIP narrow in scope to just the
> > KeyValueStore, I'd also support you. In that case, it might
> > be a good idea to simply mention in the "Rejected
> > Alternatives" that you decided not to address those other
> > store APIs at this time. That way, people later on won't
> > have to wonder why those other methods didn't get updated.
> >
> > Other than that, I have no concerns!
> >
> > Thank you,
> > John
> >
> > On Mon, 2021-07-19 at 13:22 +0200, Patrick Stuedi wrote:
> >> Hi everyone,
> >>
> >> I would like to start the discussion for KIP-763: Range queries with
> open
> >> endpoints.
> >>
> >> The KIP can be found here:
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-763%3A+Range+queries+with+open+endpoints
> >>
> >> Any feedback will be highly appreciated.
> >>
> >> Many thanks,
> >>   Patrick
> >
> >
>

Re: [DISCUSS] KIP-763: Range queries with open endpoints

Posted by Bruno Cadonna <ca...@apache.org>.
Thanks for the KIP, Patrick!

I agree with John that your KIP is well motivated.

I have just one minor feedback. Could you add store.range(null, null) to 
the example snippets with the comment that this is equivalent to all() 
(it is equivalent, right?)? This question about equivalence between 
range(null, null) and all() came up in my mind when I read the KIP and I 
think I am not the only one.

Regarding expanding the KIP to SessionStore and WindowStore, I think you 
should not expand scope of the KIP. We can do the changes to the 
SessionStore and WindowStore in a separate KIP. But it is your call!


Best,
Bruno

On 21.07.21 00:18, John Roesler wrote:
> Thanks for the KIP, Patrick!
> 
> I think your KIP is well motivated. It's definitely a bummer
> to have to iterate over the full store as a workaround for
> open-ended ranges.
> 
> I agree with your pragmatic approach here. We have recently
> had a couple of other contributions to the store APIs
> (prefix and reverseRange), and the complexity of adding
> those new methods far exceeded what anyone expected. I'd be
> in favor of being conservative with new store APIs until we
> are able to reign in that complexity somehow. Since your
> proposed semantics seem reasonable, I'm in favor.
> 
> While reviewing your KIP, it struck me that we have several
> range-like APIs on:
> * SessionStore
> (https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java)
> * WindowStore
> (https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlyWindowStore.java)
> as well.
> 
> Do you think it would be a good idea to also propose a
> similar change on those APIs? It might be nice (for exactly
> the same reasons you documented), but it also increases the
> scope of your work. There is one extra wrinkle I can see:
> SessionStore has versions of the range methods that take a
> `long` instead of an `Instant`, so `null` wouldn't work for
> them.
> 
> If you prefer to keep your KIP narrow in scope to just the
> KeyValueStore, I'd also support you. In that case, it might
> be a good idea to simply mention in the "Rejected
> Alternatives" that you decided not to address those other
> store APIs at this time. That way, people later on won't
> have to wonder why those other methods didn't get updated.
> 
> Other than that, I have no concerns!
> 
> Thank you,
> John
> 
> On Mon, 2021-07-19 at 13:22 +0200, Patrick Stuedi wrote:
>> Hi everyone,
>>
>> I would like to start the discussion for KIP-763: Range queries with open
>> endpoints.
>>
>> The KIP can be found here:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-763%3A+Range+queries+with+open+endpoints
>>
>> Any feedback will be highly appreciated.
>>
>> Many thanks,
>>   Patrick
> 
> 

Re: [DISCUSS] KIP-763: Range queries with open endpoints

Posted by John Roesler <vv...@apache.org>.
Thanks for the KIP, Patrick!

I think your KIP is well motivated. It's definitely a bummer
to have to iterate over the full store as a workaround for
open-ended ranges.

I agree with your pragmatic approach here. We have recently
had a couple of other contributions to the store APIs
(prefix and reverseRange), and the complexity of adding
those new methods far exceeded what anyone expected. I'd be
in favor of being conservative with new store APIs until we
are able to reign in that complexity somehow. Since your
proposed semantics seem reasonable, I'm in favor.

While reviewing your KIP, it struck me that we have several
range-like APIs on:
* SessionStore
(https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlySessionStore.java)
* WindowStore
(https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/ReadOnlyWindowStore.java)
as well.

Do you think it would be a good idea to also propose a
similar change on those APIs? It might be nice (for exactly
the same reasons you documented), but it also increases the
scope of your work. There is one extra wrinkle I can see:
SessionStore has versions of the range methods that take a
`long` instead of an `Instant`, so `null` wouldn't work for
them.

If you prefer to keep your KIP narrow in scope to just the
KeyValueStore, I'd also support you. In that case, it might
be a good idea to simply mention in the "Rejected
Alternatives" that you decided not to address those other
store APIs at this time. That way, people later on won't
have to wonder why those other methods didn't get updated.

Other than that, I have no concerns!

Thank you,
John

On Mon, 2021-07-19 at 13:22 +0200, Patrick Stuedi wrote:
> Hi everyone,
> 
> I would like to start the discussion for KIP-763: Range queries with open
> endpoints.
> 
> The KIP can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-763%3A+Range+queries+with+open+endpoints
> 
> Any feedback will be highly appreciated.
> 
> Many thanks,
>  Patrick