You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Luke Chen <sh...@gmail.com> on 2021/12/02 02:48:07 UTC

[DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Hi devs,

I'd like to propose a KIP to allow users to set default store
implementation class (built-in RocksDB/InMemory, or custom one), and
default to RocksDB state store, to keep backward compatibility.

Detailed description can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store

Any feedback and comments are welcome.

Thank you.
Luke

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Luke Chen <sh...@gmail.com>.
Hi all,

If there are no other comments, I'll start to vote tomorrow.

Thank you.
Luke

On Sat, Jan 15, 2022 at 10:47 AM Luke Chen <sh...@gmail.com> wrote:

> Hi Matthias,
>
> Thanks for the comments.
>
> 1. The config name: `default.dsl.store` looks good to me. Updated!
> 2. the KIP still contains a view case of the originally proposed
> config name `default.dsl.store.type` which should be updated.
> --> Updated. Thanks.
> 3. About `TopologyConfig`: we should add all public methods of this class,
> including constructors.
> --> Updated
>
> Thank you.
> Luke
>
>
> On Thu, Jan 13, 2022 at 11:48 AM Matthias J. Sax <mj...@apache.org> wrote:
>
>> Thanks for the KIP!
>>
>> I think it's a good step forward for the DSL and it makes sense to
>> exclude the PAPI and custom stores for now.
>>
>> About the config name, it seems to be overly complicated. Guozhang's
>> argument about "store type" that is used to refer to kv, windowed,
>> session stores make sense. But having "type" and "impl" in the name
>> sounds clumsy to me. What about a simplified name:
>>
>>    default.dsl.store.impl
>>
>> or maybe even better
>>
>>    default.dsl.store
>>
>> for the config?
>>
>> Btw: the KIP still contains a view case of the originally proposed
>> config name `default.dsl.store.type` which should be updated.
>>
>>
>> About `TopologyConfig`: we should add all public methods of this class,
>> including constructors.
>>
>>
>> -Matthias
>>
>> On 12/22/21 4:54 AM, Luke Chen wrote:
>> > Hi Guozhang,
>> >
>> > Thanks for the comments.
>> > And I agree it's better to rename it to `default.dsl.store.impl.type`
>> for
>> > differentiation.
>> > I've updated the KIP.
>> >
>> > Thank you.
>> > Luke
>> >
>> >
>> > On Wed, Dec 22, 2021 at 3:12 AM Guozhang Wang <wa...@gmail.com>
>> wrote:
>> >
>> >> Thanks Luke, I do not have any major comments on the wiki any more. BTW
>> >> thanks for making the "public StreamsBuilder(final TopologyConfig
>> >> topologyConfigs)" API public now, I think it will benefit a lot of
>> future
>> >> works!
>> >>
>> >> I think with the new API, we can deprecate the `build(props)` function
>> in
>> >> StreamsBuilder now, and will file a separate JIRA for it.
>> >>
>> >> Just a few nits:
>> >>
>> >> 1) There seems to be a typo at the end: "ROCK_DB"
>> >> 2) Sometimes people refer to "store type" as kv-store, window-store
>> etc;
>> >> maybe we can differentiate them a bit by calling the new API names
>> >> `StoreImplType`,
>> >> `default.dsl.store.impl.type` and `The default store implementation
>> type
>> >> used by DSL operators`.
>> >>
>> >> On Thu, Dec 16, 2021 at 2:29 AM Luke Chen <sh...@gmail.com> wrote:
>> >>
>> >>> Hi Guozhang,
>> >>>
>> >>> I've updated the KIP to use `enum`, instead of store implementation,
>> and
>> >>> some content accordingly.
>> >>> Please let me know if there's other comments.
>> >>>
>> >>>
>> >>> Thank you.
>> >>> Luke
>> >>>
>> >>> On Sun, Dec 12, 2021 at 3:55 PM Luke Chen <sh...@gmail.com> wrote:
>> >>>
>> >>>> Hi Guozhang,
>> >>>>
>> >>>> Thanks for your comments.
>> >>>> I agree that in the KIP, there's a trade-off regarding the API
>> >>> complexity.
>> >>>> With the store impl, we can support default custom stores, but
>> >> introduce
>> >>>> more complexity for users, while with the enum types, users can
>> >> configure
>> >>>> default built-in store types easily, but it can't work for custom
>> >> stores.
>> >>>>
>> >>>> For me, I'm OK to narrow down the scope and introduce the default
>> >>> built-in
>> >>>> enum store types first.
>> >>>> And if there's further request, we can consider a better way to
>> support
>> >>>> default store impl.
>> >>>>
>> >>>> I'll update the KIP next week, unless there are other opinions from
>> >> other
>> >>>> members.
>> >>>>
>> >>>> Thank you.
>> >>>> Luke
>> >>>>
>> >>>> On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>>> Thanks Luke for the updated KIP.
>> >>>>>
>> >>>>> One major change the new proposal has it to change the original enum
>> >>> store
>> >>>>> type with a new interface. Where in the enum approach our internal
>> >>>>> implementations would be something like:
>> >>>>>
>> >>>>> "
>> >>>>> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
>> >>>>> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
>> >>>>> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
>> >>>>> "
>> >>>>>
>> >>>>> And inside the impl classes like here we would could directly do:
>> >>>>>
>> >>>>> "
>> >>>>> if ((supplier = materialized.storeSupplier) == null) {
>> >>>>>      supplier =
>> >>>>> Stores.windowBytesStoreSupplier(materialized.storeImplType())
>> >>>>> }
>> >>>>> "
>> >>>>>
>> >>>>> While I understand the benefit of having an interface such that user
>> >>>>> customized stores could be used as the default store types as well,
>> >>>>> there's
>> >>>>> a trade-off I feel regarding the API complexity. Since with this
>> >>> approach,
>> >>>>> our API complexity granularity is in the order of "number of impl
>> >>> types" *
>> >>>>> "number of store types". This means that whenever we add new store
>> >> types
>> >>>>> in
>> >>>>> the future, this API needs to be augmented and customized impl needs
>> >> to
>> >>> be
>> >>>>> updated to support the new store types, in addition, not all custom
>> >> impl
>> >>>>> types may support all store types, but with this interface they are
>> >>> forced
>> >>>>> to either support all or explicitly throw un-supported exceptions.
>> >>>>>
>> >>>>> The way I see a default impl type is that, they would be safe to use
>> >> for
>> >>>>> any store types, and since store types are evolved by the library
>> >>> itself,
>> >>>>> the default impls would better be controlled by the library as well.
>> >>>>> Custom
>> >>>>> impl classes can choose to replace some of the stores explicitly,
>> but
>> >>> may
>> >>>>> not be a best fit as the default impl classes --- if there are in
>> the
>> >>>>> future, one way we can consider is to make Stores class extensible
>> >> along
>> >>>>> with the enum so that advanced users can add more default impls,
>> >>> assuming
>> >>>>> such scenarios are not very common.
>> >>>>>
>> >>>>> So I'm personally still a bit learning towards the enum approach
>> with
>> >> a
>> >>>>> narrower scope, for its simplicity as an API and also its low
>> >>> maintenance
>> >>>>> cost in the future. Let me know what do you think?
>> >>>>>
>> >>>>>
>> >>>>> Guozhang
>> >>>>>
>> >>>>>
>> >>>>> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
>> >>>>>
>> >>>>>> Hi devs,
>> >>>>>>
>> >>>>>> I'd like to propose a KIP to allow users to set default store
>> >>>>>> implementation class (built-in RocksDB/InMemory, or custom one),
>> and
>> >>>>>> default to RocksDB state store, to keep backward compatibility.
>> >>>>>>
>> >>>>>> Detailed description can be found here:
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
>> >>>>>>
>> >>>>>> Any feedback and comments are welcome.
>> >>>>>>
>> >>>>>> Thank you.
>> >>>>>> Luke
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> -- Guozhang
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >>
>> >> --
>> >> -- Guozhang
>> >>
>> >
>>
>

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Luke Chen <sh...@gmail.com>.
Hi Matthias,

Thanks for the comments.

1. The config name: `default.dsl.store` looks good to me. Updated!
2. the KIP still contains a view case of the originally proposed
config name `default.dsl.store.type` which should be updated.
--> Updated. Thanks.
3. About `TopologyConfig`: we should add all public methods of this class,
including constructors.
--> Updated

Thank you.
Luke


On Thu, Jan 13, 2022 at 11:48 AM Matthias J. Sax <mj...@apache.org> wrote:

> Thanks for the KIP!
>
> I think it's a good step forward for the DSL and it makes sense to
> exclude the PAPI and custom stores for now.
>
> About the config name, it seems to be overly complicated. Guozhang's
> argument about "store type" that is used to refer to kv, windowed,
> session stores make sense. But having "type" and "impl" in the name
> sounds clumsy to me. What about a simplified name:
>
>    default.dsl.store.impl
>
> or maybe even better
>
>    default.dsl.store
>
> for the config?
>
> Btw: the KIP still contains a view case of the originally proposed
> config name `default.dsl.store.type` which should be updated.
>
>
> About `TopologyConfig`: we should add all public methods of this class,
> including constructors.
>
>
> -Matthias
>
> On 12/22/21 4:54 AM, Luke Chen wrote:
> > Hi Guozhang,
> >
> > Thanks for the comments.
> > And I agree it's better to rename it to `default.dsl.store.impl.type` for
> > differentiation.
> > I've updated the KIP.
> >
> > Thank you.
> > Luke
> >
> >
> > On Wed, Dec 22, 2021 at 3:12 AM Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> >> Thanks Luke, I do not have any major comments on the wiki any more. BTW
> >> thanks for making the "public StreamsBuilder(final TopologyConfig
> >> topologyConfigs)" API public now, I think it will benefit a lot of
> future
> >> works!
> >>
> >> I think with the new API, we can deprecate the `build(props)` function
> in
> >> StreamsBuilder now, and will file a separate JIRA for it.
> >>
> >> Just a few nits:
> >>
> >> 1) There seems to be a typo at the end: "ROCK_DB"
> >> 2) Sometimes people refer to "store type" as kv-store, window-store etc;
> >> maybe we can differentiate them a bit by calling the new API names
> >> `StoreImplType`,
> >> `default.dsl.store.impl.type` and `The default store implementation type
> >> used by DSL operators`.
> >>
> >> On Thu, Dec 16, 2021 at 2:29 AM Luke Chen <sh...@gmail.com> wrote:
> >>
> >>> Hi Guozhang,
> >>>
> >>> I've updated the KIP to use `enum`, instead of store implementation,
> and
> >>> some content accordingly.
> >>> Please let me know if there's other comments.
> >>>
> >>>
> >>> Thank you.
> >>> Luke
> >>>
> >>> On Sun, Dec 12, 2021 at 3:55 PM Luke Chen <sh...@gmail.com> wrote:
> >>>
> >>>> Hi Guozhang,
> >>>>
> >>>> Thanks for your comments.
> >>>> I agree that in the KIP, there's a trade-off regarding the API
> >>> complexity.
> >>>> With the store impl, we can support default custom stores, but
> >> introduce
> >>>> more complexity for users, while with the enum types, users can
> >> configure
> >>>> default built-in store types easily, but it can't work for custom
> >> stores.
> >>>>
> >>>> For me, I'm OK to narrow down the scope and introduce the default
> >>> built-in
> >>>> enum store types first.
> >>>> And if there's further request, we can consider a better way to
> support
> >>>> default store impl.
> >>>>
> >>>> I'll update the KIP next week, unless there are other opinions from
> >> other
> >>>> members.
> >>>>
> >>>> Thank you.
> >>>> Luke
> >>>>
> >>>> On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Thanks Luke for the updated KIP.
> >>>>>
> >>>>> One major change the new proposal has it to change the original enum
> >>> store
> >>>>> type with a new interface. Where in the enum approach our internal
> >>>>> implementations would be something like:
> >>>>>
> >>>>> "
> >>>>> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
> >>>>> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
> >>>>> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
> >>>>> "
> >>>>>
> >>>>> And inside the impl classes like here we would could directly do:
> >>>>>
> >>>>> "
> >>>>> if ((supplier = materialized.storeSupplier) == null) {
> >>>>>      supplier =
> >>>>> Stores.windowBytesStoreSupplier(materialized.storeImplType())
> >>>>> }
> >>>>> "
> >>>>>
> >>>>> While I understand the benefit of having an interface such that user
> >>>>> customized stores could be used as the default store types as well,
> >>>>> there's
> >>>>> a trade-off I feel regarding the API complexity. Since with this
> >>> approach,
> >>>>> our API complexity granularity is in the order of "number of impl
> >>> types" *
> >>>>> "number of store types". This means that whenever we add new store
> >> types
> >>>>> in
> >>>>> the future, this API needs to be augmented and customized impl needs
> >> to
> >>> be
> >>>>> updated to support the new store types, in addition, not all custom
> >> impl
> >>>>> types may support all store types, but with this interface they are
> >>> forced
> >>>>> to either support all or explicitly throw un-supported exceptions.
> >>>>>
> >>>>> The way I see a default impl type is that, they would be safe to use
> >> for
> >>>>> any store types, and since store types are evolved by the library
> >>> itself,
> >>>>> the default impls would better be controlled by the library as well.
> >>>>> Custom
> >>>>> impl classes can choose to replace some of the stores explicitly, but
> >>> may
> >>>>> not be a best fit as the default impl classes --- if there are in the
> >>>>> future, one way we can consider is to make Stores class extensible
> >> along
> >>>>> with the enum so that advanced users can add more default impls,
> >>> assuming
> >>>>> such scenarios are not very common.
> >>>>>
> >>>>> So I'm personally still a bit learning towards the enum approach with
> >> a
> >>>>> narrower scope, for its simplicity as an API and also its low
> >>> maintenance
> >>>>> cost in the future. Let me know what do you think?
> >>>>>
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>>
> >>>>> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
> >>>>>
> >>>>>> Hi devs,
> >>>>>>
> >>>>>> I'd like to propose a KIP to allow users to set default store
> >>>>>> implementation class (built-in RocksDB/InMemory, or custom one), and
> >>>>>> default to RocksDB state store, to keep backward compatibility.
> >>>>>>
> >>>>>> Detailed description can be found here:
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
> >>>>>>
> >>>>>> Any feedback and comments are welcome.
> >>>>>>
> >>>>>> Thank you.
> >>>>>> Luke
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> -- Guozhang
> >>>>>
> >>>>
> >>>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by "Matthias J. Sax" <mj...@apache.org>.
Thanks for the KIP!

I think it's a good step forward for the DSL and it makes sense to 
exclude the PAPI and custom stores for now.

About the config name, it seems to be overly complicated. Guozhang's 
argument about "store type" that is used to refer to kv, windowed, 
session stores make sense. But having "type" and "impl" in the name 
sounds clumsy to me. What about a simplified name:

   default.dsl.store.impl

or maybe even better

   default.dsl.store

for the config?

Btw: the KIP still contains a view case of the originally proposed 
config name `default.dsl.store.type` which should be updated.


About `TopologyConfig`: we should add all public methods of this class, 
including constructors.


-Matthias

On 12/22/21 4:54 AM, Luke Chen wrote:
> Hi Guozhang,
> 
> Thanks for the comments.
> And I agree it's better to rename it to `default.dsl.store.impl.type` for
> differentiation.
> I've updated the KIP.
> 
> Thank you.
> Luke
> 
> 
> On Wed, Dec 22, 2021 at 3:12 AM Guozhang Wang <wa...@gmail.com> wrote:
> 
>> Thanks Luke, I do not have any major comments on the wiki any more. BTW
>> thanks for making the "public StreamsBuilder(final TopologyConfig
>> topologyConfigs)" API public now, I think it will benefit a lot of future
>> works!
>>
>> I think with the new API, we can deprecate the `build(props)` function in
>> StreamsBuilder now, and will file a separate JIRA for it.
>>
>> Just a few nits:
>>
>> 1) There seems to be a typo at the end: "ROCK_DB"
>> 2) Sometimes people refer to "store type" as kv-store, window-store etc;
>> maybe we can differentiate them a bit by calling the new API names
>> `StoreImplType`,
>> `default.dsl.store.impl.type` and `The default store implementation type
>> used by DSL operators`.
>>
>> On Thu, Dec 16, 2021 at 2:29 AM Luke Chen <sh...@gmail.com> wrote:
>>
>>> Hi Guozhang,
>>>
>>> I've updated the KIP to use `enum`, instead of store implementation, and
>>> some content accordingly.
>>> Please let me know if there's other comments.
>>>
>>>
>>> Thank you.
>>> Luke
>>>
>>> On Sun, Dec 12, 2021 at 3:55 PM Luke Chen <sh...@gmail.com> wrote:
>>>
>>>> Hi Guozhang,
>>>>
>>>> Thanks for your comments.
>>>> I agree that in the KIP, there's a trade-off regarding the API
>>> complexity.
>>>> With the store impl, we can support default custom stores, but
>> introduce
>>>> more complexity for users, while with the enum types, users can
>> configure
>>>> default built-in store types easily, but it can't work for custom
>> stores.
>>>>
>>>> For me, I'm OK to narrow down the scope and introduce the default
>>> built-in
>>>> enum store types first.
>>>> And if there's further request, we can consider a better way to support
>>>> default store impl.
>>>>
>>>> I'll update the KIP next week, unless there are other opinions from
>> other
>>>> members.
>>>>
>>>> Thank you.
>>>> Luke
>>>>
>>>> On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com>
>>> wrote:
>>>>
>>>>> Thanks Luke for the updated KIP.
>>>>>
>>>>> One major change the new proposal has it to change the original enum
>>> store
>>>>> type with a new interface. Where in the enum approach our internal
>>>>> implementations would be something like:
>>>>>
>>>>> "
>>>>> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
>>>>> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
>>>>> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
>>>>> "
>>>>>
>>>>> And inside the impl classes like here we would could directly do:
>>>>>
>>>>> "
>>>>> if ((supplier = materialized.storeSupplier) == null) {
>>>>>      supplier =
>>>>> Stores.windowBytesStoreSupplier(materialized.storeImplType())
>>>>> }
>>>>> "
>>>>>
>>>>> While I understand the benefit of having an interface such that user
>>>>> customized stores could be used as the default store types as well,
>>>>> there's
>>>>> a trade-off I feel regarding the API complexity. Since with this
>>> approach,
>>>>> our API complexity granularity is in the order of "number of impl
>>> types" *
>>>>> "number of store types". This means that whenever we add new store
>> types
>>>>> in
>>>>> the future, this API needs to be augmented and customized impl needs
>> to
>>> be
>>>>> updated to support the new store types, in addition, not all custom
>> impl
>>>>> types may support all store types, but with this interface they are
>>> forced
>>>>> to either support all or explicitly throw un-supported exceptions.
>>>>>
>>>>> The way I see a default impl type is that, they would be safe to use
>> for
>>>>> any store types, and since store types are evolved by the library
>>> itself,
>>>>> the default impls would better be controlled by the library as well.
>>>>> Custom
>>>>> impl classes can choose to replace some of the stores explicitly, but
>>> may
>>>>> not be a best fit as the default impl classes --- if there are in the
>>>>> future, one way we can consider is to make Stores class extensible
>> along
>>>>> with the enum so that advanced users can add more default impls,
>>> assuming
>>>>> such scenarios are not very common.
>>>>>
>>>>> So I'm personally still a bit learning towards the enum approach with
>> a
>>>>> narrower scope, for its simplicity as an API and also its low
>>> maintenance
>>>>> cost in the future. Let me know what do you think?
>>>>>
>>>>>
>>>>> Guozhang
>>>>>
>>>>>
>>>>> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
>>>>>
>>>>>> Hi devs,
>>>>>>
>>>>>> I'd like to propose a KIP to allow users to set default store
>>>>>> implementation class (built-in RocksDB/InMemory, or custom one), and
>>>>>> default to RocksDB state store, to keep backward compatibility.
>>>>>>
>>>>>> Detailed description can be found here:
>>>>>>
>>>>>>
>>>>>
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
>>>>>>
>>>>>> Any feedback and comments are welcome.
>>>>>>
>>>>>> Thank you.
>>>>>> Luke
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -- Guozhang
>>>>>
>>>>
>>>
>>
>>
>> --
>> -- Guozhang
>>
> 

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Luke Chen <sh...@gmail.com>.
Hi Guozhang,

Thanks for the comments.
And I agree it's better to rename it to `default.dsl.store.impl.type` for
differentiation.
I've updated the KIP.

Thank you.
Luke


On Wed, Dec 22, 2021 at 3:12 AM Guozhang Wang <wa...@gmail.com> wrote:

> Thanks Luke, I do not have any major comments on the wiki any more. BTW
> thanks for making the "public StreamsBuilder(final TopologyConfig
> topologyConfigs)" API public now, I think it will benefit a lot of future
> works!
>
> I think with the new API, we can deprecate the `build(props)` function in
> StreamsBuilder now, and will file a separate JIRA for it.
>
> Just a few nits:
>
> 1) There seems to be a typo at the end: "ROCK_DB"
> 2) Sometimes people refer to "store type" as kv-store, window-store etc;
> maybe we can differentiate them a bit by calling the new API names
> `StoreImplType`,
> `default.dsl.store.impl.type` and `The default store implementation type
> used by DSL operators`.
>
> On Thu, Dec 16, 2021 at 2:29 AM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi Guozhang,
> >
> > I've updated the KIP to use `enum`, instead of store implementation, and
> > some content accordingly.
> > Please let me know if there's other comments.
> >
> >
> > Thank you.
> > Luke
> >
> > On Sun, Dec 12, 2021 at 3:55 PM Luke Chen <sh...@gmail.com> wrote:
> >
> > > Hi Guozhang,
> > >
> > > Thanks for your comments.
> > > I agree that in the KIP, there's a trade-off regarding the API
> > complexity.
> > > With the store impl, we can support default custom stores, but
> introduce
> > > more complexity for users, while with the enum types, users can
> configure
> > > default built-in store types easily, but it can't work for custom
> stores.
> > >
> > > For me, I'm OK to narrow down the scope and introduce the default
> > built-in
> > > enum store types first.
> > > And if there's further request, we can consider a better way to support
> > > default store impl.
> > >
> > > I'll update the KIP next week, unless there are other opinions from
> other
> > > members.
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > >> Thanks Luke for the updated KIP.
> > >>
> > >> One major change the new proposal has it to change the original enum
> > store
> > >> type with a new interface. Where in the enum approach our internal
> > >> implementations would be something like:
> > >>
> > >> "
> > >> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
> > >> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
> > >> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
> > >> "
> > >>
> > >> And inside the impl classes like here we would could directly do:
> > >>
> > >> "
> > >> if ((supplier = materialized.storeSupplier) == null) {
> > >>     supplier =
> > >> Stores.windowBytesStoreSupplier(materialized.storeImplType())
> > >> }
> > >> "
> > >>
> > >> While I understand the benefit of having an interface such that user
> > >> customized stores could be used as the default store types as well,
> > >> there's
> > >> a trade-off I feel regarding the API complexity. Since with this
> > approach,
> > >> our API complexity granularity is in the order of "number of impl
> > types" *
> > >> "number of store types". This means that whenever we add new store
> types
> > >> in
> > >> the future, this API needs to be augmented and customized impl needs
> to
> > be
> > >> updated to support the new store types, in addition, not all custom
> impl
> > >> types may support all store types, but with this interface they are
> > forced
> > >> to either support all or explicitly throw un-supported exceptions.
> > >>
> > >> The way I see a default impl type is that, they would be safe to use
> for
> > >> any store types, and since store types are evolved by the library
> > itself,
> > >> the default impls would better be controlled by the library as well.
> > >> Custom
> > >> impl classes can choose to replace some of the stores explicitly, but
> > may
> > >> not be a best fit as the default impl classes --- if there are in the
> > >> future, one way we can consider is to make Stores class extensible
> along
> > >> with the enum so that advanced users can add more default impls,
> > assuming
> > >> such scenarios are not very common.
> > >>
> > >> So I'm personally still a bit learning towards the enum approach with
> a
> > >> narrower scope, for its simplicity as an API and also its low
> > maintenance
> > >> cost in the future. Let me know what do you think?
> > >>
> > >>
> > >> Guozhang
> > >>
> > >>
> > >> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
> > >>
> > >> > Hi devs,
> > >> >
> > >> > I'd like to propose a KIP to allow users to set default store
> > >> > implementation class (built-in RocksDB/InMemory, or custom one), and
> > >> > default to RocksDB state store, to keep backward compatibility.
> > >> >
> > >> > Detailed description can be found here:
> > >> >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
> > >> >
> > >> > Any feedback and comments are welcome.
> > >> >
> > >> > Thank you.
> > >> > Luke
> > >> >
> > >>
> > >>
> > >> --
> > >> -- Guozhang
> > >>
> > >
> >
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Luke, I do not have any major comments on the wiki any more. BTW
thanks for making the "public StreamsBuilder(final TopologyConfig
topologyConfigs)" API public now, I think it will benefit a lot of future
works!

I think with the new API, we can deprecate the `build(props)` function in
StreamsBuilder now, and will file a separate JIRA for it.

Just a few nits:

1) There seems to be a typo at the end: "ROCK_DB"
2) Sometimes people refer to "store type" as kv-store, window-store etc;
maybe we can differentiate them a bit by calling the new API names
`StoreImplType`,
`default.dsl.store.impl.type` and `The default store implementation type
used by DSL operators`.

On Thu, Dec 16, 2021 at 2:29 AM Luke Chen <sh...@gmail.com> wrote:

> Hi Guozhang,
>
> I've updated the KIP to use `enum`, instead of store implementation, and
> some content accordingly.
> Please let me know if there's other comments.
>
>
> Thank you.
> Luke
>
> On Sun, Dec 12, 2021 at 3:55 PM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi Guozhang,
> >
> > Thanks for your comments.
> > I agree that in the KIP, there's a trade-off regarding the API
> complexity.
> > With the store impl, we can support default custom stores, but introduce
> > more complexity for users, while with the enum types, users can configure
> > default built-in store types easily, but it can't work for custom stores.
> >
> > For me, I'm OK to narrow down the scope and introduce the default
> built-in
> > enum store types first.
> > And if there's further request, we can consider a better way to support
> > default store impl.
> >
> > I'll update the KIP next week, unless there are other opinions from other
> > members.
> >
> > Thank you.
> > Luke
> >
> > On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> >> Thanks Luke for the updated KIP.
> >>
> >> One major change the new proposal has it to change the original enum
> store
> >> type with a new interface. Where in the enum approach our internal
> >> implementations would be something like:
> >>
> >> "
> >> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
> >> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
> >> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
> >> "
> >>
> >> And inside the impl classes like here we would could directly do:
> >>
> >> "
> >> if ((supplier = materialized.storeSupplier) == null) {
> >>     supplier =
> >> Stores.windowBytesStoreSupplier(materialized.storeImplType())
> >> }
> >> "
> >>
> >> While I understand the benefit of having an interface such that user
> >> customized stores could be used as the default store types as well,
> >> there's
> >> a trade-off I feel regarding the API complexity. Since with this
> approach,
> >> our API complexity granularity is in the order of "number of impl
> types" *
> >> "number of store types". This means that whenever we add new store types
> >> in
> >> the future, this API needs to be augmented and customized impl needs to
> be
> >> updated to support the new store types, in addition, not all custom impl
> >> types may support all store types, but with this interface they are
> forced
> >> to either support all or explicitly throw un-supported exceptions.
> >>
> >> The way I see a default impl type is that, they would be safe to use for
> >> any store types, and since store types are evolved by the library
> itself,
> >> the default impls would better be controlled by the library as well.
> >> Custom
> >> impl classes can choose to replace some of the stores explicitly, but
> may
> >> not be a best fit as the default impl classes --- if there are in the
> >> future, one way we can consider is to make Stores class extensible along
> >> with the enum so that advanced users can add more default impls,
> assuming
> >> such scenarios are not very common.
> >>
> >> So I'm personally still a bit learning towards the enum approach with a
> >> narrower scope, for its simplicity as an API and also its low
> maintenance
> >> cost in the future. Let me know what do you think?
> >>
> >>
> >> Guozhang
> >>
> >>
> >> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
> >>
> >> > Hi devs,
> >> >
> >> > I'd like to propose a KIP to allow users to set default store
> >> > implementation class (built-in RocksDB/InMemory, or custom one), and
> >> > default to RocksDB state store, to keep backward compatibility.
> >> >
> >> > Detailed description can be found here:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
> >> >
> >> > Any feedback and comments are welcome.
> >> >
> >> > Thank you.
> >> > Luke
> >> >
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>


-- 
-- Guozhang

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Luke Chen <sh...@gmail.com>.
Hi Guozhang,

I've updated the KIP to use `enum`, instead of store implementation, and
some content accordingly.
Please let me know if there's other comments.


Thank you.
Luke

On Sun, Dec 12, 2021 at 3:55 PM Luke Chen <sh...@gmail.com> wrote:

> Hi Guozhang,
>
> Thanks for your comments.
> I agree that in the KIP, there's a trade-off regarding the API complexity.
> With the store impl, we can support default custom stores, but introduce
> more complexity for users, while with the enum types, users can configure
> default built-in store types easily, but it can't work for custom stores.
>
> For me, I'm OK to narrow down the scope and introduce the default built-in
> enum store types first.
> And if there's further request, we can consider a better way to support
> default store impl.
>
> I'll update the KIP next week, unless there are other opinions from other
> members.
>
> Thank you.
> Luke
>
> On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com> wrote:
>
>> Thanks Luke for the updated KIP.
>>
>> One major change the new proposal has it to change the original enum store
>> type with a new interface. Where in the enum approach our internal
>> implementations would be something like:
>>
>> "
>> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
>> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
>> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
>> "
>>
>> And inside the impl classes like here we would could directly do:
>>
>> "
>> if ((supplier = materialized.storeSupplier) == null) {
>>     supplier =
>> Stores.windowBytesStoreSupplier(materialized.storeImplType())
>> }
>> "
>>
>> While I understand the benefit of having an interface such that user
>> customized stores could be used as the default store types as well,
>> there's
>> a trade-off I feel regarding the API complexity. Since with this approach,
>> our API complexity granularity is in the order of "number of impl types" *
>> "number of store types". This means that whenever we add new store types
>> in
>> the future, this API needs to be augmented and customized impl needs to be
>> updated to support the new store types, in addition, not all custom impl
>> types may support all store types, but with this interface they are forced
>> to either support all or explicitly throw un-supported exceptions.
>>
>> The way I see a default impl type is that, they would be safe to use for
>> any store types, and since store types are evolved by the library itself,
>> the default impls would better be controlled by the library as well.
>> Custom
>> impl classes can choose to replace some of the stores explicitly, but may
>> not be a best fit as the default impl classes --- if there are in the
>> future, one way we can consider is to make Stores class extensible along
>> with the enum so that advanced users can add more default impls, assuming
>> such scenarios are not very common.
>>
>> So I'm personally still a bit learning towards the enum approach with a
>> narrower scope, for its simplicity as an API and also its low maintenance
>> cost in the future. Let me know what do you think?
>>
>>
>> Guozhang
>>
>>
>> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
>>
>> > Hi devs,
>> >
>> > I'd like to propose a KIP to allow users to set default store
>> > implementation class (built-in RocksDB/InMemory, or custom one), and
>> > default to RocksDB state store, to keep backward compatibility.
>> >
>> > Detailed description can be found here:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
>> >
>> > Any feedback and comments are welcome.
>> >
>> > Thank you.
>> > Luke
>> >
>>
>>
>> --
>> -- Guozhang
>>
>

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Luke Chen <sh...@gmail.com>.
Hi Guozhang,

Thanks for your comments.
I agree that in the KIP, there's a trade-off regarding the API complexity.
With the store impl, we can support default custom stores, but introduce
more complexity for users, while with the enum types, users can configure
default built-in store types easily, but it can't work for custom stores.

For me, I'm OK to narrow down the scope and introduce the default built-in
enum store types first.
And if there's further request, we can consider a better way to support
default store impl.

I'll update the KIP next week, unless there are other opinions from other
members.

Thank you.
Luke

On Fri, Dec 10, 2021 at 6:33 AM Guozhang Wang <wa...@gmail.com> wrote:

> Thanks Luke for the updated KIP.
>
> One major change the new proposal has it to change the original enum store
> type with a new interface. Where in the enum approach our internal
> implementations would be something like:
>
> "
> Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
> Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
> Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
> "
>
> And inside the impl classes like here we would could directly do:
>
> "
> if ((supplier = materialized.storeSupplier) == null) {
>     supplier =
> Stores.windowBytesStoreSupplier(materialized.storeImplType())
> }
> "
>
> While I understand the benefit of having an interface such that user
> customized stores could be used as the default store types as well, there's
> a trade-off I feel regarding the API complexity. Since with this approach,
> our API complexity granularity is in the order of "number of impl types" *
> "number of store types". This means that whenever we add new store types in
> the future, this API needs to be augmented and customized impl needs to be
> updated to support the new store types, in addition, not all custom impl
> types may support all store types, but with this interface they are forced
> to either support all or explicitly throw un-supported exceptions.
>
> The way I see a default impl type is that, they would be safe to use for
> any store types, and since store types are evolved by the library itself,
> the default impls would better be controlled by the library as well. Custom
> impl classes can choose to replace some of the stores explicitly, but may
> not be a best fit as the default impl classes --- if there are in the
> future, one way we can consider is to make Stores class extensible along
> with the enum so that advanced users can add more default impls, assuming
> such scenarios are not very common.
>
> So I'm personally still a bit learning towards the enum approach with a
> narrower scope, for its simplicity as an API and also its low maintenance
> cost in the future. Let me know what do you think?
>
>
> Guozhang
>
>
> On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:
>
> > Hi devs,
> >
> > I'd like to propose a KIP to allow users to set default store
> > implementation class (built-in RocksDB/InMemory, or custom one), and
> > default to RocksDB state store, to keep backward compatibility.
> >
> > Detailed description can be found here:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
> >
> > Any feedback and comments are welcome.
> >
> > Thank you.
> > Luke
> >
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-591: Add Kafka Streams config to set default state store

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Luke for the updated KIP.

One major change the new proposal has it to change the original enum store
type with a new interface. Where in the enum approach our internal
implementations would be something like:

"
Stores#keyValueBytesStoreSupplier(storeImplTypes, storeName, ...)
Stores#windowBytesStoreSupplier(storeImplTypes, storeName, ...)
Stores#sessionBytesStoreSupplier(storeImplTypes, storeName, ...)
"

And inside the impl classes like here we would could directly do:

"
if ((supplier = materialized.storeSupplier) == null) {
    supplier = Stores.windowBytesStoreSupplier(materialized.storeImplType())
}
"

While I understand the benefit of having an interface such that user
customized stores could be used as the default store types as well, there's
a trade-off I feel regarding the API complexity. Since with this approach,
our API complexity granularity is in the order of "number of impl types" *
"number of store types". This means that whenever we add new store types in
the future, this API needs to be augmented and customized impl needs to be
updated to support the new store types, in addition, not all custom impl
types may support all store types, but with this interface they are forced
to either support all or explicitly throw un-supported exceptions.

The way I see a default impl type is that, they would be safe to use for
any store types, and since store types are evolved by the library itself,
the default impls would better be controlled by the library as well. Custom
impl classes can choose to replace some of the stores explicitly, but may
not be a best fit as the default impl classes --- if there are in the
future, one way we can consider is to make Stores class extensible along
with the enum so that advanced users can add more default impls, assuming
such scenarios are not very common.

So I'm personally still a bit learning towards the enum approach with a
narrower scope, for its simplicity as an API and also its low maintenance
cost in the future. Let me know what do you think?


Guozhang


On Wed, Dec 1, 2021 at 6:48 PM Luke Chen <sh...@gmail.com> wrote:

> Hi devs,
>
> I'd like to propose a KIP to allow users to set default store
> implementation class (built-in RocksDB/InMemory, or custom one), and
> default to RocksDB state store, to keep backward compatibility.
>
> Detailed description can be found here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-591%3A+Add+Kafka+Streams+config+to+set+default+state+store
>
> Any feedback and comments are welcome.
>
> Thank you.
> Luke
>


-- 
-- Guozhang