You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Debasish Ghosh <gh...@gmail.com> on 2017/08/01 07:34:15 UTC

Kafka streams store migration - best practices

Hi -

I have a Kafka Streams application that needs to run on multiple instances.
It fetches metadata from all local stores and has an http query layer for
interactive queries. In some cases when I have new instances deployed,
store migration takes place making the current metadata invalid. Here are
my questions regarding some of the best practices to be followed to handle
this issue of store migration -

   - When the migration is in process, a query for the metadata may result
   in InvalidStateStoreException - is it a good practice to always have a
   retry semantics based query for the metadata ?
   - Should I check KafkaStreams.state() and only assume that I have got
   the correct metadata when the state() call returns Running. If it
   returns Rebalancing, then I should re-query. Is this correct approach ?

regards.

-- 
Debasish Ghosh
http://manning.com/ghosh2
http://manning.com/ghosh

Twttr: @debasishg
Blog: http://debasishg.blogspot.com
Code: http://github.com/debasishg

Re: Kafka streams store migration - best practices

Posted by Debasish Ghosh <gh...@gmail.com>.
Thanks for the confirmation. I guess listener would make sense if I did
some caching of the store and needed to refresh it for every change in the
underlying store.

On Tue, Aug 1, 2017 at 6:10 PM, Damian Guy <da...@gmail.com> wrote:

> No you don't need to set a listener. Was just mentioning as it an option
> if you wan't to know that the metadata needs refreshing,
>
> On Tue, 1 Aug 2017 at 13:25 Debasish Ghosh <gh...@gmail.com>
> wrote:
>
>> Regarding the last point, do I need to set up the listener ?
>>
>> All I want is to do a query from the store. For that I need to invoke streams.store()
>> first, which can potentially throw an InvalidStateStoreException during
>> rebalancing / migration of stores. If I call streams.store() with
>> retries till the rebalancing is done or I exceed some max retry count, then
>> I think I should good.
>>
>> Or am I missing something ?
>>
>> regards.
>>
>> On Tue, Aug 1, 2017 at 1:10 PM, Damian Guy <da...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> On Tue, 1 Aug 2017 at 08:34 Debasish Ghosh <gh...@gmail.com>
>>> wrote:
>>>
>>>> Hi -
>>>>
>>>> I have a Kafka Streams application that needs to run on multiple
>>>> instances.
>>>> It fetches metadata from all local stores and has an http query layer
>>>> for
>>>> interactive queries. In some cases when I have new instances deployed,
>>>> store migration takes place making the current metadata invalid. Here
>>>> are
>>>> my questions regarding some of the best practices to be followed to
>>>> handle
>>>> this issue of store migration -
>>>>
>>>>    - When the migration is in process, a query for the metadata may
>>>> result
>>>>    in InvalidStateStoreException - is it a good practice to always have
>>>> a
>>>>    retry semantics based query for the metadata ?
>>>>
>>>
>>> Yes. Whenever the application is rebalancing the stores will be
>>> unavailable, so retrying is the right thing to do.
>>>
>>>
>>>>    - Should I check KafkaStreams.state() and only assume that I have got
>>>>    the correct metadata when the state() call returns Running. If it
>>>>    returns Rebalancing, then I should re-query. Is this correct
>>>> approach ?
>>>>
>>>
>>> Correct again! If the state is rebalancing, then the metadata (for some
>>> stores at least) is going to change, so you should get it again. You can
>>> set a StateListener on the KafkaStreams instance to listen to these events.
>>>
>>>
>>>>
>>>> regards.
>>>>
>>>> --
>>>> Debasish Ghosh
>>>> http://manning.com/ghosh2
>>>> http://manning.com/ghosh
>>>>
>>>> Twttr: @debasishg
>>>> Blog: http://debasishg.blogspot.com
>>>> Code: http://github.com/debasishg
>>>>
>>>
>>
>>
>> --
>> Debasish Ghosh
>> http://manning.com/ghosh2
>> http://manning.com/ghosh
>>
>> Twttr: @debasishg
>> Blog: http://debasishg.blogspot.com
>> Code: http://github.com/debasishg
>>
>


-- 
Debasish Ghosh
http://manning.com/ghosh2
http://manning.com/ghosh

Twttr: @debasishg
Blog: http://debasishg.blogspot.com
Code: http://github.com/debasishg

Re: Kafka streams store migration - best practices

Posted by Damian Guy <da...@gmail.com>.
No you don't need to set a listener. Was just mentioning as it an option if
you wan't to know that the metadata needs refreshing,

On Tue, 1 Aug 2017 at 13:25 Debasish Ghosh <gh...@gmail.com> wrote:

> Regarding the last point, do I need to set up the listener ?
>
> All I want is to do a query from the store. For that I need to invoke streams.store()
> first, which can potentially throw an InvalidStateStoreException during
> rebalancing / migration of stores. If I call streams.store() with retries
> till the rebalancing is done or I exceed some max retry count, then I think
> I should good.
>
> Or am I missing something ?
>
> regards.
>
> On Tue, Aug 1, 2017 at 1:10 PM, Damian Guy <da...@gmail.com> wrote:
>
>> Hi,
>>
>> On Tue, 1 Aug 2017 at 08:34 Debasish Ghosh <gh...@gmail.com>
>> wrote:
>>
>>> Hi -
>>>
>>> I have a Kafka Streams application that needs to run on multiple
>>> instances.
>>> It fetches metadata from all local stores and has an http query layer for
>>> interactive queries. In some cases when I have new instances deployed,
>>> store migration takes place making the current metadata invalid. Here are
>>> my questions regarding some of the best practices to be followed to
>>> handle
>>> this issue of store migration -
>>>
>>>    - When the migration is in process, a query for the metadata may
>>> result
>>>    in InvalidStateStoreException - is it a good practice to always have a
>>>    retry semantics based query for the metadata ?
>>>
>>
>> Yes. Whenever the application is rebalancing the stores will be
>> unavailable, so retrying is the right thing to do.
>>
>>
>>>    - Should I check KafkaStreams.state() and only assume that I have got
>>>    the correct metadata when the state() call returns Running. If it
>>>    returns Rebalancing, then I should re-query. Is this correct approach
>>> ?
>>>
>>
>> Correct again! If the state is rebalancing, then the metadata (for some
>> stores at least) is going to change, so you should get it again. You can
>> set a StateListener on the KafkaStreams instance to listen to these events.
>>
>>
>>>
>>> regards.
>>>
>>> --
>>> Debasish Ghosh
>>> http://manning.com/ghosh2
>>> http://manning.com/ghosh
>>>
>>> Twttr: @debasishg
>>> Blog: http://debasishg.blogspot.com
>>> Code: http://github.com/debasishg
>>>
>>
>
>
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>

Re: Kafka streams store migration - best practices

Posted by Debasish Ghosh <gh...@gmail.com>.
Regarding the last point, do I need to set up the listener ?

All I want is to do a query from the store. For that I need to invoke
streams.store()
first, which can potentially throw an InvalidStateStoreException during
rebalancing / migration of stores. If I call streams.store() with retries
till the rebalancing is done or I exceed some max retry count, then I think
I should good.

Or am I missing something ?

regards.

On Tue, Aug 1, 2017 at 1:10 PM, Damian Guy <da...@gmail.com> wrote:

> Hi,
>
> On Tue, 1 Aug 2017 at 08:34 Debasish Ghosh <gh...@gmail.com>
> wrote:
>
>> Hi -
>>
>> I have a Kafka Streams application that needs to run on multiple
>> instances.
>> It fetches metadata from all local stores and has an http query layer for
>> interactive queries. In some cases when I have new instances deployed,
>> store migration takes place making the current metadata invalid. Here are
>> my questions regarding some of the best practices to be followed to handle
>> this issue of store migration -
>>
>>    - When the migration is in process, a query for the metadata may result
>>    in InvalidStateStoreException - is it a good practice to always have a
>>    retry semantics based query for the metadata ?
>>
>
> Yes. Whenever the application is rebalancing the stores will be
> unavailable, so retrying is the right thing to do.
>
>
>>    - Should I check KafkaStreams.state() and only assume that I have got
>>    the correct metadata when the state() call returns Running. If it
>>    returns Rebalancing, then I should re-query. Is this correct approach ?
>>
>
> Correct again! If the state is rebalancing, then the metadata (for some
> stores at least) is going to change, so you should get it again. You can
> set a StateListener on the KafkaStreams instance to listen to these events.
>
>
>>
>> regards.
>>
>> --
>> Debasish Ghosh
>> http://manning.com/ghosh2
>> http://manning.com/ghosh
>>
>> Twttr: @debasishg
>> Blog: http://debasishg.blogspot.com
>> Code: http://github.com/debasishg
>>
>


-- 
Debasish Ghosh
http://manning.com/ghosh2
http://manning.com/ghosh

Twttr: @debasishg
Blog: http://debasishg.blogspot.com
Code: http://github.com/debasishg

Re: Kafka streams store migration - best practices

Posted by Damian Guy <da...@gmail.com>.
Hi,

On Tue, 1 Aug 2017 at 08:34 Debasish Ghosh <gh...@gmail.com> wrote:

> Hi -
>
> I have a Kafka Streams application that needs to run on multiple instances.
> It fetches metadata from all local stores and has an http query layer for
> interactive queries. In some cases when I have new instances deployed,
> store migration takes place making the current metadata invalid. Here are
> my questions regarding some of the best practices to be followed to handle
> this issue of store migration -
>
>    - When the migration is in process, a query for the metadata may result
>    in InvalidStateStoreException - is it a good practice to always have a
>    retry semantics based query for the metadata ?
>

Yes. Whenever the application is rebalancing the stores will be
unavailable, so retrying is the right thing to do.


>    - Should I check KafkaStreams.state() and only assume that I have got
>    the correct metadata when the state() call returns Running. If it
>    returns Rebalancing, then I should re-query. Is this correct approach ?
>

Correct again! If the state is rebalancing, then the metadata (for some
stores at least) is going to change, so you should get it again. You can
set a StateListener on the KafkaStreams instance to listen to these events.


>
> regards.
>
> --
> Debasish Ghosh
> http://manning.com/ghosh2
> http://manning.com/ghosh
>
> Twttr: @debasishg
> Blog: http://debasishg.blogspot.com
> Code: http://github.com/debasishg
>