You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by David Garcia <da...@spiceworks.com> on 2016/08/02 15:47:07 UTC

KTable and Rebalance Operations

Hello, I’ve googled around for this, but haven’t had any luck.  Based upon this: http://docs.confluent.io/3.0.0/streams/architecture.html#state  KTables are local to instances.  An instance will process one or more partitions from one or more topics.  How does Kstreams/Ktables handle the following situation?

A single application instance is processing 4 partitions from a topic.  The application is using a Ktable.  Each event triggers lookups in the KTable.  Now, a new application instance is started.  This triggers a rebalancing of the partitions.  2 partitions originally processed by the first instance migrate to the new instance.  What happens with the KTable?  Is the entire table “migrated” also?  This would be nice because lookups (in the first instance) triggered by particular events should be identical to lookups (in the second instance) triggered by those same events.

-David

Re: KTable and Rebalance Operations

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Hi David,

on startup of the second application instance, the KTable is effectively
partitioned into two distinct partial KTables, each holding the
key-valus pairs for their corresponding assigned partitions.

Thus, your "lookups" on each instance, can only access the key-value
pairs for the set of keys assigned to each instance. There is no
replication of the whole KTable to both instances happening.

We are aware, that a global view over the whole KTable (ie, all local
KTable partitions over all running applications instances) is a nice
feature. There is already a KIP in place and we hope to release this
feature, soon:

Have look here for QA KIP-67
https://cwiki.apache.org/confluence/display/KAFKA/KIP-67%3A+Queryable+state+for+Kafka+Streams


-Matthias


On 08/02/2016 05:47 PM, David Garcia wrote:
> Hello, I’ve googled around for this, but haven’t had any luck.  Based upon this: http://docs.confluent.io/3.0.0/streams/architecture.html#state  KTables are local to instances.  An instance will process one or more partitions from one or more topics.  How does Kstreams/Ktables handle the following situation?
> 
> A single application instance is processing 4 partitions from a topic.  The application is using a Ktable.  Each event triggers lookups in the KTable.  Now, a new application instance is started.  This triggers a rebalancing of the partitions.  2 partitions originally processed by the first instance migrate to the new instance.  What happens with the KTable?  Is the entire table “migrated” also?  This would be nice because lookups (in the first instance) triggered by particular events should be identical to lookups (in the second instance) triggered by those same events.
> 
> -David
>