You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Matthias J. Sax (Jira)" <ji...@apache.org> on 2020/04/26 19:00:03 UTC

[jira] [Commented] (KAFKA-9127) Needless group coordination overhead for GlobalKTables

    [ https://issues.apache.org/jira/browse/KAFKA-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092826#comment-17092826 ] 

Matthias J. Sax commented on KAFKA-9127:
----------------------------------------

[~ableegoldman] Seems we introduced a regression in 2.5.0 via KAFKA-7317 that would be fixed with this ticket (cf. [https://stackoverflow.com/questions/61342530/kafka-streams-2-5-0-requires-input-topic]). If you agree, we should cherry-pick the fix ti 2.5 branch. And also add a corresponding test.

Thoughts?

> Needless group coordination overhead for GlobalKTables
> ------------------------------------------------------
>
>                 Key: KAFKA-9127
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9127
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 0.10.2.0
>            Reporter: Chris Toomey
>            Assignee: Sophie Blee-Goldman
>            Priority: Major
>             Fix For: 2.6.0
>
>
> When creating a simple stream topology to just populate a GlobalKTable, I noticed from logging that the stream consumer was doing group coordination requests (JoinGroup, SyncGroup, ...) to the server, which it had no reason to do since the global consumer thread populating the table fetches from all partitions and thus doesn't use the group requests. So this adds needless overhead on the client, network, and server.
> I tracked this down to the stream thread consumer, which is created regardless of whether it's needed based solely on NUM_STREAM_THREADS_CONFIG which defaults to 1 I guess.
> I found that setting NUM_STREAM_THREADS_CONFIG to 0 will prevent this from happening, but it'd be a worthwhile improvement to be able to override this setting in cases of topologies like this that don't have any need for stream threads. Hence this ticket.
> I originally asked about this on the users mailing list where Bruno suggested I file it as an improvement request.
> Here's the Scala code that I'm using that exhibits this:
> {code:scala}
> val builder: StreamsBuilder = new StreamsBuilder()
> val gTable = builder.globalTable[K, V](...)
> val stream = new KafkaStreams(builder.build(), props)
> stream.start(){code}
>  Not shown is the state store that I'm populating/using.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)