You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/05/09 20:47:38 UTC

[GitHub] [kafka] bbejeck opened a new pull request #8637: KAFKA-9976: Reuse repartition node in all cases

bbejeck opened a new pull request #8637:
URL: https://github.com/apache/kafka/pull/8637


   If a `KGroupedStream` or `KGroupedTable` requires a repartition, reusing the same instance results in creating a new repartition topic.   Unless the user provides a name via `Grouped`.  However, we should have consistent behavior regardless if the user provides a name or Kafka Streams generates it.
   
   This PR changes the behavior to create one repartition topic (if required) per instance.
   
   I've updated the test cases.
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] bbejeck commented on pull request #8637: KAFKA-9976: Reuse repartition node in all cases for KGroupedStream and KGroupedTable aggregates

Posted by GitBox <gi...@apache.org>.
bbejeck commented on pull request #8637:
URL: https://github.com/apache/kafka/pull/8637#issuecomment-632249736


   >I think it's best to close this PR and also the ticker (either as "not a problem" or "won't fix")?
   
   ack


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] bbejeck edited a comment on pull request #8637: KAFKA-9976: Reuse repartition node in all cases for KGroupedStream and KGroupedTable aggregates

Posted by GitBox <gi...@apache.org>.
bbejeck edited a comment on pull request #8637:
URL: https://github.com/apache/kafka/pull/8637#issuecomment-626233467


   ping @mjsax and @vvcephei for review
   
   relates to fix in https://github.com/apache/kafka/pull/8504


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] bbejeck closed pull request #8637: KAFKA-9976: Reuse repartition node in all cases for KGroupedStream and KGroupedTable aggregates

Posted by GitBox <gi...@apache.org>.
bbejeck closed pull request #8637:
URL: https://github.com/apache/kafka/pull/8637


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mjsax commented on pull request #8637: KAFKA-9976: Reuse repartition node in all cases for KGroupedStream and KGroupedTable aggregates

Posted by GitBox <gi...@apache.org>.
mjsax commented on pull request #8637:
URL: https://github.com/apache/kafka/pull/8637#issuecomment-628315503


   @bbejeck Wondering if there are any backward incompatibility concerns? Can you explain why this change is safe? For #8504 there would be an exception before and thus it was broken and no compatibility concern raises. Not 100% sure about this PR though.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] bbejeck commented on a change in pull request #8637: KAFKA-9976: Reuse repartition node in all cases

Posted by GitBox <gi...@apache.org>.
bbejeck commented on a change in pull request #8637:
URL: https://github.com/apache/kafka/pull/8637#discussion_r422544340



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/GroupedStreamAggregateBuilder.java
##########
@@ -85,10 +85,8 @@
             sourceName = createRepartitionSource(repartitionTopicPrefix, repartitionNodeBuilder);
 
             // First time through we need to create a repartition node.
-            // Any subsequent calls to GroupedStreamAggregateBuilder#build we check if
-            // the user has provided a name for the repartition topic, is so we re-use
-            // the existing repartition node, otherwise we create a new one.
-            if (repartitionNode == null || userProvidedRepartitionTopicName == null) {
+            // Otherwise we'll reuse the repartition node.
+            if (repartitionNode == null) {

Review comment:
       This is the change, if we don't have a repartition node, create it, otherwise, add the existing one to the graph.

##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/KGroupedTableImpl.java
##########
@@ -79,7 +79,7 @@
         final String repartitionTopic = (userProvidedRepartitionTopicName != null ? userProvidedRepartitionTopicName : materialized.storeName())
             + KStreamImpl.REPARTITION_TOPIC_SUFFIX;
 
-        if (repartitionGraphNode == null || userProvidedRepartitionTopicName == null) {
+        if (repartitionGraphNode == null) {

Review comment:
       same here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] bbejeck commented on pull request #8637: KAFKA-9976: Reuse repartition node in all cases

Posted by GitBox <gi...@apache.org>.
bbejeck commented on pull request #8637:
URL: https://github.com/apache/kafka/pull/8637#issuecomment-626233467


   ping @mjsax and @vvcephei for review


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mjsax commented on pull request #8637: KAFKA-9976: Reuse repartition node in all cases for KGroupedStream and KGroupedTable aggregates

Posted by GitBox <gi...@apache.org>.
mjsax commented on pull request #8637:
URL: https://github.com/apache/kafka/pull/8637#issuecomment-631803677


   Seems you agree to my last comment: https://github.com/apache/kafka/pull/8504#issuecomment-631757206
   
   I think it's best to close this PR and also the ticker (either as "not a problem" or "won't fix")?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org