You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/11/03 04:04:56 UTC

[GitHub] [kafka] mjsax commented on a change in pull request #11447: KAFKA-13024: Use not-null filter only in optimizable repartitions

mjsax commented on a change in pull request #11447:
URL: https://github.com/apache/kafka/pull/11447#discussion_r741604615



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/GroupedTableOperationRepartitionNode.java
##########
@@ -119,5 +119,10 @@ public void writeToTopology(final InternalTopologyBuilder topologyBuilder) {
                 processorParameters
             );
         }
+
+        @Override
+        public boolean isOptimizable() {
+            return false;

Review comment:
       Why is a table repartitioning not optimizable?

##########
File path: streams/src/test/java/org/apache/kafka/streams/integration/KStreamRepartitionIntegrationTest.java
##########
@@ -672,6 +672,42 @@ public void shouldGoThroughRebalancingCorrectly() throws Exception {
         assertEquals(2, getNumberOfPartitionsForTopic(repartitionTopicName));
     }
 
+    @Test
+    public void shouldNotFilterOutNullKeysOnRepartition() throws Exception {
+        final String repartitionName = "repartition-test";
+        final long timestamp = System.currentTimeMillis();
+        sendEvents(
+            timestamp,
+            Arrays.asList(
+                new KeyValue<>(1, "A"),
+                new KeyValue<>(2, "B"),
+                new KeyValue<>(3, null)
+            )
+        );
+
+        final StreamsBuilder builder = new StreamsBuilder();
+        final Repartitioned<String, String> repartitioned = Repartitioned.<String, String>as(repartitionName)
+            .withKeySerde(Serdes.String())
+            .withValueSerde(Serdes.String());
+
+        builder.stream(inputTopic, Consumed.with(Serdes.Integer(), Serdes.String()))
+            .selectKey((key, value) -> value == null ? null : key.toString())
+            .repartition(repartitioned)
+            .mapValues(value -> value != null ? "mapped-" + value  : "default-value")

Review comment:
       Why do we need the `mapValues()` step in this test?

##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/UnoptimizableRepartitionNode.java
##########
@@ -53,19 +53,13 @@ private UnoptimizableRepartitionNode(final String nodeName,
     public void writeToTopology(final InternalTopologyBuilder topologyBuilder) {
         topologyBuilder.addInternalTopic(repartitionTopic, internalTopicProperties);
 
-        topologyBuilder.addProcessor(

Review comment:
       Not sure if I understand this change? Can you elaborate?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org