You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "gharris1727 (via GitHub)" <gi...@apache.org> on 2023/06/05 23:31:25 UTC

[GitHub] [kafka] gharris1727 opened a new pull request, #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

gharris1727 opened a new pull request, #13816:
URL: https://github.com/apache/kafka/pull/13816

   The testOffsetTranslationBehindReplicationFlow produces a lot of records, and the MirrorConnectorsIntegrationTransactionsTest was initializing a transaction for every record. Instead, batch records together. This brings the transaction test runtimes in-line with other suites.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "C0urante (via GitHub)" <gi...@apache.org>.
C0urante commented on PR #13816:
URL: https://github.com/apache/kafka/pull/13816#issuecomment-1599106351

   There are some test failures for the `MirrorConnectorsIntegrationExactlyOnceTest` suite. I believe these should be addressed by https://github.com/apache/kafka/pull/13819; can we rebase onto the latest trunk and check the CI build, just to be safe?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] gharris1727 commented on pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "gharris1727 (via GitHub)" <gi...@apache.org>.
gharris1727 commented on PR #13816:
URL: https://github.com/apache/kafka/pull/13816#issuecomment-1593378454

   I believe this PR will solve the current MirrorConnectorsIntegrationTransactionsTest#testReplication instability that appears on trunk. I'm getting < 1% failure rate in a 30% CPU test environment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] gharris1727 commented on pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "gharris1727 (via GitHub)" <gi...@apache.org>.
gharris1727 commented on PR #13816:
URL: https://github.com/apache/kafka/pull/13816#issuecomment-1579359334

   > Is there any chance those are related to this change?
   
   Yes, it was related to this change. We improved the performance of produce so drastically that the MirrorSourceConnector started to throttle offset syncs due to the change in KAFKA-14610. This caused the initial 0:0 offset sync to get skipped, which prevented a checkpoint from being emitted, which caused the test to time out.
   
   I've adjusted that test to perform a non-zero-offset commit, and wait for that sync/checkpoint to be emitted before proceeding with the rest of the test normally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "C0urante (via GitHub)" <gi...@apache.org>.
C0urante commented on PR #13816:
URL: https://github.com/apache/kafka/pull/13816#issuecomment-1601405761

   The test failures that occurred during CI are related to topic confic syncing and appear both unrelated and to be happening on trunk as well. Merging...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante commented on a diff in pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "C0urante (via GitHub)" <gi...@apache.org>.
C0urante commented on code in PR #13816:
URL: https://github.com/apache/kafka/pull/13816#discussion_r1219842657


##########
connect/mirror/src/test/java/org/apache/kafka/connect/mirror/TestUtils.java:
##########
@@ -34,14 +38,39 @@ static Map<String, String> makeProps(String... keyValues) {
         }
         return props;
     }
-    
-    /*
-     * return records with different but predictable key and value 
+
+    /**
+     * Assemble a collection of records arbitrarily distributed across all partitions of the specified topic
+     * @param topicName Destination topic
+     * @param numRecords count of records to produce to the topic in total
+     * @return A batch of records that can be sent to a producer.
      */
-    public static Map<String, String> generateRecords(int numRecords) {
-        Map<String, String> records = new HashMap<>();
+    public static List<ProducerRecord<byte[], byte[]>> generateRecords(String topicName, int numRecords) {
+        List<ProducerRecord<byte[], byte[]>> records = new ArrayList<>();
         for (int i = 0; i < numRecords; i++) {
-            records.put("key-" + i, "message-" + i);
+            String key = "key-" + i;
+            String value = "message-" + i;
+            records.add(new ProducerRecord<>(topicName, null, key.getBytes(), value.getBytes()));
+        }
+        return records;
+    }
+
+    /**
+     * Assemble a collection of records evenly distributed across some partitions of the specified topic

Review Comment:
   We may want to add a note here that, if `numPartitions` is greater than one, this will cause records with the same key to be written to different partitions.



##########
connect/mirror/src/test/java/org/apache/kafka/connect/mirror/integration/MirrorConnectorsIntegrationBaseTest.java:
##########
@@ -234,6 +245,8 @@ public void startClusters(Map<String, String> additionalMM2Config) throws Except
     @AfterEach
     public void shutdownClusters() throws Exception {
         try {
+            primaryProducer.close();
+            backupProducer.close();

Review Comment:
   We probably still want to try to gracefully shut down the cluster even if we encounter an error with stopping our producers:
   ```suggestion
               Utils.closeQuietly(primaryProducer);
               Utils.closeQuietly(backupProducer);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] C0urante merged pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "C0urante (via GitHub)" <gi...@apache.org>.
C0urante merged PR #13816:
URL: https://github.com/apache/kafka/pull/13816


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] gharris1727 commented on a diff in pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "gharris1727 (via GitHub)" <gi...@apache.org>.
gharris1727 commented on code in PR #13816:
URL: https://github.com/apache/kafka/pull/13816#discussion_r1220234170


##########
connect/mirror/src/test/java/org/apache/kafka/connect/mirror/TestUtils.java:
##########
@@ -34,14 +38,39 @@ static Map<String, String> makeProps(String... keyValues) {
         }
         return props;
     }
-    
-    /*
-     * return records with different but predictable key and value 
+
+    /**
+     * Assemble a collection of records arbitrarily distributed across all partitions of the specified topic
+     * @param topicName Destination topic
+     * @param numRecords count of records to produce to the topic in total
+     * @return A batch of records that can be sent to a producer.
      */
-    public static Map<String, String> generateRecords(int numRecords) {
-        Map<String, String> records = new HashMap<>();
+    public static List<ProducerRecord<byte[], byte[]>> generateRecords(String topicName, int numRecords) {
+        List<ProducerRecord<byte[], byte[]>> records = new ArrayList<>();
         for (int i = 0; i < numRecords; i++) {
-            records.put("key-" + i, "message-" + i);
+            String key = "key-" + i;
+            String value = "message-" + i;
+            records.add(new ProducerRecord<>(topicName, null, key.getBytes(), value.getBytes()));
+        }
+        return records;
+    }
+
+    /**
+     * Assemble a collection of records evenly distributed across some partitions of the specified topic

Review Comment:
   Since the tests don't rely on this keying, and having the same key in multiple partitions is strange, i'll change this to make the keys unique.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] gharris1727 commented on pull request #13816: MINOR: Optimize runtime of MM2 integration tests by batching transactions

Posted by "gharris1727 (via GitHub)" <gi...@apache.org>.
gharris1727 commented on PR #13816:
URL: https://github.com/apache/kafka/pull/13816#issuecomment-1599282092

   @C0urante I rebased again to remove the dependency on the reverted #13838, and pulled in the change in #13819 .
   
   PTAL thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org