You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "showuon (via GitHub)" <gi...@apache.org> on 2023/04/24 12:45:38 UTC

[GitHub] [kafka] showuon opened a new pull request, #13631: KAFKA-14909: check zkMigrationReady tag before migration

showuon opened a new pull request, #13631:
URL: https://github.com/apache/kafka/pull/13631

   1. add `ZkMigrationReady` in apiVersionsResponse
   2. check all nodes if `ZkMigrationReady` are ready before moving to next migration state
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1176025802


##########
metadata/src/test/java/org/apache/kafka/metadata/migration/KRaftMigrationDriverTest.java:
##########
@@ -383,8 +466,8 @@ public ZkMigrationLeadershipState claimControllerLeadership(ZkMigrationLeadershi
             driver.onMetadataUpdate(delta, image, new LogDeltaManifest(provenance,
                 new LeaderAndEpoch(OptionalInt.of(3000), 1), 1, 100, 42));
             Assertions.assertTrue(claimLeaderAttempts.await(1, TimeUnit.MINUTES));
-            TestUtils.waitForCondition(() -> driver.migrationState().get(1, TimeUnit.MINUTES).equals(MigrationDriverState.ZK_MIGRATION),
-                "Waiting for KRaftMigrationDriver to enter ZK_MIGRATION state");
+            TestUtils.waitForCondition(() -> driver.migrationState().get(1, TimeUnit.MINUTES).equals(MigrationDriverState.DUAL_WRITE),
+                "Waiting for KRaftMigrationDriver to enter DUAL_WRITE state");

Review Comment:
   This test failed sometimes because `ZK_MIGRATION` is just a middle state, so we might miss to get this state and cause the test failure. Changing to verify the `DUAL_WRITE` state which will be more reliable. This will also verify the `ZK_MIGRATION` state because we can't jump to `DUAL_WRITE` directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] mumrah commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "mumrah (via GitHub)" <gi...@apache.org>.
mumrah commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1175362654


##########
metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationDriver.java:
##########
@@ -134,8 +139,7 @@ private void initializeMigrationState() {
     }
 
     private boolean isControllerQuorumReadyForMigration() {
-        // TODO implement this
-        return true;
+        return this.apiVersions.isAllNodeZkMigrationReady();

Review Comment:
   When we check to see if the quorum nodes are ready, it would be useful to print an error log showing which controllers were not ready here (similar to what we do in `areZkBrokersReadyForMigration`). 
   
   
   
   



##########
clients/src/main/java/org/apache/kafka/clients/ApiVersions.java:
##########
@@ -64,4 +64,8 @@ public synchronized byte maxUsableProduceMagic() {
         return maxUsableProduceMagic;
     }
 
+    // check if all nodes are ZK Migration ready
+    public boolean isAllNodeZkMigrationReady() {

Review Comment:
   Can we move this logic into QuorumFeatures? That would allow us to verify that all the expected controllers are present in the ApiVersions (in addition to checking the `zkMigrationEnabled` flag)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] mumrah commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "mumrah (via GitHub)" <gi...@apache.org>.
mumrah commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1176825739


##########
metadata/src/main/java/org/apache/kafka/controller/QuorumFeatures.java:
##########
@@ -128,4 +128,27 @@ VersionRange localSupportedFeature(String featureName) {
     boolean isControllerId(int nodeId) {
         return quorumNodeIds.contains(nodeId);
     }
+
+    // check if all controller nodes are ZK Migration ready
+    public boolean isAllControllersZkMigrationReady() {

Review Comment:
   Can we follow the same pattern that we've got in `reasonNotSupported` where we return an optional string of a reason why the quorum isn't ready? That way we can have all the pertinent logging come from KRaftMigrationDriver



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on PR #13631:
URL: https://github.com/apache/kafka/pull/13631#issuecomment-1525377944

   Failed tests are unrelated and also failed in trunk build:
   ```
       Build / JDK 11 and Scala 2.13 / org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testSingleNodeCluster()
       Build / JDK 11 and Scala 2.13 / org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testMultiNodeCluster()
       Build / JDK 11 and Scala 2.13 / kafka.server.KRaftClusterTest.testLegacyAlterConfigs()
       Build / JDK 11 and Scala 2.13 / kafka.server.RaftClusterSnapshotTest.testSnapshotsGenerated()
       Build / JDK 11 and Scala 2.13 / kafka.server.RaftClusterSnapshotTest.testSnapshotsGenerated()
       Build / JDK 11 and Scala 2.13 / org.apache.kafka.controller.QuorumControllerTest.testConfigurationOperations()
       Build / JDK 11 and Scala 2.13 / org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated()
       Build / JDK 17 and Scala 2.13 / org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testMultiNodeCluster()
       Build / JDK 17 and Scala 2.13 / kafka.server.KRaftClusterTest.testCreateClusterAndPerformReassignment()
       Build / JDK 17 and Scala 2.13 / kafka.server.KRaftClusterTest.testUnregisterBroker()
       Build / JDK 17 and Scala 2.13 / kafka.server.KRaftClusterTest.testCreateClusterAndCreateAndManyTopics()
       Build / JDK 17 and Scala 2.13 / org.apache.kafka.controller.QuorumControllerTest.testUpgradeMigrationStateFrom34()
       Build / JDK 17 and Scala 2.13 / org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testSingleNodeCluster()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testMultiNodeCluster()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.mirror.integration.IdentityReplicationIntegrationTest.testSyncTopicConfigs()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.integration.OffsetsApiIntegrationTest.testGetSinkConnectorOffsetsDifferentKafkaClusterTargeted
       Build / JDK 8 and Scala 2.12 / kafka.server.KRaftClusterTest.testSetLog4jConfigurations()
       Build / JDK 8 and Scala 2.12 / kafka.server.KRaftClusterTest.testCreateClusterAndPerformReassignment()
       Build / JDK 8 and Scala 2.12 / kafka.server.KRaftClusterTest.testCreateClusterAndPerformReassignment()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.controller.QuorumControllerTest.testCreateAndClose()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.controller.QuorumControllerTest.testCreateAndClose()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1176029069


##########
core/src/main/scala/kafka/server/ControllerServer.scala:
##########
@@ -270,7 +271,8 @@ class ControllerServer(
             "zk migration",
             fatal = false,
             () => {}
-          )
+          ),
+          quorumFeatures

Review Comment:
   feed `quorumFeatures` into `KRaftMigrationDriver` now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon merged pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon merged PR #13631:
URL: https://github.com/apache/kafka/pull/13631


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1176030390


##########
clients/src/main/java/org/apache/kafka/clients/ApiVersions.java:
##########
@@ -64,4 +64,8 @@ public synchronized byte maxUsableProduceMagic() {
         return maxUsableProduceMagic;
     }
 
+    // check if all nodes are ZK Migration ready
+    public boolean isAllNodeZkMigrationReady() {

Review Comment:
   Agree! Moving to `QuorumFeatures` now, and add tests to test the non-controller node case. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on PR #13631:
URL: https://github.com/apache/kafka/pull/13631#issuecomment-1525588039

   I'd like to merge it after the test failure fix completed in this PR: https://github.com/apache/kafka/pull/13647 to make sure we don't introduce more failed tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1176027142


##########
metadata/src/test/java/org/apache/kafka/metadata/migration/KRaftMigrationDriverTest.java:
##########
@@ -277,64 +312,65 @@ CompletableFuture<Void> enqueueMetadataChangeEventWithFuture(
     public void testOnlySendNeededRPCsToBrokers() throws Exception {
         CountingMetadataPropagator metadataPropagator = new CountingMetadataPropagator();
         CapturingMigrationClient migrationClient = new CapturingMigrationClient(new HashSet<>(Arrays.asList(1, 2, 3)));
-        KRaftMigrationDriver driver = new KRaftMigrationDriver(
+        try (KRaftMigrationDriver driver = new KRaftMigrationDriver(

Review Comment:
   Using try with resource. Nothing else changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1176030100


##########
metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationDriver.java:
##########
@@ -134,8 +139,7 @@ private void initializeMigrationState() {
     }
 
     private boolean isControllerQuorumReadyForMigration() {
-        // TODO implement this
-        return true;
+        return this.apiVersions.isAllNodeZkMigrationReady();

Review Comment:
   Good suggestion. Updated. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13631:
URL: https://github.com/apache/kafka/pull/13631#discussion_r1177832658


##########
metadata/src/main/java/org/apache/kafka/controller/QuorumFeatures.java:
##########
@@ -128,4 +128,27 @@ VersionRange localSupportedFeature(String featureName) {
     boolean isControllerId(int nodeId) {
         return quorumNodeIds.contains(nodeId);
     }
+
+    // check if all controller nodes are ZK Migration ready
+    public boolean isAllControllersZkMigrationReady() {

Review Comment:
   Agree. PR updated. Thanks 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on pull request #13631: KAFKA-14909: check zkMigrationReady tag before migration

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on PR #13631:
URL: https://github.com/apache/kafka/pull/13631#issuecomment-1527058819

   Failed tests are unrelated:
   ```
       Build / JDK 11 and Scala 2.13 / org.apache.kafka.connect.runtime.distributed.DistributedHerderTest.testExternalZombieFencingRequestAsynchronousFailure
       Build / JDK 17 and Scala 2.13 / integration.kafka.server.FetchFromFollowerIntegrationTest.testRackAwareRangeAssignor()
       Build / JDK 17 and Scala 2.13 / kafka.server.CreateTopicsRequestTest.testErrorCreateTopicsRequests(String).quorum=kraft
       Build / JDK 17 and Scala 2.13 / org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org