You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "dengziming (via GitHub)" <gi...@apache.org> on 2023/05/06 08:24:50 UTC

[GitHub] [kafka] dengziming opened a new pull request, #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

dengziming opened a new pull request, #13679:
URL: https://github.com/apache/kafka/pull/13679

   *More detailed description of your change*
   
   The KRaft controller return empty finalized features in `ApiVersionResponse`, the brokers are not infected by this, so this problem doesn't have any impact currently, but it's worth fixing it to avoid unexpected problems.
   And there is a bunch of of confusing methods in `ApiVersionResponse` which are only used in test code, I moved them to `TestUtils` to make the code more clear, and force everyone to pass in the correct parameters instead of the default zero parameters, for example, empty `supportedFeatures` and empty `finalizedFeatures`.
   
   
   *Summary of testing strategy (including rationale)*
   Added logic for checking `supportedFeatures` and `finalizedFeatures` in `AbstractApiVersionsRequestTest`.
   
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] ijuma commented on a diff in pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "ijuma (via GitHub)" <gi...@apache.org>.
ijuma commented on code in PR #13679:
URL: https://github.com/apache/kafka/pull/13679#discussion_r1223558292


##########
core/src/main/scala/kafka/server/ApiVersionManager.scala:
##########
@@ -112,4 +157,8 @@ class DefaultApiVersionManager(
       zkMigrationEnabled
     )
   }
+
+  override def apiVersionResponse(throttleTimeMs: Int, finalizedFeatures: Map[String, java.lang.Short], finalizedFeatureEpoch: Long): ApiVersionsResponse = {
+    throw new UnsupportedOperationException("This method is not supported in DefaultApiVersionManager, use apiVersionResponse(throttleTimeMs) instead")

Review Comment:
   I think @cmccabe submitted a PR to fix this here https://github.com/apache/kafka/pull/13826



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1562174228

   @KarboniteKream I reproduced this in #13761 , you can run the new test case and help investigating if you are interested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1560714746

   @KarboniteKream Thank you for reporting this, I'm trying to reproduce your case locally but failed, do you mean that this problem only occur after the leader is shut down and restarted?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] KarboniteKream commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "KarboniteKream (via GitHub)" <gi...@apache.org>.
KarboniteKream commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1562213887

   Thank you, I'll check today!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13679:
URL: https://github.com/apache/kafka/pull/13679#discussion_r1189322325


##########
core/src/main/scala/kafka/server/ApiVersionManager.scala:
##########
@@ -84,18 +105,30 @@ class SimpleApiVersionManager(
     throw new UnsupportedOperationException("This method is not supported in SimpleApiVersionManager, use apiVersionResponse(throttleTimeMs, finalizedFeatures, epoch) instead")
   }
 
-  override def apiVersionResponse(throttleTimeMs: Int, finalizedFeatures: Map[String, java.lang.Short], epoch: Long): ApiVersionsResponse = {
+  override def apiVersionResponse(throttleTimeMs: Int, finalizedFeatures: Map[String, java.lang.Short], finalizedFeaturesEpoch: Long): ApiVersionsResponse = {
     ApiVersionsResponse.createApiVersionsResponse(
       throttleTimeMs,
       apiVersions,
       brokerFeatures,
       finalizedFeatures.asJava,
-      epoch,
+      finalizedFeaturesEpoch,
       zkMigrationEnabled
     )
   }
 }
 
+/**
+ * The default ApiVersionManager that supports forwarding and has metadata cache, used in broker and zk controller.
+ * When forwarding is enabled, the enabled apis are determined by the broker listener type and the controller apis,
+ * otherwise the enabled apis are determined by the broker listener type, which is the same with SimpleApiVersionManager.
+ *
+ * @param listenerType the listener type
+ * @param forwardingManager the forwarding manager,
+ * @param features
+ * @param metadataCache
+ * @param enableUnstableLastVersion
+ * @param zkMigrationEnabled

Review Comment:
   @dengziming thanks for adding the javadocs for these 2 classes. Very clear. But please remember to complete them. :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1541425575

   @showuon There are a lot of errors, I find that `ApiVersionRequest` is very important, when all nodes start, the `NetworkClient` will be initialized and `ApiVersionRequest` will be sent to learn the api version information of other nodes. It is possible that at this time, the controller has not yet started consuming the log, and we should return an empty value, I made a minor change to `QuorumController#finalizedFeatures`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] KarboniteKream commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "KarboniteKream (via GitHub)" <gi...@apache.org>.
KarboniteKream commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1559681643

   This PR seems to have introduced a regression (confirmed using bisect). In a simple setup of two controllers using `config/kraft/controller.properties`, after the leader is shut down and restarted, `UNKNOWN_SERVER_EXCEPTION` will be thrown by `ApiVersionsRequest`.
   
   The other controller sees the following exception:
   ```
   [2023-05-19 15:50:18,834] WARN [QuorumController id=0] getFinalizedFeatures: failed with unknown server exception RuntimeException in 28 us.  The controller is already in standby mode. (org.apache.kafka.controller.QuorumController)
   java.lang.RuntimeException: No in-memory snapshot for epoch 84310. Snapshot epochs are: 61900
       at org.apache.kafka.timeline.SnapshotRegistry.getSnapshot(SnapshotRegistry.java:173)
       at org.apache.kafka.timeline.SnapshotRegistry.iterator(SnapshotRegistry.java:131)
       at org.apache.kafka.timeline.TimelineObject.get(TimelineObject.java:69)
       at org.apache.kafka.controller.FeatureControlManager.finalizedFeatures(FeatureControlManager.java:303)
       at org.apache.kafka.controller.QuorumController.lambda$finalizedFeatures$16(QuorumController.java:2016)
       at org.apache.kafka.controller.QuorumController$ControllerReadEvent.run(QuorumController.java:546)
       at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127)
       at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
       at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
       at java.base/java.lang.Thread.run(Thread.java:829)
   ```
   
   A similar issue was reported in [KAFKA-14996](https://issues.apache.org/jira/browse/KAFKA-14996). I'll report this in Jira once my account is approved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1539844838

   Also, checkstyle failed. Please fix them. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1543964126

   I rerun the failed tests locally and they work correctly, I create a JIRA to track this problem.
   https://issues.apache.org/jira/browse/KAFKA-14989


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] KarboniteKream commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "KarboniteKream (via GitHub)" <gi...@apache.org>.
KarboniteKream commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1560737054

   Correct, the issue happens after shutting down and restarting the leader.
   
   Here's the example configuration:
   - [`1.properties`](https://gist.github.com/KarboniteKream/6ec0378f9b057f9e15467177e58539c6)
   - [`2.properties`](https://gist.github.com/KarboniteKream/4bd198db1db77e7b43c6bb37cecd9b68)
   
   Reproduction:
   1. Initialize the storage:
       - `./bin/kafka-storage.sh format --config 1.properties --cluster-id 9N8QxiJfRoe1rPZ1vpjd2w`
       - `./bin/kafka-storage.sh format --config 2.properties --cluster-id 9N8QxiJfRoe1rPZ1vpjd2w`
   2. Start both controllers:
       - `./bin/kafka-server-start.sh 1.properties`
       - `./bin/kafka-server-start.sh 2.properties`
   3. Wait for quorum to be established, then stop the leader with Ctrl-C
   4. Start the controller back up
   
   After a few moments, the controller will start logging the following:
   ```
   [2023-05-24 18:00:02,980] WARN [RaftManager id=3001] Received error UNKNOWN_SERVER_ERROR from node 3002 when making an ApiVersionsRequest with correlation id 4. Disconnecting. (org.apache.kafka.clients.NetworkClient)
   [2023-05-24 18:00:03,030] INFO [MetadataLoader id=3001] initializeNewPublishers: the loader is still catching up because we still don't know the high water mark yet. (org.apache.kafka.image.loader.MetadataLoader)
   ```
   
   On the other controller (new leader), you'll see the following logs in `logs/controller.log`:
   ```
   [2023-05-24 18:00:02,898] WARN [QuorumController id=3002] getFinalizedFeatures: failed with unknown server exception RuntimeException in 271 us.  The controller is already in standby mode. (org.apache.kafka.controller.QuorumController)
   java.lang.RuntimeException: No in-memory snapshot for epoch 9. Snapshot epochs are: 
   	at org.apache.kafka.timeline.SnapshotRegistry.getSnapshot(SnapshotRegistry.java:173)
   	at org.apache.kafka.timeline.SnapshotRegistry.iterator(SnapshotRegistry.java:131)
   	at org.apache.kafka.timeline.TimelineObject.get(TimelineObject.java:69)
   	at org.apache.kafka.controller.FeatureControlManager.finalizedFeatures(FeatureControlManager.java:303)
   	at org.apache.kafka.controller.QuorumController.lambda$finalizedFeatures$16(QuorumController.java:2016)
   	at org.apache.kafka.controller.QuorumController$ControllerReadEvent.run(QuorumController.java:546)
   	at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:127)
   	at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:210)
   	at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:181)
   	at java.base/java.lang.Thread.run(Thread.java:829)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming commented on a diff in pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming commented on code in PR #13679:
URL: https://github.com/apache/kafka/pull/13679#discussion_r1220994605


##########
core/src/main/scala/kafka/server/ApiVersionManager.scala:
##########
@@ -112,4 +157,8 @@ class DefaultApiVersionManager(
       zkMigrationEnabled
     )
   }
+
+  override def apiVersionResponse(throttleTimeMs: Int, finalizedFeatures: Map[String, java.lang.Short], finalizedFeatureEpoch: Long): ApiVersionsResponse = {
+    throw new UnsupportedOperationException("This method is not supported in DefaultApiVersionManager, use apiVersionResponse(throttleTimeMs) instead")

Review Comment:
   The `features` in `ApiVersionResponse` is retrieved directly from `MetadataCache` in BrokerServer/KafkaServer, but in ControllerServer it can only be got asynchronized and can't be unified, so I added 2 different methods here.
    I'm still checking whether there are better way to handle this, one way is to add a synchronized method to controller, another is to make `ApiVersionManager.apiVersionResponse` asynchronized, both will introduce new problems, the root cause is that `ApiVersionRequest` is treated static but finalized features is changing dynamically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] showuon commented on a diff in pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "showuon (via GitHub)" <gi...@apache.org>.
showuon commented on code in PR #13679:
URL: https://github.com/apache/kafka/pull/13679#discussion_r1188407891


##########
core/src/main/scala/kafka/server/ApiVersionManager.scala:
##########
@@ -80,7 +81,18 @@ class SimpleApiVersionManager(
   private val apiVersions = ApiVersionsResponse.collectApis(enabledApis.asJava, enableUnstableLastVersion)
 
   override def apiVersionResponse(requestThrottleMs: Int): ApiVersionsResponse = {
-    ApiVersionsResponse.createApiVersionsResponse(requestThrottleMs, apiVersions, brokerFeatures, zkMigrationEnabled)
+    throw new UnsupportedOperationException("This method is not supported in SimpleApiVersionManager, use apiVersionResponse(throttleTimeMs, finalizedFeatures, epoch) instead")

Review Comment:
   Actually it's not clear when to use `SimpleApiVersionManager` and when to use `DefaultApiVersionManager`. Could we add some comments above each class?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming merged pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming merged PR #13679:
URL: https://github.com/apache/kafka/pull/13679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] dengziming commented on pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "dengziming (via GitHub)" <gi...@apache.org>.
dengziming commented on PR #13679:
URL: https://github.com/apache/kafka/pull/13679#issuecomment-1545197689

   The failed tests are flaky and also failed recently in other PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [kafka] ijuma commented on a diff in pull request #13679: KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse

Posted by "ijuma (via GitHub)" <gi...@apache.org>.
ijuma commented on code in PR #13679:
URL: https://github.com/apache/kafka/pull/13679#discussion_r1220723779


##########
core/src/main/scala/kafka/server/ApiVersionManager.scala:
##########
@@ -112,4 +157,8 @@ class DefaultApiVersionManager(
       zkMigrationEnabled
     )
   }
+
+  override def apiVersionResponse(throttleTimeMs: Int, finalizedFeatures: Map[String, java.lang.Short], finalizedFeatureEpoch: Long): ApiVersionsResponse = {
+    throw new UnsupportedOperationException("This method is not supported in DefaultApiVersionManager, use apiVersionResponse(throttleTimeMs) instead")

Review Comment:
   What's the thinking around this? It's extremely brittle to design interfaces like this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org