You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/03/18 22:17:22 UTC

[GitHub] [kafka] rondagostino opened a new pull request #10357: MINOR: KIP-500 Cluster ID is a String

rondagostino opened a new pull request #10357:
URL: https://github.com/apache/kafka/pull/10357


   It is possible that an existing Cluster ID in Zookeeper is not convertible to a UUID.  KIP-500 cannot constrain Cluster IDs to be convertible to a UUID and still expect such a cluster to be able to upgrade to KIP-500.  This patch removes this UUID constraint and makes KIP-500 consistent with existing RPCs, which treat Cluster ID as a String.
   
   It is unlikely that an existing Cluster ID in Zookeeper would not be convertible to a UUID because the Cluster ID is auto-generated in the form of a UUID and stored in ZooKeeper as soon as the first broker connects.  However, there is nothing preventing someone from generating a Cluster ID that does not confirm to the UUID format and manually storing/bootstrapping that into ZooKeeper before the first broker connects.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe merged pull request #10357: MINOR: The new KIP-500 code should treat cluster ID as a string

Posted by GitBox <gi...@apache.org>.
cmccabe merged pull request #10357:
URL: https://github.com/apache/kafka/pull/10357


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rondagostino commented on pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
rondagostino commented on pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#issuecomment-802344461


   This needs to be cherry-picked to 2.8.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] hachikuji commented on a change in pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
hachikuji commented on a change in pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#discussion_r597285632



##########
File path: core/src/main/scala/kafka/server/BrokerMetadataCheckpoint.scala
##########
@@ -106,14 +105,8 @@ object MetaProperties {
     value.getOrElse(throw new RuntimeException(s"Failed to find required property $key."))
   }
 
-  def requireClusterId(properties: RawMetaProperties): Uuid = {
-    val value = require(ClusterIdKey, properties.clusterId)
-    try {
-      Uuid.fromString(value)
-    } catch {
-      case e: Throwable => throw new RuntimeException(s"Failed to parse $ClusterIdKey property " +
-        s"as a UUID: ${e.getMessage}")
-    }
+  def requireClusterId(properties: RawMetaProperties): String = {

Review comment:
       We may as well get rid of this method. We are already using `require(NodeIdKey, properties.nodeId)` above.

##########
File path: core/src/test/scala/kafka/server/BrokerMetadataCheckpointTest.scala
##########
@@ -81,21 +80,21 @@ class BrokerMetadataCheckpointTest {
   }
 
   @Test
-  def testMetaPropertiesDoesNotAllowHexEncodedUUIDs(): Unit = {
+  def testMetaPropertiesAllowsHexEncodedUUIDs(): Unit = {
     val properties = new RawMetaProperties()
     properties.version = 1
     properties.clusterId = "7bc79ca1-9746-42a3-a35a-efb3cde44492"
     properties.nodeId = 1
-    assertThrows(classOf[RuntimeException], () => MetaProperties.parse(properties))
+    MetaProperties.parse(properties)

Review comment:
       Can we assert something about the return value? For example, we could follow the pattern in `testCreateMetadataProperties`.

##########
File path: core/src/test/scala/unit/kafka/server/KafkaRaftServerTest.scala
##########
@@ -32,7 +32,7 @@ class KafkaRaftServerTest {
 
   @Test
   def testSuccessfulLoadMetaProperties(): Unit = {
-    val clusterId = Uuid.randomUuid()
+    val clusterId = Uuid.randomUuid().toString

Review comment:
       Probably doesn't matter too much, but perhaps we could hard-code all of these to some string such as "JgxuGe9URy-E-ceaL04lEw" from the other test. The way it's written after this change looks a little odd.

##########
File path: core/src/main/scala/kafka/tools/StorageTool.scala
##########
@@ -198,7 +198,7 @@ object StorageTool extends Logging {
         s"does not appear to be a valid UUID: ${e.getMessage}")
     }
     require(config.nodeId >= 0, s"The node.id must be set to a non-negative integer.")
-    new MetaProperties(effectiveClusterId, config.nodeId)
+    new MetaProperties(effectiveClusterId.toString, config.nodeId)

Review comment:
       Should we also remove the logic above which requires that the clusterId parses as a UUID?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] hachikuji commented on pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
hachikuji commented on pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#issuecomment-802417977


   @vvcephei Where are we with the RC? Do we still have time to merge this to 2.8? We can do this after the release if necessary, but we'd prefer not to break the protocol if we don't have to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] hachikuji commented on a change in pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
hachikuji commented on a change in pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#discussion_r597325275



##########
File path: core/src/main/scala/kafka/tools/StorageTool.scala
##########
@@ -198,7 +198,7 @@ object StorageTool extends Logging {
         s"does not appear to be a valid UUID: ${e.getMessage}")
     }
     require(config.nodeId >= 0, s"The node.id must be set to a non-negative integer.")
-    new MetaProperties(effectiveClusterId, config.nodeId)
+    new MetaProperties(effectiveClusterId.toString, config.nodeId)

Review comment:
       I'm ok with doing this later. Let's be sure to file a jira so we don't forget about it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rondagostino commented on a change in pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
rondagostino commented on a change in pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#discussion_r597324486



##########
File path: core/src/main/scala/kafka/tools/StorageTool.scala
##########
@@ -198,7 +198,7 @@ object StorageTool extends Logging {
         s"does not appear to be a valid UUID: ${e.getMessage}")
     }
     require(config.nodeId >= 0, s"The node.id must be set to a non-negative integer.")
-    new MetaProperties(effectiveClusterId, config.nodeId)
+    new MetaProperties(effectiveClusterId.toString, config.nodeId)

Review comment:
       > Should we also remove the [StorageTool] logic above which requires that the clusterId parses as a UUID 
   
   Basically, this is asking if `StorageTool` should support accepting non-UUIDs via its `--cluster-id` argument.  One purpose of the tool is to minimize the chance that a broker could use data from the wrong volume (i.e. data from another cluster).  Generating a random UUID via the `--random-uuid` parameter encourages using a globally unique value for every cluster and is consistent with the behavior today with ZooKeeper, whereas allowing a non-UUID here would increase the chance that someone could reuse a Cluster ID value across clusters and short-circuit the risk mitigation that this tool provides.
   
   Merging this PR now avoids an uncomfortable "rename" in `BrokerRegistrationRequest` (or a breaking change) after 2.8 is released; letting `StorageTool` accept non-UUIDs can be done at any time with no impact.  Maybe we keep this PR about removing the constraint, which also simply cannot exist -- otherwise we can't upgrade clusters that don't use a UUID -- and leave this question about the `--cluster-id` parameter for a broader discussion later?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rondagostino commented on a change in pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
rondagostino commented on a change in pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#discussion_r597326794



##########
File path: core/src/main/scala/kafka/tools/StorageTool.scala
##########
@@ -198,7 +198,7 @@ object StorageTool extends Logging {
         s"does not appear to be a valid UUID: ${e.getMessage}")
     }
     require(config.nodeId >= 0, s"The node.id must be set to a non-negative integer.")
-    new MetaProperties(effectiveClusterId, config.nodeId)
+    new MetaProperties(effectiveClusterId.toString, config.nodeId)

Review comment:
       > file a jira so we don't forget about it. 
   
   https://issues.apache.org/jira/browse/KAFKA-12505




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on pull request #10357: MINOR: KIP-500 Cluster ID is a String

Posted by GitBox <gi...@apache.org>.
cmccabe commented on pull request #10357:
URL: https://github.com/apache/kafka/pull/10357#issuecomment-802556947


   @vvcephei : I like @hachikuji 's idea of merging this to 2.8.  It's worth noting that the protocol that we're changing here is a new one which was introduced by kip-500, and is not used in the ZK code path.  So the risk to existing code should be low.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org