You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/03/26 12:15:06 UTC

[GitHub] [kafka] ijuma opened a new pull request #10414: MINOR: Self-managed -> Raft

ijuma opened a new pull request #10414:
URL: https://github.com/apache/kafka/pull/10414


   Self-managed is also used in the context of Cloud vs on-prem and it can
   be confusing.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] ijuma commented on a change in pull request #10414: MINOR: Self-managed -> KRaft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

ijuma commented on a change in pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#discussion_r603591568



##########
File path: raft/bin/test-kraft-server-start.sh
##########
@@ -17,7 +17,7 @@
 base_dir=$(dirname $0)
 
 if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
-    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/raft-log4j.properties"
+    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/kraft-log4j.properties"

Review comment:
       @jsancio The folder names for the modules don't include `kafka` even though the published artifacts do (`clients` and `kafka-clients`). The artifact for the `raft` module is `kafka-raft` and it was originally released in 2.7. I don't want to change that right now. Similarly, for the package names, since they're `kafka.raft`, I think we can leave it like that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] ijuma merged pull request #10414: MINOR: Self-managed -> KRaft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

ijuma merged pull request #10414:
URL: https://github.com/apache/kafka/pull/10414


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] cmccabe commented on pull request #10414: MINOR: Self-managed -> Kraft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

cmccabe commented on pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#issuecomment-809550143


   @ijuma: I think "KRaft" would be better than "Kraft" since it emphasizes that Kafka Raft is a variant of Raft.  Of course, In the case of file paths, all-lowercase is fine (that's more in line with how we do other paths)
   
   I agree with @jsancio that we may want to consider renaming some internal classes.  However, we should think about doing that in a separate PR to avoid making this one bigger.
   
   LGTM once we rename Kraft -> KRaft


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] jsancio commented on a change in pull request #10414: MINOR: Self-managed -> Kraft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

jsancio commented on a change in pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#discussion_r603413500



##########
File path: config/kraft/README.md
##########
@@ -119,8 +131,8 @@ If you encounter an issue, you might want to take a look at the metadata log.
 One way to view the metadata log is with kafka-dump-log.sh tool, like so:
 
 ~~~~
-[cmccabe@zeratul kafka3]$ ./bin/kafka-dump-log.sh  --cluster-metadata-decoder --skip-record-metadata --files /tmp/self-managed-combined-logs/\@metadata-0/*.log
-Dumping /tmp/self-managed-combined-logs/@metadata-0/00000000000000000000.log
+[cmccabe@zeratul kafka3]$ ./bin/kafka-dump-log.sh  --cluster-metadata-decoder --skip-record-metadata --files /tmp/kraft-combined-logs/\@metadata-0/*.log

Review comment:
       We should remove personal information.
   ```suggestion
   $ ./bin/kafka-dump-log.sh  --cluster-metadata-decoder --skip-record-metadata --files /tmp/kraft-combined-logs/\@metadata-0/*.log
   ```

##########
File path: raft/bin/test-kraft-server-start.sh
##########
@@ -17,7 +17,7 @@
 base_dir=$(dirname $0)
 
 if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
-    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/raft-log4j.properties"
+    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/kraft-log4j.properties"

Review comment:
       I like that we are renaming a few of the files in the module (`raft`). How far do we want to take this? For example:
   1. Should we rename the `raft` module to `kraft`?
   2. Should we rename the `org.apache.kafka.raft` to `org.apache.kafka.kraft`? Maybe this is okay since `kafka` precedes `raft`.
   
   We don't have to do this in this PR. I am just interested to know if we are only concerned with public facing names or should we also rename internal names?
   
   cc @hachikuji 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] cmccabe commented on a change in pull request #10414: MINOR: Self-managed -> Kraft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

cmccabe commented on a change in pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#discussion_r603464758



##########
File path: config/kraft/README.md
##########
@@ -119,8 +131,8 @@ If you encounter an issue, you might want to take a look at the metadata log.
 One way to view the metadata log is with kafka-dump-log.sh tool, like so:
 
 ~~~~
-[cmccabe@zeratul kafka3]$ ./bin/kafka-dump-log.sh  --cluster-metadata-decoder --skip-record-metadata --files /tmp/self-managed-combined-logs/\@metadata-0/*.log
-Dumping /tmp/self-managed-combined-logs/@metadata-0/00000000000000000000.log
+[cmccabe@zeratul kafka3]$ ./bin/kafka-dump-log.sh  --cluster-metadata-decoder --skip-record-metadata --files /tmp/kraft-combined-logs/\@metadata-0/*.log

Review comment:
       Yeah, this change makes sense




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] rondagostino commented on a change in pull request #10414: MINOR: Self-managed -> Raft

Posted by GitBox <gi...@apache.org>.

rondagostino commented on a change in pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#discussion_r602270763



##########
File path: config/raft/README.md
##########
@@ -65,16 +68,19 @@ controllers, you can tolerate 1 failure; with 5 controllers, you can tolerate 2
 ## Process Roles
 Each Kafka server now has a new configuration key called `process.roles` which can have the following values:
 
-* If `process.roles` is set to `broker`, the server acts as a self-managed broker.
-* If `process.roles` is set to `controller`, the server acts as a self-managed controller.
-* If `process.roles` is set to `broker,controller`, the server acts as both a self-managed broker and a self-managed controller.
-* If `process.roles` is not set at all then we are assumed to be in ZooKeeper mode.  As mentioned earlier, you can't currently transition back and forth between ZK mode and self-managed mode without reformatting.
+* If `process.roles` is set to `broker`, the server acts as a broker.
+* If `process.roles` is set to `controller`, the server acts as a controller.
+* If `process.roles` is set to `broker,controller`, the server acts as both a broker and a controller.
+* If `process.roles` is not set at all then we are assumed to be in ZooKeeper mode.  As mentioned earlier, you can't currently transition back and forth between ZK mode and Raft mode without reformatting.

Review comment:
       `s/ZK/ZooKeeper/` as was done below?

##########
File path: config/raft/README.md
##########
@@ -84,21 +90,26 @@ listeners=CONTROLLER://controller1.example.com:9093
 controller.quorum.voters=1@controller1.example.com:9093,2@controller2.example.com:9093,3@controller3.example.com:9093
 ```
 
-Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the controller.quorum.voters configuration must match that supplied to the server.  So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate node IDs is probably just to give each server a numeric ID, starting from 0.
+Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the `controller.quorum.voters` configuration must match that supplied to the server.
+So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate
+node IDs is probably just to give each server a numeric ID, starting from 0.

Review comment:
       Just noting that I think these are just line break changes.

##########
File path: config/raft/README.md
##########
@@ -84,21 +90,26 @@ listeners=CONTROLLER://controller1.example.com:9093
 controller.quorum.voters=1@controller1.example.com:9093,2@controller2.example.com:9093,3@controller3.example.com:9093
 ```
 
-Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the controller.quorum.voters configuration must match that supplied to the server.  So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate node IDs is probably just to give each server a numeric ID, starting from 0.
+Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the `controller.quorum.voters` configuration must match that supplied to the server.
+So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate
+node IDs is probably just to give each server a numeric ID, starting from 0.
 
 Note that clients never need to configure `controller.quorum.voters`; only servers do.
 
 ## Kafka Storage Tool
 As described above in the QuickStart section, you must use the `kafka-storage.sh` tool to generate a cluster ID for your new cluster, and then run the format command on each node before starting the node.
 
-This is different from how Kafka has operated in the past.  Previously, Kafka would format blank storage directories automatically, and also generate a new cluster UUID automatically.  One reason for the change is that auto-formatting can sometimes obscure an error condition.  For example, under UNIX, if a data directory can't be mounted, it may show up as blank.  In this case, auto-formatting would be the wrong thing to do.
+This is different from how Kafka has operated in the past.  Previously, Kafka would format blank storage directories automatically, and also generate a new cluster UUID automatically.  One reason for the change
+is that auto-formatting can sometimes obscure an error condition.  For example, under UNIX, if a data directory can't be mounted, it may show up as blank.  In this case, auto-formatting would be the wrong thing to do.

Review comment:
       just noting linebreak-only changes

##########
File path: config/raft/README.md
##########
@@ -84,21 +90,26 @@ listeners=CONTROLLER://controller1.example.com:9093
 controller.quorum.voters=1@controller1.example.com:9093,2@controller2.example.com:9093,3@controller3.example.com:9093
 ```
 
-Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the controller.quorum.voters configuration must match that supplied to the server.  So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate node IDs is probably just to give each server a numeric ID, starting from 0.
+Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the `controller.quorum.voters` configuration must match that supplied to the server.
+So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate
+node IDs is probably just to give each server a numeric ID, starting from 0.
 
 Note that clients never need to configure `controller.quorum.voters`; only servers do.
 
 ## Kafka Storage Tool
 As described above in the QuickStart section, you must use the `kafka-storage.sh` tool to generate a cluster ID for your new cluster, and then run the format command on each node before starting the node.
 
-This is different from how Kafka has operated in the past.  Previously, Kafka would format blank storage directories automatically, and also generate a new cluster UUID automatically.  One reason for the change is that auto-formatting can sometimes obscure an error condition.  For example, under UNIX, if a data directory can't be mounted, it may show up as blank.  In this case, auto-formatting would be the wrong thing to do.
+This is different from how Kafka has operated in the past.  Previously, Kafka would format blank storage directories automatically, and also generate a new cluster UUID automatically.  One reason for the change
+is that auto-formatting can sometimes obscure an error condition.  For example, under UNIX, if a data directory can't be mounted, it may show up as blank.  In this case, auto-formatting would be the wrong thing to do.
 
-This is particularly important for the metadata log maintained by the controller servers.  If two controllers out of three controllers were able to start with blank logs, a leader might be able to be elected with nothing in the log, which would cause all metadata to be lost.
+This is particularly important for the metadata log maintained by the controller servers.  If two controllers out of three controllers were able to start with blank logs, a leader might be able to be elected with
+nothing in the log, which would cause all metadata to be lost.

Review comment:
       just noting linebreak-only changes

##########
File path: config/raft/README.md
##########
@@ -65,16 +68,19 @@ controllers, you can tolerate 1 failure; with 5 controllers, you can tolerate 2
 ## Process Roles
 Each Kafka server now has a new configuration key called `process.roles` which can have the following values:
 
-* If `process.roles` is set to `broker`, the server acts as a self-managed broker.
-* If `process.roles` is set to `controller`, the server acts as a self-managed controller.
-* If `process.roles` is set to `broker,controller`, the server acts as both a self-managed broker and a self-managed controller.
-* If `process.roles` is not set at all then we are assumed to be in ZooKeeper mode.  As mentioned earlier, you can't currently transition back and forth between ZK mode and self-managed mode without reformatting.
+* If `process.roles` is set to `broker`, the server acts as a broker.
+* If `process.roles` is set to `controller`, the server acts as a controller.
+* If `process.roles` is set to `broker,controller`, the server acts as both a broker and a controller.
+* If `process.roles` is not set at all then we are assumed to be in ZooKeeper mode.  As mentioned earlier, you can't currently transition back and forth between ZK mode and Raft mode without reformatting.
 
 Nodes that act as both brokers and controllers are referred to as "combined" nodes.  Combined nodes are simpler to operate for simple use cases and allow you to avoid
-some of the fixed memory overheads associated with JVMs.  The key disadvantage is that the controller will be less isolated from the rest of the system.  For example, if activity on the broker causes an out of memory condition, the controller part of the server is not isolated from that OOM condition.
+some fixed memory overheads associated with JVMs.  The key disadvantage is that the controller will be less isolated from the rest of the system.  For example, if activity on the broker causes an out of
+memory condition, the controller part of the server is not isolated from that OOM condition.
 
 ## Quorum Voters
-All nodes in the system must set the `controller.quorum.voters` configuration.  This identifies the quorum controller servers that should be used.  All the controllers must be enumerated.  This is similar to how, when using ZooKeeper, the `zookeeper.connect` configuration must contain all the ZooKeeper servers.  Unlike with the ZK config, however, `controller.quorum.voters` also has IDs for each node.  The format is id1@host1:port1,id2@host2:port2, etc.
+All nodes in the system must set the `controller.quorum.voters` configuration.  This identifies the quorum controller servers that should be used.  All the controllers must be enumerated.
+This is similar to how, when using ZooKeeper, the `zookeeper.connect` configuration must contain all the ZooKeeper servers.  Unlike with the ZooKeeper config, however, `controller.quorum.voters`
+also has IDs for each node.  The format is id1@host1:port1,id2@host2:port2, etc.

Review comment:
       Just noting that the change here was `s/ZK/ZooKeeper/` since the added line breaks make it difficult to see quickly what the change actually was.

##########
File path: config/raft/README.md
##########
@@ -84,21 +90,26 @@ listeners=CONTROLLER://controller1.example.com:9093
 controller.quorum.voters=1@controller1.example.com:9093,2@controller2.example.com:9093,3@controller3.example.com:9093
 ```
 
-Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the controller.quorum.voters configuration must match that supplied to the server.  So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate node IDs is probably just to give each server a numeric ID, starting from 0.
+Each broker and each controller must set `controller.quorum.voters`.  Note that the node ID supplied in the `controller.quorum.voters` configuration must match that supplied to the server.
+So on controller1, node.id must be set to 1, and so forth.  Note that there is no requirement for controller IDs to start at 0 or 1.  However, the easiest and least confusing way to allocate
+node IDs is probably just to give each server a numeric ID, starting from 0.
 
 Note that clients never need to configure `controller.quorum.voters`; only servers do.
 
 ## Kafka Storage Tool
 As described above in the QuickStart section, you must use the `kafka-storage.sh` tool to generate a cluster ID for your new cluster, and then run the format command on each node before starting the node.
 
-This is different from how Kafka has operated in the past.  Previously, Kafka would format blank storage directories automatically, and also generate a new cluster UUID automatically.  One reason for the change is that auto-formatting can sometimes obscure an error condition.  For example, under UNIX, if a data directory can't be mounted, it may show up as blank.  In this case, auto-formatting would be the wrong thing to do.
+This is different from how Kafka has operated in the past.  Previously, Kafka would format blank storage directories automatically, and also generate a new cluster UUID automatically.  One reason for the change
+is that auto-formatting can sometimes obscure an error condition.  For example, under UNIX, if a data directory can't be mounted, it may show up as blank.  In this case, auto-formatting would be the wrong thing to do.
 
-This is particularly important for the metadata log maintained by the controller servers.  If two controllers out of three controllers were able to start with blank logs, a leader might be able to be elected with nothing in the log, which would cause all metadata to be lost.
+This is particularly important for the metadata log maintained by the controller servers.  If two controllers out of three controllers were able to start with blank logs, a leader might be able to be elected with
+nothing in the log, which would cause all metadata to be lost.
 
 # Missing Features
-We do not yet support generating or loading KIP-630 metadata snapshots.  This means that after a while, the time required to restart a broker will become very large.  This is a known issue and we are working on completing snapshots for the next release.
+We do not yet support generating or loading KIP-630 metadata snapshots.  This means that after a while, the time required to restart a broker will become very large.  This is a known issue and we are working on
+completing snapshots for the next release.

Review comment:
       just noting linebreak-only changes

##########
File path: config/raft/README.md
##########
@@ -65,16 +68,19 @@ controllers, you can tolerate 1 failure; with 5 controllers, you can tolerate 2
 ## Process Roles
 Each Kafka server now has a new configuration key called `process.roles` which can have the following values:
 
-* If `process.roles` is set to `broker`, the server acts as a self-managed broker.
-* If `process.roles` is set to `controller`, the server acts as a self-managed controller.
-* If `process.roles` is set to `broker,controller`, the server acts as both a self-managed broker and a self-managed controller.
-* If `process.roles` is not set at all then we are assumed to be in ZooKeeper mode.  As mentioned earlier, you can't currently transition back and forth between ZK mode and self-managed mode without reformatting.
+* If `process.roles` is set to `broker`, the server acts as a broker.
+* If `process.roles` is set to `controller`, the server acts as a controller.
+* If `process.roles` is set to `broker,controller`, the server acts as both a broker and a controller.

Review comment:
       It feels to me that we need to mention "Raft mode" in these lines somehow.  Perhaps "`...the server acts as <a broker|a controller|both a broker and a controller> in Raft mode`?  Or maybe `acts in Raft mode as <a broker|a controller|both a broker and a controller>`?

##########
File path: config/raft/broker.properties
##########
@@ -14,13 +14,13 @@
 # limitations under the License.
 
 #
-# This configuration file is intended for use in self-managed mode, where
-# Apache ZooKeeper is not present.  See config/self-managed/README.md for details.
+# This configuration file is intended for use in Raft mode, where
+# Apache ZooKeeper is not present.  See config/raft/README.md for details.
 #
 
 ############################# Server Basics #############################
 
-# The role of this server. Setting this puts us in self-managed mode
+# The role of this server. Setting this puts us in raft mode

Review comment:
       `s/raft mode/Raft mode/` for consistency in capitalization




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] ijuma commented on a change in pull request #10414: MINOR: Self-managed -> Kraft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

ijuma commented on a change in pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#discussion_r603591568



##########
File path: raft/bin/test-kraft-server-start.sh
##########
@@ -17,7 +17,7 @@
 base_dir=$(dirname $0)
 
 if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
-    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/raft-log4j.properties"
+    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/kraft-log4j.properties"

Review comment:
       @jsancio The folder names for the modules don't `kafka` even though the published artifacts do (`clients` and `kafka-clients`). The artifact for the `raft` module is `kafka-raft` and it was originally released in 2.7. I don't want to change that right now. Similarly, for the package names, since they're `kafka.raft`, I think we can leave it like that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] ijuma commented on pull request #10414: MINOR: Self-managed -> Raft

Posted by GitBox <gi...@apache.org>.

ijuma commented on pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#issuecomment-808245571


   Thanks @rondagostino. I addressed the review comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] jsancio commented on a change in pull request #10414: MINOR: Self-managed -> KRaft (Kafka Raft)

Posted by GitBox <gi...@apache.org>.

jsancio commented on a change in pull request #10414:
URL: https://github.com/apache/kafka/pull/10414#discussion_r603607525



##########
File path: raft/bin/test-kraft-server-start.sh
##########
@@ -17,7 +17,7 @@
 base_dir=$(dirname $0)
 
 if [ "x$KAFKA_LOG4J_OPTS" = "x" ]; then
-    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/raft-log4j.properties"
+    export KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:$base_dir/../config/kraft-log4j.properties"

Review comment:
       Good to know about the artifact names. Sounds good to me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org