You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jay Kreps (JIRA)" <ji...@apache.org> on 2013/02/08 23:55:13 UTC

[jira] [Updated] (KAFKA-554) Move all per-topic configuration into ZK and add to the CreateTopicCommand

     [ https://issues.apache.org/jira/browse/KAFKA-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jay Kreps updated KAFKA-554:
----------------------------

    Attachment: KAFKA-554-v1.patch

This patch does two things:
1. Implement a dynamic configuration mechanism for topics
2. Remove the scripts bin/kafka-list-topic.sh, bin/kafka-delete-topic.sh, bin/kafka-create-topic.sh and create a new more powerful tool:
jay@ahab:kafka> bin/kafka-topics.sh
Command must include exactly one action: --list, --describe, --create, --delete, or --alter
Option                                  Description                            
------                                  -----------                            
--alter                                 Alter the configuration for the topic. 
--config <name=value>                   A topic configuration for this topic.  
--create                                Create a new topic.                    
--delete                                Delete the topic.                      
--describe                              List details for the given topics.     
--help                                  Print usage information.               
--list                                  List all available topics.             
--partitions <Integer: # of partitions> The number of partitions for the topic.
--replica-assignment                    A list of manual partition-to-broker   
  <broker_id_for_part1_replica1 :         assignments.                         
  broker_id_for_part1_replica2 ,                                               
  broker_id_for_part2_replica1 :                                               
  broker_id_for_part2_replica2 , ...>                                          
--replication-factor <Integer:          The replication factor for each        
  replication factor>                     partition in the topic.              
--topic <topic>                         The topic to be created.               
--zookeeper <urls>                      REQUIRED: The connection string for    
                                          the zookeeper connection in the form 
                                          host:port. Multiple URLS can be      
                                          given to allow fail-over.  

This command line tool can either list topics, describe topics, create topics, delete topics, or change the configuration for topics.

Here is an example of creating two topics with overrides:
./bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic first_topic --topic second_topic --replication-factor 1 --partitions 4 --config segment.bytes=1073741824 --config retention.ms=1000000
Created topic "first_topic".
Created topic "second_topic".

(Any command that takes a topic option can run on a list of topics by giving more than one topic flag.)

./bin/kafka-topics.sh  --zookeeper localhost:2181 --list
first_topic
second_topic

./bin/kafka-topics.sh  --zookeeper localhost:2181 --describe --topic second_topic
second_topic
	configs: segment.bytes = 1073741824, retention.ms = 1000000
	partitions: 4
		partition 0
		leader: 0 (ahab.linkedin.biz:9092)
		replicas: 0 (ahab.linkedin.biz:9092)
		isr: 0 (ahab.linkedin.biz:9092)
		partition 1
		leader: 0 (ahab.linkedin.biz:9092)
		replicas: 0 (ahab.linkedin.biz:9092)
		isr: 0 (ahab.linkedin.biz:9092)
		partition 2
		leader: 0 (ahab.linkedin.biz:9092)
		replicas: 0 (ahab.linkedin.biz:9092)
		isr: 0 (ahab.linkedin.biz:9092)
		partition 3
		leader: 0 (ahab.linkedin.biz:9092)
		replicas: 0 (ahab.linkedin.biz:9092)
		isr: 0 (ahab.linkedin.biz:9092)

This configuration could be changed later for a topic by running
./bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic first_topic --config segment.bytes=673741824 --config retention.ms=500000
Updated config for topic "first_topic".

The implementation of the dynamic config is to add a new zookeeper path
  /config
This path has two subdirectories
  /config/topics/<topic_name>
and
  /config/changes
The per-topic path contains any override properties specified for the topic stored in java.util.Properties format. If no overrides are given then that znode will not exist. The defaults are still taken from the server.properties file.

The /config/changes path is used to reduce the number of watches required. Instead of keeping a watch on each config override znode, whenever we update a config entry we add a sequential entry under the changes directory containing the name of the topic whose config changed. Each broker keeps a watch on this directory and caches the last change it has executed. When the watch fires it executes any new config changes. Old change entries are garbage collected after 10 minutes. The config changes are managed by a new class TopicConfigManager which executes these changes.

This patch also has two refactorings:
1. Renamed KafkaZookeeper to KafkaHealthcheck
2. Moved logic for creating topics out of CreateTopicCommand and replaced it with two utilities in AdminUtils:
       def createTopic(zkClient: ZkClient,
                  topic: String,
                  partitions: Int, 
                  replicationFactor: Int, 
                  topicConfig: Properties = new Properties)
       def createTopicWithAssignment(zkClient: ZkClient, 
                                topic: String, 
                                partitionReplicaAssignment: Map[Int, Seq[Int]], 
                                config: Properties = new Properties)
The first method will choose a partition assignment, and the second just sanity checks the assignment it is given.

I had originally planned to implement an RPC api to create and delete and alter topics, but I backed away from this since we don't seem to have a sane way to organize admin functionality yet.

I think the first step in cleaning up is probably to refactor AdminUtils into a sane Admin client with methods that match the high-level administrative operations. This will still directly interact with zookeeper. This would be a reasonable starting point since one could at least then implement a web console that used this class even if the functionality was not available to other languages. But in any case this is beyond the scope of this patch.
                
> Move all per-topic configuration into ZK and add to the CreateTopicCommand
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-554
>                 URL: https://issues.apache.org/jira/browse/KAFKA-554
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Jay Kreps
>              Labels: project
>             Fix For: 0.8.1
>
>         Attachments: KAFKA-554-v1.patch
>
>
> We have a number of per-topic configurations that control message retention and flush interval. Here is the list of properties I find in KafkaConfig that appear to be per-topic:
>   topic.log.file.size
>   topic.log.roll.hours
>   topic.log.retention.hours
>   topic.log.retention.size
>   topic.flush.intervals.ms
> Currently we specify these in server.properties. This is not a good solution as it requires a rolling bounce of the cluster to make a change, which just doesn't scale to having hundreds of topics. Also the map encoded in a CSV string is kind of hacky.
> We should move these into ZK in some kind of JSON blob that allows easily adding new per-topic configs and we should remove these from server.properties.
> It would be good to start with a wiki design and get consensus on that first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira