You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Allen Wang (JIRA)" <ji...@apache.org> on 2015/06/24 02:03:43 UTC
[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

    [ https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598615#comment-14598615 ] 

Allen Wang commented on KAFKA-1215:
-----------------------------------

We have a working solution now for rack aware assignment. It is based on current patch for this JIRA but with some improvement. The key idea of the solution is:

- Rack ID is a String instead of integer
- For replica assignment, add an extra parameter of Map[Int, String] to assignReplicasToBrokers() method which maps broker ID to rack ID
- Before doing the rack aware assignment, sort the broker list such that they are interlaced according to the rack. In other words, adjacent brokers should not be in the same rack if possible . For example, assuming 6 brokers mapping to 3 racks:

0 -> "rack1", 1 -> "rack1", 2 -> "rack2", 3 -> "rack2", 4 -> "rack3", 5 -> "rack3"

The sorted broker list could be (0, 2, 4, 1, 3, 5)

- Apply the same assignment algorithm to assign replicas, with the addition of skipping a broker if its rack is already used for the same partition (similar to what has been done in current patch)

The benefit of this approach is that replica distribution is kept as even as possible to all the racks and brokers.

With regard to KAFKA-1792, an easy solution is to restrict replica movement within the same rack, which I think should work in most practical cases. It will also have added benefit that usually replicas move faster within a rack. So basically we can apply the same algorithm described in KAFKA-1792 for each rack. For example, if there are three racks, then apply the algorithm three times, each time with broker list and assignment for that specific rack. Again, we assume the broker to rack mapping will be available in the method signature.

The open question is how to obtain broker to rack mapping. The information can be supplied when Kafka registers the broker with ZooKeeper which means some information has to be added to ZooKeeper. However, it could be that the rack information is already available in a deployment independent way. For example, for some deployment, the rack information may be available in a database. What we can do is to abstract out the API required to obtain rack information in an interface and allow user to supply an implementation in command line or at broker start up (to handle auto topic creation).





 

> Rack-Aware replica assignment option
> ------------------------------------
>
>                 Key: KAFKA-1215
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1215
>             Project: Kafka
>          Issue Type: Improvement
>          Components: replication
>    Affects Versions: 0.8.0
>            Reporter: Joris Van Remoortere
>            Assignee: Jun Rao
>             Fix For: 0.9.0
>
>         Attachments: rack_aware_replica_assignment_v1.patch, rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica assignment by using the max-rack-replication argument in the admin scripts (create topic, etc.). By default the original replication assignment algorithm is used because max-rack-replication defaults to -1. max-rack-replication > -1 is not honored if you are doing manual replica assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)