You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/10/10 11:41:20 UTC

[jira] [Commented] (KAFKA-2091) Expose a Partitioner interface in the new producer

    [ https://issues.apache.org/jira/browse/KAFKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15562055#comment-15562055 ] 

ASF GitHub Bot commented on KAFKA-2091:
---------------------------------------

GitHub user iconara opened a pull request:

    https://github.com/apache/kafka/pull/2000

    Make Partitioner a Closeable and close it when closing the producer

    Even though Partitioner has a close method it is not closed when the producer is closed. Serializers, interceptors and metrics are all closed, so partitioners should be closed to.
    
    Looking at [KAFKA-2091](https://issues.apache.org/jira/browse/KAFKA-2091) (d6c45c70fb9773043766446e88370db9709e7995) that introduced the `Partitioner` interface it looks like the intention was that the producer should close the partitioner.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/iconara/kafka kafka-4284

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2000.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2000
    
----
commit dcf85e553f69df7a3327166c56e16f3a12694c6c
Author: Theo <th...@iconara.net>
Date:   2016-10-10T11:06:12Z

    Make Partitioner a Closeable and close it when closing the producer
    
    Even though Partitioner has a close method it is not closed when the producer is closed. Serializers, interceptors and metrics are all closed, so partitioners should be closed to.

----


> Expose a Partitioner interface in the new producer
> --------------------------------------------------
>
>                 Key: KAFKA-2091
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2091
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jay Kreps
>            Assignee: Sriharsha Chintalapani
>         Attachments: KAFKA-2091.patch, KAFKA-2091_2015-05-27_15:50:18.patch
>
>
> In the new producer you can pass in a key or hard code the partition as part of ProducerRecord.
> Internally we are using a class
> {code}
> class Partitioner {
>     public int partition(String topic, byte[] key, Integer partition, Cluster cluster) {...}
> }
> {code}
> This class uses the specified partition if there is one; uses a hash of the key if there isn't a partition but there is a key; and simply chooses a partition round robin if there is neither a partition nor a key.
> However there are several partitioning strategies that could be useful that we don't support out of the box. 
> An example would be having each producer periodically choose a random partition. This tends to be the most efficient since all data goes to one server and uses the fewest TCP connections, however it only produces good load balancing if there are many producers.
> Of course a user can do this now by just setting the partition manually, but that is a bit inconvenient if you need to do that across a bunch of apps since each will need to remember to set the partition every time.
> The idea would be to expose a configuration to set the partitioner implementation like
> {code}
> partitioner.class=org.apache.kafka.producer.DefaultPartitioner
> {code}
> This would default to the existing partitioner implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)