You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/06/18 19:55:00 UTC

[jira] [Commented] (FLINK-9610) Add Kafka partitioner that uses the key to partition by

    [ https://issues.apache.org/jira/browse/FLINK-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516283#comment-16516283 ] 

ASF GitHub Bot commented on FLINK-9610:
---------------------------------------

GitHub user nielsbasjes opened a pull request:

    https://github.com/apache/flink/pull/6181

    [FLINK-9610] [flink-connector-kafka-base] Add Kafka Partitioner that uses the hash of the provided key.

    ## What is the purpose of the change
    
    Add the simple feature of being able to route records into Kafka using a key based partitioning.
    
    ## Brief change log
    
    - Added the FlinkKeyHashPartitioner class with some tests.
    
    ## Verifying this change
    
    This change added tests and can be verified as follows:
    - Use this instead of the FlinkFixedPartitioner while instantiating the FlinkKafkaProducer. Also add an KeyedSerializationSchema implementation that returns the right key that is to be used.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): no
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes
      - The serializers: no
      - The runtime per-record code paths (performance sensitive): no
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no 
      - The S3 file system connector:  no 
    
    ## Documentation
    
      - Does this pull request introduce a new feature? yes
      - If yes, how is the feature documented? JavaDocs


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nielsbasjes/flink KafkaKeyHashPartitioner

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/6181.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6181
    
----
commit 22f814cab0710653ec1766531db7077f9e8fd534
Author: Niels Basjes <nb...@...>
Date:   2018-06-18T19:16:22Z

    [FLINK-9610] Add Kafka Partitioner that uses the hash of the provided key

----


> Add Kafka partitioner that uses the key to partition by
> -------------------------------------------------------
>
>                 Key: FLINK-9610
>                 URL: https://issues.apache.org/jira/browse/FLINK-9610
>             Project: Flink
>          Issue Type: New Feature
>          Components: Kafka Connector
>            Reporter: Niels Basjes
>            Assignee: Niels Basjes
>            Priority: Major
>
> The kafka connector package only contains the 
> FlinkFixedPartitioner implementation of the FlinkKafkaPartitioner.
> The most common usecase I have seen it the need to spread the records across the kafka partitions while keeping all messages with the same key together.
> I'll put up a pull request with a very simple implementation that should make this a lot easier for others to use and extend.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)