You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by madhu phatak <ph...@gmail.com> on 2016/03/09 10:07:48 UTC

keyBy using custom partitioner

Hi,
How to use a custom partitioner in keyBy operation? As of now it's using
hash partitioner to load balance across parallel tasks. I tried custom
partitioning the schema before calling keyBy operation. It doesn't seem to
preserve that partition.

-- 
Regards,
Madhukara Phatak
http://datamantra.io/

Re: keyBy using custom partitioner

Posted by madhu phatak <ph...@gmail.com>.
Hi,
Thank you.
On Mar 9, 2016 5:27 PM, "Stephan Ewen" <se...@apache.org> wrote:

> Hi!
>
> You can currently not override the hash function used by "keyBy()". The
> reason is that this function is used in multiple places, for the stream
> partitioning, and also for the partitioning of state. Both have to be
> aligned.
>
> What you can do is use "partitionCustom(...)" to use an arbitrary
> partitioner. However, you cannot window or access state using that...
>
> If you want to partition in a particular way and use windows after that,
> you would currently have to do something like a a map function that
> generates a special key, and then use keyBy() on that.
>
> Greetings,
> Stephan
>
>
> On Wed, Mar 9, 2016 at 10:07 AM, madhu phatak <ph...@gmail.com>
> wrote:
>
>> Hi,
>> How to use a custom partitioner in keyBy operation? As of now it's using
>> hash partitioner to load balance across parallel tasks. I tried custom
>> partitioning the schema before calling keyBy operation. It doesn't seem to
>> preserve that partition.
>>
>> --
>> Regards,
>> Madhukara Phatak
>> http://datamantra.io/
>>
>
>

Re: keyBy using custom partitioner

Posted by Stephan Ewen <se...@apache.org>.
Hi!

You can currently not override the hash function used by "keyBy()". The
reason is that this function is used in multiple places, for the stream
partitioning, and also for the partitioning of state. Both have to be
aligned.

What you can do is use "partitionCustom(...)" to use an arbitrary
partitioner. However, you cannot window or access state using that...

If you want to partition in a particular way and use windows after that,
you would currently have to do something like a a map function that
generates a special key, and then use keyBy() on that.

Greetings,
Stephan


On Wed, Mar 9, 2016 at 10:07 AM, madhu phatak <ph...@gmail.com> wrote:

> Hi,
> How to use a custom partitioner in keyBy operation? As of now it's using
> hash partitioner to load balance across parallel tasks. I tried custom
> partitioning the schema before calling keyBy operation. It doesn't seem to
> preserve that partition.
>
> --
> Regards,
> Madhukara Phatak
> http://datamantra.io/
>