You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by santosh techie <sa...@gmail.com> on 2021/01/13 09:54:15 UTC

Partition Split and Merge

Hello,
From this link on stackoverflow,
https://stackoverflow.com/questions/40981743/does-apache-helix-support-partition-split-and-merge,
I understand that helix do not support out of the box partition split and
merge functionality.

I have a use case similar to topic and topic subscriptions in a typical
messaging framework.

1. Currently I model a topic as a resource in helix and it's partitioned.
2. Some of these topics could become hot-topic and may co-exist with other
topics, on the same node, thus experiencing the bottleneck. Which topic
could be a hot-spot is *not* known ahead of time.
3. Thus the partition to which this hot-topic belongs to is a candidate for
splitting, such that the original partition could be served from multiple
nodes.

4. I was thinking of a complicated approach of achieving split via
hierarchical logical  resource arrangement.
1. monitoring current load of a partition, based on   #of
current subscribers,
2. upon a certain threshold, on the fly, create a helix resource that
represents the hot-partition and add participants(nodes) to it.
3. by doing this, I try to achieve further sub-division within the original
partition.

Are there any recommendations on how to use a helix for partition splitting
?

Thanks,
Santosh

Re: Partition Split and Merge

Posted by Wang Jiajun <er...@gmail.com>.
FYI, an alternative option is letting Helix understand the hot partition so
it can help to re-assign the partitions accordingly. Here's the new
rebalancer that is weight-aware,
https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer
.
Note that FULL-AUTO rebalance mode is required to leverage this rebalancer.
And a migration plan might be necessary if you have an existing cluster
instead of creating new clusters.

Best Regards,
Jiajun


On Wed, Jan 13, 2021 at 9:20 AM Xue Junkai <jx...@apache.org> wrote:

> Hi Santosh,
>
> There could be some concerns of aggregation for resources, since you are
> keeping creating new resources for hot partitions. I would suggest thinking
> about recreating a resource to replace the original topic instead of mixing
> topic and partitions.
>
> There is a ureplicator project that Uber uses Helix to do the Kafka
> replication may give you some ideas: https://github.com/uber/uReplicator
>
> Best,
>
> Junkai
>
> On Wed, Jan 13, 2021 at 1:54 AM santosh techie <sa...@gmail.com>
> wrote:
>
>> Hello,
>> From this link on stackoverflow,
>> https://stackoverflow.com/questions/40981743/does-apache-helix-support-partition-split-and-merge,
>> I understand that helix do not support out of the box partition split and
>> merge functionality.
>>
>> I have a use case similar to topic and topic subscriptions in a typical
>> messaging framework.
>>
>> 1. Currently I model a topic as a resource in helix and it's partitioned.
>> 2. Some of these topics could become hot-topic and may co-exist with
>> other topics, on the same node, thus experiencing the bottleneck. Which
>> topic could be a hot-spot is *not* known ahead of time.
>> 3. Thus the partition to which this hot-topic belongs to is a candidate
>> for splitting, such that the original partition could be served from
>> multiple nodes.
>>
>> 4. I was thinking of a complicated approach of achieving split via
>> hierarchical logical  resource arrangement.
>> 1. monitoring current load of a partition, based on   #of
>> current subscribers,
>> 2. upon a certain threshold, on the fly, create a helix resource that
>> represents the hot-partition and add participants(nodes) to it.
>> 3. by doing this, I try to achieve further sub-division within the
>> original partition.
>>
>> Are there any recommendations on how to use a helix for partition
>> splitting ?
>>
>> Thanks,
>> Santosh
>>
>>

Re: Partition Split and Merge

Posted by Xue Junkai <jx...@apache.org>.
Hi Santosh,

There could be some concerns of aggregation for resources, since you are
keeping creating new resources for hot partitions. I would suggest thinking
about recreating a resource to replace the original topic instead of mixing
topic and partitions.

There is a ureplicator project that Uber uses Helix to do the Kafka
replication may give you some ideas: https://github.com/uber/uReplicator

Best,

Junkai

On Wed, Jan 13, 2021 at 1:54 AM santosh techie <sa...@gmail.com>
wrote:

> Hello,
> From this link on stackoverflow,
> https://stackoverflow.com/questions/40981743/does-apache-helix-support-partition-split-and-merge,
> I understand that helix do not support out of the box partition split and
> merge functionality.
>
> I have a use case similar to topic and topic subscriptions in a typical
> messaging framework.
>
> 1. Currently I model a topic as a resource in helix and it's partitioned.
> 2. Some of these topics could become hot-topic and may co-exist with other
> topics, on the same node, thus experiencing the bottleneck. Which topic
> could be a hot-spot is *not* known ahead of time.
> 3. Thus the partition to which this hot-topic belongs to is a candidate
> for splitting, such that the original partition could be served from
> multiple nodes.
>
> 4. I was thinking of a complicated approach of achieving split via
> hierarchical logical  resource arrangement.
> 1. monitoring current load of a partition, based on   #of
> current subscribers,
> 2. upon a certain threshold, on the fly, create a helix resource that
> represents the hot-partition and add participants(nodes) to it.
> 3. by doing this, I try to achieve further sub-division within the
> original partition.
>
> Are there any recommendations on how to use a helix for partition
> splitting ?
>
> Thanks,
> Santosh
>
>