You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by Jianzhou Zhao <jz...@quora.com> on 2019/10/21 03:30:14 UTC

who uses helix

Hi,

I was looking for who uses helix, and got
https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix

The link did not get updated after 2017. Is the list still update to date?

Thank you, j

Re: who uses helix

Posted by Lei Xia <lx...@linkedin.com>.
You are right, currently Helix distributes partitions from the same resource (DB or table) evenly across nodes, i.e,  each node will hold roughly same number of partitions from one DB.  If the traffic (or data size) across partitions from same DB is even, but different DBs have different traffic load, that is fine though.

We are actively developing a new rebalancer (https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer) which Helix will distribute partitions based on their weight (a vector of metrics defined by the user). This will help us to solve the "hot" partitions issue. We are expecting this new rebalancer will be released by end of this year.


Lei
[https://avatars3.githubusercontent.com/u/47359?s=400&v=4]<https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer>
apache/helix<https://github.com/apache/helix/wiki/Weight-aware-Globally-Evenly-distributed-Rebalancer>
Mirror of Apache Helix. Contribute to apache/helix development by creating an account on GitHub.
github.com



Lei Xia


Data Infra/Helix

lxia@linkedin.com<ma...@linkedin.com>
www.linkedin.com/in/lxia1<http://www.linkedin.com/in/lxia1>

________________________________
From: Jianzhou Zhao <jz...@quora.com>
Sent: Sunday, October 20, 2019 9:40 PM
To: user@helix.apache.org <us...@helix.apache.org>
Subject: Re: who uses helix

This is amazing.
This number means each table could have thousands of partitions on average.
At this scale, I feel the helix UI could be the bottleneck to monitor them :)

Not a related question, but just considered it after seeing the numbers...

In the helix code, I did not see how it balance partitions in terms of how busy a resource could be, but helix simply ensures numbers of shards are evenly distributed. What happens if some resources are busy for some reason, and their distribution make all other resources are busy because they share nodes.

I am new to helix, and its tutorial says we can use semi-auto or customized balancing strategies, but am still curious at practice the semi-auto approaches are scaled.




On Sun, Oct 20, 2019 at 9:31 PM kishore g <g....@gmail.com>> wrote:
At LinkedIn, Helix manages thousands on nodes. 1 million segments (equivalent of partitions) across thousands of tables.

On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com>> wrote:
Cool.

In the rocksplicator case, how many partitions, replica + nodes a helix cluster can manage?

On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com>> wrote:
We extensively use Helix at Pinterest. This blog post has more details and some tips.
https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7C6e415a026ada4a5d1ec708d755e25ea7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072303072003774&sdata=k97L59nrSwNWVd2jbtjiiezGyVOUA8SD6xsp2kkS7cQ%3D&reserved=0>

On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com>> wrote:
Hi,

I was looking for who uses helix, and got
https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7C6e415a026ada4a5d1ec708d755e25ea7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072303072013765&sdata=%2FU30Oqb%2F%2Fn9XaNZ%2Ft57qcdmOR9bpWgI%2FAUziQD%2BQK1A%3D&reserved=0>

The link did not get updated after 2017. Is the list still update to date?

Thank you, j


--
Best regards,
Bo


Re: who uses helix

Posted by Jianzhou Zhao <jz...@quora.com>.
This is amazing.
This number means each table could have thousands of partitions on average.
At this scale, I feel the helix UI could be the bottleneck to monitor them
:)

Not a related question, but just considered it after seeing the numbers...

In the helix code, I did not see how it balance partitions in terms of how
busy a resource could be, but helix simply ensures numbers of shards are
evenly distributed. What happens if some resources are busy for some
reason, and their distribution make all other resources are busy because
they share nodes.

I am new to helix, and its tutorial says we can use semi-auto or customized
balancing strategies, but am still curious at practice the semi-auto
approaches are scaled.




On Sun, Oct 20, 2019 at 9:31 PM kishore g <g....@gmail.com> wrote:

> At LinkedIn, Helix manages thousands on nodes. 1 million segments
> (equivalent of partitions) across thousands of tables.
>
> On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:
>
>> Cool.
>>
>> In the rocksplicator case, how many partitions, replica + nodes a helix
>> cluster can manage?
>>
>> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>>
>>> We extensively use Helix at Pinterest. This blog post has more details
>>> and some tips.
>>>
>>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>>
>>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I was looking for who uses helix, and got
>>>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>>>>
>>>> The link did not get updated after 2017. Is the list still update to
>>>> date?
>>>>
>>>> Thank you, j
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Bo
>>>
>>>

Re: who uses helix

Posted by kishore g <g....@gmail.com>.
Let’s start another thread to discuss the specific feature and the
timeline.

On Mon, Oct 21, 2019 at 2:21 PM Hunter Lee <na...@gmail.com> wrote:

> Exact timeline may vary, but it's currently in active development and I'd
> say sometime in the first half of 2020.
>
> Hunter
>
> On Mon, Oct 21, 2019 at 8:39 AM Yi Chen <yi...@airbnb.com> wrote:
>
>> I am thinking of implementing the load based rebalancing based on the
>> google algorithm "Consistent Hashing with Bounded Loads," I am glad Helix
>> team is working on a weighted rebalancing algorithm as well, that would
>> help me a lot. Any idea when this feature can be released?
>>
>> On Sun, Oct 20, 2019 at 11:44 PM Hunter Lee <na...@gmail.com> wrote:
>>
>>> As Kishore said, Helix powers Apache Pinot.
>>>
>>> Apache Gobblin also uses Helix for cluster management and Helix Task
>>> Framework (online workflow scheduler) for its data ingestion sync use
>>> cases, running 80K+ jobs per day.
>>>
>>> Hunter
>>>
>>> On Sun, Oct 20, 2019 at 10:06 PM kishore g <g....@gmail.com> wrote:
>>>
>>>> Can you talk about your use case?. Helix does not rebalance partitions
>>>> based on load automatically but one can use customized mode to achieve
>>>> that.
>>>>
>>>> On Sun, Oct 20, 2019 at 10:01 PM Lei Xia <lx...@linkedin.com> wrote:
>>>>
>>>>> Here are blogs with some latest adaptation of Helix outside of
>>>>> LinkedIn.
>>>>>
>>>>>
>>>>>    - Uber's Kafka replicator built on top of Helix:
>>>>>    https://eng.uber.com/ureplicator/
>>>>>
>>>>>    - Pinterest's RocksDB replicator:
>>>>>    https://medium.com/@Pinterest_Engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>>>>
>>>>>    -
>>>>>    - Airbnb’s Change Data Capture system:
>>>>>    https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *Lei Xia*
>>>>>
>>>>>
>>>>> Data Infra/Helix
>>>>>
>>>>> lxia@linkedin.com
>>>>> www.linkedin.com/in/lxia1
>>>>> ------------------------------
>>>>> *From:* kishore g <g....@gmail.com>
>>>>> *Sent:* Sunday, October 20, 2019 9:24 PM
>>>>> *To:* user@helix.apache.org <us...@helix.apache.org>
>>>>> *Subject:* Re: who uses helix
>>>>>
>>>>> At LinkedIn, Helix manages thousands on nodes. 1 million segments
>>>>> (equivalent of partitions) across thousands of tables.
>>>>>
>>>>> On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>>>
>>>>> Cool.
>>>>>
>>>>> In the rocksplicator case, how many partitions, replica + nodes a
>>>>> helix cluster can manage?
>>>>>
>>>>> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>>>>>
>>>>> We extensively use Helix at Pinterest. This blog post has more details
>>>>> and some tips.
>>>>>
>>>>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>>>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=If0JOZJSZFBq2jziFG5ZGdhoE%2B30KHw%2B4a2V6F56fLo%3D&reserved=0>
>>>>>
>>>>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I was looking for who uses helix, and got
>>>>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>>>>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=5HtQ%2F8U3IvrrLjhoQN4%2BlO9r9jNDpuX9MeXVGovKKAs%3D&reserved=0>
>>>>>
>>>>> The link did not get updated after 2017. Is the list still update to
>>>>> date?
>>>>>
>>>>> Thank you, j
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Bo
>>>>>
>>>>>

Re: who uses helix

Posted by Hunter Lee <na...@gmail.com>.
Exact timeline may vary, but it's currently in active development and I'd
say sometime in the first half of 2020.

Hunter

On Mon, Oct 21, 2019 at 8:39 AM Yi Chen <yi...@airbnb.com> wrote:

> I am thinking of implementing the load based rebalancing based on the
> google algorithm "Consistent Hashing with Bounded Loads," I am glad Helix
> team is working on a weighted rebalancing algorithm as well, that would
> help me a lot. Any idea when this feature can be released?
>
> On Sun, Oct 20, 2019 at 11:44 PM Hunter Lee <na...@gmail.com> wrote:
>
>> As Kishore said, Helix powers Apache Pinot.
>>
>> Apache Gobblin also uses Helix for cluster management and Helix Task
>> Framework (online workflow scheduler) for its data ingestion sync use
>> cases, running 80K+ jobs per day.
>>
>> Hunter
>>
>> On Sun, Oct 20, 2019 at 10:06 PM kishore g <g....@gmail.com> wrote:
>>
>>> Can you talk about your use case?. Helix does not rebalance partitions
>>> based on load automatically but one can use customized mode to achieve
>>> that.
>>>
>>> On Sun, Oct 20, 2019 at 10:01 PM Lei Xia <lx...@linkedin.com> wrote:
>>>
>>>> Here are blogs with some latest adaptation of Helix outside of LinkedIn.
>>>>
>>>>
>>>>    - Uber's Kafka replicator built on top of Helix:
>>>>    https://eng.uber.com/ureplicator/
>>>>
>>>>    - Pinterest's RocksDB replicator:
>>>>    https://medium.com/@Pinterest_Engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>>>
>>>>    -
>>>>    - Airbnb’s Change Data Capture system:
>>>>    https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
>>>>
>>>>
>>>>
>>>>
>>>> *Lei Xia*
>>>>
>>>>
>>>> Data Infra/Helix
>>>>
>>>> lxia@linkedin.com
>>>> www.linkedin.com/in/lxia1
>>>> ------------------------------
>>>> *From:* kishore g <g....@gmail.com>
>>>> *Sent:* Sunday, October 20, 2019 9:24 PM
>>>> *To:* user@helix.apache.org <us...@helix.apache.org>
>>>> *Subject:* Re: who uses helix
>>>>
>>>> At LinkedIn, Helix manages thousands on nodes. 1 million segments
>>>> (equivalent of partitions) across thousands of tables.
>>>>
>>>> On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>>
>>>> Cool.
>>>>
>>>> In the rocksplicator case, how many partitions, replica + nodes a helix
>>>> cluster can manage?
>>>>
>>>> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>>>>
>>>> We extensively use Helix at Pinterest. This blog post has more details
>>>> and some tips.
>>>>
>>>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=If0JOZJSZFBq2jziFG5ZGdhoE%2B30KHw%2B4a2V6F56fLo%3D&reserved=0>
>>>>
>>>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I was looking for who uses helix, and got
>>>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>>>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=5HtQ%2F8U3IvrrLjhoQN4%2BlO9r9jNDpuX9MeXVGovKKAs%3D&reserved=0>
>>>>
>>>> The link did not get updated after 2017. Is the list still update to
>>>> date?
>>>>
>>>> Thank you, j
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Bo
>>>>
>>>>

Re: who uses helix

Posted by Yi Chen <yi...@airbnb.com>.
I am thinking of implementing the load based rebalancing based on the
google algorithm "Consistent Hashing with Bounded Loads," I am glad Helix
team is working on a weighted rebalancing algorithm as well, that would
help me a lot. Any idea when this feature can be released?

On Sun, Oct 20, 2019 at 11:44 PM Hunter Lee <na...@gmail.com> wrote:

> As Kishore said, Helix powers Apache Pinot.
>
> Apache Gobblin also uses Helix for cluster management and Helix Task
> Framework (online workflow scheduler) for its data ingestion sync use
> cases, running 80K+ jobs per day.
>
> Hunter
>
> On Sun, Oct 20, 2019 at 10:06 PM kishore g <g....@gmail.com> wrote:
>
>> Can you talk about your use case?. Helix does not rebalance partitions
>> based on load automatically but one can use customized mode to achieve
>> that.
>>
>> On Sun, Oct 20, 2019 at 10:01 PM Lei Xia <lx...@linkedin.com> wrote:
>>
>>> Here are blogs with some latest adaptation of Helix outside of LinkedIn.
>>>
>>>
>>>    - Uber's Kafka replicator built on top of Helix:
>>>    https://eng.uber.com/ureplicator/
>>>
>>>    - Pinterest's RocksDB replicator:
>>>    https://medium.com/@Pinterest_Engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>>
>>>    -
>>>    - Airbnb’s Change Data Capture system:
>>>    https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
>>>
>>>
>>>
>>>
>>> *Lei Xia*
>>>
>>>
>>> Data Infra/Helix
>>>
>>> lxia@linkedin.com
>>> www.linkedin.com/in/lxia1
>>> ------------------------------
>>> *From:* kishore g <g....@gmail.com>
>>> *Sent:* Sunday, October 20, 2019 9:24 PM
>>> *To:* user@helix.apache.org <us...@helix.apache.org>
>>> *Subject:* Re: who uses helix
>>>
>>> At LinkedIn, Helix manages thousands on nodes. 1 million segments
>>> (equivalent of partitions) across thousands of tables.
>>>
>>> On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>
>>> Cool.
>>>
>>> In the rocksplicator case, how many partitions, replica + nodes a helix
>>> cluster can manage?
>>>
>>> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>>>
>>> We extensively use Helix at Pinterest. This blog post has more details
>>> and some tips.
>>>
>>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=If0JOZJSZFBq2jziFG5ZGdhoE%2B30KHw%2B4a2V6F56fLo%3D&reserved=0>
>>>
>>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>>
>>> Hi,
>>>
>>> I was looking for who uses helix, and got
>>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=5HtQ%2F8U3IvrrLjhoQN4%2BlO9r9jNDpuX9MeXVGovKKAs%3D&reserved=0>
>>>
>>> The link did not get updated after 2017. Is the list still update to
>>> date?
>>>
>>> Thank you, j
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Bo
>>>
>>>

Re: who uses helix

Posted by Hunter Lee <na...@gmail.com>.
As Kishore said, Helix powers Apache Pinot.

Apache Gobblin also uses Helix for cluster management and Helix Task
Framework (online workflow scheduler) for its data ingestion sync use
cases, running 80K+ jobs per day.

Hunter

On Sun, Oct 20, 2019 at 10:06 PM kishore g <g....@gmail.com> wrote:

> Can you talk about your use case?. Helix does not rebalance partitions
> based on load automatically but one can use customized mode to achieve
> that.
>
> On Sun, Oct 20, 2019 at 10:01 PM Lei Xia <lx...@linkedin.com> wrote:
>
>> Here are blogs with some latest adaptation of Helix outside of LinkedIn.
>>
>>
>>    - Uber's Kafka replicator built on top of Helix:
>>    https://eng.uber.com/ureplicator/
>>
>>    - Pinterest's RocksDB replicator:
>>    https://medium.com/@Pinterest_Engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>
>>    -
>>    - Airbnb’s Change Data Capture system:
>>    https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
>>
>>
>>
>>
>> *Lei Xia*
>>
>>
>> Data Infra/Helix
>>
>> lxia@linkedin.com
>> www.linkedin.com/in/lxia1
>> ------------------------------
>> *From:* kishore g <g....@gmail.com>
>> *Sent:* Sunday, October 20, 2019 9:24 PM
>> *To:* user@helix.apache.org <us...@helix.apache.org>
>> *Subject:* Re: who uses helix
>>
>> At LinkedIn, Helix manages thousands on nodes. 1 million segments
>> (equivalent of partitions) across thousands of tables.
>>
>> On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>
>> Cool.
>>
>> In the rocksplicator case, how many partitions, replica + nodes a helix
>> cluster can manage?
>>
>> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>>
>> We extensively use Helix at Pinterest. This blog post has more details
>> and some tips.
>>
>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=If0JOZJSZFBq2jziFG5ZGdhoE%2B30KHw%2B4a2V6F56fLo%3D&reserved=0>
>>
>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>
>> Hi,
>>
>> I was looking for who uses helix, and got
>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=5HtQ%2F8U3IvrrLjhoQN4%2BlO9r9jNDpuX9MeXVGovKKAs%3D&reserved=0>
>>
>> The link did not get updated after 2017. Is the list still update to date?
>>
>> Thank you, j
>>
>>
>>
>> --
>> Best regards,
>> Bo
>>
>>

Re: who uses helix

Posted by kishore g <g....@gmail.com>.
Can you talk about your use case?. Helix does not rebalance partitions
based on load automatically but one can use customized mode to achieve
that.

On Sun, Oct 20, 2019 at 10:01 PM Lei Xia <lx...@linkedin.com> wrote:

> Here are blogs with some latest adaptation of Helix outside of LinkedIn.
>
>
>    - Uber's Kafka replicator built on top of Helix:
>    https://eng.uber.com/ureplicator/
>
>    - Pinterest's RocksDB replicator:
>    https://medium.com/@Pinterest_Engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>
>    -
>    - Airbnb’s Change Data Capture system:
>    https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
>
>
>
>
> *Lei Xia*
>
>
> Data Infra/Helix
>
> lxia@linkedin.com
> www.linkedin.com/in/lxia1
> ------------------------------
> *From:* kishore g <g....@gmail.com>
> *Sent:* Sunday, October 20, 2019 9:24 PM
> *To:* user@helix.apache.org <us...@helix.apache.org>
> *Subject:* Re: who uses helix
>
> At LinkedIn, Helix manages thousands on nodes. 1 million segments
> (equivalent of partitions) across thousands of tables.
>
> On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:
>
> Cool.
>
> In the rocksplicator case, how many partitions, replica + nodes a helix
> cluster can manage?
>
> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>
> We extensively use Helix at Pinterest. This blog post has more details and
> some tips.
>
> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=If0JOZJSZFBq2jziFG5ZGdhoE%2B30KHw%2B4a2V6F56fLo%3D&reserved=0>
>
> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>
> Hi,
>
> I was looking for who uses helix, and got
> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=5HtQ%2F8U3IvrrLjhoQN4%2BlO9r9jNDpuX9MeXVGovKKAs%3D&reserved=0>
>
> The link did not get updated after 2017. Is the list still update to date?
>
> Thank you, j
>
>
>
> --
> Best regards,
> Bo
>
>

Re: who uses helix

Posted by Lei Xia <lx...@linkedin.com>.
Here are blogs with some latest adaptation of Helix outside of LinkedIn.


  *   Uber's Kafka replicator built on top of Helix: https://eng.uber.com/ureplicator/

  *   Pinterest's RocksDB replicator: https://medium.com/@Pinterest_Engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
  *
  *   Airbnb’s Change Data Capture system: https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f



Lei Xia


Data Infra/Helix

lxia@linkedin.com<ma...@linkedin.com>
www.linkedin.com/in/lxia1<http://www.linkedin.com/in/lxia1>

________________________________
From: kishore g <g....@gmail.com>
Sent: Sunday, October 20, 2019 9:24 PM
To: user@helix.apache.org <us...@helix.apache.org>
Subject: Re: who uses helix

At LinkedIn, Helix manages thousands on nodes. 1 million segments (equivalent of partitions) across thousands of tables.

On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com>> wrote:
Cool.

In the rocksplicator case, how many partitions, replica + nodes a helix cluster can manage?

On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com>> wrote:
We extensively use Helix at Pinterest. This blog post has more details and some tips.
https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedium.com%2Fpinterest-engineering%2Fautomated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=If0JOZJSZFBq2jziFG5ZGdhoE%2B30KHw%2B4a2V6F56fLo%3D&reserved=0>

On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com>> wrote:
Hi,

I was looking for who uses helix, and got
https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FHELIX%2FPowered%2BBy%2BHelix&data=02%7C01%7Clxia%40linkedin.com%7Cbf9310fdb70647d3761308d755df905f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637072291002932821&sdata=5HtQ%2F8U3IvrrLjhoQN4%2BlO9r9jNDpuX9MeXVGovKKAs%3D&reserved=0>

The link did not get updated after 2017. Is the list still update to date?

Thank you, j


--
Best regards,
Bo


Re: who uses helix

Posted by kishore g <g....@gmail.com>.
At LinkedIn, Helix manages thousands on nodes. 1 million segments
(equivalent of partitions) across thousands of tables.

On Sun, Oct 20, 2019 at 9:16 PM Jianzhou Zhao <jz...@quora.com> wrote:

> Cool.
>
> In the rocksplicator case, how many partitions, replica + nodes a helix
> cluster can manage?
>
> On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:
>
>> We extensively use Helix at Pinterest. This blog post has more details
>> and some tips.
>>
>> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>>
>> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>>
>>> Hi,
>>>
>>> I was looking for who uses helix, and got
>>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>>>
>>> The link did not get updated after 2017. Is the list still update to
>>> date?
>>>
>>> Thank you, j
>>>
>>
>>
>> --
>> Best regards,
>> Bo
>>
>>

Re: who uses helix

Posted by Jianzhou Zhao <jz...@quora.com>.
Cool.

In the rocksplicator case, how many partitions, replica + nodes a helix
cluster can manage?

On Sun, Oct 20, 2019 at 9:05 PM Bo Liu <ne...@gmail.com> wrote:

> We extensively use Helix at Pinterest. This blog post has more details and
> some tips.
>
> https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833
>
> On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:
>
>> Hi,
>>
>> I was looking for who uses helix, and got
>> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>>
>> The link did not get updated after 2017. Is the list still update to date?
>>
>> Thank you, j
>>
>
>
> --
> Best regards,
> Bo
>
>

Re: who uses helix

Posted by Bo Liu <ne...@gmail.com>.
We extensively use Helix at Pinterest. This blog post has more details and
some tips.
https://medium.com/pinterest-engineering/automated-cluster-management-and-recovery-for-rocksplicator-f1f8fd35c833

On Sun, Oct 20, 2019 at 8:30 PM Jianzhou Zhao <jz...@quora.com> wrote:

> Hi,
>
> I was looking for who uses helix, and got
> https://cwiki.apache.org/confluence/display/HELIX/Powered+By+Helix
>
> The link did not get updated after 2017. Is the list still update to date?
>
> Thank you, j
>


-- 
Best regards,
Bo