You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by Zhen Zhang <ne...@gmail.com> on 2015/02/05 22:41:52 UTC

Not using ZK as metadata store

Hi,

As Helix is going to support bigger clusters with more nodes, more
resources, and more partitions, using ZK as metadata store seems not
scalable. This is especially the case for ideal states and external views
where Helix stores all partition states and mappings. Alternatively, we can
use for example, a key-value store for ideal states and external views,
while leaving minimum information on ZK, say URL to the metadata store.
Helix listeners will still get ZK callbacks on all state changes, then
reads the URL, and retrieve the actual data from the more scalable metadata
stores. Any thoughts?

Thanks,
Jason

Re: Not using ZK as metadata store

Posted by kishore g <g....@gmail.com>.
I like the idea in general. Lets try to define the scalability problems of
ZK.

   - amount of data stored is limited(memory).
   - The number of paths where watches can be set is limited (I think it
   should fit within tcp max size)

We can also define the limits in terms of Helix nodes/resources etc.


On the flip side, using another KV store means we need the KV store to
provide same properties as Zookeeper i.e data consistency, replicas etc.

thanks,
Kishore G


On Thu, Feb 5, 2015 at 1:41 PM, Zhen Zhang <ne...@gmail.com> wrote:

> Hi,
>
> As Helix is going to support bigger clusters with more nodes, more
> resources, and more partitions, using ZK as metadata store seems not
> scalable. This is especially the case for ideal states and external views
> where Helix stores all partition states and mappings. Alternatively, we can
> use for example, a key-value store for ideal states and external views,
> while leaving minimum information on ZK, say URL to the metadata store.
> Helix listeners will still get ZK callbacks on all state changes, then
> reads the URL, and retrieve the actual data from the more scalable metadata
> stores. Any thoughts?
>
> Thanks,
> Jason
>

Re: Not using ZK as metadata store

Posted by kishore g <g....@gmail.com>.
I like the idea in general. Lets try to define the scalability problems of
ZK.

   - amount of data stored is limited(memory).
   - The number of paths where watches can be set is limited (I think it
   should fit within tcp max size)

We can also define the limits in terms of Helix nodes/resources etc.


On the flip side, using another KV store means we need the KV store to
provide same properties as Zookeeper i.e data consistency, replicas etc.

thanks,
Kishore G


On Thu, Feb 5, 2015 at 1:41 PM, Zhen Zhang <ne...@gmail.com> wrote:

> Hi,
>
> As Helix is going to support bigger clusters with more nodes, more
> resources, and more partitions, using ZK as metadata store seems not
> scalable. This is especially the case for ideal states and external views
> where Helix stores all partition states and mappings. Alternatively, we can
> use for example, a key-value store for ideal states and external views,
> while leaving minimum information on ZK, say URL to the metadata store.
> Helix listeners will still get ZK callbacks on all state changes, then
> reads the URL, and retrieve the actual data from the more scalable metadata
> stores. Any thoughts?
>
> Thanks,
> Jason
>