You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by "David Alves (JIRA)" <ji...@apache.org> on 2013/03/18 07:42:15 UTC

[jira] [Created] (DRILL-53) Setup cluster configuration and membership mgmt system

David Alves created DRILL-53:
--------------------------------

             Summary: Setup cluster configuration and membership mgmt system
                 Key: DRILL-53
                 URL: https://issues.apache.org/jira/browse/DRILL-53
             Project: Apache Drill
          Issue Type: New Feature
            Reporter: David Alves


Several configuration entries need to be managed across the cluster, namely metastore (hive?) location and drill daemon addresses.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Created] (DRILL-53) Setup cluster configuration and membership mgmt system

Posted by Jacques Nadeau <ja...@apache.org>.

Helix is very interesting.  Curator brings a lot of the same things with a
more library type approach.  My initial pass is with Curator.  The question
around Helix is it seems more complicated so that it can deal with
partition management type things that are more common in HBase or Hadoop
where data ownership is a core part of the system.  I'm not sure that we
have those same requirements.

As far as alternatives to zk, I think zk is great in an environment where
you already have zk.  I'm keeping the interfaces abstracted from that level
so that, hopefully, we can have alternatives.  Maybe we'll switch to helix,
maybe 0xdata.  Let's get something done.  One thing that is great about
Cassandra is the lack of dependent services and thus the ease of spinning
up a cluster.  I'd really like that to be an option for people, even if it
is the Derby equivalent of a coordination layer. I agree that building a
whole another coordination engine from scratch would be a stretch.
 However, borrowing/leveraging other approaches could work very well.

thanks,
Jacques

On Mon, Mar 18, 2013 at 9:05 PM, Ted Dunning <te...@gmail.com> wrote:

> Check out Apache Helix.
>
> Provides pretty much everything you need.  The basic idea is that workers
> have a life cycle defined in terms of a state machine and there are cluster
> wide constraints allowed on how many workers can be in different states.
>  There are also resources which are assigned to workers according to
> flexible preferences.
>
> What these states are, what they mean and what the resources are remains
> comfortably abstract.
>
> The actual coordination is done using ZK, which is a good thing.  For
> anything more than a single worker, you have to have some reliable handling
> of partition and resource assignment anyway, and there are few options
> other than implementing yet another paxos engine or using Zookeeper.  I
> would veto the first as a huge waste of time so we are pretty much left
> with the Zookeeper option.
>
> On Mar 18, 2013, at 2:42 AM, David Alves (JIRA) wrote:
>
> > David Alves created DRILL-53:
> > --------------------------------
> >
> >             Summary: Setup cluster configuration and membership mgmt
> system
> >                 Key: DRILL-53
> >                 URL: https://issues.apache.org/jira/browse/DRILL-53
> >             Project: Apache Drill
> >          Issue Type: New Feature
> >            Reporter: David Alves
> >
> >
> > Several configuration entries need to be managed across the cluster,
> namely metastore (hive?) location and drill daemon addresses.
> >
> >
> > --
> > This message is automatically generated by JIRA.
> > If you think it was sent incorrectly, please contact your JIRA
> administrators
> > For more information on JIRA, see:
> http://www.atlassian.com/software/jira
>
>

Re: [jira] [Created] (DRILL-53) Setup cluster configuration and membership mgmt system

Posted by Ted Dunning <te...@gmail.com>.

Check out Apache Helix.

Provides pretty much everything you need.  The basic idea is that workers have a life cycle defined in terms of a state machine and there are cluster wide constraints allowed on how many workers can be in different states.  There are also resources which are assigned to workers according to flexible preferences.

What these states are, what they mean and what the resources are remains comfortably abstract.

The actual coordination is done using ZK, which is a good thing.  For anything more than a single worker, you have to have some reliable handling of partition and resource assignment anyway, and there are few options other than implementing yet another paxos engine or using Zookeeper.  I would veto the first as a huge waste of time so we are pretty much left with the Zookeeper option.

On Mar 18, 2013, at 2:42 AM, David Alves (JIRA) wrote:

> David Alves created DRILL-53:
> --------------------------------
> 
>             Summary: Setup cluster configuration and membership mgmt system
>                 Key: DRILL-53
>                 URL: https://issues.apache.org/jira/browse/DRILL-53
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: David Alves
> 
> 
> Several configuration entries need to be managed across the cluster, namely metastore (hive?) location and drill daemon addresses.
> 
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira