You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@helix.apache.org by Sandeep Nayak <os...@gmail.com> on 2014/03/10 01:16:58 UTC

Questions on Helix Model

Hey guys,

I got a chance to get back to the model and had a few questions. All
these are aligned towards cleaning up the model. Let me know if I am
going down the wrong path in wanting to clean up the model.

* Are constraints only applied to cluster?

* What entities is CurrentState applicable to? It appears it applies
to a resource + partition combination, am I correct? If so why not add
the concept of 'Partition' to the model and have it return resources
and have state on the resources?

* Is there a reason we don't have a base 'State' to have derived
states of IdealState(end state) and CurrentState(where we are in the
transition)? That way we could potentially add StartState(start of
transitions) and IntermediateState(any one of the states transitioned
to while getting to ideal state) as a place holder for all other
states. What do you think?

* Is there a reason you guys decided not to have a representation of
'Cluster' in the model?

* HelixConfigScope applies to configurations related to Cluster,
Resource, Participant, Partition and Constraint. Constraint appears to
be the odd man, can we fix this? How about having Configuration
hierarchy i.e. ClusterConfiguration, ResourceConfiguration,
PartitionConfiguration and so on and do away with Scope?

* We should remove the AlertsHistory, LeaderHistory from the model
package. These are not true model elements, what do you guys think?

* We should remove StateModelDefinition, Transition and State from
model, it is not core model. It applies to entities in the model. What
do you guys think?

* We should remove Message from model, its not a model entity. It
needs to be a separate package, let me know what you guys think?

* LiveInstance should go away, we should only have a status on the
Participant. Let me know what you guys think?

I am hoping to work a bit on the model tonight, so let me know what
you guys think.

Thanks,

Sandeep

Re: Questions on Helix Model

Posted by Sandeep Nayak <os...@gmail.com>.

Thanks for the responses and the wiki link, I apologize for the
delayed response on the question of what is the definition of a model.
Let me try and give my perspective and maybe it will help clarify how
I am looking at the model. Mind you when I say does not belong in
model I am not asking to delete them but simply to repurpose them
outside of model as features which work on the model.

"A model is the minimum set of entities or elements that are necessary
to solve a given problem."

In this case the problem is cluster definition, management and execution.

So, having said that I see the core model of Helix as Cluster,
Partition, Resource (both primary and replicas), Members (Participants
.. etc). Arguably there are more and which is why I am asking those
questions.

IMO state is a property on elements on the model. State machine and
transitions didn't seem to belong to model IMO because a clustering
solution can be achieved without a state machine i.e. a lot of
peer-peer solutions don't have the concepts of state machines and
transitions. These are overlays on the model, its a layer sitting on
top of the model entities which orchestrates the lifecycle of an
entity.

I added some of my responses inline to your responses.

Sandeep

On Mon, Mar 10, 2014 at 12:40 PM, Kanak Biscuitwala <ka...@hotmail.com> wrote:
>
> Hi,
>
> I added a new section to the wiki page. It's called definitions and it enumerates the Helix entities, what they do, and what scope they work at. I hope this will be a good starting point as it more or less describes all the functionality that this API needs to support/simplify.
>
> https://cwiki.apache.org/confluence/display/HELIX/API+Redesign+-+Progress
>
> Thanks,
> Kanak
>
> ----------------------------------------
>> From: kanak.b@hotmail.com
>> To: dev@helix.apache.org
>> Subject: RE: Questions on Helix Model
>> Date: Sun, 9 Mar 2014 18:04:02 -0700
>>
>>
>> Responses inline. Can you please define "model" and what should and shouldn't be a model? I'm a little unclear here. Also, it seems like there's already classes doing what you're suggesting in some cases. I'm not sure how we can best communicate when that is the case in general.
>>
>> ----------------------------------------
>>> Date: Sun, 9 Mar 2014 17:16:58 -0700
>>> Subject: Questions on Helix Model
>>> From: osgigeek@gmail.com
>>> To: dev@helix.apache.org
>>>
>>> Hey guys,
>>>
>>> I got a chance to get back to the model and had a few questions. All
>>> these are aligned towards cleaning up the model. Let me know if I am
>>> going down the wrong path in wanting to clean up the model.
>>>
>>> * Are constraints only applied to cluster?
>>
>> No, they can be applied at cluster, resource, participant, and partition scopes. Currently the Helix admin takes a scope paramater in when setting constraints. We can either continue to do this, or expose constraint commands for each scope builders.

[Sandeep] But currently we do not have ResourceConstraint,
ParticipantConstraint but we do have ClusterConstraints. If we want to
apply them to all the other areas then I would do away with scope and
apply constraints for each of those commands.

>>
>>>
>>> * What entities is CurrentState applicable to? It appears it applies
>>> to a resource + partition combination, am I correct? If so why not add
>>> the concept of 'Partition' to the model and have it return resources
>>> and have state on the resources?
>>
>> Current state only exists at the scope of a participant. It's a runtime entity which indicates which resource partitions a node has and what state they are in. I wouldn't agree with partitions returning resources because a partition is a subset of a resource and a node can serve multiple partitions of the same resource, as well as multiple resources.

[Sandeep] Does a partition have exactly one resource? Or can a
partition can carry multiple resources? If a partition has multiple
resources then I don't see why partitions cannot return resources. A
resource resides on all the partitions no? I think I am missing
something. Can you explain why you think a partition is a subset of
resource?

I see it as

node -> {set of partitions}
partition -> {set of resources}

>>
>>>
>>> * Is there a reason we don't have a base 'State' to have derived
>>> states of IdealState(end state) and CurrentState(where we are in the
>>> transition)? That way we could potentially add StartState(start of
>>> transitions) and IntermediateState(any one of the states transitioned
>>> to while getting to ideal state) as a place holder for all other
>>> states. What do you think?
>>
>> Because 'State' means something else in Helix terminology. It's the state of a single replica of a partition (i.e. it's scoped the same as a StateModelDefinition).
>>
>> I think IdealState and CurrentState are self-explanatory. I don't think adding more classes and interfaces with respect to these two will improve understanding, and will only cause confusion if people stumble across old documents.

[Sandeep] The problem is I don't think the two states are as such core
model entities, I think they should hang off one or more of the
entities whose state they represent.

>>
>>>
>>> * Is there a reason you guys decided not to have a representation of
>>> 'Cluster' in the model?
>>
>> The only entities that are cluster-wide are auto join status, constraints, and user configurations. We have something called ClusterConfiguration in the old model that sort of captures this. We also have a ClusterConfig and a Cluster (for runtime snapshots) class.

[Sandeep] Yes, you are correct. For the longest time I was thinking
configuration as properties of the entity, so I always thought of
Cluster has configuration but now that you mention that works too.

>>
>>>
>>> * HelixConfigScope applies to configurations related to Cluster,
>>> Resource, Participant, Partition and Constraint. Constraint appears to
>>> be the odd man, can we fix this? How about having Configuration
>>> hierarchy i.e. ClusterConfiguration, ResourceConfiguration,
>>> PartitionConfiguration and so on and do away with Scope?
>>
>> I would stop using HelixConfigScope and start using Scope where possible. We do have this hierarchy in ClusterConfig, ResourceConfig, and ParticipantConfig.

[Sandeep] Btw there was a class earlier ConfigScope which was
deprecated for HelixConfigScope, so are we thinking of deprecating
HelixConfigScope? I was trying to do away with scope, doesnt a config
at different levels of cluster, resource and participant express scope
anyway?

>>
>>>
>>> * We should remove the AlertsHistory, LeaderHistory from the model
>>> package. These are not true model elements, what do you guys think?
>>>
>>
>> AlertsHistory should go away from Helix altogether. No one uses it and it's not scalable. Leader history is a useful runtime thing to expose in the cluster snapshot class (Cluster).

[Sandeep] Agreed its useful, not saying its not... just pointing to it
not being in model.

>>
>>> * We should remove StateModelDefinition, Transition and State from
>>> model, it is not core model. It applies to entities in the model. What
>>> do you guys think?
>>>
>>
>> They're core Helix concepts. I don't know if that means they should be models or not.

[Sandeep] So far I saw it as an overlay on model i.e. models have the
additional layer of state management. But I can be swayed to look at
them as core entities.

>>
>>> * We should remove Message from model, its not a model entity. It
>>> needs to be a separate package, let me know what you guys think?
>>>
>>
>> We support sending messages between nodes in the cluster. Whether or not they should be in model depends on how you define model.
>>
>>> * LiveInstance should go away, we should only have a status on the
>>> Participant. Let me know what you guys think?
>>>
>>
>> This is runtime information, and see the Participant class, which is meant to be a runtime snapshot of a participant. We can't just have a flag of live/not live. It's also important to know pid, qualified host, and version.
[Sandeep] Do we have a concept of a non-live participant? Wouldn't a
participant have properties of pid, qualified host and version as
attributes? Why would we not have them i.e. what are the cases when we
dont see them on a participant?

>>
>>> I am hoping to work a bit on the model tonight, so let me know what
>>> you guys think.
>>>
>>> Thanks,
>>>
>>> Sandeep
>>
>

Re: Questions on Helix Model

Posted by kishore g <g....@gmail.com>.

This is very useful Kanak. I think a Helix beginner will probably need
something like this. How about a quick sync up on IRC (#apachehelix) at
2:00 PM PST.

thanks,
Kishore G


On Mon, Mar 10, 2014 at 12:40 PM, Kanak Biscuitwala <ka...@hotmail.com>wrote:

>
> Hi,
>
> I added a new section to the wiki page. It's called definitions and it
> enumerates the Helix entities, what they do, and what scope they work at. I
> hope this will be a good starting point as it more or less describes all
> the functionality that this API needs to support/simplify.
>
> https://cwiki.apache.org/confluence/display/HELIX/API+Redesign+-+Progress
>
> Thanks,
> Kanak
>
> ----------------------------------------
> > From: kanak.b@hotmail.com
> > To: dev@helix.apache.org
> > Subject: RE: Questions on Helix Model
> > Date: Sun, 9 Mar 2014 18:04:02 -0700
> >
> >
> > Responses inline. Can you please define "model" and what should and
> shouldn't be a model? I'm a little unclear here. Also, it seems like
> there's already classes doing what you're suggesting in some cases. I'm not
> sure how we can best communicate when that is the case in general.
> >
> > ----------------------------------------
> >> Date: Sun, 9 Mar 2014 17:16:58 -0700
> >> Subject: Questions on Helix Model
> >> From: osgigeek@gmail.com
> >> To: dev@helix.apache.org
> >>
> >> Hey guys,
> >>
> >> I got a chance to get back to the model and had a few questions. All
> >> these are aligned towards cleaning up the model. Let me know if I am
> >> going down the wrong path in wanting to clean up the model.
> >>
> >> * Are constraints only applied to cluster?
> >
> > No, they can be applied at cluster, resource, participant, and partition
> scopes. Currently the Helix admin takes a scope paramater in when setting
> constraints. We can either continue to do this, or expose constraint
> commands for each scope builders.
> >
> >>
> >> * What entities is CurrentState applicable to? It appears it applies
> >> to a resource + partition combination, am I correct? If so why not add
> >> the concept of 'Partition' to the model and have it return resources
> >> and have state on the resources?
> >
> > Current state only exists at the scope of a participant. It's a runtime
> entity which indicates which resource partitions a node has and what state
> they are in. I wouldn't agree with partitions returning resources because a
> partition is a subset of a resource and a node can serve multiple
> partitions of the same resource, as well as multiple resources.
> >
> >>
> >> * Is there a reason we don't have a base 'State' to have derived
> >> states of IdealState(end state) and CurrentState(where we are in the
> >> transition)? That way we could potentially add StartState(start of
> >> transitions) and IntermediateState(any one of the states transitioned
> >> to while getting to ideal state) as a place holder for all other
> >> states. What do you think?
> >
> > Because 'State' means something else in Helix terminology. It's the
> state of a single replica of a partition (i.e. it's scoped the same as a
> StateModelDefinition).
> >
> > I think IdealState and CurrentState are self-explanatory. I don't think
> adding more classes and interfaces with respect to these two will improve
> understanding, and will only cause confusion if people stumble across old
> documents.
> >
> >>
> >> * Is there a reason you guys decided not to have a representation of
> >> 'Cluster' in the model?
> >
> > The only entities that are cluster-wide are auto join status,
> constraints, and user configurations. We have something called
> ClusterConfiguration in the old model that sort of captures this. We also
> have a ClusterConfig and a Cluster (for runtime snapshots) class.
> >
> >>
> >> * HelixConfigScope applies to configurations related to Cluster,
> >> Resource, Participant, Partition and Constraint. Constraint appears to
> >> be the odd man, can we fix this? How about having Configuration
> >> hierarchy i.e. ClusterConfiguration, ResourceConfiguration,
> >> PartitionConfiguration and so on and do away with Scope?
> >
> > I would stop using HelixConfigScope and start using Scope where
> possible. We do have this hierarchy in ClusterConfig, ResourceConfig, and
> ParticipantConfig.
> >
> >>
> >> * We should remove the AlertsHistory, LeaderHistory from the model
> >> package. These are not true model elements, what do you guys think?
> >>
> >
> > AlertsHistory should go away from Helix altogether. No one uses it and
> it's not scalable. Leader history is a useful runtime thing to expose in
> the cluster snapshot class (Cluster).
> >
> >> * We should remove StateModelDefinition, Transition and State from
> >> model, it is not core model. It applies to entities in the model. What
> >> do you guys think?
> >>
> >
> > They're core Helix concepts. I don't know if that means they should be
> models or not.
> >
> >> * We should remove Message from model, its not a model entity. It
> >> needs to be a separate package, let me know what you guys think?
> >>
> >
> > We support sending messages between nodes in the cluster. Whether or not
> they should be in model depends on how you define model.
> >
> >> * LiveInstance should go away, we should only have a status on the
> >> Participant. Let me know what you guys think?
> >>
> >
> > This is runtime information, and see the Participant class, which is
> meant to be a runtime snapshot of a participant. We can't just have a flag
> of live/not live. It's also important to know pid, qualified host, and
> version.
> >
> >> I am hoping to work a bit on the model tonight, so let me know what
> >> you guys think.
> >>
> >> Thanks,
> >>
> >> Sandeep
> >
>
>

RE: Questions on Helix Model

Posted by Kanak Biscuitwala <ka...@hotmail.com>.

Hi,

I added a new section to the wiki page. It's called definitions and it enumerates the Helix entities, what they do, and what scope they work at. I hope this will be a good starting point as it more or less describes all the functionality that this API needs to support/simplify.

https://cwiki.apache.org/confluence/display/HELIX/API+Redesign+-+Progress

Thanks,
Kanak

----------------------------------------
> From: kanak.b@hotmail.com
> To: dev@helix.apache.org
> Subject: RE: Questions on Helix Model
> Date: Sun, 9 Mar 2014 18:04:02 -0700
>
>
> Responses inline. Can you please define "model" and what should and shouldn't be a model? I'm a little unclear here. Also, it seems like there's already classes doing what you're suggesting in some cases. I'm not sure how we can best communicate when that is the case in general.
>
> ----------------------------------------
>> Date: Sun, 9 Mar 2014 17:16:58 -0700
>> Subject: Questions on Helix Model
>> From: osgigeek@gmail.com
>> To: dev@helix.apache.org
>>
>> Hey guys,
>>
>> I got a chance to get back to the model and had a few questions. All
>> these are aligned towards cleaning up the model. Let me know if I am
>> going down the wrong path in wanting to clean up the model.
>>
>> * Are constraints only applied to cluster?
>
> No, they can be applied at cluster, resource, participant, and partition scopes. Currently the Helix admin takes a scope paramater in when setting constraints. We can either continue to do this, or expose constraint commands for each scope builders.
>
>>
>> * What entities is CurrentState applicable to? It appears it applies
>> to a resource + partition combination, am I correct? If so why not add
>> the concept of 'Partition' to the model and have it return resources
>> and have state on the resources?
>
> Current state only exists at the scope of a participant. It's a runtime entity which indicates which resource partitions a node has and what state they are in. I wouldn't agree with partitions returning resources because a partition is a subset of a resource and a node can serve multiple partitions of the same resource, as well as multiple resources.
>
>>
>> * Is there a reason we don't have a base 'State' to have derived
>> states of IdealState(end state) and CurrentState(where we are in the
>> transition)? That way we could potentially add StartState(start of
>> transitions) and IntermediateState(any one of the states transitioned
>> to while getting to ideal state) as a place holder for all other
>> states. What do you think?
>
> Because 'State' means something else in Helix terminology. It's the state of a single replica of a partition (i.e. it's scoped the same as a StateModelDefinition).
>
> I think IdealState and CurrentState are self-explanatory. I don't think adding more classes and interfaces with respect to these two will improve understanding, and will only cause confusion if people stumble across old documents.
>
>>
>> * Is there a reason you guys decided not to have a representation of
>> 'Cluster' in the model?
>
> The only entities that are cluster-wide are auto join status, constraints, and user configurations. We have something called ClusterConfiguration in the old model that sort of captures this. We also have a ClusterConfig and a Cluster (for runtime snapshots) class.
>
>>
>> * HelixConfigScope applies to configurations related to Cluster,
>> Resource, Participant, Partition and Constraint. Constraint appears to
>> be the odd man, can we fix this? How about having Configuration
>> hierarchy i.e. ClusterConfiguration, ResourceConfiguration,
>> PartitionConfiguration and so on and do away with Scope?
>
> I would stop using HelixConfigScope and start using Scope where possible. We do have this hierarchy in ClusterConfig, ResourceConfig, and ParticipantConfig.
>
>>
>> * We should remove the AlertsHistory, LeaderHistory from the model
>> package. These are not true model elements, what do you guys think?
>>
>
> AlertsHistory should go away from Helix altogether. No one uses it and it's not scalable. Leader history is a useful runtime thing to expose in the cluster snapshot class (Cluster).
>
>> * We should remove StateModelDefinition, Transition and State from
>> model, it is not core model. It applies to entities in the model. What
>> do you guys think?
>>
>
> They're core Helix concepts. I don't know if that means they should be models or not.
>
>> * We should remove Message from model, its not a model entity. It
>> needs to be a separate package, let me know what you guys think?
>>
>
> We support sending messages between nodes in the cluster. Whether or not they should be in model depends on how you define model.
>
>> * LiveInstance should go away, we should only have a status on the
>> Participant. Let me know what you guys think?
>>
>
> This is runtime information, and see the Participant class, which is meant to be a runtime snapshot of a participant. We can't just have a flag of live/not live. It's also important to know pid, qualified host, and version.
>
>> I am hoping to work a bit on the model tonight, so let me know what
>> you guys think.
>>
>> Thanks,
>>
>> Sandeep
>

RE: Questions on Helix Model

Posted by Kanak Biscuitwala <ka...@hotmail.com>.

Responses inline. Can you please define "model" and what should and shouldn't be a model? I'm a little unclear here. Also, it seems like there's already classes doing what you're suggesting in some cases. I'm not sure how we can best communicate when that is the case in general.

----------------------------------------
> Date: Sun, 9 Mar 2014 17:16:58 -0700
> Subject: Questions on Helix Model
> From: osgigeek@gmail.com
> To: dev@helix.apache.org
>
> Hey guys,
>
> I got a chance to get back to the model and had a few questions. All
> these are aligned towards cleaning up the model. Let me know if I am
> going down the wrong path in wanting to clean up the model.
>
> * Are constraints only applied to cluster?

No, they can be applied at cluster, resource, participant, and partition scopes. Currently the Helix admin takes a scope paramater in when setting constraints. We can either continue to do this, or expose constraint commands for each scope builders.

>
> * What entities is CurrentState applicable to? It appears it applies
> to a resource + partition combination, am I correct? If so why not add
> the concept of 'Partition' to the model and have it return resources
> and have state on the resources?

Current state only exists at the scope of a participant. It's a runtime entity which indicates which resource partitions a node has and what state they are in. I wouldn't agree with partitions returning resources because a partition is a subset of a resource and a node can serve multiple partitions of the same resource, as well as multiple resources.

>
> * Is there a reason we don't have a base 'State' to have derived
> states of IdealState(end state) and CurrentState(where we are in the
> transition)? That way we could potentially add StartState(start of
> transitions) and IntermediateState(any one of the states transitioned
> to while getting to ideal state) as a place holder for all other
> states. What do you think?

Because 'State' means something else in Helix terminology. It's the state of a single replica of a partition (i.e. it's scoped the same as a StateModelDefinition).

I think IdealState and CurrentState are self-explanatory. I don't think adding more classes and interfaces with respect to these two will improve understanding, and will only cause confusion if people stumble across old documents.

>
> * Is there a reason you guys decided not to have a representation of
> 'Cluster' in the model?

The only entities that are cluster-wide are auto join status, constraints, and user configurations. We have something called ClusterConfiguration in the old model that sort of captures this. We also have a ClusterConfig and a Cluster (for runtime snapshots) class.

>
> * HelixConfigScope applies to configurations related to Cluster,
> Resource, Participant, Partition and Constraint. Constraint appears to
> be the odd man, can we fix this? How about having Configuration
> hierarchy i.e. ClusterConfiguration, ResourceConfiguration,
> PartitionConfiguration and so on and do away with Scope?

I would stop using HelixConfigScope and start using Scope where possible. We do have this hierarchy in ClusterConfig, ResourceConfig, and ParticipantConfig.

>
> * We should remove the AlertsHistory, LeaderHistory from the model
> package. These are not true model elements, what do you guys think?
>

AlertsHistory should go away from Helix altogether. No one uses it and it's not scalable. Leader history is a useful runtime thing to expose in the cluster snapshot class (Cluster).

> * We should remove StateModelDefinition, Transition and State from
> model, it is not core model. It applies to entities in the model. What
> do you guys think?
>

They're core Helix concepts. I don't know if that means they should be models or not.

> * We should remove Message from model, its not a model entity. It
> needs to be a separate package, let me know what you guys think?
>

We support sending messages between nodes in the cluster. Whether or not they should be in model depends on how you define model.

> * LiveInstance should go away, we should only have a status on the
> Participant. Let me know what you guys think?
>

This is runtime information, and see the Participant class, which is meant to be a runtime snapshot of a participant. We can't just have a flag of live/not live. It's also important to know pid, qualified host, and version.

> I am hoping to work a bit on the model tonight, so let me know what
> you guys think.
>
> Thanks,
>
> Sandeep