You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Anton Kalashnikov <ka...@yandex.ru> on 2019/01/24 15:33:04 UTC

Baseline auto-adjust`s discuss

Hello, Igniters!

Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I want to start to discuss of implementation of "Baseline auto-adjust" [2].

"Baseline auto-adjust" feature implements mechanism of auto-adjust baseline corresponding to current topology after event join/left was appeared. It is required because when a node left the grid and nobody would change baseline manually it can lead to lost data(when some more nodes left the grid on depends in backup factor) but permanent tracking of grid is not always possible/desirible. Looks like in many cases auto-adjust baseline after some timeout is very helpfull.

Distributed metastore[3](it is already done):

First of all it is required the ability to store configuration data consistently and cluster-wide. Ignite doesn't have any specific API for such configurations and we don't want to have many similar implementations of the same feature in our code. After some thoughts is was proposed to implement it as some kind of distributed metastorage that gives the ability to store any data in it.
First implementation is based on existing local metastorage API for persistent clusters (in-memory clusters will store data in memory). Write/remove operation use Discovery SPI to send updates to the cluster, it guarantees updates order and the fact that all existing (alive) nodes have handled the update message. As a way to find out which node has the latest data there is a "version" value of distributed metastorage, which is basically <number of all updates, hash of updates>. All updates history until some point in the past is stored along with the data, so when an outdated node connects to the cluster it will receive all the missing data and apply it locally. If there's not enough history stored or joining node is clear then it'll receive shapshot of distributed metastorage so there won't be inconsistencies.

Baseline auto-adjust:

Main scenario:
- There is grid with the baseline is equal to the current topology
- New node joins to grid or some node left(failed) the grid
- New mechanism detects this event and it add task for changing baseline to queue with configured timeout
- If new event are happened before baseline would be changed task would be removed from queue and new task will be added
- When timeout are expired the task would try to set new baseline corresponded to current topology

First of all we need to add two parameters[4]:
- baselineAutoAdjustEnabled - enable/disable "Baseline auto-adjust" feature.
- baselineAutoAdjustTimeout - timeout after which baseline should be changed.

This parameters are cluster wide and can be changed in real time because it is based on "Distributed metastore". On first time this parameters would be initiated by corresponded parameters(initBaselineAutoAdjustEnabled, initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value valid only before first changing of it after value would be changed it is stored in "Distributed metastore".

Restrictions:
- This mechanism handling events only on active grid
- If baselineNodes != gridNodes on activate this feature would be disabled
- If lost partitions was detected this feature would be disabled
- If baseline was adjusted manually on baselineNodes != gridNodes this feature would be disabled

Draft implementation you can find here[5]. Feel free to ask more details and make suggestions.

[1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
[2] https://issues.apache.org/jira/browse/IGNITE-8571
[3] https://issues.apache.org/jira/browse/IGNITE-10640
[4] https://issues.apache.org/jira/browse/IGNITE-8573
[5] https://github.com/apache/ignite/pull/5907

--
Best regards,
Anton Kalashnikov

Re: Baseline auto-adjust`s discuss

Posted by Павлухин Иван <vo...@gmail.com>.

Anton,

Thank you for the clarification!

вт, 29 янв. 2019 г. в 15:15, Anton Kalashnikov <ka...@yandex.ru>:
>
> Ivan, I glad you interested in this feature. Some answers are below.
>
> Yes, it is correct about properties it is consistent across the cluster cause it based on distributed metastore(we have other topic for discuss it).
>
> Some details of implementation: When event happened we added task to GridTimeoutProcessor(old mechanism for task execution with delay). Task is added only for coordinator. There is not required do it on non coordinator nodes because if coordinator will be failed new event will be appear and we will generate new task on new coordinator.
>
> --
> Best regards,
> Anton Kalashnikov
>
>
> 28.01.2019, 13:17, "Павлухин Иван" <vo...@gmail.com>:
> > Anton,
> >
> > Great feature!
> >
> > Could you please clarify a bit about implementation details? As I
> > understood auto-adjust properties is meant to be consistent across the
> > cluster. And baseline adjustment is put into some delay queue. Do we
> > put event into a queue on each node? Or is there some dedicated node
> > driving baseline adjustment?
> >
> > пт, 25 янв. 2019 г. в 16:31, Anton Kalashnikov <ka...@yandex.ru>:
> >>  Initially, hard timeout should protect grid from constantly changing topology(constantly blinking node). But in fact if we have constantly changing topology, baseline adjust operation is failed in most cases. As result hard timeout only added complexity but it don't give any new guarantee. So I think we can skip it in first implementation.
> >>
> >>  First of all timeout protect us from unnecessary adjust of baseline . If node left the grid and immediately(or after some time less than us timeout) it join back to grid. Also timeout is helpful in other cases when some events happened one after another.
> >>
> >>  This feature doesn't have any complex heuristic to react, except of described in restrictions section.
> >>
> >>  Also I want to notes that this feature isn't protect us from constantly blinking node. We need one more heuristic mechanism for detect this situation and doing some actions like removing this node from grid.
> >>
> >>  --
> >>  Best regards,
> >>  Anton Kalashnikov
> >>
> >>  25.01.2019, 15:43, "Sergey Chugunov" <se...@gmail.com>:
> >>  > Anton,
> >>  >
> >>  > As I understand from the IEP document policy was supposed to support two
> >>  > timeouts: soft and hard, so here you're proposing a bit simpler
> >>  > functionality.
> >>  >
> >>  > Just to clarify, do I understand correctly that this feature when enabled
> >>  > will auto-adjust blt on each node join/node left event, and timeout is
> >>  > necessary to protect us from blinking nodes?
> >>  > So no complexities with taking into account number of alive backups or
> >>  > something like that?
> >>  >
> >>  > On Fri, Jan 25, 2019 at 1:11 PM Vladimir Ozerov <vo...@gridgain.com>
> >>  > wrote:
> >>  >
> >>  >> Got it, makes sense.
> >>  >>
> >>  >> On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <ka...@yandex.ru>
> >>  >> wrote:
> >>  >>
> >>  >> > Vladimir, thanks for your notes, both of them looks good enough but I
> >>  >> > have two different thoughts about it.
> >>  >> >
> >>  >> > I think I agree about enabling only one of manual/auto adjustment. It is
> >>  >> > easier than current solution and in fact as extra feature we can allow
> >>  >> > user to force task to execute(if they doesn't want to wait until timeout
> >>  >> > expired).
> >>  >> > But about second one I don't sure that one parameters instead of two
> >>  >> would
> >>  >> > be more convenient. For example: in case when user changed timeout and
> >>  >> then
> >>  >> > disable auto-adjust after then when someone will want to enable it they
> >>  >> > should know what value of timeout was before auto-adjust was disabled. I
> >>  >> > think "negative value" pattern good choice for always usable parameters
> >>  >> > like timeout of connection (ex. -1 equal to endless waiting) and so on,
> >>  >> but
> >>  >> > in our case we want to disable whole functionality rather than change
> >>  >> > parameter value.
> >>  >> >
> >>  >> > --
> >>  >> > Best regards,
> >>  >> > Anton Kalashnikov
> >>  >> >
> >>  >> >
> >>  >> > 24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
> >>  >> > > Hi Anton,
> >>  >> > >
> >>  >> > > This is great feature, but I am a bit confused about automatic
> >>  >> disabling
> >>  >> > of
> >>  >> > > a feature during manual baseline adjustment. This may lead to
> >>  >> unpleasant
> >>  >> > > situations when a user enabled auto-adjustment, then re-adjusted it
> >>  >> > > manually somehow (e.g. from some previously created script) so that
> >>  >> > > auto-adjustment disabling went unnoticed, then added more nodes hoping
> >>  >> > that
> >>  >> > > auto-baseline is still active, etc.
> >>  >> > >
> >>  >> > > Instead, I would rather make manual and auto adjustment mutually
> >>  >> > exclusive
> >>  >> > > - baseline cannot be adjusted manually when auto mode is set, and vice
> >>  >> > > versa. If exception is thrown in that cases, administrators will always
> >>  >> > > know current behavior of the system.
> >>  >> > >
> >>  >> > > As far as configuration, wouldn’t it be enough to have a single long
> >>  >> > value
> >>  >> > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
> >>  >> > negative
> >>  >> > > - disabled, positive - auto adjustment after timeout.
> >>  >> > >
> >>  >> > > Thoughts?
> >>  >> > >
> >>  >> > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
> >>  >> > >
> >>  >> > >> Hello, Igniters!
> >>  >> > >>
> >>  >> > >> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
> >>  >> > want
> >>  >> > >> to start to discuss of implementation of "Baseline auto-adjust" [2].
> >>  >> > >>
> >>  >> > >> "Baseline auto-adjust" feature implements mechanism of auto-adjust
> >>  >> > >> baseline corresponding to current topology after event join/left was
> >>  >> > >> appeared. It is required because when a node left the grid and nobody
> >>  >> > would
> >>  >> > >> change baseline manually it can lead to lost data(when some more
> >>  >> nodes
> >>  >> > left
> >>  >> > >> the grid on depends in backup factor) but permanent tracking of grid
> >>  >> > is not
> >>  >> > >> always possible/desirible. Looks like in many cases auto-adjust
> >>  >> > baseline
> >>  >> > >> after some timeout is very helpfull.
> >>  >> > >>
> >>  >> > >> Distributed metastore[3](it is already done):
> >>  >> > >>
> >>  >> > >> First of all it is required the ability to store configuration data
> >>  >> > >> consistently and cluster-wide. Ignite doesn't have any specific API
> >>  >> for
> >>  >> > >> such configurations and we don't want to have many similar
> >>  >> > implementations
> >>  >> > >> of the same feature in our code. After some thoughts is was proposed
> >>  >> to
> >>  >> > >> implement it as some kind of distributed metastorage that gives the
> >>  >> > ability
> >>  >> > >> to store any data in it.
> >>  >> > >> First implementation is based on existing local metastorage API for
> >>  >> > >> persistent clusters (in-memory clusters will store data in memory).
> >>  >> > >> Write/remove operation use Discovery SPI to send updates to the
> >>  >> > cluster, it
> >>  >> > >> guarantees updates order and the fact that all existing (alive) nodes
> >>  >> > have
> >>  >> > >> handled the update message. As a way to find out which node has the
> >>  >> > latest
> >>  >> > >> data there is a "version" value of distributed metastorage, which is
> >>  >> > >> basically <number of all updates, hash of updates>. All updates
> >>  >> history
> >>  >> > >> until some point in the past is stored along with the data, so when
> >>  >> an
> >>  >> > >> outdated node connects to the cluster it will receive all the missing
> >>  >> > data
> >>  >> > >> and apply it locally. If there's not enough history stored or joining
> >>  >> > node
> >>  >> > >> is clear then it'll receive shapshot of distributed metastorage so
> >>  >> > there
> >>  >> > >> won't be inconsistencies.
> >>  >> > >>
> >>  >> > >> Baseline auto-adjust:
> >>  >> > >>
> >>  >> > >> Main scenario:
> >>  >> > >> - There is grid with the baseline is equal to the current
> >>  >> > topology
> >>  >> > >> - New node joins to grid or some node left(failed) the grid
> >>  >> > >> - New mechanism detects this event and it add task for
> >>  >> changing
> >>  >> > >> baseline to queue with configured timeout
> >>  >> > >> - If new event are happened before baseline would be changed
> >>  >> > task
> >>  >> > >> would be removed from queue and new task will be added
> >>  >> > >> - When timeout are expired the task would try to set new
> >>  >> > baseline
> >>  >> > >> corresponded to current topology
> >>  >> > >>
> >>  >> > >> First of all we need to add two parameters[4]:
> >>  >> > >> - baselineAutoAdjustEnabled - enable/disable "Baseline
> >>  >> > >> auto-adjust" feature.
> >>  >> > >> - baselineAutoAdjustTimeout - timeout after which baseline
> >>  >> > should
> >>  >> > >> be changed.
> >>  >> > >>
> >>  >> > >> This parameters are cluster wide and can be changed in real time
> >>  >> > because
> >>  >> > >> it is based on "Distributed metastore". On first time this parameters
> >>  >> > would
> >>  >> > >> be initiated by corresponded
> >>  >> parameters(initBaselineAutoAdjustEnabled,
> >>  >> > >> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init
> >>  >> value
> >>  >> > >> valid only before first changing of it after value would be changed
> >>  >> it
> >>  >> > is
> >>  >> > >> stored in "Distributed metastore".
> >>  >> > >>
> >>  >> > >> Restrictions:
> >>  >> > >> - This mechanism handling events only on active grid
> >>  >> > >> - If baselineNodes != gridNodes on activate this feature
> >>  >> would
> >>  >> > be
> >>  >> > >> disabled
> >>  >> > >> - If lost partitions was detected this feature would be
> >>  >> > disabled
> >>  >> > >> - If baseline was adjusted manually on baselineNodes !=
> >>  >> > gridNodes
> >>  >> > >> this feature would be disabled
> >>  >> > >>
> >>  >> > >> Draft implementation you can find here[5]. Feel free to ask more
> >>  >> > details
> >>  >> > >> and make suggestions.
> >>  >> > >>
> >>  >> > >> [1]
> >>  >> > >>
> >>  >> >
> >>  >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> >>  >> > >> [2] https://issues.apache.org/jira/browse/IGNITE-8571
> >>  >> > >> [3] https://issues.apache.org/jira/browse/IGNITE-10640
> >>  >> > >> [4] https://issues.apache.org/jira/browse/IGNITE-8573
> >>  >> > >> [5] https://github.com/apache/ignite/pull/5907
> >>  >> > >>
> >>  >> > >> --
> >>  >> > >> Best regards,
> >>  >> > >> Anton Kalashnikov
> >>  >> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin



-- 
Best regards,
Ivan Pavlukhin

Re: Baseline auto-adjust`s discuss

Posted by Anton Kalashnikov <ka...@yandex.ru>.

Ivan, I glad you interested in this feature. Some answers are below.

Yes, it is correct about properties it is consistent across the cluster cause it based on distributed metastore(we have other topic for discuss it).

Some details of implementation: When event happened we added task to GridTimeoutProcessor(old mechanism for task execution with delay). Task is added only for coordinator. There is not required do it on non coordinator nodes because if coordinator will be failed new event will be appear and we will generate new task on new coordinator. 

-- 
Best regards,
Anton Kalashnikov


28.01.2019, 13:17, "Павлухин Иван" <vo...@gmail.com>:
> Anton,
>
> Great feature!
>
> Could you please clarify a bit about implementation details? As I
> understood auto-adjust properties is meant to be consistent across the
> cluster. And baseline adjustment is put into some delay queue. Do we
> put event into a queue on each node? Or is there some dedicated node
> driving baseline adjustment?
>
> пт, 25 янв. 2019 г. в 16:31, Anton Kalashnikov <ka...@yandex.ru>:
>>  Initially, hard timeout should protect grid from constantly changing topology(constantly blinking node). But in fact if we have constantly changing topology, baseline adjust operation is failed in most cases. As result hard timeout only added complexity but it don't give any new guarantee. So I think we can skip it in first implementation.
>>
>>  First of all timeout protect us from unnecessary adjust of baseline . If node left the grid and immediately(or after some time less than us timeout) it join back to grid. Also timeout is helpful in other cases when some events happened one after another.
>>
>>  This feature doesn't have any complex heuristic to react, except of described in restrictions section.
>>
>>  Also I want to notes that this feature isn't protect us from constantly blinking node. We need one more heuristic mechanism for detect this situation and doing some actions like removing this node from grid.
>>
>>  --
>>  Best regards,
>>  Anton Kalashnikov
>>
>>  25.01.2019, 15:43, "Sergey Chugunov" <se...@gmail.com>:
>>  > Anton,
>>  >
>>  > As I understand from the IEP document policy was supposed to support two
>>  > timeouts: soft and hard, so here you're proposing a bit simpler
>>  > functionality.
>>  >
>>  > Just to clarify, do I understand correctly that this feature when enabled
>>  > will auto-adjust blt on each node join/node left event, and timeout is
>>  > necessary to protect us from blinking nodes?
>>  > So no complexities with taking into account number of alive backups or
>>  > something like that?
>>  >
>>  > On Fri, Jan 25, 2019 at 1:11 PM Vladimir Ozerov <vo...@gridgain.com>
>>  > wrote:
>>  >
>>  >> Got it, makes sense.
>>  >>
>>  >> On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <ka...@yandex.ru>
>>  >> wrote:
>>  >>
>>  >> > Vladimir, thanks for your notes, both of them looks good enough but I
>>  >> > have two different thoughts about it.
>>  >> >
>>  >> > I think I agree about enabling only one of manual/auto adjustment. It is
>>  >> > easier than current solution and in fact as extra feature we can allow
>>  >> > user to force task to execute(if they doesn't want to wait until timeout
>>  >> > expired).
>>  >> > But about second one I don't sure that one parameters instead of two
>>  >> would
>>  >> > be more convenient. For example: in case when user changed timeout and
>>  >> then
>>  >> > disable auto-adjust after then when someone will want to enable it they
>>  >> > should know what value of timeout was before auto-adjust was disabled. I
>>  >> > think "negative value" pattern good choice for always usable parameters
>>  >> > like timeout of connection (ex. -1 equal to endless waiting) and so on,
>>  >> but
>>  >> > in our case we want to disable whole functionality rather than change
>>  >> > parameter value.
>>  >> >
>>  >> > --
>>  >> > Best regards,
>>  >> > Anton Kalashnikov
>>  >> >
>>  >> >
>>  >> > 24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
>>  >> > > Hi Anton,
>>  >> > >
>>  >> > > This is great feature, but I am a bit confused about automatic
>>  >> disabling
>>  >> > of
>>  >> > > a feature during manual baseline adjustment. This may lead to
>>  >> unpleasant
>>  >> > > situations when a user enabled auto-adjustment, then re-adjusted it
>>  >> > > manually somehow (e.g. from some previously created script) so that
>>  >> > > auto-adjustment disabling went unnoticed, then added more nodes hoping
>>  >> > that
>>  >> > > auto-baseline is still active, etc.
>>  >> > >
>>  >> > > Instead, I would rather make manual and auto adjustment mutually
>>  >> > exclusive
>>  >> > > - baseline cannot be adjusted manually when auto mode is set, and vice
>>  >> > > versa. If exception is thrown in that cases, administrators will always
>>  >> > > know current behavior of the system.
>>  >> > >
>>  >> > > As far as configuration, wouldn’t it be enough to have a single long
>>  >> > value
>>  >> > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
>>  >> > negative
>>  >> > > - disabled, positive - auto adjustment after timeout.
>>  >> > >
>>  >> > > Thoughts?
>>  >> > >
>>  >> > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
>>  >> > >
>>  >> > >> Hello, Igniters!
>>  >> > >>
>>  >> > >> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
>>  >> > want
>>  >> > >> to start to discuss of implementation of "Baseline auto-adjust" [2].
>>  >> > >>
>>  >> > >> "Baseline auto-adjust" feature implements mechanism of auto-adjust
>>  >> > >> baseline corresponding to current topology after event join/left was
>>  >> > >> appeared. It is required because when a node left the grid and nobody
>>  >> > would
>>  >> > >> change baseline manually it can lead to lost data(when some more
>>  >> nodes
>>  >> > left
>>  >> > >> the grid on depends in backup factor) but permanent tracking of grid
>>  >> > is not
>>  >> > >> always possible/desirible. Looks like in many cases auto-adjust
>>  >> > baseline
>>  >> > >> after some timeout is very helpfull.
>>  >> > >>
>>  >> > >> Distributed metastore[3](it is already done):
>>  >> > >>
>>  >> > >> First of all it is required the ability to store configuration data
>>  >> > >> consistently and cluster-wide. Ignite doesn't have any specific API
>>  >> for
>>  >> > >> such configurations and we don't want to have many similar
>>  >> > implementations
>>  >> > >> of the same feature in our code. After some thoughts is was proposed
>>  >> to
>>  >> > >> implement it as some kind of distributed metastorage that gives the
>>  >> > ability
>>  >> > >> to store any data in it.
>>  >> > >> First implementation is based on existing local metastorage API for
>>  >> > >> persistent clusters (in-memory clusters will store data in memory).
>>  >> > >> Write/remove operation use Discovery SPI to send updates to the
>>  >> > cluster, it
>>  >> > >> guarantees updates order and the fact that all existing (alive) nodes
>>  >> > have
>>  >> > >> handled the update message. As a way to find out which node has the
>>  >> > latest
>>  >> > >> data there is a "version" value of distributed metastorage, which is
>>  >> > >> basically <number of all updates, hash of updates>. All updates
>>  >> history
>>  >> > >> until some point in the past is stored along with the data, so when
>>  >> an
>>  >> > >> outdated node connects to the cluster it will receive all the missing
>>  >> > data
>>  >> > >> and apply it locally. If there's not enough history stored or joining
>>  >> > node
>>  >> > >> is clear then it'll receive shapshot of distributed metastorage so
>>  >> > there
>>  >> > >> won't be inconsistencies.
>>  >> > >>
>>  >> > >> Baseline auto-adjust:
>>  >> > >>
>>  >> > >> Main scenario:
>>  >> > >> - There is grid with the baseline is equal to the current
>>  >> > topology
>>  >> > >> - New node joins to grid or some node left(failed) the grid
>>  >> > >> - New mechanism detects this event and it add task for
>>  >> changing
>>  >> > >> baseline to queue with configured timeout
>>  >> > >> - If new event are happened before baseline would be changed
>>  >> > task
>>  >> > >> would be removed from queue and new task will be added
>>  >> > >> - When timeout are expired the task would try to set new
>>  >> > baseline
>>  >> > >> corresponded to current topology
>>  >> > >>
>>  >> > >> First of all we need to add two parameters[4]:
>>  >> > >> - baselineAutoAdjustEnabled - enable/disable "Baseline
>>  >> > >> auto-adjust" feature.
>>  >> > >> - baselineAutoAdjustTimeout - timeout after which baseline
>>  >> > should
>>  >> > >> be changed.
>>  >> > >>
>>  >> > >> This parameters are cluster wide and can be changed in real time
>>  >> > because
>>  >> > >> it is based on "Distributed metastore". On first time this parameters
>>  >> > would
>>  >> > >> be initiated by corresponded
>>  >> parameters(initBaselineAutoAdjustEnabled,
>>  >> > >> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init
>>  >> value
>>  >> > >> valid only before first changing of it after value would be changed
>>  >> it
>>  >> > is
>>  >> > >> stored in "Distributed metastore".
>>  >> > >>
>>  >> > >> Restrictions:
>>  >> > >> - This mechanism handling events only on active grid
>>  >> > >> - If baselineNodes != gridNodes on activate this feature
>>  >> would
>>  >> > be
>>  >> > >> disabled
>>  >> > >> - If lost partitions was detected this feature would be
>>  >> > disabled
>>  >> > >> - If baseline was adjusted manually on baselineNodes !=
>>  >> > gridNodes
>>  >> > >> this feature would be disabled
>>  >> > >>
>>  >> > >> Draft implementation you can find here[5]. Feel free to ask more
>>  >> > details
>>  >> > >> and make suggestions.
>>  >> > >>
>>  >> > >> [1]
>>  >> > >>
>>  >> >
>>  >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
>>  >> > >> [2] https://issues.apache.org/jira/browse/IGNITE-8571
>>  >> > >> [3] https://issues.apache.org/jira/browse/IGNITE-10640
>>  >> > >> [4] https://issues.apache.org/jira/browse/IGNITE-8573
>>  >> > >> [5] https://github.com/apache/ignite/pull/5907
>>  >> > >>
>>  >> > >> --
>>  >> > >> Best regards,
>>  >> > >> Anton Kalashnikov
>>  >> >
>
> --
> Best regards,
> Ivan Pavlukhin

Re: Baseline auto-adjust`s discuss

Posted by Павлухин Иван <vo...@gmail.com>.

Anton,

Great feature!

Could you please clarify a bit about implementation details? As I
understood auto-adjust properties is meant to be consistent across the
cluster. And baseline adjustment is put into some delay queue. Do we
put event into a queue on each node? Or is there some dedicated node
driving baseline adjustment?

пт, 25 янв. 2019 г. в 16:31, Anton Kalashnikov <ka...@yandex.ru>:
>
> Initially, hard timeout should protect grid from constantly changing topology(constantly blinking node). But in fact if we have constantly changing topology, baseline adjust operation is failed in most cases. As result hard timeout only added complexity but it don't give any new guarantee. So I think we can skip it in first implementation.
>
> First of all timeout protect us from unnecessary adjust of baseline . If node left the grid and immediately(or after some time less than us timeout) it join back to grid. Also timeout is helpful in other cases when some events happened one after another.
>
> This feature doesn't have any complex heuristic to react, except of described in restrictions section.
>
> Also I want to notes that this feature isn't protect us from constantly blinking node. We need one more heuristic mechanism for detect this situation and doing some actions like removing this node from grid.
>
> --
> Best regards,
> Anton Kalashnikov
>
>
> 25.01.2019, 15:43, "Sergey Chugunov" <se...@gmail.com>:
> > Anton,
> >
> > As I understand from the IEP document policy was supposed to support two
> > timeouts: soft and hard, so here you're proposing a bit simpler
> > functionality.
> >
> > Just to clarify, do I understand correctly that this feature when enabled
> > will auto-adjust blt on each node join/node left event, and timeout is
> > necessary to protect us from blinking nodes?
> > So no complexities with taking into account number of alive backups or
> > something like that?
> >
> > On Fri, Jan 25, 2019 at 1:11 PM Vladimir Ozerov <vo...@gridgain.com>
> > wrote:
> >
> >>  Got it, makes sense.
> >>
> >>  On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <ka...@yandex.ru>
> >>  wrote:
> >>
> >>  > Vladimir, thanks for your notes, both of them looks good enough but I
> >>  > have two different thoughts about it.
> >>  >
> >>  > I think I agree about enabling only one of manual/auto adjustment. It is
> >>  > easier than current solution and in fact as extra feature we can allow
> >>  > user to force task to execute(if they doesn't want to wait until timeout
> >>  > expired).
> >>  > But about second one I don't sure that one parameters instead of two
> >>  would
> >>  > be more convenient. For example: in case when user changed timeout and
> >>  then
> >>  > disable auto-adjust after then when someone will want to enable it they
> >>  > should know what value of timeout was before auto-adjust was disabled. I
> >>  > think "negative value" pattern good choice for always usable parameters
> >>  > like timeout of connection (ex. -1 equal to endless waiting) and so on,
> >>  but
> >>  > in our case we want to disable whole functionality rather than change
> >>  > parameter value.
> >>  >
> >>  > --
> >>  > Best regards,
> >>  > Anton Kalashnikov
> >>  >
> >>  >
> >>  > 24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
> >>  > > Hi Anton,
> >>  > >
> >>  > > This is great feature, but I am a bit confused about automatic
> >>  disabling
> >>  > of
> >>  > > a feature during manual baseline adjustment. This may lead to
> >>  unpleasant
> >>  > > situations when a user enabled auto-adjustment, then re-adjusted it
> >>  > > manually somehow (e.g. from some previously created script) so that
> >>  > > auto-adjustment disabling went unnoticed, then added more nodes hoping
> >>  > that
> >>  > > auto-baseline is still active, etc.
> >>  > >
> >>  > > Instead, I would rather make manual and auto adjustment mutually
> >>  > exclusive
> >>  > > - baseline cannot be adjusted manually when auto mode is set, and vice
> >>  > > versa. If exception is thrown in that cases, administrators will always
> >>  > > know current behavior of the system.
> >>  > >
> >>  > > As far as configuration, wouldn’t it be enough to have a single long
> >>  > value
> >>  > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
> >>  > negative
> >>  > > - disabled, positive - auto adjustment after timeout.
> >>  > >
> >>  > > Thoughts?
> >>  > >
> >>  > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
> >>  > >
> >>  > >> Hello, Igniters!
> >>  > >>
> >>  > >> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
> >>  > want
> >>  > >> to start to discuss of implementation of "Baseline auto-adjust" [2].
> >>  > >>
> >>  > >> "Baseline auto-adjust" feature implements mechanism of auto-adjust
> >>  > >> baseline corresponding to current topology after event join/left was
> >>  > >> appeared. It is required because when a node left the grid and nobody
> >>  > would
> >>  > >> change baseline manually it can lead to lost data(when some more
> >>  nodes
> >>  > left
> >>  > >> the grid on depends in backup factor) but permanent tracking of grid
> >>  > is not
> >>  > >> always possible/desirible. Looks like in many cases auto-adjust
> >>  > baseline
> >>  > >> after some timeout is very helpfull.
> >>  > >>
> >>  > >> Distributed metastore[3](it is already done):
> >>  > >>
> >>  > >> First of all it is required the ability to store configuration data
> >>  > >> consistently and cluster-wide. Ignite doesn't have any specific API
> >>  for
> >>  > >> such configurations and we don't want to have many similar
> >>  > implementations
> >>  > >> of the same feature in our code. After some thoughts is was proposed
> >>  to
> >>  > >> implement it as some kind of distributed metastorage that gives the
> >>  > ability
> >>  > >> to store any data in it.
> >>  > >> First implementation is based on existing local metastorage API for
> >>  > >> persistent clusters (in-memory clusters will store data in memory).
> >>  > >> Write/remove operation use Discovery SPI to send updates to the
> >>  > cluster, it
> >>  > >> guarantees updates order and the fact that all existing (alive) nodes
> >>  > have
> >>  > >> handled the update message. As a way to find out which node has the
> >>  > latest
> >>  > >> data there is a "version" value of distributed metastorage, which is
> >>  > >> basically <number of all updates, hash of updates>. All updates
> >>  history
> >>  > >> until some point in the past is stored along with the data, so when
> >>  an
> >>  > >> outdated node connects to the cluster it will receive all the missing
> >>  > data
> >>  > >> and apply it locally. If there's not enough history stored or joining
> >>  > node
> >>  > >> is clear then it'll receive shapshot of distributed metastorage so
> >>  > there
> >>  > >> won't be inconsistencies.
> >>  > >>
> >>  > >> Baseline auto-adjust:
> >>  > >>
> >>  > >> Main scenario:
> >>  > >> - There is grid with the baseline is equal to the current
> >>  > topology
> >>  > >> - New node joins to grid or some node left(failed) the grid
> >>  > >> - New mechanism detects this event and it add task for
> >>  changing
> >>  > >> baseline to queue with configured timeout
> >>  > >> - If new event are happened before baseline would be changed
> >>  > task
> >>  > >> would be removed from queue and new task will be added
> >>  > >> - When timeout are expired the task would try to set new
> >>  > baseline
> >>  > >> corresponded to current topology
> >>  > >>
> >>  > >> First of all we need to add two parameters[4]:
> >>  > >> - baselineAutoAdjustEnabled - enable/disable "Baseline
> >>  > >> auto-adjust" feature.
> >>  > >> - baselineAutoAdjustTimeout - timeout after which baseline
> >>  > should
> >>  > >> be changed.
> >>  > >>
> >>  > >> This parameters are cluster wide and can be changed in real time
> >>  > because
> >>  > >> it is based on "Distributed metastore". On first time this parameters
> >>  > would
> >>  > >> be initiated by corresponded
> >>  parameters(initBaselineAutoAdjustEnabled,
> >>  > >> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init
> >>  value
> >>  > >> valid only before first changing of it after value would be changed
> >>  it
> >>  > is
> >>  > >> stored in "Distributed metastore".
> >>  > >>
> >>  > >> Restrictions:
> >>  > >> - This mechanism handling events only on active grid
> >>  > >> - If baselineNodes != gridNodes on activate this feature
> >>  would
> >>  > be
> >>  > >> disabled
> >>  > >> - If lost partitions was detected this feature would be
> >>  > disabled
> >>  > >> - If baseline was adjusted manually on baselineNodes !=
> >>  > gridNodes
> >>  > >> this feature would be disabled
> >>  > >>
> >>  > >> Draft implementation you can find here[5]. Feel free to ask more
> >>  > details
> >>  > >> and make suggestions.
> >>  > >>
> >>  > >> [1]
> >>  > >>
> >>  >
> >>  https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> >>  > >> [2] https://issues.apache.org/jira/browse/IGNITE-8571
> >>  > >> [3] https://issues.apache.org/jira/browse/IGNITE-10640
> >>  > >> [4] https://issues.apache.org/jira/browse/IGNITE-8573
> >>  > >> [5] https://github.com/apache/ignite/pull/5907
> >>  > >>
> >>  > >> --
> >>  > >> Best regards,
> >>  > >> Anton Kalashnikov
> >>  >



-- 
Best regards,
Ivan Pavlukhin

Re: Baseline auto-adjust`s discuss

Posted by Anton Kalashnikov <ka...@yandex.ru>.

Initially, hard timeout should protect grid from constantly changing topology(constantly blinking node). But in fact if we have constantly changing topology, baseline adjust operation is failed in most cases. As result hard timeout only added complexity but it don't give any new guarantee. So I think we can skip it in first implementation.

First of all timeout protect us from unnecessary adjust of baseline . If node left the grid and immediately(or after some time less than us timeout) it join back to grid. Also timeout is helpful in other cases when some events happened one after another.

This feature doesn't have any complex heuristic to react, except of described in restrictions section.

Also I want to notes that this feature isn't protect us from constantly blinking node. We need one more heuristic mechanism for detect this situation and doing some actions like removing this node from grid.

-- 
Best regards,
Anton Kalashnikov


25.01.2019, 15:43, "Sergey Chugunov" <se...@gmail.com>:
> Anton,
>
> As I understand from the IEP document policy was supposed to support two
> timeouts: soft and hard, so here you're proposing a bit simpler
> functionality.
>
> Just to clarify, do I understand correctly that this feature when enabled
> will auto-adjust blt on each node join/node left event, and timeout is
> necessary to protect us from blinking nodes?
> So no complexities with taking into account number of alive backups or
> something like that?
>
> On Fri, Jan 25, 2019 at 1:11 PM Vladimir Ozerov <vo...@gridgain.com>
> wrote:
>
>>  Got it, makes sense.
>>
>>  On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <ka...@yandex.ru>
>>  wrote:
>>
>>  > Vladimir, thanks for your notes, both of them looks good enough but I
>>  > have two different thoughts about it.
>>  >
>>  > I think I agree about enabling only one of manual/auto adjustment. It is
>>  > easier than current solution and in fact as extra feature we can allow
>>  > user to force task to execute(if they doesn't want to wait until timeout
>>  > expired).
>>  > But about second one I don't sure that one parameters instead of two
>>  would
>>  > be more convenient. For example: in case when user changed timeout and
>>  then
>>  > disable auto-adjust after then when someone will want to enable it they
>>  > should know what value of timeout was before auto-adjust was disabled. I
>>  > think "negative value" pattern good choice for always usable parameters
>>  > like timeout of connection (ex. -1 equal to endless waiting) and so on,
>>  but
>>  > in our case we want to disable whole functionality rather than change
>>  > parameter value.
>>  >
>>  > --
>>  > Best regards,
>>  > Anton Kalashnikov
>>  >
>>  >
>>  > 24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
>>  > > Hi Anton,
>>  > >
>>  > > This is great feature, but I am a bit confused about automatic
>>  disabling
>>  > of
>>  > > a feature during manual baseline adjustment. This may lead to
>>  unpleasant
>>  > > situations when a user enabled auto-adjustment, then re-adjusted it
>>  > > manually somehow (e.g. from some previously created script) so that
>>  > > auto-adjustment disabling went unnoticed, then added more nodes hoping
>>  > that
>>  > > auto-baseline is still active, etc.
>>  > >
>>  > > Instead, I would rather make manual and auto adjustment mutually
>>  > exclusive
>>  > > - baseline cannot be adjusted manually when auto mode is set, and vice
>>  > > versa. If exception is thrown in that cases, administrators will always
>>  > > know current behavior of the system.
>>  > >
>>  > > As far as configuration, wouldn’t it be enough to have a single long
>>  > value
>>  > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
>>  > negative
>>  > > - disabled, positive - auto adjustment after timeout.
>>  > >
>>  > > Thoughts?
>>  > >
>>  > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
>>  > >
>>  > >> Hello, Igniters!
>>  > >>
>>  > >> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
>>  > want
>>  > >> to start to discuss of implementation of "Baseline auto-adjust" [2].
>>  > >>
>>  > >> "Baseline auto-adjust" feature implements mechanism of auto-adjust
>>  > >> baseline corresponding to current topology after event join/left was
>>  > >> appeared. It is required because when a node left the grid and nobody
>>  > would
>>  > >> change baseline manually it can lead to lost data(when some more
>>  nodes
>>  > left
>>  > >> the grid on depends in backup factor) but permanent tracking of grid
>>  > is not
>>  > >> always possible/desirible. Looks like in many cases auto-adjust
>>  > baseline
>>  > >> after some timeout is very helpfull.
>>  > >>
>>  > >> Distributed metastore[3](it is already done):
>>  > >>
>>  > >> First of all it is required the ability to store configuration data
>>  > >> consistently and cluster-wide. Ignite doesn't have any specific API
>>  for
>>  > >> such configurations and we don't want to have many similar
>>  > implementations
>>  > >> of the same feature in our code. After some thoughts is was proposed
>>  to
>>  > >> implement it as some kind of distributed metastorage that gives the
>>  > ability
>>  > >> to store any data in it.
>>  > >> First implementation is based on existing local metastorage API for
>>  > >> persistent clusters (in-memory clusters will store data in memory).
>>  > >> Write/remove operation use Discovery SPI to send updates to the
>>  > cluster, it
>>  > >> guarantees updates order and the fact that all existing (alive) nodes
>>  > have
>>  > >> handled the update message. As a way to find out which node has the
>>  > latest
>>  > >> data there is a "version" value of distributed metastorage, which is
>>  > >> basically <number of all updates, hash of updates>. All updates
>>  history
>>  > >> until some point in the past is stored along with the data, so when
>>  an
>>  > >> outdated node connects to the cluster it will receive all the missing
>>  > data
>>  > >> and apply it locally. If there's not enough history stored or joining
>>  > node
>>  > >> is clear then it'll receive shapshot of distributed metastorage so
>>  > there
>>  > >> won't be inconsistencies.
>>  > >>
>>  > >> Baseline auto-adjust:
>>  > >>
>>  > >> Main scenario:
>>  > >> - There is grid with the baseline is equal to the current
>>  > topology
>>  > >> - New node joins to grid or some node left(failed) the grid
>>  > >> - New mechanism detects this event and it add task for
>>  changing
>>  > >> baseline to queue with configured timeout
>>  > >> - If new event are happened before baseline would be changed
>>  > task
>>  > >> would be removed from queue and new task will be added
>>  > >> - When timeout are expired the task would try to set new
>>  > baseline
>>  > >> corresponded to current topology
>>  > >>
>>  > >> First of all we need to add two parameters[4]:
>>  > >> - baselineAutoAdjustEnabled - enable/disable "Baseline
>>  > >> auto-adjust" feature.
>>  > >> - baselineAutoAdjustTimeout - timeout after which baseline
>>  > should
>>  > >> be changed.
>>  > >>
>>  > >> This parameters are cluster wide and can be changed in real time
>>  > because
>>  > >> it is based on "Distributed metastore". On first time this parameters
>>  > would
>>  > >> be initiated by corresponded
>>  parameters(initBaselineAutoAdjustEnabled,
>>  > >> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init
>>  value
>>  > >> valid only before first changing of it after value would be changed
>>  it
>>  > is
>>  > >> stored in "Distributed metastore".
>>  > >>
>>  > >> Restrictions:
>>  > >> - This mechanism handling events only on active grid
>>  > >> - If baselineNodes != gridNodes on activate this feature
>>  would
>>  > be
>>  > >> disabled
>>  > >> - If lost partitions was detected this feature would be
>>  > disabled
>>  > >> - If baseline was adjusted manually on baselineNodes !=
>>  > gridNodes
>>  > >> this feature would be disabled
>>  > >>
>>  > >> Draft implementation you can find here[5]. Feel free to ask more
>>  > details
>>  > >> and make suggestions.
>>  > >>
>>  > >> [1]
>>  > >>
>>  >
>>  https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
>>  > >> [2] https://issues.apache.org/jira/browse/IGNITE-8571
>>  > >> [3] https://issues.apache.org/jira/browse/IGNITE-10640
>>  > >> [4] https://issues.apache.org/jira/browse/IGNITE-8573
>>  > >> [5] https://github.com/apache/ignite/pull/5907
>>  > >>
>>  > >> --
>>  > >> Best regards,
>>  > >> Anton Kalashnikov
>>  >

Re: Baseline auto-adjust`s discuss

Posted by Sergey Chugunov <se...@gmail.com>.

Anton,

As I understand from the IEP document policy was supposed to support two
timeouts: soft and hard, so here you're proposing a bit simpler
functionality.

Just to clarify, do I understand correctly that this feature when enabled
will auto-adjust blt on each node join/node left event, and timeout is
necessary to protect us from blinking nodes?
So no complexities with taking into account number of alive backups or
something like that?

On Fri, Jan 25, 2019 at 1:11 PM Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Got it, makes sense.
>
> On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <ka...@yandex.ru>
> wrote:
>
> > Vladimir, thanks  for your notes, both of them looks good enough but I
> > have two different thoughts about it.
> >
> > I think I agree about enabling only one of manual/auto adjustment. It is
> > easier than current solution and in fact as extra feature  we can allow
> > user to force task to execute(if they doesn't want to wait until timeout
> > expired).
> > But about second one I don't sure that one parameters instead of two
> would
> > be more convenient. For example: in case when user changed timeout and
> then
> > disable auto-adjust after then when someone will want to enable it they
> > should know what value of timeout was before auto-adjust was disabled. I
> > think "negative value" pattern good choice for always usable parameters
> > like timeout of connection (ex. -1 equal to endless waiting) and so on,
> but
> > in our case we want to disable whole functionality rather than change
> > parameter value.
> >
> > --
> > Best regards,
> > Anton Kalashnikov
> >
> >
> > 24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
> > > Hi Anton,
> > >
> > > This is great feature, but I am a bit confused about automatic
> disabling
> > of
> > > a feature during manual baseline adjustment. This may lead to
> unpleasant
> > > situations when a user enabled auto-adjustment, then re-adjusted it
> > > manually somehow (e.g. from some previously created script) so that
> > > auto-adjustment disabling went unnoticed, then added more nodes hoping
> > that
> > > auto-baseline is still active, etc.
> > >
> > > Instead, I would rather make manual and auto adjustment mutually
> > exclusive
> > > - baseline cannot be adjusted manually when auto mode is set, and vice
> > > versa. If exception is thrown in that cases, administrators will always
> > > know current behavior of the system.
> > >
> > > As far as configuration, wouldn’t it be enough to have a single long
> > value
> > > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
> > negative
> > > - disabled, positive - auto adjustment after timeout.
> > >
> > > Thoughts?
> > >
> > > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
> > >
> > >>  Hello, Igniters!
> > >>
> > >>  Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
> > want
> > >>  to start to discuss of implementation of "Baseline auto-adjust" [2].
> > >>
> > >>  "Baseline auto-adjust" feature implements mechanism of auto-adjust
> > >>  baseline corresponding to current topology after event join/left was
> > >>  appeared. It is required because when a node left the grid and nobody
> > would
> > >>  change baseline manually it can lead to lost data(when some more
> nodes
> > left
> > >>  the grid on depends in backup factor) but permanent tracking of grid
> > is not
> > >>  always possible/desirible. Looks like in many cases auto-adjust
> > baseline
> > >>  after some timeout is very helpfull.
> > >>
> > >>  Distributed metastore[3](it is already done):
> > >>
> > >>  First of all it is required the ability to store configuration data
> > >>  consistently and cluster-wide. Ignite doesn't have any specific API
> for
> > >>  such configurations and we don't want to have many similar
> > implementations
> > >>  of the same feature in our code. After some thoughts is was proposed
> to
> > >>  implement it as some kind of distributed metastorage that gives the
> > ability
> > >>  to store any data in it.
> > >>  First implementation is based on existing local metastorage API for
> > >>  persistent clusters (in-memory clusters will store data in memory).
> > >>  Write/remove operation use Discovery SPI to send updates to the
> > cluster, it
> > >>  guarantees updates order and the fact that all existing (alive) nodes
> > have
> > >>  handled the update message. As a way to find out which node has the
> > latest
> > >>  data there is a "version" value of distributed metastorage, which is
> > >>  basically <number of all updates, hash of updates>. All updates
> history
> > >>  until some point in the past is stored along with the data, so when
> an
> > >>  outdated node connects to the cluster it will receive all the missing
> > data
> > >>  and apply it locally. If there's not enough history stored or joining
> > node
> > >>  is clear then it'll receive shapshot of distributed metastorage so
> > there
> > >>  won't be inconsistencies.
> > >>
> > >>  Baseline auto-adjust:
> > >>
> > >>  Main scenario:
> > >>          - There is grid with the baseline is equal to the current
> > topology
> > >>          - New node joins to grid or some node left(failed) the grid
> > >>          - New mechanism detects this event and it add task for
> changing
> > >>  baseline to queue with configured timeout
> > >>          - If new event are happened before baseline would be changed
> > task
> > >>  would be removed from queue and new task will be added
> > >>          - When timeout are expired the task would try to set new
> > baseline
> > >>  corresponded to current topology
> > >>
> > >>  First of all we need to add two parameters[4]:
> > >>          - baselineAutoAdjustEnabled - enable/disable "Baseline
> > >>  auto-adjust" feature.
> > >>          - baselineAutoAdjustTimeout - timeout after which baseline
> > should
> > >>  be changed.
> > >>
> > >>  This parameters are cluster wide and can be changed in real time
> > because
> > >>  it is based on "Distributed metastore". On first time this parameters
> > would
> > >>  be initiated by corresponded
> parameters(initBaselineAutoAdjustEnabled,
> > >>  initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init
> value
> > >>  valid only before first changing of it after value would be changed
> it
> > is
> > >>  stored in "Distributed metastore".
> > >>
> > >>  Restrictions:
> > >>          - This mechanism handling events only on active grid
> > >>          - If baselineNodes != gridNodes on activate this feature
> would
> > be
> > >>  disabled
> > >>          - If lost partitions was detected this feature would be
> > disabled
> > >>          - If baseline was adjusted manually on baselineNodes !=
> > gridNodes
> > >>  this feature would be disabled
> > >>
> > >>  Draft implementation you can find here[5]. Feel free to ask more
> > details
> > >>  and make suggestions.
> > >>
> > >>  [1]
> > >>
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> > >>  [2] https://issues.apache.org/jira/browse/IGNITE-8571
> > >>  [3] https://issues.apache.org/jira/browse/IGNITE-10640
> > >>  [4] https://issues.apache.org/jira/browse/IGNITE-8573
> > >>  [5] https://github.com/apache/ignite/pull/5907
> > >>
> > >>  --
> > >>  Best regards,
> > >>  Anton Kalashnikov
> >
>

Re: Baseline auto-adjust`s discuss

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Got it, makes sense.

On Fri, Jan 25, 2019 at 11:06 AM Anton Kalashnikov <ka...@yandex.ru>
wrote:

> Vladimir, thanks  for your notes, both of them looks good enough but I
> have two different thoughts about it.
>
> I think I agree about enabling only one of manual/auto adjustment. It is
> easier than current solution and in fact as extra feature  we can allow
> user to force task to execute(if they doesn't want to wait until timeout
> expired).
> But about second one I don't sure that one parameters instead of two would
> be more convenient. For example: in case when user changed timeout and then
> disable auto-adjust after then when someone will want to enable it they
> should know what value of timeout was before auto-adjust was disabled. I
> think "negative value" pattern good choice for always usable parameters
> like timeout of connection (ex. -1 equal to endless waiting) and so on, but
> in our case we want to disable whole functionality rather than change
> parameter value.
>
> --
> Best regards,
> Anton Kalashnikov
>
>
> 24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
> > Hi Anton,
> >
> > This is great feature, but I am a bit confused about automatic disabling
> of
> > a feature during manual baseline adjustment. This may lead to unpleasant
> > situations when a user enabled auto-adjustment, then re-adjusted it
> > manually somehow (e.g. from some previously created script) so that
> > auto-adjustment disabling went unnoticed, then added more nodes hoping
> that
> > auto-baseline is still active, etc.
> >
> > Instead, I would rather make manual and auto adjustment mutually
> exclusive
> > - baseline cannot be adjusted manually when auto mode is set, and vice
> > versa. If exception is thrown in that cases, administrators will always
> > know current behavior of the system.
> >
> > As far as configuration, wouldn’t it be enough to have a single long
> value
> > as opposed to Boolean + long? Say, 0 - immediate auto adjustment,
> negative
> > - disabled, positive - auto adjustment after timeout.
> >
> > Thoughts?
> >
> > чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
> >
> >>  Hello, Igniters!
> >>
> >>  Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I
> want
> >>  to start to discuss of implementation of "Baseline auto-adjust" [2].
> >>
> >>  "Baseline auto-adjust" feature implements mechanism of auto-adjust
> >>  baseline corresponding to current topology after event join/left was
> >>  appeared. It is required because when a node left the grid and nobody
> would
> >>  change baseline manually it can lead to lost data(when some more nodes
> left
> >>  the grid on depends in backup factor) but permanent tracking of grid
> is not
> >>  always possible/desirible. Looks like in many cases auto-adjust
> baseline
> >>  after some timeout is very helpfull.
> >>
> >>  Distributed metastore[3](it is already done):
> >>
> >>  First of all it is required the ability to store configuration data
> >>  consistently and cluster-wide. Ignite doesn't have any specific API for
> >>  such configurations and we don't want to have many similar
> implementations
> >>  of the same feature in our code. After some thoughts is was proposed to
> >>  implement it as some kind of distributed metastorage that gives the
> ability
> >>  to store any data in it.
> >>  First implementation is based on existing local metastorage API for
> >>  persistent clusters (in-memory clusters will store data in memory).
> >>  Write/remove operation use Discovery SPI to send updates to the
> cluster, it
> >>  guarantees updates order and the fact that all existing (alive) nodes
> have
> >>  handled the update message. As a way to find out which node has the
> latest
> >>  data there is a "version" value of distributed metastorage, which is
> >>  basically <number of all updates, hash of updates>. All updates history
> >>  until some point in the past is stored along with the data, so when an
> >>  outdated node connects to the cluster it will receive all the missing
> data
> >>  and apply it locally. If there's not enough history stored or joining
> node
> >>  is clear then it'll receive shapshot of distributed metastorage so
> there
> >>  won't be inconsistencies.
> >>
> >>  Baseline auto-adjust:
> >>
> >>  Main scenario:
> >>          - There is grid with the baseline is equal to the current
> topology
> >>          - New node joins to grid or some node left(failed) the grid
> >>          - New mechanism detects this event and it add task for changing
> >>  baseline to queue with configured timeout
> >>          - If new event are happened before baseline would be changed
> task
> >>  would be removed from queue and new task will be added
> >>          - When timeout are expired the task would try to set new
> baseline
> >>  corresponded to current topology
> >>
> >>  First of all we need to add two parameters[4]:
> >>          - baselineAutoAdjustEnabled - enable/disable "Baseline
> >>  auto-adjust" feature.
> >>          - baselineAutoAdjustTimeout - timeout after which baseline
> should
> >>  be changed.
> >>
> >>  This parameters are cluster wide and can be changed in real time
> because
> >>  it is based on "Distributed metastore". On first time this parameters
> would
> >>  be initiated by corresponded parameters(initBaselineAutoAdjustEnabled,
> >>  initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value
> >>  valid only before first changing of it after value would be changed it
> is
> >>  stored in "Distributed metastore".
> >>
> >>  Restrictions:
> >>          - This mechanism handling events only on active grid
> >>          - If baselineNodes != gridNodes on activate this feature would
> be
> >>  disabled
> >>          - If lost partitions was detected this feature would be
> disabled
> >>          - If baseline was adjusted manually on baselineNodes !=
> gridNodes
> >>  this feature would be disabled
> >>
> >>  Draft implementation you can find here[5]. Feel free to ask more
> details
> >>  and make suggestions.
> >>
> >>  [1]
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> >>  [2] https://issues.apache.org/jira/browse/IGNITE-8571
> >>  [3] https://issues.apache.org/jira/browse/IGNITE-10640
> >>  [4] https://issues.apache.org/jira/browse/IGNITE-8573
> >>  [5] https://github.com/apache/ignite/pull/5907
> >>
> >>  --
> >>  Best regards,
> >>  Anton Kalashnikov
>

Re: Baseline auto-adjust`s discuss

Posted by Anton Kalashnikov <ka...@yandex.ru>.

Vladimir, thanks  for your notes, both of them looks good enough but I have two different thoughts about it. 

I think I agree about enabling only one of manual/auto adjustment. It is easier than current solution and in fact as extra feature  we can allow user to force task to execute(if they doesn't want to wait until timeout expired). 
But about second one I don't sure that one parameters instead of two would be more convenient. For example: in case when user changed timeout and then disable auto-adjust after then when someone will want to enable it they should know what value of timeout was before auto-adjust was disabled. I think "negative value" pattern good choice for always usable parameters like timeout of connection (ex. -1 equal to endless waiting) and so on, but in our case we want to disable whole functionality rather than change parameter value.

-- 
Best regards,
Anton Kalashnikov


24.01.2019, 22:03, "Vladimir Ozerov" <vo...@gridgain.com>:
> Hi Anton,
>
> This is great feature, but I am a bit confused about automatic disabling of
> a feature during manual baseline adjustment. This may lead to unpleasant
> situations when a user enabled auto-adjustment, then re-adjusted it
> manually somehow (e.g. from some previously created script) so that
> auto-adjustment disabling went unnoticed, then added more nodes hoping that
> auto-baseline is still active, etc.
>
> Instead, I would rather make manual and auto adjustment mutually exclusive
> - baseline cannot be adjusted manually when auto mode is set, and vice
> versa. If exception is thrown in that cases, administrators will always
> know current behavior of the system.
>
> As far as configuration, wouldn’t it be enough to have a single long value
> as opposed to Boolean + long? Say, 0 - immediate auto adjustment, negative
> - disabled, positive - auto adjustment after timeout.
>
> Thoughts?
>
> чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:
>
>>  Hello, Igniters!
>>
>>  Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I want
>>  to start to discuss of implementation of "Baseline auto-adjust" [2].
>>
>>  "Baseline auto-adjust" feature implements mechanism of auto-adjust
>>  baseline corresponding to current topology after event join/left was
>>  appeared. It is required because when a node left the grid and nobody would
>>  change baseline manually it can lead to lost data(when some more nodes left
>>  the grid on depends in backup factor) but permanent tracking of grid is not
>>  always possible/desirible. Looks like in many cases auto-adjust baseline
>>  after some timeout is very helpfull.
>>
>>  Distributed metastore[3](it is already done):
>>
>>  First of all it is required the ability to store configuration data
>>  consistently and cluster-wide. Ignite doesn't have any specific API for
>>  such configurations and we don't want to have many similar implementations
>>  of the same feature in our code. After some thoughts is was proposed to
>>  implement it as some kind of distributed metastorage that gives the ability
>>  to store any data in it.
>>  First implementation is based on existing local metastorage API for
>>  persistent clusters (in-memory clusters will store data in memory).
>>  Write/remove operation use Discovery SPI to send updates to the cluster, it
>>  guarantees updates order and the fact that all existing (alive) nodes have
>>  handled the update message. As a way to find out which node has the latest
>>  data there is a "version" value of distributed metastorage, which is
>>  basically <number of all updates, hash of updates>. All updates history
>>  until some point in the past is stored along with the data, so when an
>>  outdated node connects to the cluster it will receive all the missing data
>>  and apply it locally. If there's not enough history stored or joining node
>>  is clear then it'll receive shapshot of distributed metastorage so there
>>  won't be inconsistencies.
>>
>>  Baseline auto-adjust:
>>
>>  Main scenario:
>>          - There is grid with the baseline is equal to the current topology
>>          - New node joins to grid or some node left(failed) the grid
>>          - New mechanism detects this event and it add task for changing
>>  baseline to queue with configured timeout
>>          - If new event are happened before baseline would be changed task
>>  would be removed from queue and new task will be added
>>          - When timeout are expired the task would try to set new baseline
>>  corresponded to current topology
>>
>>  First of all we need to add two parameters[4]:
>>          - baselineAutoAdjustEnabled - enable/disable "Baseline
>>  auto-adjust" feature.
>>          - baselineAutoAdjustTimeout - timeout after which baseline should
>>  be changed.
>>
>>  This parameters are cluster wide and can be changed in real time because
>>  it is based on "Distributed metastore". On first time this parameters would
>>  be initiated by corresponded parameters(initBaselineAutoAdjustEnabled,
>>  initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value
>>  valid only before first changing of it after value would be changed it is
>>  stored in "Distributed metastore".
>>
>>  Restrictions:
>>          - This mechanism handling events only on active grid
>>          - If baselineNodes != gridNodes on activate this feature would be
>>  disabled
>>          - If lost partitions was detected this feature would be disabled
>>          - If baseline was adjusted manually on baselineNodes != gridNodes
>>  this feature would be disabled
>>
>>  Draft implementation you can find here[5]. Feel free to ask more details
>>  and make suggestions.
>>
>>  [1]
>>  https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
>>  [2] https://issues.apache.org/jira/browse/IGNITE-8571
>>  [3] https://issues.apache.org/jira/browse/IGNITE-10640
>>  [4] https://issues.apache.org/jira/browse/IGNITE-8573
>>  [5] https://github.com/apache/ignite/pull/5907
>>
>>  --
>>  Best regards,
>>  Anton Kalashnikov

Re: Baseline auto-adjust`s discuss

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Hi Anton,

This is great feature, but I am a bit confused about automatic disabling of
a feature during manual baseline adjustment. This may lead to unpleasant
situations when a user enabled auto-adjustment, then re-adjusted it
manually somehow (e.g. from some previously created script) so that
auto-adjustment disabling went unnoticed, then added more nodes hoping that
auto-baseline is still active, etc.

Instead, I would rather make manual and auto adjustment mutually exclusive
- baseline cannot be adjusted manually when auto mode is set, and vice
versa. If exception is thrown in that cases, administrators will always
know current behavior of the system.

As far as configuration, wouldn’t it be enough to have a single long value
as opposed to Boolean + long? Say, 0 - immediate auto adjustment, negative
- disabled, positive - auto adjustment after timeout.

Thoughts?

чт, 24 янв. 2019 г. в 18:33, Anton Kalashnikov <ka...@yandex.ru>:

>
> Hello, Igniters!
>
> Work on the Phase II of IEP-4 (Baseline topology) [1] has started. I want
> to start to discuss of implementation of "Baseline auto-adjust" [2].
>
> "Baseline auto-adjust" feature implements mechanism of auto-adjust
> baseline corresponding to current topology after event join/left was
> appeared. It is required because when a node left the grid and nobody would
> change baseline manually it can lead to lost data(when some more nodes left
> the grid on depends in backup factor) but permanent tracking of grid is not
> always possible/desirible. Looks like in many cases auto-adjust baseline
> after some timeout is very helpfull.
>
> Distributed metastore[3](it is already done):
>
> First of all it is required the ability to store configuration data
> consistently and cluster-wide. Ignite doesn't have any specific API for
> such configurations and we don't want to have many similar implementations
> of the same feature in our code. After some thoughts is was proposed to
> implement it as some kind of distributed metastorage that gives the ability
> to store any data in it.
> First implementation is based on existing local metastorage API for
> persistent clusters (in-memory clusters will store data in memory).
> Write/remove operation use Discovery SPI to send updates to the cluster, it
> guarantees updates order and the fact that all existing (alive) nodes have
> handled the update message. As a way to find out which node has the latest
> data there is a "version" value of distributed metastorage, which is
> basically <number of all updates, hash of updates>. All updates history
> until some point in the past is stored along with the data, so when an
> outdated node connects to the cluster it will receive all the missing data
> and apply it locally. If there's not enough history stored or joining node
> is clear then it'll receive shapshot of distributed metastorage so there
> won't be inconsistencies.
>
> Baseline auto-adjust:
>
> Main scenario:
>         - There is grid with the baseline is equal to the current topology
>         - New node joins to grid or some node left(failed) the grid
>         - New mechanism detects this event and it add task for changing
> baseline to queue with configured timeout
>         - If new event are happened before baseline would be changed task
> would be removed from queue and new task will be added
>         - When timeout are expired the task would try to set new baseline
> corresponded to current topology
>
> First of all we need to add two parameters[4]:
>         - baselineAutoAdjustEnabled - enable/disable "Baseline
> auto-adjust" feature.
>         - baselineAutoAdjustTimeout - timeout after which baseline should
> be changed.
>
> This parameters are cluster wide and can be changed in real time because
> it is based on "Distributed metastore". On first time this parameters would
> be initiated by corresponded parameters(initBaselineAutoAdjustEnabled,
> initBaselineAutoAdjustTimeout) from "Ignite Configuration". Init value
> valid only before first changing of it after  value would be changed it is
> stored in "Distributed metastore".
>
> Restrictions:
>         - This mechanism handling events only on active grid
>         - If baselineNodes != gridNodes on activate this feature would be
> disabled
>         - If lost partitions was detected this feature would be disabled
>         - If baseline was adjusted manually on baselineNodes != gridNodes
> this feature would be disabled
>
> Draft implementation you can find here[5]. Feel free to ask more details
> and make suggestions.
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-4+Baseline+topology+for+caches
> [2] https://issues.apache.org/jira/browse/IGNITE-8571
> [3] https://issues.apache.org/jira/browse/IGNITE-10640
> [4] https://issues.apache.org/jira/browse/IGNITE-8573
> [5] https://github.com/apache/ignite/pull/5907
>
> --
> Best regards,
> Anton Kalashnikov
>
>