You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Aleksandr Polovtsev <al...@gmail.com> on 2022/10/18 07:59:45 UTC

Shutdown policy refactoring

Hello, dear Igniters!

I'd like to propose a refactoring of the current Shutdown Policy mechanism.
Currently, node shutdown is controlled by a bunch of flags and enums (I may
be missing some of them):

1. ShutdownPolicy enum has only two entries and is basically a flag that
dictates if we need to wait for the data to be backed up, so that we will
not lose any data when the node goes offline.
2. "cancel" flag is passed to the stop method and indicates that all long
running jobs should be interrupted. For example, it prevents the node
performing a checkpoint on shutdown. I'm pretty sure this flag is nearly
always set to "true", because it is used by the Shutdown Hook.
3. There are also a bunch of system properties that affect the shutdown:
    * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
node stop, which heavily correlates with the "cancel" flag.
    * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates the
ShutdownPolicy enum
    * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know if
it is really useful.

I think that this approach is counter intuitive and would like to
incorporate all aforementioned flags into the ShutdownPolicy enum. For
example, it can look as follows:

public enum ShutdownPolicy {
    IMMEDIATE // Stops the node and cancels all running jobs
    GRACEFUL // Stops the node, does not cancel running jobs and waits for
backups
    GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
does not wait for backups
}

After that, the "cancel" flag and all corresponding properties can be
removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still needed).

Does this make sense? What do you think?

-- 
With regards,
Aleksandr Polovtsev

Re: Shutdown policy refactoring

Posted by Aleksandr Polovtsev <al...@gmail.com>.
Thank you, Stan, for the detailed feedback.

I totally agree with the "dimensions" stuff, I think that's what I tried to
describe in the original proposal. I still think that we can get away with
a single enum, because it at least covers the current use cases. In any
case, it will be safe to add more values to it if needed. So, if everyone
agrees with the proposed enum (with three values), I'll prepare a ticket
and will start working on it some time in the future.

It would still be nice to get more feedback, though, because this
improvement may affect some users whose use cases I'm not aware of.

On Sun, Oct 30, 2022 at 9:46 PM Stanislav Lukyanov <st...@gmail.com>
wrote:

> Hi Aleksandr,
>
> Thanks for driving this. Someone should put a stop to the shutdown
> behavior nonsense :)
>
> But let me complicate things a bit for you.
>
> I see multiple dimensions that are in play when node shutdown happens:
>
> - Running operations (Compute, SQL, Cache): interrupt or wait for
> completion? If we wait r the for how long?
> - Persistence checkpoint: just shutdown? wait for completion if running?
> trigger and wait for completion? If we wait then for how long?
> - Partition loss: just shutdown? wait for more copies to avoid partition
> loss? is behavior the same for in-mem and persistence? is behavior the same
> for backups=0?
> - (anything else?)
>
> Ideally, we should first analyse each of these three dimensions and
> understand what are the per-dimensions behaviours we can come up with -
> what is reasonable.
> Then, we can try combine the dimensions: introduce an enum of policies
> that picks one behavior for each dimension.
> I guess, it should be a table like
>
> Policy           | Operations   | Checkpoint                     |
> Partition Loss
> IMMEDIATE | Cancel         | Cancel                            | Ignore
> GRACEFUL | Wait forever | Trigger and wait forever | Wait forever
>
> This table should be a part of the Javadoc and the website docs.
>
> On the other hand, you might want to separate the policies and control the
> running operations, the checkpoint, and the rebalance/partition loss
> separately. But then it won't solve the initial mess probably?
>
> Thoughts?
>
> Thanks,
> Stan
>
> > On 25 Oct 2022, at 17:17, Aleksandr Polovtsev <al...@gmail.com>
> wrote:
> >
> > Guys, any more feedback?
> >
> > On Tue, Oct 18, 2022 at 12:42 PM Aleksandr Polovtsev <
> > alexpolovtcev@gmail.com> wrote:
> >
> >> My guess is that it reduces the node shutdown time. We may want to keep
> it
> >> for compatibility purposes, because I'm afraid that changing the default
> >> behaviour is not that easy.
> >>
> >> On Tue, Oct 18, 2022 at 12:32 PM Николай Ижиков <ni...@apache.org>
> >> wrote:
> >>
> >>> So why it’s default behavior?
> >>> And why we want keep it?
> >>>
> >>>> 18 окт. 2022 г., в 12:27, Aleksandr Polovtsev <
> alexpolovtcev@gmail.com>
> >>> написал(а):
> >>>>
> >>>> But this is the default behaviour right now, do you mean we should
> >>> change
> >>>> it? I'm ok with that if it will not break something
> >>>>
> >>>> On Tue, Oct 18, 2022 at 12:13 PM Николай Ижиков <ni...@apache.org>
> >>> wrote:
> >>>>
> >>>>>> I don't know what most people use in the real world, maybe more
> >>>>> experienced guys can clarify that.
> >>>>>
> >>>>> My question is different.
> >>>>>
> >>>>> Why we may want to provide cancel=true?
> >>>>> Is there any reason to support this in Ignite?
> >>>>>
> >>>>>> 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <
> >>> alexpolovtcev@gmail.com>
> >>>>> написал(а):
> >>>>>>
> >>>>>> Thank you for the response, Nikolay. As I can see, this is the
> default
> >>>>>> behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't
> >>> know
> >>>>>> what most people use in the real world, maybe more experienced guys
> >>> can
> >>>>>> clarify that.
> >>>>>>
> >>>>>> On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <
> nizhikov@apache.org>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> Why do we need «immediate»?
> >>>>>>> Is there real-world scenarios for it?
> >>>>>>>
> >>>>>>> Can we do graceful or «save all possible data» shutdown always?
> >>>>>>>
> >>>>>>>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <
> >>> alexpolovtcev@gmail.com
> >>>>>>
> >>>>>>> написал(а):
> >>>>>>>>
> >>>>>>>> Hello, dear Igniters!
> >>>>>>>>
> >>>>>>>> I'd like to propose a refactoring of the current Shutdown Policy
> >>>>>>> mechanism.
> >>>>>>>> Currently, node shutdown is controlled by a bunch of flags and
> >>> enums (I
> >>>>>>> may
> >>>>>>>> be missing some of them):
> >>>>>>>>
> >>>>>>>> 1. ShutdownPolicy enum has only two entries and is basically a
> flag
> >>>>> that
> >>>>>>>> dictates if we need to wait for the data to be backed up, so that
> we
> >>>>> will
> >>>>>>>> not lose any data when the node goes offline.
> >>>>>>>> 2. "cancel" flag is passed to the stop method and indicates that
> all
> >>>>> long
> >>>>>>>> running jobs should be interrupted. For example, it prevents the
> >>> node
> >>>>>>>> performing a checkpoint on shutdown. I'm pretty sure this flag is
> >>>>> nearly
> >>>>>>>> always set to "true", because it is used by the Shutdown Hook.
> >>>>>>>> 3. There are also a bunch of system properties that affect the
> >>>>> shutdown:
> >>>>>>>> * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints
> on
> >>>>>>>> node stop, which heavily correlates with the "cancel" flag.
> >>>>>>>> * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates
> >>> the
> >>>>>>>> ShutdownPolicy enum
> >>>>>>>> * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't
> >>> know
> >>>>>>> if
> >>>>>>>> it is really useful.
> >>>>>>>>
> >>>>>>>> I think that this approach is counter intuitive and would like to
> >>>>>>>> incorporate all aforementioned flags into the ShutdownPolicy enum.
> >>> For
> >>>>>>>> example, it can look as follows:
> >>>>>>>>
> >>>>>>>> public enum ShutdownPolicy {
> >>>>>>>> IMMEDIATE // Stops the node and cancels all running jobs
> >>>>>>>> GRACEFUL // Stops the node, does not cancel running jobs and waits
> >>>>> for
> >>>>>>>> backups
> >>>>>>>> GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running
> >>> jobs,
> >>>>>>>> does not wait for backups
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> After that, the "cancel" flag and all corresponding properties can
> >>> be
> >>>>>>>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still
> >>>>> needed).
> >>>>>>>>
> >>>>>>>> Does this make sense? What do you think?
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> With regards,
> >>>>>>>> Aleksandr Polovtsev
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> With regards,
> >>>>>> Aleksandr Polovtsev
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> With regards,
> >>>> Aleksandr Polovtsev
> >>>
> >>>
> >>
> >> --
> >> With regards,
> >> Aleksandr Polovtsev
> >>
> >
> >
> > --
> > With regards,
> > Aleksandr Polovtsev
>
>

-- 
With regards,
Aleksandr Polovtsev

Re: Shutdown policy refactoring

Posted by Stanislav Lukyanov <st...@gmail.com>.
Hi Aleksandr,

Thanks for driving this. Someone should put a stop to the shutdown behavior nonsense :) 

But let me complicate things a bit for you.

I see multiple dimensions that are in play when node shutdown happens:

- Running operations (Compute, SQL, Cache): interrupt or wait for completion? If we wait r the for how long?
- Persistence checkpoint: just shutdown? wait for completion if running? trigger and wait for completion? If we wait then for how long?
- Partition loss: just shutdown? wait for more copies to avoid partition loss? is behavior the same for in-mem and persistence? is behavior the same for backups=0?
- (anything else?)

Ideally, we should first analyse each of these three dimensions and understand what are the per-dimensions behaviours we can come up with - what is reasonable.
Then, we can try combine the dimensions: introduce an enum of policies that picks one behavior for each dimension.
I guess, it should be a table like

Policy           | Operations   | Checkpoint                     | Partition Loss
IMMEDIATE | Cancel         | Cancel                            | Ignore
GRACEFUL | Wait forever | Trigger and wait forever | Wait forever

This table should be a part of the Javadoc and the website docs.

On the other hand, you might want to separate the policies and control the running operations, the checkpoint, and the rebalance/partition loss separately. But then it won't solve the initial mess probably?

Thoughts?

Thanks,
Stan

> On 25 Oct 2022, at 17:17, Aleksandr Polovtsev <al...@gmail.com> wrote:
> 
> Guys, any more feedback?
> 
> On Tue, Oct 18, 2022 at 12:42 PM Aleksandr Polovtsev <
> alexpolovtcev@gmail.com> wrote:
> 
>> My guess is that it reduces the node shutdown time. We may want to keep it
>> for compatibility purposes, because I'm afraid that changing the default
>> behaviour is not that easy.
>> 
>> On Tue, Oct 18, 2022 at 12:32 PM Николай Ижиков <ni...@apache.org>
>> wrote:
>> 
>>> So why it’s default behavior?
>>> And why we want keep it?
>>> 
>>>> 18 окт. 2022 г., в 12:27, Aleksandr Polovtsev <al...@gmail.com>
>>> написал(а):
>>>> 
>>>> But this is the default behaviour right now, do you mean we should
>>> change
>>>> it? I'm ok with that if it will not break something
>>>> 
>>>> On Tue, Oct 18, 2022 at 12:13 PM Николай Ижиков <ni...@apache.org>
>>> wrote:
>>>> 
>>>>>> I don't know what most people use in the real world, maybe more
>>>>> experienced guys can clarify that.
>>>>> 
>>>>> My question is different.
>>>>> 
>>>>> Why we may want to provide cancel=true?
>>>>> Is there any reason to support this in Ignite?
>>>>> 
>>>>>> 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <
>>> alexpolovtcev@gmail.com>
>>>>> написал(а):
>>>>>> 
>>>>>> Thank you for the response, Nikolay. As I can see, this is the default
>>>>>> behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't
>>> know
>>>>>> what most people use in the real world, maybe more experienced guys
>>> can
>>>>>> clarify that.
>>>>>> 
>>>>>> On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org>
>>>>> wrote:
>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> Why do we need «immediate»?
>>>>>>> Is there real-world scenarios for it?
>>>>>>> 
>>>>>>> Can we do graceful or «save all possible data» shutdown always?
>>>>>>> 
>>>>>>>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <
>>> alexpolovtcev@gmail.com
>>>>>> 
>>>>>>> написал(а):
>>>>>>>> 
>>>>>>>> Hello, dear Igniters!
>>>>>>>> 
>>>>>>>> I'd like to propose a refactoring of the current Shutdown Policy
>>>>>>> mechanism.
>>>>>>>> Currently, node shutdown is controlled by a bunch of flags and
>>> enums (I
>>>>>>> may
>>>>>>>> be missing some of them):
>>>>>>>> 
>>>>>>>> 1. ShutdownPolicy enum has only two entries and is basically a flag
>>>>> that
>>>>>>>> dictates if we need to wait for the data to be backed up, so that we
>>>>> will
>>>>>>>> not lose any data when the node goes offline.
>>>>>>>> 2. "cancel" flag is passed to the stop method and indicates that all
>>>>> long
>>>>>>>> running jobs should be interrupted. For example, it prevents the
>>> node
>>>>>>>> performing a checkpoint on shutdown. I'm pretty sure this flag is
>>>>> nearly
>>>>>>>> always set to "true", because it is used by the Shutdown Hook.
>>>>>>>> 3. There are also a bunch of system properties that affect the
>>>>> shutdown:
>>>>>>>> * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
>>>>>>>> node stop, which heavily correlates with the "cancel" flag.
>>>>>>>> * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates
>>> the
>>>>>>>> ShutdownPolicy enum
>>>>>>>> * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't
>>> know
>>>>>>> if
>>>>>>>> it is really useful.
>>>>>>>> 
>>>>>>>> I think that this approach is counter intuitive and would like to
>>>>>>>> incorporate all aforementioned flags into the ShutdownPolicy enum.
>>> For
>>>>>>>> example, it can look as follows:
>>>>>>>> 
>>>>>>>> public enum ShutdownPolicy {
>>>>>>>> IMMEDIATE // Stops the node and cancels all running jobs
>>>>>>>> GRACEFUL // Stops the node, does not cancel running jobs and waits
>>>>> for
>>>>>>>> backups
>>>>>>>> GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running
>>> jobs,
>>>>>>>> does not wait for backups
>>>>>>>> }
>>>>>>>> 
>>>>>>>> After that, the "cancel" flag and all corresponding properties can
>>> be
>>>>>>>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still
>>>>> needed).
>>>>>>>> 
>>>>>>>> Does this make sense? What do you think?
>>>>>>>> 
>>>>>>>> --
>>>>>>>> With regards,
>>>>>>>> Aleksandr Polovtsev
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> With regards,
>>>>>> Aleksandr Polovtsev
>>>>> 
>>>>> 
>>>> 
>>>> --
>>>> With regards,
>>>> Aleksandr Polovtsev
>>> 
>>> 
>> 
>> --
>> With regards,
>> Aleksandr Polovtsev
>> 
> 
> 
> -- 
> With regards,
> Aleksandr Polovtsev


Re: Shutdown policy refactoring

Posted by Aleksandr Polovtsev <al...@gmail.com>.
Guys, any more feedback?

On Tue, Oct 18, 2022 at 12:42 PM Aleksandr Polovtsev <
alexpolovtcev@gmail.com> wrote:

> My guess is that it reduces the node shutdown time. We may want to keep it
> for compatibility purposes, because I'm afraid that changing the default
> behaviour is not that easy.
>
> On Tue, Oct 18, 2022 at 12:32 PM Николай Ижиков <ni...@apache.org>
> wrote:
>
>> So why it’s default behavior?
>> And why we want keep it?
>>
>> > 18 окт. 2022 г., в 12:27, Aleksandr Polovtsev <al...@gmail.com>
>> написал(а):
>> >
>> > But this is the default behaviour right now, do you mean we should
>> change
>> > it? I'm ok with that if it will not break something
>> >
>> > On Tue, Oct 18, 2022 at 12:13 PM Николай Ижиков <ni...@apache.org>
>> wrote:
>> >
>> >>> I don't know what most people use in the real world, maybe more
>> >> experienced guys can clarify that.
>> >>
>> >> My question is different.
>> >>
>> >> Why we may want to provide cancel=true?
>> >> Is there any reason to support this in Ignite?
>> >>
>> >>> 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <
>> alexpolovtcev@gmail.com>
>> >> написал(а):
>> >>>
>> >>> Thank you for the response, Nikolay. As I can see, this is the default
>> >>> behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't
>> know
>> >>> what most people use in the real world, maybe more experienced guys
>> can
>> >>> clarify that.
>> >>>
>> >>> On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org>
>> >> wrote:
>> >>>
>> >>>> Hello,
>> >>>>
>> >>>> Why do we need «immediate»?
>> >>>> Is there real-world scenarios for it?
>> >>>>
>> >>>> Can we do graceful or «save all possible data» shutdown always?
>> >>>>
>> >>>>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <
>> alexpolovtcev@gmail.com
>> >>>
>> >>>> написал(а):
>> >>>>>
>> >>>>> Hello, dear Igniters!
>> >>>>>
>> >>>>> I'd like to propose a refactoring of the current Shutdown Policy
>> >>>> mechanism.
>> >>>>> Currently, node shutdown is controlled by a bunch of flags and
>> enums (I
>> >>>> may
>> >>>>> be missing some of them):
>> >>>>>
>> >>>>> 1. ShutdownPolicy enum has only two entries and is basically a flag
>> >> that
>> >>>>> dictates if we need to wait for the data to be backed up, so that we
>> >> will
>> >>>>> not lose any data when the node goes offline.
>> >>>>> 2. "cancel" flag is passed to the stop method and indicates that all
>> >> long
>> >>>>> running jobs should be interrupted. For example, it prevents the
>> node
>> >>>>> performing a checkpoint on shutdown. I'm pretty sure this flag is
>> >> nearly
>> >>>>> always set to "true", because it is used by the Shutdown Hook.
>> >>>>> 3. There are also a bunch of system properties that affect the
>> >> shutdown:
>> >>>>>  * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
>> >>>>> node stop, which heavily correlates with the "cancel" flag.
>> >>>>>  * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates
>> the
>> >>>>> ShutdownPolicy enum
>> >>>>>  * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't
>> know
>> >>>> if
>> >>>>> it is really useful.
>> >>>>>
>> >>>>> I think that this approach is counter intuitive and would like to
>> >>>>> incorporate all aforementioned flags into the ShutdownPolicy enum.
>> For
>> >>>>> example, it can look as follows:
>> >>>>>
>> >>>>> public enum ShutdownPolicy {
>> >>>>>  IMMEDIATE // Stops the node and cancels all running jobs
>> >>>>>  GRACEFUL // Stops the node, does not cancel running jobs and waits
>> >> for
>> >>>>> backups
>> >>>>>  GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running
>> jobs,
>> >>>>> does not wait for backups
>> >>>>> }
>> >>>>>
>> >>>>> After that, the "cancel" flag and all corresponding properties can
>> be
>> >>>>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still
>> >> needed).
>> >>>>>
>> >>>>> Does this make sense? What do you think?
>> >>>>>
>> >>>>> --
>> >>>>> With regards,
>> >>>>> Aleksandr Polovtsev
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> With regards,
>> >>> Aleksandr Polovtsev
>> >>
>> >>
>> >
>> > --
>> > With regards,
>> > Aleksandr Polovtsev
>>
>>
>
> --
> With regards,
> Aleksandr Polovtsev
>


-- 
With regards,
Aleksandr Polovtsev

Re: Shutdown policy refactoring

Posted by Aleksandr Polovtsev <al...@gmail.com>.
My guess is that it reduces the node shutdown time. We may want to keep it
for compatibility purposes, because I'm afraid that changing the default
behaviour is not that easy.

On Tue, Oct 18, 2022 at 12:32 PM Николай Ижиков <ni...@apache.org> wrote:

> So why it’s default behavior?
> And why we want keep it?
>
> > 18 окт. 2022 г., в 12:27, Aleksandr Polovtsev <al...@gmail.com>
> написал(а):
> >
> > But this is the default behaviour right now, do you mean we should change
> > it? I'm ok with that if it will not break something
> >
> > On Tue, Oct 18, 2022 at 12:13 PM Николай Ижиков <ni...@apache.org>
> wrote:
> >
> >>> I don't know what most people use in the real world, maybe more
> >> experienced guys can clarify that.
> >>
> >> My question is different.
> >>
> >> Why we may want to provide cancel=true?
> >> Is there any reason to support this in Ignite?
> >>
> >>> 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <alexpolovtcev@gmail.com
> >
> >> написал(а):
> >>>
> >>> Thank you for the response, Nikolay. As I can see, this is the default
> >>> behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't know
> >>> what most people use in the real world, maybe more experienced guys can
> >>> clarify that.
> >>>
> >>> On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org>
> >> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> Why do we need «immediate»?
> >>>> Is there real-world scenarios for it?
> >>>>
> >>>> Can we do graceful or «save all possible data» shutdown always?
> >>>>
> >>>>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <
> alexpolovtcev@gmail.com
> >>>
> >>>> написал(а):
> >>>>>
> >>>>> Hello, dear Igniters!
> >>>>>
> >>>>> I'd like to propose a refactoring of the current Shutdown Policy
> >>>> mechanism.
> >>>>> Currently, node shutdown is controlled by a bunch of flags and enums
> (I
> >>>> may
> >>>>> be missing some of them):
> >>>>>
> >>>>> 1. ShutdownPolicy enum has only two entries and is basically a flag
> >> that
> >>>>> dictates if we need to wait for the data to be backed up, so that we
> >> will
> >>>>> not lose any data when the node goes offline.
> >>>>> 2. "cancel" flag is passed to the stop method and indicates that all
> >> long
> >>>>> running jobs should be interrupted. For example, it prevents the node
> >>>>> performing a checkpoint on shutdown. I'm pretty sure this flag is
> >> nearly
> >>>>> always set to "true", because it is used by the Shutdown Hook.
> >>>>> 3. There are also a bunch of system properties that affect the
> >> shutdown:
> >>>>>  * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
> >>>>> node stop, which heavily correlates with the "cancel" flag.
> >>>>>  * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates
> the
> >>>>> ShutdownPolicy enum
> >>>>>  * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know
> >>>> if
> >>>>> it is really useful.
> >>>>>
> >>>>> I think that this approach is counter intuitive and would like to
> >>>>> incorporate all aforementioned flags into the ShutdownPolicy enum.
> For
> >>>>> example, it can look as follows:
> >>>>>
> >>>>> public enum ShutdownPolicy {
> >>>>>  IMMEDIATE // Stops the node and cancels all running jobs
> >>>>>  GRACEFUL // Stops the node, does not cancel running jobs and waits
> >> for
> >>>>> backups
> >>>>>  GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
> >>>>> does not wait for backups
> >>>>> }
> >>>>>
> >>>>> After that, the "cancel" flag and all corresponding properties can be
> >>>>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still
> >> needed).
> >>>>>
> >>>>> Does this make sense? What do you think?
> >>>>>
> >>>>> --
> >>>>> With regards,
> >>>>> Aleksandr Polovtsev
> >>>>
> >>>>
> >>>
> >>> --
> >>> With regards,
> >>> Aleksandr Polovtsev
> >>
> >>
> >
> > --
> > With regards,
> > Aleksandr Polovtsev
>
>

-- 
With regards,
Aleksandr Polovtsev

Re: Shutdown policy refactoring

Posted by Николай Ижиков <ni...@apache.org>.
So why it’s default behavior?
And why we want keep it?

> 18 окт. 2022 г., в 12:27, Aleksandr Polovtsev <al...@gmail.com> написал(а):
> 
> But this is the default behaviour right now, do you mean we should change
> it? I'm ok with that if it will not break something
> 
> On Tue, Oct 18, 2022 at 12:13 PM Николай Ижиков <ni...@apache.org> wrote:
> 
>>> I don't know what most people use in the real world, maybe more
>> experienced guys can clarify that.
>> 
>> My question is different.
>> 
>> Why we may want to provide cancel=true?
>> Is there any reason to support this in Ignite?
>> 
>>> 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <al...@gmail.com>
>> написал(а):
>>> 
>>> Thank you for the response, Nikolay. As I can see, this is the default
>>> behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't know
>>> what most people use in the real world, maybe more experienced guys can
>>> clarify that.
>>> 
>>> On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org>
>> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> Why do we need «immediate»?
>>>> Is there real-world scenarios for it?
>>>> 
>>>> Can we do graceful or «save all possible data» shutdown always?
>>>> 
>>>>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <alexpolovtcev@gmail.com
>>> 
>>>> написал(а):
>>>>> 
>>>>> Hello, dear Igniters!
>>>>> 
>>>>> I'd like to propose a refactoring of the current Shutdown Policy
>>>> mechanism.
>>>>> Currently, node shutdown is controlled by a bunch of flags and enums (I
>>>> may
>>>>> be missing some of them):
>>>>> 
>>>>> 1. ShutdownPolicy enum has only two entries and is basically a flag
>> that
>>>>> dictates if we need to wait for the data to be backed up, so that we
>> will
>>>>> not lose any data when the node goes offline.
>>>>> 2. "cancel" flag is passed to the stop method and indicates that all
>> long
>>>>> running jobs should be interrupted. For example, it prevents the node
>>>>> performing a checkpoint on shutdown. I'm pretty sure this flag is
>> nearly
>>>>> always set to "true", because it is used by the Shutdown Hook.
>>>>> 3. There are also a bunch of system properties that affect the
>> shutdown:
>>>>>  * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
>>>>> node stop, which heavily correlates with the "cancel" flag.
>>>>>  * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates the
>>>>> ShutdownPolicy enum
>>>>>  * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know
>>>> if
>>>>> it is really useful.
>>>>> 
>>>>> I think that this approach is counter intuitive and would like to
>>>>> incorporate all aforementioned flags into the ShutdownPolicy enum. For
>>>>> example, it can look as follows:
>>>>> 
>>>>> public enum ShutdownPolicy {
>>>>>  IMMEDIATE // Stops the node and cancels all running jobs
>>>>>  GRACEFUL // Stops the node, does not cancel running jobs and waits
>> for
>>>>> backups
>>>>>  GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
>>>>> does not wait for backups
>>>>> }
>>>>> 
>>>>> After that, the "cancel" flag and all corresponding properties can be
>>>>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still
>> needed).
>>>>> 
>>>>> Does this make sense? What do you think?
>>>>> 
>>>>> --
>>>>> With regards,
>>>>> Aleksandr Polovtsev
>>>> 
>>>> 
>>> 
>>> --
>>> With regards,
>>> Aleksandr Polovtsev
>> 
>> 
> 
> -- 
> With regards,
> Aleksandr Polovtsev


Re: Shutdown policy refactoring

Posted by Aleksandr Polovtsev <al...@gmail.com>.
But this is the default behaviour right now, do you mean we should change
it? I'm ok with that if it will not break something

On Tue, Oct 18, 2022 at 12:13 PM Николай Ижиков <ni...@apache.org> wrote:

> > I don't know what most people use in the real world, maybe more
> experienced guys can clarify that.
>
> My question is different.
>
> Why we may want to provide cancel=true?
> Is there any reason to support this in Ignite?
>
> > 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <al...@gmail.com>
> написал(а):
> >
> > Thank you for the response, Nikolay. As I can see, this is the default
> > behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't know
> > what most people use in the real world, maybe more experienced guys can
> > clarify that.
> >
> > On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org>
> wrote:
> >
> >> Hello,
> >>
> >> Why do we need «immediate»?
> >> Is there real-world scenarios for it?
> >>
> >> Can we do graceful or «save all possible data» shutdown always?
> >>
> >>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <alexpolovtcev@gmail.com
> >
> >> написал(а):
> >>>
> >>> Hello, dear Igniters!
> >>>
> >>> I'd like to propose a refactoring of the current Shutdown Policy
> >> mechanism.
> >>> Currently, node shutdown is controlled by a bunch of flags and enums (I
> >> may
> >>> be missing some of them):
> >>>
> >>> 1. ShutdownPolicy enum has only two entries and is basically a flag
> that
> >>> dictates if we need to wait for the data to be backed up, so that we
> will
> >>> not lose any data when the node goes offline.
> >>> 2. "cancel" flag is passed to the stop method and indicates that all
> long
> >>> running jobs should be interrupted. For example, it prevents the node
> >>> performing a checkpoint on shutdown. I'm pretty sure this flag is
> nearly
> >>> always set to "true", because it is used by the Shutdown Hook.
> >>> 3. There are also a bunch of system properties that affect the
> shutdown:
> >>>   * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
> >>> node stop, which heavily correlates with the "cancel" flag.
> >>>   * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates the
> >>> ShutdownPolicy enum
> >>>   * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know
> >> if
> >>> it is really useful.
> >>>
> >>> I think that this approach is counter intuitive and would like to
> >>> incorporate all aforementioned flags into the ShutdownPolicy enum. For
> >>> example, it can look as follows:
> >>>
> >>> public enum ShutdownPolicy {
> >>>   IMMEDIATE // Stops the node and cancels all running jobs
> >>>   GRACEFUL // Stops the node, does not cancel running jobs and waits
> for
> >>> backups
> >>>   GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
> >>> does not wait for backups
> >>> }
> >>>
> >>> After that, the "cancel" flag and all corresponding properties can be
> >>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still
> needed).
> >>>
> >>> Does this make sense? What do you think?
> >>>
> >>> --
> >>> With regards,
> >>> Aleksandr Polovtsev
> >>
> >>
> >
> > --
> > With regards,
> > Aleksandr Polovtsev
>
>

-- 
With regards,
Aleksandr Polovtsev

Re: Shutdown policy refactoring

Posted by Николай Ижиков <ni...@apache.org>.
> I don't know what most people use in the real world, maybe more experienced guys can clarify that.

My question is different.

Why we may want to provide cancel=true?
Is there any reason to support this in Ignite?

> 18 окт. 2022 г., в 11:47, Aleksandr Polovtsev <al...@gmail.com> написал(а):
> 
> Thank you for the response, Nikolay. As I can see, this is the default
> behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't know
> what most people use in the real world, maybe more experienced guys can
> clarify that.
> 
> On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org> wrote:
> 
>> Hello,
>> 
>> Why do we need «immediate»?
>> Is there real-world scenarios for it?
>> 
>> Can we do graceful or «save all possible data» shutdown always?
>> 
>>> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <al...@gmail.com>
>> написал(а):
>>> 
>>> Hello, dear Igniters!
>>> 
>>> I'd like to propose a refactoring of the current Shutdown Policy
>> mechanism.
>>> Currently, node shutdown is controlled by a bunch of flags and enums (I
>> may
>>> be missing some of them):
>>> 
>>> 1. ShutdownPolicy enum has only two entries and is basically a flag that
>>> dictates if we need to wait for the data to be backed up, so that we will
>>> not lose any data when the node goes offline.
>>> 2. "cancel" flag is passed to the stop method and indicates that all long
>>> running jobs should be interrupted. For example, it prevents the node
>>> performing a checkpoint on shutdown. I'm pretty sure this flag is nearly
>>> always set to "true", because it is used by the Shutdown Hook.
>>> 3. There are also a bunch of system properties that affect the shutdown:
>>>   * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
>>> node stop, which heavily correlates with the "cancel" flag.
>>>   * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates the
>>> ShutdownPolicy enum
>>>   * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know
>> if
>>> it is really useful.
>>> 
>>> I think that this approach is counter intuitive and would like to
>>> incorporate all aforementioned flags into the ShutdownPolicy enum. For
>>> example, it can look as follows:
>>> 
>>> public enum ShutdownPolicy {
>>>   IMMEDIATE // Stops the node and cancels all running jobs
>>>   GRACEFUL // Stops the node, does not cancel running jobs and waits for
>>> backups
>>>   GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
>>> does not wait for backups
>>> }
>>> 
>>> After that, the "cancel" flag and all corresponding properties can be
>>> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still needed).
>>> 
>>> Does this make sense? What do you think?
>>> 
>>> --
>>> With regards,
>>> Aleksandr Polovtsev
>> 
>> 
> 
> -- 
> With regards,
> Aleksandr Polovtsev


Re: Shutdown policy refactoring

Posted by Aleksandr Polovtsev <al...@gmail.com>.
Thank you for the response, Nikolay. As I can see, this is the default
behaviour now (ShutdownPolicy = IMMEDIATE, cancel = true). I don't know
what most people use in the real world, maybe more experienced guys can
clarify that.

On Tue, Oct 18, 2022 at 11:15 AM Николай Ижиков <ni...@apache.org> wrote:

> Hello,
>
> Why do we need «immediate»?
> Is there real-world scenarios for it?
>
> Can we do graceful or «save all possible data» shutdown always?
>
> > 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <al...@gmail.com>
> написал(а):
> >
> > Hello, dear Igniters!
> >
> > I'd like to propose a refactoring of the current Shutdown Policy
> mechanism.
> > Currently, node shutdown is controlled by a bunch of flags and enums (I
> may
> > be missing some of them):
> >
> > 1. ShutdownPolicy enum has only two entries and is basically a flag that
> > dictates if we need to wait for the data to be backed up, so that we will
> > not lose any data when the node goes offline.
> > 2. "cancel" flag is passed to the stop method and indicates that all long
> > running jobs should be interrupted. For example, it prevents the node
> > performing a checkpoint on shutdown. I'm pretty sure this flag is nearly
> > always set to "true", because it is used by the Shutdown Hook.
> > 3. There are also a bunch of system properties that affect the shutdown:
> >    * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
> > node stop, which heavily correlates with the "cancel" flag.
> >    * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates the
> > ShutdownPolicy enum
> >    * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know
> if
> > it is really useful.
> >
> > I think that this approach is counter intuitive and would like to
> > incorporate all aforementioned flags into the ShutdownPolicy enum. For
> > example, it can look as follows:
> >
> > public enum ShutdownPolicy {
> >    IMMEDIATE // Stops the node and cancels all running jobs
> >    GRACEFUL // Stops the node, does not cancel running jobs and waits for
> > backups
> >    GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
> > does not wait for backups
> > }
> >
> > After that, the "cancel" flag and all corresponding properties can be
> > removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still needed).
> >
> > Does this make sense? What do you think?
> >
> > --
> > With regards,
> > Aleksandr Polovtsev
>
>

-- 
With regards,
Aleksandr Polovtsev

Re: Shutdown policy refactoring

Posted by Николай Ижиков <ni...@apache.org>.
Hello, 

Why do we need «immediate»?
Is there real-world scenarios for it?

Can we do graceful or «save all possible data» shutdown always? 

> 18 окт. 2022 г., в 10:59, Aleksandr Polovtsev <al...@gmail.com> написал(а):
> 
> Hello, dear Igniters!
> 
> I'd like to propose a refactoring of the current Shutdown Policy mechanism.
> Currently, node shutdown is controlled by a bunch of flags and enums (I may
> be missing some of them):
> 
> 1. ShutdownPolicy enum has only two entries and is basically a flag that
> dictates if we need to wait for the data to be backed up, so that we will
> not lose any data when the node goes offline.
> 2. "cancel" flag is passed to the stop method and indicates that all long
> running jobs should be interrupted. For example, it prevents the node
> performing a checkpoint on shutdown. I'm pretty sure this flag is nearly
> always set to "true", because it is used by the Shutdown Hook.
> 3. There are also a bunch of system properties that affect the shutdown:
>    * IGNITE_PDS_SKIP_CHECKPOINT_ON_NODE_STOP - disables checkpoints on
> node stop, which heavily correlates with the "cancel" flag.
>    * IGNITE_WAIT_FOR_BACKUPS_ON_SHUTDOWN - deprecated and duplicates the
> ShutdownPolicy enum
>    * IGNITE_NO_SHUTDOWN_HOOK - disables the shutdown hook. I don't know if
> it is really useful.
> 
> I think that this approach is counter intuitive and would like to
> incorporate all aforementioned flags into the ShutdownPolicy enum. For
> example, it can look as follows:
> 
> public enum ShutdownPolicy {
>    IMMEDIATE // Stops the node and cancels all running jobs
>    GRACEFUL // Stops the node, does not cancel running jobs and waits for
> backups
>    GRACEFUL_NO_BACKUPS // Stops the node, does not cancel running jobs,
> does not wait for backups
> }
> 
> After that, the "cancel" flag and all corresponding properties can be
> removed (apart from the IGNITE_NO_SHUTDOWN_HOOK, if it is still needed).
> 
> Does this make sense? What do you think?
> 
> -- 
> With regards,
> Aleksandr Polovtsev