You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Ivan Artukhov (JIRA)" <ji...@apache.org> on 2018/05/15 12:35:00 UTC

[jira] [Created] (IGNITE-8497) Ignite stops the node in the middle of checkpointing upon receiving a SIGINT

Ivan Artukhov created IGNITE-8497:
-------------------------------------

             Summary: Ignite stops the node in the middle of checkpointing upon receiving a SIGINT
                 Key: IGNITE-8497
                 URL: https://issues.apache.org/jira/browse/IGNITE-8497
             Project: Ignite
          Issue Type: Bug
          Components: persistence
    Affects Versions: 2.4
         Environment: Ubuntu 17.10
            Reporter: Ivan Artukhov
         Attachments: example-cache.xml, srv.1.log, srv.2.log

*Steps*
Start Ignite server node with enabled PDS (see the attached  [^example-cache.xml] config file)
Activate the cluster with _./bin/control.sh --activate_
Put some data into cluster (with _CachePutGetExample.java_ for example)
Stop Ignite server node with SIGINT

*Actual result*
Ignite server node invokes the shutdown hook, checkpoint procedure starts, but Ignite node *does not wait for checkpoint to finish* and terminates the node.

An excerpt from  [^srv.1.log] :
{noformat}
[2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook...
[2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command protocol successfully stopped: TCP binary
[2018-05-15 15:20:59,998][INFO ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3, startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871], checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167, reason='timeout']
[2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped cache [cacheName=default]
[2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped cache [cacheName=ignite-sys-cache]
[2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped cache [cacheName=CachePutGetExample]
[2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal] 

>>> +-----------------------------------------------------+
>>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK
>>> +-----------------------------------------------------+
>>> Grid uptime: 00:00:36.228
{noformat}

When one starts the node again, the following warning appears in the log ( [^srv.2.log] ):
{noformat}
[2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager] Ignite node stopped in the middle of checkpoint. Will restore memory state and finish checkpoint on node start.
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: [jira] [Created] (IGNITE-8497) Ignite stops the node in the middle of checkpointing upon receiving a SIGINT

Posted by Dmitry Pavlov <dp...@gmail.com>.
HI Ivan R. ,

I understand now, thank you. Probably we could use title "Ignite triggers
checkpoint upon receiving a SIGINT even if it not required"

вт, 15 мая 2018 г. в 16:08, Ivan Rakov <iv...@gmail.com>:

> It's a regular priority bug and should be fixed.
> Issue doesn't cause any kind of data loss. It's harmless, but still
> undesirable: even if checkpoint wasn't running, it will be triggered and
> then immediately interrupted by Ignition.stop(true). Such behavior
> increases time of following node startup.
>
> I added "always" to the ticket summary to avoid misunderstanding.
>
> Best Regards,
> Ivan Rakov
>
> On 15.05.2018 15:54, Dmitry Pavlov wrote:
> > Hi Igniters, Ivan,
> >
> > To my mind it is not a bug. Ignite would be able to restore memory state
> > without waiting checkpoint to be completed. Note checkpoint may be very
> > long running operation.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вт, 15 мая 2018 г. в 15:35, Ivan Artukhov (JIRA) <ji...@apache.org>:
> >
> >> Ivan Artukhov created IGNITE-8497:
> >> -------------------------------------
> >>
> >>               Summary: Ignite stops the node in the middle of
> checkpointing
> >> upon receiving a SIGINT
> >>                   Key: IGNITE-8497
> >>                   URL:
> https://issues.apache.org/jira/browse/IGNITE-8497
> >>               Project: Ignite
> >>            Issue Type: Bug
> >>            Components: persistence
> >>      Affects Versions: 2.4
> >>           Environment: Ubuntu 17.10
> >>              Reporter: Ivan Artukhov
> >>           Attachments: example-cache.xml, srv.1.log, srv.2.log
> >>
> >> *Steps*
> >> Start Ignite server node with enabled PDS (see the attached
> >> [^example-cache.xml] config file)
> >> Activate the cluster with _./bin/control.sh --activate_
> >> Put some data into cluster (with _CachePutGetExample.java_ for example)
> >> Stop Ignite server node with SIGINT
> >>
> >> *Actual result*
> >> Ignite server node invokes the shutdown hook, checkpoint procedure
> starts,
> >> but Ignite node *does not wait for checkpoint to finish* and terminates
> the
> >> node.
> >>
> >> An excerpt from  [^srv.1.log] :
> >> {noformat}
> >> [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook...
> >> [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command
> >> protocol successfully stopped: TCP binary
> >> [2018-05-15 15:20:59,998][INFO
> >> ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint
> >> started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3,
> >> startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871],
> >> checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167,
> >> reason='timeout']
> >> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped
> >> cache [cacheName=default]
> >> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped
> >> cache [cacheName=ignite-sys-cache]
> >> [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped
> >> cache [cacheName=CachePutGetExample]
> >> [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal]
> >>
> >>>>> +-----------------------------------------------------+
> >>>>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK
> >>>>> +-----------------------------------------------------+
> >>>>> Grid uptime: 00:00:36.228
> >> {noformat}
> >>
> >> When one starts the node again, the following warning appears in the
> log (
> >> [^srv.2.log] ):
> >> {noformat}
> >> [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager]
> >> Ignite node stopped in the middle of checkpoint. Will restore memory
> state
> >> and finish checkpoint on node start.
> >> {noformat}
> >>
> >>
> >>
> >> --
> >> This message was sent by Atlassian JIRA
> >> (v7.6.3#76005)
> >>
>
>

Re: [jira] [Created] (IGNITE-8497) Ignite stops the node in the middle of checkpointing upon receiving a SIGINT

Posted by Ivan Rakov <iv...@gmail.com>.
It's a regular priority bug and should be fixed.
Issue doesn't cause any kind of data loss. It's harmless, but still 
undesirable: even if checkpoint wasn't running, it will be triggered and 
then immediately interrupted by Ignition.stop(true). Such behavior 
increases time of following node startup.

I added "always" to the ticket summary to avoid misunderstanding.

Best Regards,
Ivan Rakov

On 15.05.2018 15:54, Dmitry Pavlov wrote:
> Hi Igniters, Ivan,
>
> To my mind it is not a bug. Ignite would be able to restore memory state
> without waiting checkpoint to be completed. Note checkpoint may be very
> long running operation.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 15 мая 2018 г. в 15:35, Ivan Artukhov (JIRA) <ji...@apache.org>:
>
>> Ivan Artukhov created IGNITE-8497:
>> -------------------------------------
>>
>>               Summary: Ignite stops the node in the middle of checkpointing
>> upon receiving a SIGINT
>>                   Key: IGNITE-8497
>>                   URL: https://issues.apache.org/jira/browse/IGNITE-8497
>>               Project: Ignite
>>            Issue Type: Bug
>>            Components: persistence
>>      Affects Versions: 2.4
>>           Environment: Ubuntu 17.10
>>              Reporter: Ivan Artukhov
>>           Attachments: example-cache.xml, srv.1.log, srv.2.log
>>
>> *Steps*
>> Start Ignite server node with enabled PDS (see the attached
>> [^example-cache.xml] config file)
>> Activate the cluster with _./bin/control.sh --activate_
>> Put some data into cluster (with _CachePutGetExample.java_ for example)
>> Stop Ignite server node with SIGINT
>>
>> *Actual result*
>> Ignite server node invokes the shutdown hook, checkpoint procedure starts,
>> but Ignite node *does not wait for checkpoint to finish* and terminates the
>> node.
>>
>> An excerpt from  [^srv.1.log] :
>> {noformat}
>> [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook...
>> [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command
>> protocol successfully stopped: TCP binary
>> [2018-05-15 15:20:59,998][INFO
>> ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint
>> started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3,
>> startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871],
>> checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167,
>> reason='timeout']
>> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped
>> cache [cacheName=default]
>> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped
>> cache [cacheName=ignite-sys-cache]
>> [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped
>> cache [cacheName=CachePutGetExample]
>> [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal]
>>
>>>>> +-----------------------------------------------------+
>>>>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK
>>>>> +-----------------------------------------------------+
>>>>> Grid uptime: 00:00:36.228
>> {noformat}
>>
>> When one starts the node again, the following warning appears in the log (
>> [^srv.2.log] ):
>> {noformat}
>> [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager]
>> Ignite node stopped in the middle of checkpoint. Will restore memory state
>> and finish checkpoint on node start.
>> {noformat}
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v7.6.3#76005)
>>


Re: [jira] [Created] (IGNITE-8497) Ignite stops the node in the middle of checkpointing upon receiving a SIGINT

Posted by Dmitry Pavlov <dp...@gmail.com>.
Hi Igniters, Ivan,

To my mind it is not a bug. Ignite would be able to restore memory state
without waiting checkpoint to be completed. Note checkpoint may be very
long running operation.

Sincerely,
Dmitriy Pavlov

вт, 15 мая 2018 г. в 15:35, Ivan Artukhov (JIRA) <ji...@apache.org>:

> Ivan Artukhov created IGNITE-8497:
> -------------------------------------
>
>              Summary: Ignite stops the node in the middle of checkpointing
> upon receiving a SIGINT
>                  Key: IGNITE-8497
>                  URL: https://issues.apache.org/jira/browse/IGNITE-8497
>              Project: Ignite
>           Issue Type: Bug
>           Components: persistence
>     Affects Versions: 2.4
>          Environment: Ubuntu 17.10
>             Reporter: Ivan Artukhov
>          Attachments: example-cache.xml, srv.1.log, srv.2.log
>
> *Steps*
> Start Ignite server node with enabled PDS (see the attached
> [^example-cache.xml] config file)
> Activate the cluster with _./bin/control.sh --activate_
> Put some data into cluster (with _CachePutGetExample.java_ for example)
> Stop Ignite server node with SIGINT
>
> *Actual result*
> Ignite server node invokes the shutdown hook, checkpoint procedure starts,
> but Ignite node *does not wait for checkpoint to finish* and terminates the
> node.
>
> An excerpt from  [^srv.1.log] :
> {noformat}
> [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook...
> [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command
> protocol successfully stopped: TCP binary
> [2018-05-15 15:20:59,998][INFO
> ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint
> started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3,
> startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871],
> checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167,
> reason='timeout']
> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped
> cache [cacheName=default]
> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped
> cache [cacheName=ignite-sys-cache]
> [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped
> cache [cacheName=CachePutGetExample]
> [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal]
>
> >>> +-----------------------------------------------------+
> >>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK
> >>> +-----------------------------------------------------+
> >>> Grid uptime: 00:00:36.228
> {noformat}
>
> When one starts the node again, the following warning appears in the log (
> [^srv.2.log] ):
> {noformat}
> [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager]
> Ignite node stopped in the middle of checkpoint. Will restore memory state
> and finish checkpoint on node start.
> {noformat}
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>