You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by shivakumar <sh...@gmail.com> on 2019/10/17 14:54:40 UTC

nodes in the baseline topology is going to OFFLINE state

Hi all,
I have Ignite deployment on Kubernetes and I wanted to restart all nodes so
I am using "kill -k" command from the visor shell.
after running this command it is restarting all nodes, once all nodes join
the topology sometimes few nodes are going into OFFLINE state [eventhough
the nodes are up and running] and it looks like it is causing split-brain or
split cluster scenario.


[ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user ignite
--password ignite --host ignite-service-br --baseline
Control utility [ver. 2.7.6#19700101-sha1:DEV]
2019 Copyright(C) Apache Software Foundation
User: ignite
--------------------------------------------------------------------------------
Cluster state: inactive
Current topology version: 1

Baseline nodes:
    ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=ONLINE
    ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=OFFLINE
    ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=OFFLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 3

Other nodes not found.
[ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user ignite
--password ignite --host ignite-service-br --baseline
Control utility [ver. 2.7.6#19700101-sha1:DEV]
2019 Copyright(C) Apache Software Foundation
User: ignite
--------------------------------------------------------------------------------
Cluster state: inactive
Current topology version: 2

Baseline nodes:
    ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=OFFLINE
    ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=ONLINE
    ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=ONLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 3

Other nodes not found.
[ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user ignite
--password ignite --host ignite-service-br --baseline
Control utility [ver. 2.7.6#19700101-sha1:DEV]
2019 Copyright(C) Apache Software Foundation
User: ignite
--------------------------------------------------------------------------------
Cluster state: inactive
Current topology version: 2

Baseline nodes:
    ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=OFFLINE
    ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=ONLINE
    ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=ONLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 3

Other nodes not found.
[ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user ignite
--password ignite --host ignite-service-br --baseline
Control utility [ver. 2.7.6#19700101-sha1:DEV]
2019 Copyright(C) Apache Software Foundation
User: ignite
--------------------------------------------------------------------------------
Cluster state: inactive
Current topology version: 2

Baseline nodes:
    ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=OFFLINE
    ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=ONLINE
    ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=ONLINE
--------------------------------------------------------------------------------
Number of baseline nodes: 3

Other nodes not found.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: nodes in the baseline topology is going to OFFLINE state

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

There is no supported way to gracefully restart cluster currently. You will
have to stop all nodes, start them again and then activate (or
auto-activate).

Regards,
-- 
Ilya Kasnacheev


пт, 18 окт. 2019 г. в 16:42, Shiva Kumar <sh...@gmail.com>:

> Hi Ilya,
> My goal is to deactivate the cluster and not restart !! There is an issue
> in deactivating the cluster in my deployment so I am going with restart.
>
> I have the ignite deployment on kubernetes and during deactivation entire
> cluster and even request to deactivate (rest or control.sh) hangs because I
> have few applications which connected to this ignite  cluster over JDBC and
> try to run some queries and also inserts records to many tables parallelly.
> At this time if I issue a deactivate request it hangs for more than 25
> minutes. I am in a impression that since there are many clients established
> TCP connections and running queries, this is causing the cluster to hang
> and thinking of restarting the cluster so that I can proceed with
> deactivation easily once restart is done.
> Any suggestions is appreciated.
>
> Regards,
> Shiva
>
>
> On Fri, 18 Oct, 2019, 6:37 PM Ilya Kasnacheev, <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> If cluster is persistent, you can deactivate it and then restart.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пт, 18 окт. 2019 г. в 09:51, shivakumar <sh...@gmail.com>:
>>
>>> Hi Ilya Kasnacheev,
>>> Is there any other way of gracefully shutting down/restart the entire
>>> cluster?
>>>
>>> regards,
>>> shiva
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>
>>

Re: nodes in the baseline topology is going to OFFLINE state

Posted by Shiva Kumar <sh...@gmail.com>.
Hi Ilya,
My goal is to deactivate the cluster and not restart !! There is an issue
in deactivating the cluster in my deployment so I am going with restart.

I have the ignite deployment on kubernetes and during deactivation entire
cluster and even request to deactivate (rest or control.sh) hangs because I
have few applications which connected to this ignite  cluster over JDBC and
try to run some queries and also inserts records to many tables parallelly.
At this time if I issue a deactivate request it hangs for more than 25
minutes. I am in a impression that since there are many clients established
TCP connections and running queries, this is causing the cluster to hang
and thinking of restarting the cluster so that I can proceed with
deactivation easily once restart is done.
Any suggestions is appreciated.

Regards,
Shiva


On Fri, 18 Oct, 2019, 6:37 PM Ilya Kasnacheev, <il...@gmail.com>
wrote:

> Hello!
>
> If cluster is persistent, you can deactivate it and then restart.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пт, 18 окт. 2019 г. в 09:51, shivakumar <sh...@gmail.com>:
>
>> Hi Ilya Kasnacheev,
>> Is there any other way of gracefully shutting down/restart the entire
>> cluster?
>>
>> regards,
>> shiva
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Re: nodes in the baseline topology is going to OFFLINE state

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

If cluster is persistent, you can deactivate it and then restart.

Regards,
-- 
Ilya Kasnacheev


пт, 18 окт. 2019 г. в 09:51, shivakumar <sh...@gmail.com>:

> Hi Ilya Kasnacheev,
> Is there any other way of gracefully shutting down/restart the entire
> cluster?
>
> regards,
> shiva
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: nodes in the baseline topology is going to OFFLINE state

Posted by shivakumar <sh...@gmail.com>.
Hi Ilya Kasnacheev,
Is there any other way of gracefully shutting down/restart the entire
cluster?

regards,
shiva



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: nodes in the baseline topology is going to OFFLINE state

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Can you provide logs from nodes which are considered OFFLINE by cluster?

Please note that it is advised to start one node and then start all other
when first one is up, as opposed to starting them all at the same moment.

Regards,
-- 
Ilya Kasnacheev


чт, 17 окт. 2019 г. в 17:54, shivakumar <sh...@gmail.com>:

> Hi all,
> I have Ignite deployment on Kubernetes and I wanted to restart all nodes so
> I am using "kill -k" command from the visor shell.
> after running this command it is restarting all nodes, once all nodes join
> the topology sometimes few nodes are going into OFFLINE state [eventhough
> the nodes are up and running] and it looks like it is causing split-brain
> or
> split cluster scenario.
>
>
> [ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user
> ignite
> --password ignite --host ignite-service-br --baseline
> Control utility [ver. 2.7.6#19700101-sha1:DEV]
> 2019 Copyright(C) Apache Software Foundation
> User: ignite
>
> --------------------------------------------------------------------------------
> Cluster state: inactive
> Current topology version: 1
>
> Baseline nodes:
>     ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=ONLINE
>     ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=OFFLINE
>     ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=OFFLINE
>
> --------------------------------------------------------------------------------
> Number of baseline nodes: 3
>
> Other nodes not found.
> [ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user
> ignite
> --password ignite --host ignite-service-br --baseline
> Control utility [ver. 2.7.6#19700101-sha1:DEV]
> 2019 Copyright(C) Apache Software Foundation
> User: ignite
>
> --------------------------------------------------------------------------------
> Cluster state: inactive
> Current topology version: 2
>
> Baseline nodes:
>     ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=OFFLINE
>     ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=ONLINE
>     ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=ONLINE
>
> --------------------------------------------------------------------------------
> Number of baseline nodes: 3
>
> Other nodes not found.
> [ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user
> ignite
> --password ignite --host ignite-service-br --baseline
> Control utility [ver. 2.7.6#19700101-sha1:DEV]
> 2019 Copyright(C) Apache Software Foundation
> User: ignite
>
> --------------------------------------------------------------------------------
> Cluster state: inactive
> Current topology version: 2
>
> Baseline nodes:
>     ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=OFFLINE
>     ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=ONLINE
>     ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=ONLINE
>
> --------------------------------------------------------------------------------
> Number of baseline nodes: 3
>
> Other nodes not found.
> [ignite@ignite-shiva-visor-68cb697b5-qbccr bin]$ ./control.sh --user
> ignite
> --password ignite --host ignite-service-br --baseline
> Control utility [ver. 2.7.6#19700101-sha1:DEV]
> 2019 Copyright(C) Apache Software Foundation
> User: ignite
>
> --------------------------------------------------------------------------------
> Cluster state: inactive
> Current topology version: 2
>
> Baseline nodes:
>     ConsistentID=253094b4-877b-45ae-ad06-07e639befffc, STATE=OFFLINE
>     ConsistentID=862e324b-f3d1-4198-92d0-0d1d2c4a2f88, STATE=ONLINE
>     ConsistentID=86b5f451-ac4f-4479-9f6e-2db6ab5d11e7, STATE=ONLINE
>
> --------------------------------------------------------------------------------
> Number of baseline nodes: 3
>
> Other nodes not found.
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>