You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vladislav Pyatkov (JIRA)" <ji...@apache.org> on 2019/01/14 15:32:00 UTC

[jira] [Updated] (IGNITE-10933) Node may hang on join to topology and not move forward

     [ https://issues.apache.org/jira/browse/IGNITE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vladislav Pyatkov updated IGNITE-10933:
---------------------------------------
    Description: 
Several nodes join to topology simultaneously and hang on a long time.

That can be on first start all cluster nodes or join nodes to completed topology.

In the logs of problem nodes can see messages:
{noformat}
2019-01-11 18:37:39.296 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig
nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

 2019-01-11 18:43:09.374 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig
nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

...

{noformat}
and so for a long time without others.

  was:
Several nodes join to topology simultaneously and hang on a long time.

That can be on first start all cluster nodes or join nodes to completed topology.

In the logs of problem nodes can see messages:

{noformat}

2019-01-11 18:37:39.296 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig
nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

 2019-01-11 18:43:09.374 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig
nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]

...

{noformat}

and this long time without others.


> Node may hang on join to topology and not move forward
> ------------------------------------------------------
>
>                 Key: IGNITE-10933
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10933
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vladislav Pyatkov
>            Priority: Major
>
> Several nodes join to topology simultaneously and hang on a long time.
> That can be on first start all cluster nodes or join nodes to completed topology.
> In the logs of problem nodes can see messages:
> {noformat}
> 2019-01-11 18:37:39.296 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig
> nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
>  2019-01-11 18:43:09.374 [WARN ][Thread-56][o.a.i.s.d.tcp.TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require sig
> nificant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
> ...
> {noformat}
> and so for a long time without others.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)