You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Alexey Kukushkin (Jira)" <ji...@apache.org> on 2021/05/18 09:48:00 UTC

[jira] [Assigned] (IGNITE-14711) Client discovery thread interrupt/stop causes endless communication reconnect attempt

     [ https://issues.apache.org/jira/browse/IGNITE-14711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Kukushkin reassigned IGNITE-14711:
-----------------------------------------

    Assignee: Alexey Kukushkin

> Client discovery thread interrupt/stop causes endless communication reconnect attempt
> -------------------------------------------------------------------------------------
>
>                 Key: IGNITE-14711
>                 URL: https://issues.apache.org/jira/browse/IGNITE-14711
>             Project: Ignite
>          Issue Type: Bug
>          Components: networking
>    Affects Versions: 2.10
>            Reporter: Ilya Kasnacheev
>            Assignee: Alexey Kukushkin
>            Priority: Major
>         Attachments: IgniteDiscoveryThreadKillingTest.java
>
>
> Original issue: if tcp-client-disco-sock-reader thread dies on client node, it will never disconnect from the cluster despite NODE_FAILED, and will endlessly try to open communication connections to server while getting "Remote node does not observe current node in topology" exceptions on client and "Close incoming connection, unknown node" on server.
> Generalized issue: stop()ing or interrupt()ing discovery threads cause cluster to hang in many cases, where it is expected that any such node will:
> * Restart the thread and continue normally
> * Disconnect from the cluster to re-establish discovery connection
> * Stop and close all remaining threads.
> See the attached reproducer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)