You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by "Ilya Kasnacheev (JIRA)" <ji...@apache.org> on 2018/05/28 17:26:00 UTC

[jira] [Created] (IGNITE-8633) Node fails to bail out of wrong BLT, instead hanging around indefinitely

Ilya Kasnacheev created IGNITE-8633:
---------------------------------------

             Summary: Node fails to bail out of wrong BLT, instead hanging around indefinitely
                 Key: IGNITE-8633
                 URL: https://issues.apache.org/jira/browse/IGNITE-8633
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.4
            Reporter: Ilya Kasnacheev
            Assignee: Stanislav Lukyanov


Follow-up on https://stackoverflow.com/questions/50234056/how-to-give-multiple-static-ip-in-apache-ignite-cache-configuration-xml-file/50270676?noredirect=1#comment88095814_50270676 but not quite the same.

I have three nodes: A, B and C.
I've started A and C and performed activation.
Then I stopped them both, started B and performed activation on it.
Now I have two BlT clusters: (A, C) and (B)
However, when I start B; and then try to launch nodes A or C I get inconsistent behavior:
When I launch C, I get the error:
{code}
org.apache.ignite.spi.IgniteSpiException: BaselineTopology of joining node (8c1e210f-52bb-424f-9c7c-a2e7b1bab546 ) is not compatible with BaselineTopology in the cluster. Branching history of cluster BlT ([-1349069127]) doesn't contain branching point hash of joining node BlT (631694798). Consider cleaning persistent storage of the node and adding it to the cluster again.
{code}

But when I launch A, it never enters topology, but also never fails. Moreover, A and B will ping pong each other for eternity:
{code}
[20:16:38,596][WARNING][main][TcpDiscoverySpi] Node has not been connected to topology and will repeat join process. Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=5000]
[20:17:29,514][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/172.25.1.36, rmtPort=49030]
[20:17:29,522][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/172.25.1.36, rmtPort=49030]
[20:17:29,523][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/172.25.1.36:49030, rmtPort=49030]
[20:17:29,524][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Received ping request from the remote node [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, rmtAddr=/172.25.1.36:49030, rmtPort=49030]
[20:17:29,525][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Finished writing ping response [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, rmtAddr=/172.25.1.36:49030, rmtPort=49030]
[20:17:29,526][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/172.25.1.36:49030, rmtPort=49030
[20:18:30,733][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/172.25.1.36, rmtPort=50857]
[20:18:30,733][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/172.25.1.36, rmtPort=50857]
[20:18:30,733][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/172.25.1.36:50857, rmtPort=50857]
[20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Received ping request from the remote node [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, rmtAddr=/172.25.1.36:50857, rmtPort=50857]
[20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Finished writing ping response [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc, rmtAddr=/172.25.1.36:50857, rmtPort=50857]
[20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/172.25.1.36:50857, rmtPort=50857
{code}
{code}
[20:16:28,793][INFO][tcp-disco-msg-worker-#3][GridSnapshotAwareClusterStateProcessorImpl] Received state change finish message: true
[20:16:28,803][INFO][exchange-worker-#62][time] Finished exchange init [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], crd=true]
[20:16:28,812][INFO][exchange-worker-#62][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=1, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, node=37104137-a21e-4b6f-a70b-09164300bbfc]
[20:16:28,818][INFO][sys-#68][GridSnapshotAwareClusterStateProcessorImpl] Successfully performed final activation steps [nodeId=37104137-a21e-4b6f-a70b-09164300bbfc, client=false, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1]]
[20:16:33,571][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/172.25.1.35, rmtPort=42500]
[20:16:33,579][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/172.25.1.35, rmtPort=42500]
[20:16:33,580][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/172.25.1.35:42500, rmtPort=42500]
[20:16:33,592][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/172.25.1.35:42500, rmtPort=42500
[20:16:39,801][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/172.25.1.35, rmtPort=42714]
[20:16:39,801][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/172.25.1.35, rmtPort=42714]
[20:16:39,802][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/172.25.1.35:42714, rmtPort=42714]
[20:16:39,806][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/172.25.1.35:42714, rmtPort=42714
{code}

I don't think this is expected behaviour. I will attach config and work directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)