You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by risomas <ma...@risomas.com> on 2020/06/11 18:11:16 UTC

Ignite servers are not connecting to the cluster

Hello,
We have 12 identical azure servers in one network, each 32cpu and 64G ram.
Ignite is running as docker container (identical including config). 8 our
servers are working fine. But 4 are working only sometimes - they cant join
and even worse they are blocking also others. I have recreated these VM
(preserving NIC due licensing of some plugins) but it dint help. Sometimes
these 4 nodes are working fine and they can connect togather, but sometimes
they cant even connect if they are just 2. Identical behavior is while
running as docker-compose or kubernetes. Once servers are connected there is
no issue they are working fine.
We are using ignite 2.8.1

I am attaching logs and config where just 2 servers in the network:
 
node8.node8
<http://apache-ignite-users.70518.x6.nabble.com/file/t2900/node8.node8>  
node7.node7
<http://apache-ignite-users.70518.x6.nabble.com/file/t2900/node7.node7>  
both logs are continuing in infinity error loop.

2.xml <http://apache-ignite-users.70518.x6.nabble.com/file/t2900/2.xml>  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite servers are not connecting to the cluster

Posted by mcherkasov <mc...@gridgain.com>.
Hello,


could you please say what type of network you use to run containers?

in logs you an find the following line:
[17:35:08,448][INFO][main][IgniteKernal] Non-loopback local IPs: 10.0.2.10,
172.17.0.1, 192.168.183.128, fe80:0:0:0:20d:3aff:febb:b0b6%eth0

ignite will find all available network interfaces and propagate them to
other nodes via discovery protocol, so other nodes can choose any of those
addresses. You can limit this by setting localHost property in
IgniteConfiguration.

Thanks,
Mike.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite servers are not connecting to the cluster

Posted by risomas <ma...@risomas.com>.
I have already tried to put ports as you mentioned to the configuration and
also increase or decrease all timeouts add some other timeouts like
networkTimeout, ackTimeut, connectionTimeout but no change in the behavior.
Ports are exposed and they are reachable from all VM to any other VM.
We build docker image from centos8 image + official ignite installation (
due plugin licensing installation issue on official ignite image)

For me the most suspicious message is:
...Connection timed out...addr=/192.168.183.128:47100]
that ip is tunl0 ip on other VM and is not reachable




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Ignite servers are not connecting to the cluster

Posted by Denis Magda <dm...@apache.org>.
Hi, there are two things to check.

First, add 47500 port number to all the addresses of
your TcpDiscoveryVmIpFinder.addresses property. For instance, 10.0.2.4
needs to be changed to 10.0.2.4:47500.

Second, double-check that every Docker VM exposes the following port
numbers - 11211, 47100, 47500, 49112. Those should be exposed by default if
an official Ignite image is used. But I would still verify that.

-
Denis


On Thu, Jun 11, 2020 at 11:11 AM risomas <ma...@risomas.com> wrote:

> Hello,
> We have 12 identical azure servers in one network, each 32cpu and 64G ram.
> Ignite is running as docker container (identical including config). 8 our
> servers are working fine. But 4 are working only sometimes - they cant join
> and even worse they are blocking also others. I have recreated these VM
> (preserving NIC due licensing of some plugins) but it dint help. Sometimes
> these 4 nodes are working fine and they can connect togather, but sometimes
> they cant even connect if they are just 2. Identical behavior is while
> running as docker-compose or kubernetes. Once servers are connected there
> is
> no issue they are working fine.
> We are using ignite 2.8.1
>
> I am attaching logs and config where just 2 servers in the network:
>
> node8.node8
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2900/node8.node8>
> node7.node7
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2900/node7.node7>
> both logs are continuing in infinity error loop.
>
> 2.xml <http://apache-ignite-users.70518.x6.nabble.com/file/t2900/2.xml>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>