You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Kristian Rosenvold <kr...@apache.org> on 2016/04/14 10:34:48 UTC

Pecuilar loopback address on Mac seems to break cluster of linux and mac....

I was seeing quite substantial instabilities in my newly configured 1.5.0
cluster, where messages like this would pop up, resulting in the
termination of the node.:

java.net.UnknownHostException: no such interface lo
at java.net.Inet6Address.initstr(Inet6Address.java:487) ~[na:1.8.0_60]
at java.net.Inet6Address.<init>(Inet6Address.java:408) ~[na:1.8.0_60]
at java.net.InetAddress.getAllByName(InetAddress.java:1181) ~[na:1.8.0_60]
at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[na:1.8.0_60]
at java.net.InetAddress.getByName(InetAddress.java:1076) ~[na:1.8.0_60]
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1259)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1241)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.sendMessageAcrossRing(ServerImpl.java:2456)
[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processHeartbeatMessage(ServerImpl.java:4432)
[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2267)
[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:5784)
[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2161)
[ignite-core-1.5.0.final.jar:1.5.0.final]
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[ignite-core-1.5.0.final.jar:1.5.0.final]
08:23:23.189 [tcp-disco-msg-worker-#2%RA-ignite%] WARN
 o.a.i.s.d.tcp.TcpDiscoverySpi  - Local node has detected failed nodes and
started cluster-wide procedure. To speed up failure detection please see
'Failure Detection' section under javadoc for 'TcpDiscoverySpi'

Now in our mysql discovery database I saw a host called
'0:0:0:0:0:0:0:1%lo' (as well as '0:0:0:0:0:0:0:1') . On a hunch I deleted
the "lo" row from the database and things seem to have stabilized.

It would appear to me that when I start a node on my local mac, it inserts
a row into the discovery database that does not parse properly on the linux
node (or vice versa, I have not been able to determine entirely).
 According to the docs on TcpDiscoverySpi, a random entry from the
discovery address is used and it would appear thing start breaking down
whenever this address is chosen.


It appears things have stabilized significantly once I switched the entire
cluster to -Djava.net.preferIPv4Stack=true

Is there a known fix for this issue ? What would be the appropriate root
problem to fix in a patch here ?

Kristian

Re: Pecuilar loopback address on Mac seems to break cluster of linux and mac....

Posted by Denis Magda <dm...@gridgain.com>.
Kristian,

According to the sources UnknownHostException should have been processed gracefully and discovery SPI should have been used the next host address trying to connect to it.

Do you turn on DEBUG logging level to see this exception (could you attach the full log)? Isn’t the Linux node tries to connect to the node using the Mac using the next address from the list?

—
Denis

> On Apr 15, 2016, at 7:43 AM, Denis Magda <dm...@gridgain.com> wrote:
> 
> Hi Kristian,
> 
> Thanks for reporting on this. I've opened in issue in Apache Ignite JIRA
> https://issues.apache.org/jira/browse/IGNITE-3011
> 
> As a workaround as you already noted you can set
> -Djava.net.preferIPv4Stack=true to JVM upon startup. 
> 
> Other solution that may work in your case is to set a host address to use
> for network communications explicitly in configuration using
> IgniteConfiguration.setLocalHost method.
> So if Mac node's IP address would be "10.123.24.25" and Linux's one
> "11.123.24.25" then in the configuration of Mac node you should set
> IgniteConfiguration.setLocalHost("10.123.24.25") and in Linux's one
> IgniteConfiguration.setLocalHost("11.123.24.25").
> 
> --
> Denis
> 
> 
> 
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pecuilar-loopback-address-on-Mac-seems-to-break-cluster-of-linux-and-mac-tp4156p4210.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Pecuilar loopback address on Mac seems to break cluster of linux and mac....

Posted by Denis Magda <dm...@gridgain.com>.
Hi Kristian,

Thanks for reporting on this. I've opened in issue in Apache Ignite JIRA
https://issues.apache.org/jira/browse/IGNITE-3011

As a workaround as you already noted you can set
-Djava.net.preferIPv4Stack=true to JVM upon startup. 

Other solution that may work in your case is to set a host address to use
for network communications explicitly in configuration using
IgniteConfiguration.setLocalHost method.
So if Mac node's IP address would be "10.123.24.25" and Linux's one
"11.123.24.25" then in the configuration of Mac node you should set
IgniteConfiguration.setLocalHost("10.123.24.25") and in Linux's one
IgniteConfiguration.setLocalHost("11.123.24.25").

--
Denis



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Pecuilar-loopback-address-on-Mac-seems-to-break-cluster-of-linux-and-mac-tp4156p4210.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.