You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by kfeerick <ke...@ninemilefinancial.com> on 2017/03/16 00:16:07 UTC

ignite messaging disconnection behaviour

Hello friends,
Another query on some charateristics of Ignite beheviour which I'm not sure
if it's my usage or the intention of the product. We are using the messaging
sub-system to distribute streams of data between interested processes. 

The server is sending data as follows:
ignite.message().sendOrdered("my-topic", objectToSend, 1);

The client subscribing:
ignite.message().localListen("my-topic", (UUID uuid, Object o) -> {
            log.info(o);
            return true;
        });

This works well for our use case apart from one behaviour. When a topic
subscriber process exits the sender detects the TCP failure and blocks the
sending thread for a prolonged period of time. Reporting the exception
below. 

We really don't care that a subcriber has gone away. We're more of a
multicast use case but don't want to introduce the overhead of a specialist
messaging product for what is quite a narrow use case. Ignite messaging
seems to do the trick apart from this one disconnection behaviour. 

Thanks for the great product and keep up the good work!
Kevin


[WARN ] 2017-03-16 11:06:32.951 [grid-nio-worker-0-#9%null%]
TcpCommunicationSpi - Failed to process selector key (will close):
GridSelectorNioSessionImpl [selectorIdx=0, queueSize=0,
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
recovery=GridNioRecoveryDescriptor [acked=4624, resendCnt=0, rcvCnt=478,
sentCnt=4632, reserved=true, lastAck=478, nodeLeft=false,
node=TcpDiscoveryNode [id=613517f0-9534-46e3-bbe7-eb5ed1515b27,
addrs=[0:0:0:0:0:0:0:1, 10.1.1.15, 127.0.0.1, 172.27.236.2,
2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0], sockAddrs=[/172.27.236.2:47501,
/2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0:47501, A0004.lan/10.1.1.15:47501,
/0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2,
intOrder=2, lastExchangeTime=1489622741577, loc=false,
ver=1.8.0#20161205-sha1:9ca40dbe, isClient=false], connected=true,
connectCnt=1, queueLimit=5120, reserveCnt=1], super=GridNioSessionImpl
[locAddr=/0:0:0:0:0:0:0:1:59801, rmtAddr=/0:0:0:0:0:0:0:1:47101,
createTime=1489622741736, closeTime=0, bytesSent=13680979, bytesRcvd=158379,
sndSchedTime=1489622792923, lastSndTime=1489622792923,
lastRcvTime=1489622792813, readsPaused=false,
filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=o.a.i.i.util.nio.GridDirectParser@69ad4e89, directMode=true],
GridConnectionBytesVerifyFilter], accepted=false]]
[WARN ] 2017-03-16 11:06:32.952 [grid-nio-worker-0-#9%null%]
TcpCommunicationSpi - Closing NIO session because of unhandled exception
[cls=class o.a.i.i.util.nio.GridNioException, msg=An existing connection was
forcibly closed by the remote host]
[INFO ] 2017-03-16 11:06:32.957 [market-data-worker] MarketDataPublisher -
ABC{bestBid=5.73, bestAsk=5.39, bidVol=575, askVol=2261, IAP=5.53}
[WARN ] 2017-03-16 11:06:33.980 [market-data-worker] TcpCommunicationSpi -
Connect timed out (consider increasing 'failureDetectionTimeout'
configuration property) [addr=/0:0:0:0:0:0:0:1:47101,
failureDetectionTimeout=10000]
[WARN ] 2017-03-16 11:06:34.977 [market-data-worker] TcpCommunicationSpi -
Connect timed out (consider increasing 'failureDetectionTimeout'
configuration property) [addr=/127.0.0.1:47101,
failureDetectionTimeout=10000]
[WARN ] 2017-03-16 11:06:35.982 [market-data-worker] TcpCommunicationSpi -
Connect timed out (consider increasing 'failureDetectionTimeout'
configuration property) [addr=/172.27.236.2:47101,
failureDetectionTimeout=10000]
[WARN ] 2017-03-16 11:06:36.983 [market-data-worker] TcpCommunicationSpi -
Connect timed out (consider increasing 'failureDetectionTimeout'
configuration property) [addr=/2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0:47101,
failureDetectionTimeout=10000]
[WARN ] 2017-03-16 11:06:37.981 [market-data-worker] TcpCommunicationSpi -
Connect timed out (consider increasing 'failureDetectionTimeout'
configuration property) [addr=A0004.lan/10.1.1.15:47101,
failureDetectionTimeout=10000]
[WARN ] 2017-03-16 11:06:37.981 [market-data-worker] TcpCommunicationSpi -
Failed to connect to a remote node (make sure that destination node is alive
and operating system firewall is disabled on local and remote hosts)
[addrs=[/0:0:0:0:0:0:0:1:47101, /127.0.0.1:47101, /172.27.236.2:47101,
/2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0:47101, A0004.lan/10.1.1.15:47101]]
[ERROR] 2017-03-16 11:06:37.987 [market-data-worker] MarketDataPublisher -
Failed to send message to remote node: TcpDiscoveryNode
[id=613517f0-9534-46e3-bbe7-eb5ed1515b27, addrs=[0:0:0:0:0:0:0:1, 10.1.1.15,
127.0.0.1, 172.27.236.2, 2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0],
sockAddrs=[/172.27.236.2:47501, /2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0:47501,
A0004.lan/10.1.1.15:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1489622741577,
loc=false, ver=1.8.0#20161205-sha1:9ca40dbe, isClient=false]
org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote
node: TcpDiscoveryNode [id=613517f0-9534-46e3-bbe7-eb5ed1515b27,
addrs=[0:0:0:0:0:0:0:1, 10.1.1.15, 127.0.0.1, 172.27.236.2,
2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0], sockAddrs=[/172.27.236.2:47501,
/2001:0:9d38:6abd:18fe:1bb0:f5fe:fef0:47501, A0004.lan/10.1.1.15:47501,
/0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2,
intOrder=2, lastExchangeTime=1489622741577, loc=false,
ver=1.8.0#20161205-sha1:9ca40dbe, isClient=false]
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2017)
~[ignite-core-1.8.0.jar:1.8.0]
	at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1955)
~[ignite-core-1.8.0.jar:1.8.0]
	at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1148)
~[ignite-core-1.8.0.jar:1.8.0]
	at
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1571)
~[ignite-core-1.8.0.jar:1.8.0]
	at
org.apache.ignite.internal.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1315)
~[ignite-core-1.8.0.jar:1.8.0]
	at
org.apache.ignite.internal.managers.communication.GridIoManager.sendUserMessage(GridIoManager.java:1453)
~[ignite-core-1.8.0.jar:1.8.0]
	at
org.apache.ignite.internal.IgniteMessagingImpl.sendOrdered(IgniteMessagingImpl.java:140)
~[ignite-core-1.8.0.jar:1.8.0]



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-messaging-disconnection-behaviour-tp11218.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: ignite messaging disconnection behaviour

Posted by dkarachentsev <dk...@gridgain.com>.
Hi Kevin,

Unfortunately community decided to leave it as is, because messages are sent
asynchronously by design, but connection establishing is not. That was made
intentionally and user should take it into account and resolve depending on
concrete situation.

Thanks!

-Dmitry.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-messaging-disconnection-behaviour-tp11218p11658.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: ignite messaging disconnection behaviour

Posted by kfeerick <ke...@ninemilefinancial.com>.
Hi Dmitry,
Did you have a chance to raise the ticket for this issue? If so can you
supply a JIRA ref?

Cheers,
Kevin



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-messaging-disconnection-behaviour-tp11218p11648.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: ignite messaging disconnection behaviour

Posted by dkarachentsev <dk...@gridgain.com>.
Hi Kevin,

It seems that async mode has no effect for message sending, I've wrote a
suggestion to fix this in dev list. But for now you could delegate message
sending to separate thread just to release your main thread.
I'll send follow up message to you with ticket number, if or when it will be
opened.

Thanks!

-Dmitry.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-messaging-disconnection-behaviour-tp11218p11418.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: ignite messaging disconnection behaviour

Posted by kfeerick <ke...@ninemilefinancial.com>.
Hi Dmitry,
Thanks for the suggestion. It does not seem to make a difference and the
calling thread still seems to block before an exception gets thrown from the
TcpCommunicationSpi

It appears you can tweak how quickly the exception gets throws by adjusting
the failureDetectionTimeout on the cache configuration however this is not
that desirable as we run across a variety of network topologies and don't
want to tune this value for all our cache instances. We really do want to
fire and forget if someone is listening, great. If they're not no problem
either.

I've written a quick example to demonstrate the behaviour - appreciate if
you could take a look and advise

Cheers
messaging-reproducer.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/n11329/messaging-reproducer.zip>  





--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-messaging-disconnection-behaviour-tp11218p11329.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: ignite messaging disconnection behaviour

Posted by dkarachentsev <dk...@gridgain.com>.
Hi Kevin,

You may use IgniteMessaging in async mode:
IgniteMessaging msg = ignite.message().withAsync();
msg.sendOrdered("my-topic", objectToSend, 1);

In that case you'll not wait for sending message and reconnections in if
there was a failure.

-Dmitry.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/ignite-messaging-disconnection-behaviour-tp11218p11311.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.