You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Denis Magda <dm...@gridgain.com> on 2016/05/04 11:32:24 UTC

Re: Stopping the node in order to prevent cluster wide instability.

Hi, 

Please properly subscribe to the user list (this way we will not have to
manually approve your emails). All you need to do is send an email to ì
user-subscribe@ignite.apache.orgî and follow simple instructions in the
reply.

Upon your questions. Most likely the node, that was stopped, became
segmented: was kicked out of the topology by some reason. Look for "local
node segmented" message is the log.

More often the reason is either is a slow network connection when a node
can't reply on a message during IgniteConfiguration.failureDetectionTimeout
or a long GC pauses.

I would suggest that there are no long GC pauses first. Refer to this page
for more details on how to gather GC logs [1]. If you see pauses bigger than
10 secs (default value of IgniteConfiguration.failureDetectionTimeout) then
this is the reason why the node was segmented and you have to tune Java heap
[2] and/or your app.

[1]
https://apacheignite.readme.io/v1.5/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats
[2]
https://apacheignite.readme.io/v1.5/docs/jvm-and-system-tuning#jvm-tuning-for-clusters-with-on_heap-caches

--
Denis




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Stopping-the-node-in-order-to-prevent-cluster-wide-instability-tp4752p4757.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.