You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Kopilov <ko...@gmail.com> on 2018/09/13 16:38:52 UTC

Unreasonable segmentation in Kubernetes on one node reboot

Hello!

I have made Java application with embedded Ignite and deployed it to
Kubernetes by using StatefulSet with LivenessProbe. Application has three
pods replicated per three physical nodes.

As failover test, I reboot one Kubernetes node (it has pod with slave Ignite
node). *While it is offline *(about two minutes), *another slave Ignite
replica goes down with Segmentation*[1].
This does not happen on killing pod by Kubernetes. 

All timeouts in Ignite are default. Kubernetes liveness probe has
initialDelaySeconds: 10, periodSeconds: 2, timeoutSeconds: 1.

Pods are:
replica-0: 172.30.53.18,  Ignite master (it is master because was started
first by Kubernetes)
replica-1: 172.30.73.15,  Ignite slave, has segmented (and stopped itself) —
this is my problem
replica-2: 172.30.55.10,  Ignite slave, was killed for failover test

Failover reboot was made at 11:08:23.

*Some logs from master Ignite node* (with my IMHO comments, marked italic):
=========================================================
/Connect timed out from killed node, got soon after default
failureDetectionTimeout value — looks OK/
sep 13, 2018 11:08:36 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Connect timed out (consider increasing 'failureDetectionTimeout'
configuration property) [addr=/172.30.55.10:47100,
failureDetectionTimeout=10000]

/Timed out from another node (self-stopped in future) — why? And what is the
difference between "Connect timed out" vs "Timed out waiting for message
delivery receipt"?/
sep 13, 2018 11:08:40 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Timed out waiting for message delivery receipt (most probably, the
reason is in long GC pauses on remote node; consider tuning GC and
increasing 'ackTimeout' configuration property). Will retry to send message
with increased timeout [currentTimeout=10000, rmtAddr=/172.30.73.15:47500,
rmtPort=47500]

/Next problems, same for both slaves with 10 seconds interval./
sep 13, 2018 11:08:40 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Failed to send message to next node
[msg=TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode
[id=ebf41e67-e115-4185-9478-92ffa0463b5e, addrs=[127.0.0.1, 172.30.53.18],
sockAddrs=[replica-0.ignite-deployment.ignite-namespace.svc.cluster.local/172.30.53.18:47500,
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1,
lastExchangeTime=1536826110655, loc=true, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false], failedNodeId=null, status=0,
super=TcpDiscoveryAbstractMessage [sndNodeId=null,
id=523d4c2d561-ebf41e67-e115-4185-9478-92ffa0463b5e, verifierNodeId=null,
topVer=0, pendingIdx=0, failedNodes=null, isClient=false]],
next=TcpDiscoveryNode [id=20e118e5-cb8c-4139-902e-0e44f4a586c3,
addrs=[127.0.0.1, 172.30.73.15], sockAddrs=[/172.30.73.15:47500,
/127.0.0.1:47500], discPort=47500, order=2, intOrder=2,
lastExchangeTime=1536825520251, loc=false, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false], errMsg=Failed to send message to next node
[msg=TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode
[id=ebf41e67-e115-4185-9478-92ffa0463b5e, addrs=[127.0.0.1, 172.30.53.18],
sockAddrs=[replica-0.ignite-deployment.ignite-namespace.svc.cluster.local/172.30.53.18:47500,
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1,
lastExchangeTime=1536826110655, loc=true, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false], failedNodeId=null, status=0,
super=TcpDiscoveryAbstractMessage [sndNodeId=null,
id=523d4c2d561-ebf41e67-e115-4185-9478-92ffa0463b5e, verifierNodeId=null,
topVer=0, pendingIdx=0, failedNodes=null, isClient=false]], next=ClusterNode
[id=20e118e5-cb8c-4139-902e-0e44f4a586c3, order=2, addr=[127.0.0.1,
172.30.73.15], daemon=false]]]

sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Failed to send message to next node
[msg=TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode
[id=ebf41e67-e115-4185-9478-92ffa0463b5e, addrs=[127.0.0.1, 172.30.53.18],
sockAddrs=[replica-0.ignite-deployment.ignite-namespace.svc.cluster.local/172.30.53.18:47500,
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1,
lastExchangeTime=1536826110655, loc=true, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false], failedNodeId=null, status=0,
super=TcpDiscoveryAbstractMessage [sndNodeId=null,
id=523d4c2d561-ebf41e67-e115-4185-9478-92ffa0463b5e, verifierNodeId=null,
topVer=0, pendingIdx=0, failedNodes=null, isClient=false]],
next=TcpDiscoveryNode [id=46f906ab-175f-4055-817c-fab88c5abfb0,
addrs=[127.0.0.1, 172.30.55.10], sockAddrs=[/172.30.55.10:47500,
/127.0.0.1:47500], discPort=47500, order=3, intOrder=3,
lastExchangeTime=1536825531571, loc=false, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false], errMsg=Failed to send message to next node
[msg=TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode
[id=ebf41e67-e115-4185-9478-92ffa0463b5e, addrs=[127.0.0.1, 172.30.53.18],
sockAddrs=[replica-0.ignite-deployment.ignite-namespace.svc.cluster.local/172.30.53.18:47500,
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1,
lastExchangeTime=1536826110655, loc=true, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false], failedNodeId=null, status=0,
super=TcpDiscoveryAbstractMessage [sndNodeId=null,
id=523d4c2d561-ebf41e67-e115-4185-9478-92ffa0463b5e, verifierNodeId=null,
topVer=0, pendingIdx=0, failedNodes=null, isClient=false]], next=ClusterNode
[id=46f906ab-175f-4055-817c-fab88c5abfb0, order=3, addr=[127.0.0.1,
172.30.55.10], daemon=false]]]
/order=1, intOrder=1, lastExchangeTime=1536826110655 — time is equivalent to
11:08:30/
/order=2, intOrder=2, lastExchangeTime=1536825520251 — time is equivalent to
10:58:40, that is second node join moment/
/order=3, intOrder=3, lastExchangeTime=1536825531571 — time is equivalent to
10:58:51, that is third node join moment/

/Both nodes are dropped from topology at the same time/
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Local node has detected failed nodes and started cluster-wide
procedure. To speed up failure detection please see 'Failure Detection'
section under javadoc for 'TcpDiscoverySpi'
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Node FAILED: TcpDiscoveryNode
[id=20e118e5-cb8c-4139-902e-0e44f4a586c3, addrs=[127.0.0.1, 172.30.73.15],
sockAddrs=[/172.30.73.15:47500, /127.0.0.1:47500], discPort=47500, order=2,
intOrder=2, lastExchangeTime=1536825520251, loc=false,
ver=2.6.1#20180906-sha1:8cf681db, isClient=false]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Topology snapshot [ver=4, servers=2, clients=0, CPUs=32,
offheap=48.0GB, heap=64.0GB]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Node [id=EBF41E67-E115-4185-9478-92FFA0463B5E,
clusterState=ACTIVE]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Data Regions Configured:
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Default_Region [initSize=1,0 GiB, maxSize=24,0 GiB,
persistenceEnabled=false]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Started exchange init [topVer=AffinityTopologyVersion [topVer=4,
minorTopVer=0], crd=true, evt=NODE_FAILED,
evtNode=20e118e5-cb8c-4139-902e-0e44f4a586c3, customEvt=null,
allowMerge=true]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Node FAILED: TcpDiscoveryNode
[id=46f906ab-175f-4055-817c-fab88c5abfb0, addrs=[127.0.0.1, 172.30.55.10],
sockAddrs=[/172.30.55.10:47500, /127.0.0.1:47500], discPort=47500, order=3,
intOrder=3, lastExchangeTime=1536825531571, loc=false,
ver=2.6.1#20180906-sha1:8cf681db, isClient=false]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Topology snapshot [ver=5, servers=1, clients=0, CPUs=16,
offheap=24.0GB, heap=32.0GB]
=========================================================

*Some logs from self-stopped slave node:*
=========================================================
/Timed out waiting for message delivery receipt from killed node  — looks
OK, master gave another message at the same time/
sep 13, 2018 11:08:36 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Timed out waiting for message delivery receipt (most probably, the
reason is in long GC pauses on remote node; consider tuning GC and
increasing 'ackTimeout' configuration property). Will retry to send message
with increased timeout [currentTimeout=10000, rmtAddr=/172.30.55.10:47500,
rmtPort=47500]

sep 13, 2018 11:08:36 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Failed to send message to next node
[msg=TcpDiscoveryMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
[sndNodeId=ebf41e67-e115-4185-9478-92ffa0463b5e,
id=ab0a4c2d561-ebf41e67-e115-4185-9478-92ffa0463b5e,
verifierNodeId=ebf41e67-e115-4185-9478-92ffa0463b5e, topVer=0, pendingIdx=0,
failedNodes=null, isClient=false]], next=TcpDiscoveryNode
[id=46f906ab-175f-4055-817c-fab88c5abfb0, addrs=[127.0.0.1, 172.30.55.10],
sockAddrs=[/172.30.55.10:47500, /127.0.0.1:47500], discPort=47500, order=3,
intOrder=3, lastExchangeTime=1536825531578, loc=false,
ver=2.6.1#20180906-sha1:8cf681db, isClient=false], errMsg=Failed to send
message to next node [msg=TcpDiscoveryMetricsUpdateMessage
[super=TcpDiscoveryAbstractMessage
[sndNodeId=ebf41e67-e115-4185-9478-92ffa0463b5e,
id=ab0a4c2d561-ebf41e67-e115-4185-9478-92ffa0463b5e,
verifierNodeId=ebf41e67-e115-4185-9478-92ffa0463b5e, topVer=0, pendingIdx=0,
failedNodes=null, isClient=false]], next=ClusterNode
[id=46f906ab-175f-4055-817c-fab88c5abfb0, order=3, addr=[127.0.0.1,
172.30.55.10], daemon=false]]]

/The same cluster-wide procedure will be at 11:08:50 on master. Was this
missed/ignored by it?/
sep 13, 2018 11:08:36 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Local node has detected failed nodes and started cluster-wide
procedure. To speed up failure detection please see 'Failure Detection'
section under javadoc for 'TcpDiscoverySpi'

/Segmentation detected at the same time as master changed topology to one
node/
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Local node SEGMENTED: TcpDiscoveryNode
[id=20e118e5-cb8c-4139-902e-0e44f4a586c3, addrs=[127.0.0.1, 172.30.73.15],
sockAddrs=[replica-1.ignite-deployment.ignite-namespace.svc.cluster.local/172.30.73.15:47500,
/127.0.0.1:47500], discPort=47500, order=2, intOrder=2,
lastExchangeTime=1536826130700, loc=true, ver=2.6.1#20180906-sha1:8cf681db,
isClient=false]
sep 13, 2018 11:08:50 AM java.util.logging.LogManager$RootLogger log
SEVERE: Critical system error detected. Will be handled accordingly to
configured handler [hnd=class o.a.i.failure.StopNodeFailureHandler,
failureCtx=FailureContext [type=SEGMENTATION, err=null]]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger error
SEVERE: Ignite node is in invalid state due to a critical failure.
sep 13, 2018 11:08:50 AM java.util.logging.LogManager$RootLogger log
SEVERE: Stopping local node on Ignite failure: [failureCtx=FailureContext
[type=SEGMENTATION, err=null]]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Node FAILED: TcpDiscoveryNode
[id=ebf41e67-e115-4185-9478-92ffa0463b5e, addrs=[127.0.0.1, 172.30.53.18],
sockAddrs=[/172.30.53.18:47500, /127.0.0.1:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1536825520301, loc=false,
ver=2.6.1#20180906-sha1:8cf681db, isClient=false]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Topology snapshot [ver=4, servers=2, clients=0, CPUs=32,
offheap=48.0GB, heap=64.0GB]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Node [id=20E118E5-CB8C-4139-902E-0E44F4A586C3,
clusterState=ACTIVE]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Data Regions Configured:
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Default_Region [initSize=1,0 GiB, maxSize=24,0 GiB,
persistenceEnabled=false]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Command protocol successfully stopped: TCP binary
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Become new coordinator 20e118e5-cb8c-4139-902e-0e44f4a586c3
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Node FAILED: TcpDiscoveryNode
[id=46f906ab-175f-4055-817c-fab88c5abfb0, addrs=[127.0.0.1, 172.30.55.10],
sockAddrs=[/172.30.55.10:47500, /127.0.0.1:47500], discPort=47500, order=3,
intOrder=3, lastExchangeTime=1536825531578, loc=false,
ver=2.6.1#20180906-sha1:8cf681db, isClient=false]
sep 13, 2018 11:08:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Topology snapshot [ver=5, servers=1, clients=0, CPUs=16,
offheap=24.0GB, heap=32.0GB]

/And little later/
>>> +------------------------------------------------------------------------------------------+
>>> Ignite ver.
>>> 2.6.1-SNAPSHOT#20180906-sha1:8cf681db120b41dd7935cda9fabbb501b6b05402
>>> stopped OK
>>> +------------------------------------------------------------------------------------------+
>>> Grid uptime: 00:10:18.337
=========================================================

Logs messages looks like network problems between replica-0 (master) and
replica-1 (segmented and self-stopped slave) but I had executed connection
test (by using watch nc -l on replica-1 and watch nc on replica-0) at the
same time on same pods and it had not shown any large pauses.
At the same time, replica-2 (together with affinited kube-proxy and
kube-dns) really was offline. And I do not know if packets for it's IP were
refused or dropped.

*Generally*, had anybody met the same problem and  does anybody know how to
avoid it?
What can be wrong with my Kubernetes or Ignite configuration?
I can restart segmented node, but that is the last variant. Segmentation
avoiding would be much better. (I already had tried to make the same test
with six replicas, four of them were deployed on non-failovered nodes and
only one not stopped itself).

*[1]* By default (Ignite-2.6.0) it crashed with useless message
java.lang.IllegalStateException: Thread tcp-disco-srvr-#3 is terminated
unexpectedly
I had backported patch from
https://issues.apache.org/jira/browse/IGNITE-8751 issue manually and got
Critical system error detected. Will be handled accordingly to configured
handler [hnd=class o.a.i.failure.StopNodeFailureHandler,
failureCtx=FailureContext [type=SEGMENTATION, err=null]]
then
Ignite node stopped OK
Building Ignite from source with that patch is why there is "2.6.1-SNAPSHOT"
version in the log.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by Roman Guseinov <ro...@gromtech.ru>.
Hi,

According to the last logs, this is a different issue.

2018-09-28T05:25:03.598Z [RxComputationThreadPool-1]/W: Get: batch { keys: 1 
} failed, retrying. 
java.util.concurrent.TimeoutException 
        at 
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137) 
        at 
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169) 
        at 
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66) 
        at 
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57) 
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) 
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
        at java.lang.Thread.run(Thread.java:748)

"io.reactivex" is definitely not a part of Apache Ignite. Also, it is not
clear what happens in the process of "GET /health".

I am not sure the last provided deployment files would be enough to
reproduce current issue since they don't have any references to external
libs. Could you share a reproducer (actual YAML-files, client tools,
step-by-step instruction)? Do you use custom application inside with
embedded Ignite node in the Docker image? If yes, could you also upload a
simple maven project at GitHub?

Thanks.

Best Regards,
Roman



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
Please ignore ports, I don't think it matters in the end if connections with
these random ports established as client sockets, as long as exposed server
sockets use fixed ports.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
Here is where random ports come from, both ServerImpl and ClientImpl ask
TcpDiscoverySpi to get socket, and it's implemented as `sock.bind(new
InetSocketAddress(locHost, 0));`

https://github.com/apache/ignite/blob/2.6.0/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.java#L1471

I'm sure it's fine in general, because, well, that's how you get a client
socket, but I'm not sure if that plays well with Kubernetes given how strict
it is about container ports, I do however expect k8s to only require ingress
ports to be exposed explicitly



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
What bothers me after looking carefully at the logs is that there are ports
in use that I've not specified,

like this log entry and similar to it:

```
locAddr=/10.42.32.36:42504,
rmtAddr=ignite-0.ignite.buildcache.svc.cluster.local/10.42.40.84:47100
```

Note that local port is `42504` even though it's set to default 47100 for
communication and 47500 for discovery and ranges of both are set to `0`.

If `42504` and other similar ridiculous ports like `43851`, etc I see in
other logs are actually in use and exposed for network communications, *that
is problematic in Kubernetes environment*. In k8s, container cannot expose
range of ports atm, only fixed set of predifened ports can be exposed.

Moreover I don't understand where would a port number like `42504` come
from, it's not in any default range of default Ignite ports as far as I know



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
With timeouts you've recommended we are still observing sudden latency spike
on client -> server communication leading to timeouts and restart after just
2 minutes running *with no production load on service*, literally just
health checks running

Here are logs from client pod:

```
$ kubectl logs -f -n buildcache buildcache-3 --previous

2018-09-28T05:23:31.231Z [main]/I: Starting the server...
2018-09-28T05:23:31.381Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=10, freeMemoryMb=4085)
Sep 28, 2018 5:23:31 AM java.util.logging.LogManager$RootLogger log
SEVERE: Failed to resolve default logging config file:
config/java.util.logging.properties
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 

>>>    __________  ________________  
>>>   /  _/ ___/ |/ /  _/_  __/ __/  
>>>  _/ // (7 7    // /  / / / _/    
>>> /___/\___/_/|_/___/ /_/ /___/   
>>> 
>>> ver. 2.6.0#20180710-sha1:669feacc
>>> 2018 Copyright(C) Apache Software Foundation
>>> 
>>> Ignite documentation: http://ignite.apache.org

Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Config URL: n/a
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: IgniteConfiguration [igniteInstanceName=null, pubPoolSize=8,
svcPoolSize=8, callbackPoolSize=8, stripedPoolSize=8, sysPoolSize=8,
mgmtPoolSize=4, igfsPoolSize=8, dataStreamerPoolSize=8,
utilityCachePoolSize=8, utilityCacheKeepAliveTime=60000, p2pPoolSize=2,
qryPoolSize=8, igniteHome=null, igniteWorkDir=/tmp/ignite/work,
mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer@363ee3a2,
nodeId=395941f3-a9fd-41c9-8035-06b2fac1b55e,
marsh=org.apache.ignite.internal.binary.BinaryMarshaller@2898ac89,
marshLocJobs=false, daemon=false, p2pEnabled=false, netTimeout=5000,
sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000,
metricsUpdateFreq=2000, metricsExpTime=9223372036854775807,
discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=0, ackTimeout=0,
marsh=null, reconCnt=10, reconDelay=2000, maxAckTimeout=600000,
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null],
segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true,
allResolversPassReq=true, segChkFreq=10000, commSpi=TcpCommunicationSpi
[connectGate=null, connPlc=null, enableForcibleNodeKill=false,
enableTroubleshootingLog=false,
srvLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@6a2b953e,
locAddr=null, locHost=null, locPort=47100, locPortRange=0, shmemPort=-1,
directBuf=true, directSndBuf=false, idleConnTimeout=600000,
connTimeout=5000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768,
sockRcvBuf=32768, msgQueueLimit=512, slowClientQueueLimit=0, nioSrvr=null,
shmemSrv=null, usePairedConnections=false, connectionsPerNode=1,
tcpNoDelay=true, filterReachableAddresses=false, ackSndThreshold=32,
unackedMsgsBufSize=0, sockWriteTimeout=2000, lsnr=null, boundTcpPort=-1,
boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null,
ctxInitLatch=java.util.concurrent.CountDownLatch@313b2ea6[Count = 1],
stopping=false,
metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationMetricsListener@2f465398],
evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi@548e6d58,
colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [lsnr=null],
indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi@610f7aa,
addrRslvr=null, clientMode=true, rebalanceThreadPoolSize=1,
txCfg=org.apache.ignite.configuration.TransactionConfiguration@6a03bcb1,
cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=SHARED,
p2pMissedCacheSize=100, locHost=null, timeSrvPortBase=31100,
timeSrvPortRange=100, failureDetectionTimeout=60000,
clientFailureDetectionTimeout=30000, metricsLogFreq=60000, hadoopCfg=null,
connectorCfg=org.apache.ignite.configuration.ConnectorConfiguration@21b2e768,
odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration
[seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null,
grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null,
binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration
[sysRegionInitSize=41943040, sysCacheMaxSize=104857600, pageSize=8192,
concLvl=0, dfltDataRegConf=DataRegionConfiguration [name=Default_Region,
maxSize=52428800, initSize=52428800, swapPath=null,
pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100,
metricsEnabled=false, metricsSubIntervalCount=5,
metricsRateTimeInterval=60000, persistenceEnabled=false,
checkpointPageBufSize=0], storagePath=null, checkpointFreq=180000,
lockWaitTime=10000, checkpointThreads=4, checkpointWriteOrder=SEQUENTIAL,
walHistSize=20, walSegments=10, walSegmentSize=67108864, walPath=db/wal,
walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY,
walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000,
walRecordIterBuffSize=67108864, alwaysWriteFullPages=false,
fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory@5609159b,
metricsSubIntervalCnt=5, metricsRateTimeInterval=60000,
walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false,
walCompactionEnabled=false], activeOnStart=true, autoActivation=true,
longQryWarnTimeout=3000, sqlConnCfg=null,
cliConnCfg=ClientConnectorConfiguration [host=null, port=10800,
portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true,
maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0,
jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, sslEnabled=false,
useIgniteSslCtxFactory=true, sslClientAuth=false, sslCtxFactory=null],
authEnabled=false, failureHnd=null, commFailureRslvr=null]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Daemon mode: off
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: OS: Linux 4.17.17-200.fc28.x86_64 amd64
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: OS user: root
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: PID: 1
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Language runtime: Java Platform API Specification ver. 1.8
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: VM information: OpenJDK Runtime Environment 1.8.0_171-b11 Oracle
Corporation OpenJDK 64-Bit Server VM 25.171-b11
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: VM total memory: 4.0GB
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Remote Management [restart: off, REST: off, JMX (remote: off)]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Logger: JavaLogger [quiet=false, config=null]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: IGNITE_HOME=null
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: VM arguments: [-Xms4096M, -Xmx4096M, -XX:+PrintGC,
-XX:+PrintGCDateStamps, -XX:+UseG1GC, -XX:+AlwaysPreTouch,
-XX:+ScavengeBeforeFullGC, -XX:+DisableExplicitGC, -XX:MaxGCPauseMillis=30,
-XX:+UnlockExperimentalVMOptions, -XX:G1NewSizePercent=20,
-DIGNITE_QUIET=false, -Djava.net.preferIPv4Stack=true]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: System cache's DataRegion size is configured to 40 MB. Use
DataStorageConfiguration.systemCacheMemorySize property to change the
setting.
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Configured caches [in 'sysMemPlc' dataRegion: ['ignite-sys-cache']]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Configured plugins:
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- None
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Configured failure handler: [hnd=StopNodeOrHaltFailureHandler
[tryStop=false, timeout=0]]
2018-09-28T05:23:32.156+0000: [GC pause (Metadata GC Threshold) (young)
(initial-mark) 55M->11M(4096M), 0.0205314 secs]
2018-09-28T05:23:32.177+0000: [GC concurrent-root-region-scan-start]
2018-09-28T05:23:32.183+0000: [GC concurrent-root-region-scan-end, 0.0065217
secs]
2018-09-28T05:23:32.183+0000: [GC concurrent-mark-start]
2018-09-28T05:23:32.185+0000: [GC concurrent-mark-end, 0.0018376 secs]
2018-09-28T05:23:32.186+0000: [GC remark, 0.0040225 secs]
2018-09-28T05:23:32.190+0000: [GC cleanup 13M->13M(4096M), 0.0022399 secs]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Successfully bound communication NIO server to TCP port [port=47100,
locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Checkpoints are disabled (to enable configure any GridCheckpointSpi
implementation)
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Collision resolution is disabled (all jobs will be activated upon
arrival).
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Security status [authentication=off, tls/ssl=off]
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Client connector processor has started on TCP port 10800
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: REST protocols do not start on client node. To start the protocols on
client node set '-DIGNITE_REST_START_ON_CLIENT=true' system property.
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Non-loopback local IPs: 10.42.32.36
Sep 28, 2018 5:23:32 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Enabled local MACs: 0A094ED0320C
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Started exchange init [topVer=AffinityTopologyVersion [topVer=632,
minorTopVer=0], crd=false, evt=NODE_JOINED,
evtNode=395941f3-a9fd-41c9-8035-06b2fac1b55e, customEvt=null,
allowMerge=true]
2018-09-28T05:23:33.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=107, freeMemoryMb=3988)
2018-09-28T05:23:33.488+0000: [GC pause (Metadata GC Threshold) (young)
(initial-mark) 114M->37M(4096M), 0.0533984 secs]
2018-09-28T05:23:33.541+0000: [GC concurrent-root-region-scan-start]
2018-09-28T05:23:33.571+0000: [GC concurrent-root-region-scan-end, 0.0297871
secs]
2018-09-28T05:23:33.571+0000: [GC concurrent-mark-start]
2018-09-28T05:23:33.573+0000: [GC concurrent-mark-end, 0.0020858 secs]
2018-09-28T05:23:33.574+0000: [GC remark, 0.0129509 secs]
2018-09-28T05:23:33.587+0000: [GC cleanup 42M->42M(4096M), 0.0034809 secs]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Started cache [name=ignite-sys-cache, id=-2100569601,
memoryPolicyName=sysMemPlc, mode=REPLICATED, atomicity=TRANSACTIONAL,
backups=2147483647]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:42504,
rmtAddr=ignite-0.ignite.buildcache.svc.cluster.local/10.42.40.84:47100]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Finished exchange init [topVer=AffinityTopologyVersion [topVer=632,
minorTopVer=0], crd=false]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Received full message, will finish exchange
[node=f47de522-5e60-49c8-9f2b-5a55254ffeba, resVer=AffinityTopologyVersion
[topVer=632, minorTopVer=0]]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Finish exchange future [startVer=AffinityTopologyVersion [topVer=632,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=632, minorTopVer=0],
err=null]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Performance suggestions for grid  (fix if possible)
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Set max direct memory size if getting 'OOME: Direct buffer
memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Refer to this page for more performance suggestions:
https://apacheignite.readme.io/docs/jvm-and-system-tuning
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.6.0#20180710-sha1:669feacc5d3a4e60efcdd300dc8de99780f38eed
>>> +----------------------------------------------------------------------+
>>> OS name: Linux 4.17.17-200.fc28.x86_64 amd64
>>> CPU(s): 8
>>> Heap: 4.0GB
>>> VM name: 1@buildcache-3.buildcache-service.buildcache.svc.cluster.local
>>> Local node [ID=395941F3-A9FD-41C9-8035-06B2FAC1B55E, order=632,
>>> clientMode=true]
>>> Local node addresses:
>>> [buildcache-3.buildcache-service.buildcache.svc.cluster.local/10.42.32.36,
>>> /127.0.0.1]
>>> Local ports: TCP:10800 TCP:47100 

Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Topology snapshot [ver=632, servers=12, clients=4, CPUs=128,
offheap=290.0GB, heap=64.0GB]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Node [id=395941F3-A9FD-41C9-8035-06B2FAC1B55E,
clusterState=ACTIVE]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Data Regions Configured:
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO:   ^-- Default_Region [initSize=50.0 MiB, maxSize=50.0 MiB,
persistenceEnabled=false]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Cache write synchronization mode is set to FULL_ASYNC. All
single-key 'put' and 'remove' operations will return 'null', all 'putx' and
'removex' operations will return 'true' [cacheName=buildcache]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Started cache [name=buildcache, id=-467288620,
memoryPolicyName=Default_Region, mode=PARTITIONED, atomicity=ATOMIC,
backups=0]
Sep 28, 2018 5:23:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:43284,
rmtAddr=ignite-11.ignite.buildcache.svc.cluster.local/10.42.20.184:47100]
2018-09-28 05:23:33.921:INFO::main: Logging initialized @4865ms to
org.eclipse.jetty.util.log.StdErrLog
2018-09-28 05:23:33.978:INFO:oejs.Server:main: jetty-9.4.z-SNAPSHOT; built:
2018-06-05T18:24:03.829Z; git: d5fc0523cfa96bfebfbda19606cad384d772f04c; jvm
1.8.0_171-b11
2018-09-28 05:23:34.301:INFO:oejs.AbstractConnector:main: Started
ServerConnector@113bebca{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2018-09-28 05:23:34.302:INFO:oejs.Server:main: Started @5245ms
2018-09-28T05:23:34.302Z [main]/I: Server has started in 3074 ms and is
ready to handle requests on port 8080.
2018-09-28T05:23:35.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=79, freeMemoryMb=4016)
2018-09-28T05:23:37.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=82, freeMemoryMb=4013)
2018-09-28T05:23:38.188Z [RxComputationThreadPool-4]/VERBOSE: ---> req 0 GET
/health
Sep 28, 2018 5:23:38 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:50798,
rmtAddr=ignite-5.ignite.buildcache.svc.cluster.local/10.42.25.137:47100]
2018-09-28T05:23:38.254Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 29 ms
2018-09-28T05:23:38.263Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 0 200
GET /health , took 94.56 ms
2018-09-28T05:23:39.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=102, freeMemoryMb=3993)
2018-09-28T05:23:41.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=102, freeMemoryMb=3993)
2018-09-28T05:23:43.068Z [RxComputationThreadPool-2]/VERBOSE: ---> req 1 GET
/health
2018-09-28T05:23:43.161Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 75 ms
2018-09-28T05:23:43.162Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 1 200
GET /health , took 94.16 ms
Sep 28, 2018 5:23:43 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Your version is up to date.
2018-09-28T05:23:43.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=105, freeMemoryMb=3990)
2018-09-28T05:23:45.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=106, freeMemoryMb=3989)
2018-09-28T05:23:47.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=106, freeMemoryMb=3989)
2018-09-28T05:23:48.068Z [RxComputationThreadPool-8]/VERBOSE: ---> req 2 GET
/health
Sep 28, 2018 5:23:48 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:53006,
rmtAddr=ignite-6.ignite.buildcache.svc.cluster.local/10.42.39.9:47100]
2018-09-28T05:23:48.100Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 5 ms
2018-09-28T05:23:48.100Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 2 200
GET /health , took 32.90 ms
2018-09-28T05:23:49.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=108, freeMemoryMb=3987)
2018-09-28T05:23:50.090Z [RxComputationThreadPool-6]/VERBOSE: ---> req 3 GET
/health
Sep 28, 2018 5:23:50 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:49148,
rmtAddr=ignite-7.ignite.buildcache.svc.cluster.local/10.42.34.115:47100]
2018-09-28T05:23:50.109Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 4 ms
2018-09-28T05:23:50.110Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 3 200
GET /health , took 20.23 ms
2018-09-28T05:23:51.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=110, freeMemoryMb=3985)
2018-09-28T05:23:52.090Z [RxComputationThreadPool-4]/VERBOSE: ---> req 4 GET
/health
Sep 28, 2018 5:23:52 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:46304,
rmtAddr=ignite-9.ignite.buildcache.svc.cluster.local/10.42.60.171:47100]
2018-09-28T05:23:52.119Z [RxComputationThreadPool-7]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 4 ms
2018-09-28T05:23:52.120Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 4 200
GET /health , took 30.09 ms
2018-09-28T05:23:53.068Z [RxComputationThreadPool-1]/VERBOSE: ---> req 5 GET
/health
Sep 28, 2018 5:23:53 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:35318,
rmtAddr=ignite-10.ignite.buildcache.svc.cluster.local/10.42.46.110:47100]
2018-09-28T05:23:53.078Z [RxComputationThreadPool-5]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 3 ms
2018-09-28T05:23:53.079Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 5 200
GET /health , took 11.45 ms
2018-09-28T05:23:53.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=113, freeMemoryMb=3982)
2018-09-28T05:23:54.089Z [RxComputationThreadPool-7]/VERBOSE: ---> req 6 GET
/health
Sep 28, 2018 5:23:54 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:54638,
rmtAddr=ignite-2.ignite.buildcache.svc.cluster.local/10.42.64.23:47100]
2018-09-28T05:23:54.099Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 4 ms
2018-09-28T05:23:54.100Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 6 200
GET /health , took 10.43 ms
2018-09-28T05:23:55.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=115, freeMemoryMb=3980)
2018-09-28T05:23:56.089Z [RxComputationThreadPool-4]/VERBOSE: ---> req 7 GET
/health
2018-09-28T05:23:56.107Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:23:56.108Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 7 200
GET /health , took 18.79 ms
2018-09-28T05:23:57.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=116, freeMemoryMb=3979)
2018-09-28T05:23:58.067Z [RxComputationThreadPool-2]/VERBOSE: ---> req 8 GET
/health
Sep 28, 2018 5:23:58 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:46570,
rmtAddr=ignite-3.ignite.buildcache.svc.cluster.local/10.42.40.12:47100]
2018-09-28T05:23:58.161Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 76 ms
2018-09-28T05:23:58.161Z [RxComputationThreadPool-8]/VERBOSE: ---> req 9 GET
/health
2018-09-28T05:23:58.162Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 8 200
GET /health , took 94.79 ms
2018-09-28T05:23:58.177Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:23:58.178Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 9 200
GET /health , took 88.32 ms
2018-09-28T05:23:59.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=118, freeMemoryMb=3977)
2018-09-28T05:24:00.089Z [RxComputationThreadPool-6]/VERBOSE: ---> req 10
GET /health
2018-09-28T05:24:00.097Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:00.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 10
200 GET /health , took 8.14 ms
2018-09-28T05:24:01.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=119, freeMemoryMb=3976)
2018-09-28T05:24:02.089Z [RxComputationThreadPool-4]/VERBOSE: ---> req 11
GET /health
Sep 28, 2018 5:24:02 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:49956,
rmtAddr=ignite-8.ignite.buildcache.svc.cluster.local/10.42.20.199:47100]
2018-09-28T05:24:02.109Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 4 ms
2018-09-28T05:24:02.110Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 11
200 GET /health , took 20.98 ms
2018-09-28T05:24:03.067Z [RxComputationThreadPool-2]/VERBOSE: ---> req 12
GET /health
2018-09-28T05:24:03.097Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:03.098Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 12
200 GET /health , took 30.65 ms
2018-09-28T05:24:03.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=120, freeMemoryMb=3975)
2018-09-28T05:24:04.089Z [RxComputationThreadPool-8]/VERBOSE: ---> req 13
GET /health
2018-09-28T05:24:04.117Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:04.118Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 13
200 GET /health , took 28.51 ms
2018-09-28T05:24:05.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=121, freeMemoryMb=3974)
2018-09-28T05:24:06.089Z [RxComputationThreadPool-6]/VERBOSE: ---> req 14
GET /health
2018-09-28T05:24:06.097Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:06.098Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 14
200 GET /health , took 8.78 ms
2018-09-28T05:24:07.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=121, freeMemoryMb=3974)
2018-09-28T05:24:08.067Z [RxComputationThreadPool-4]/VERBOSE: ---> req 15
GET /health
2018-09-28T05:24:08.077Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:08.078Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 15
200 GET /health , took 10.93 ms
2018-09-28T05:24:08.089Z [RxComputationThreadPool-2]/VERBOSE: ---> req 16
GET /health
2018-09-28T05:24:08.162Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:08.162Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 16
200 GET /health , took 73.58 ms
2018-09-28T05:24:09.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=123, freeMemoryMb=3972)
2018-09-28T05:24:10.089Z [RxComputationThreadPool-8]/VERBOSE: ---> req 17
GET /health
2018-09-28T05:24:10.117Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:10.117Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 17
200 GET /health , took 28.24 ms
2018-09-28T05:24:11.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=124, freeMemoryMb=3971)
2018-09-28T05:24:12.089Z [RxComputationThreadPool-6]/VERBOSE: ---> req 18
GET /health
2018-09-28T05:24:12.097Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:12.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 18
200 GET /health , took 8.15 ms
2018-09-28T05:24:13.068Z [RxComputationThreadPool-4]/VERBOSE: ---> req 19
GET /health
2018-09-28T05:24:13.087Z [RxComputationThreadPool-7]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:13.088Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 19
200 GET /health , took 20.41 ms
2018-09-28T05:24:13.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=125, freeMemoryMb=3970)
2018-09-28T05:24:14.089Z [RxComputationThreadPool-1]/VERBOSE: ---> req 20
GET /health
Sep 28, 2018 5:24:14 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:36994,
rmtAddr=ignite-1.ignite.buildcache.svc.cluster.local/10.42.43.251:47100]
2018-09-28T05:24:14.109Z [RxComputationThreadPool-5]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 4 ms
2018-09-28T05:24:14.109Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 20
200 GET /health , took 20.27 ms
2018-09-28T05:24:15.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=125, freeMemoryMb=3970)
2018-09-28T05:24:16.089Z [RxComputationThreadPool-7]/VERBOSE: ---> req 21
GET /health
2018-09-28T05:24:16.117Z [RxComputationThreadPool-3]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:16.117Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 21
200 GET /health , took 28.43 ms
2018-09-28T05:24:17.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=126, freeMemoryMb=3969)
2018-09-28T05:24:18.067Z [RxComputationThreadPool-5]/VERBOSE: ---> req 22
GET /health
2018-09-28T05:24:18.089Z [RxComputationThreadPool-1]/VERBOSE: ---> req 23
GET /health
Sep 28, 2018 5:24:18 AM org.apache.ignite.logger.java.JavaLogger info
INFO: Established outgoing communication connection
[locAddr=/10.42.32.36:54172,
rmtAddr=ignite-4.ignite.buildcache.svc.cluster.local/10.42.58.57:47100]
2018-09-28T05:24:18.099Z [RxComputationThreadPool-5]/VERBOSE: Get: batch {
keys: 2, size: "0 B" } was processed, took 4 ms
2018-09-28T05:24:18.160Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 22
200 GET /health , took 93.56 ms
2018-09-28T05:24:18.161Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 23
200 GET /health , took 71.56 ms
2018-09-28T05:24:19.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=129, freeMemoryMb=3966)
2018-09-28T05:24:20.089Z [RxComputationThreadPool-7]/VERBOSE: ---> req 24
GET /health
2018-09-28T05:24:20.107Z [RxComputationThreadPool-3]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:20.107Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 24
200 GET /health , took 18.26 ms
2018-09-28T05:24:21.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=129, freeMemoryMb=3966)
2018-09-28T05:24:22.089Z [RxComputationThreadPool-5]/VERBOSE: ---> req 25
GET /health
2018-09-28T05:24:22.116Z [RxComputationThreadPool-1]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:22.117Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 25
200 GET /health , took 27.76 ms
2018-09-28T05:24:23.067Z [RxComputationThreadPool-3]/VERBOSE: ---> req 26
GET /health
2018-09-28T05:24:23.077Z [RxComputationThreadPool-7]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:23.077Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 26
200 GET /health , took 10.46 ms
2018-09-28T05:24:23.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=130, freeMemoryMb=3965)
2018-09-28T05:24:24.089Z [RxComputationThreadPool-1]/VERBOSE: ---> req 27
GET /health
2018-09-28T05:24:24.096Z [RxComputationThreadPool-5]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:24.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 27
200 GET /health , took 7.97 ms
2018-09-28T05:24:25.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=131, freeMemoryMb=3964)
2018-09-28T05:24:26.089Z [RxComputationThreadPool-7]/VERBOSE: ---> req 28
GET /health
2018-09-28T05:24:26.107Z [RxComputationThreadPool-3]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:26.108Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 28
200 GET /health , took 18.74 ms
2018-09-28T05:24:27.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=132, freeMemoryMb=3963)
2018-09-28T05:24:28.067Z [RxComputationThreadPool-5]/VERBOSE: ---> req 29
GET /health
2018-09-28T05:24:28.087Z [RxComputationThreadPool-1]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:28.087Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 29
200 GET /health , took 20.37 ms
2018-09-28T05:24:28.089Z [RxComputationThreadPool-3]/VERBOSE: ---> req 30
GET /health
2018-09-28T05:24:28.117Z [RxComputationThreadPool-7]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:28.117Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 30
200 GET /health , took 28.07 ms
2018-09-28T05:24:29.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=135, freeMemoryMb=3960)
2018-09-28T05:24:30.100Z [RxComputationThreadPool-1]/VERBOSE: ---> req 31
GET /health
2018-09-28T05:24:30.127Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:30.128Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 31
200 GET /health , took 28.49 ms
2018-09-28T05:24:31.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=135, freeMemoryMb=3960)
2018-09-28T05:24:32.089Z [RxComputationThreadPool-6]/VERBOSE: ---> req 32
GET /health
2018-09-28T05:24:32.107Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:32.107Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 32
200 GET /health , took 18.28 ms
2018-09-28T05:24:33.067Z [RxComputationThreadPool-4]/VERBOSE: ---> req 33
GET /health
2018-09-28T05:24:33.096Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:33.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 33
200 GET /health , took 29.97 ms
2018-09-28T05:24:33.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=138, freeMemoryMb=3957)
Sep 28, 2018 5:24:33 AM org.apache.ignite.logger.java.JavaLogger info
INFO: 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=395941f3, uptime=00:01:00.004]
    ^-- H/N/C [hosts=16, nodes=16, CPUs=128]
    ^-- CPU [cur=0.1%, avg=0.34%, GC=0%]
    ^-- PageMemory [pages=0]
    ^-- Heap [used=138MB, free=96.63%, comm=4096MB]
    ^-- Non heap [used=55MB, free=-1%, comm=56MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=0, qSize=0]
2018-09-28T05:24:34.089Z [RxComputationThreadPool-2]/VERBOSE: ---> req 34
GET /health
2018-09-28T05:24:34.117Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:34.117Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 34
200 GET /health , took 28.38 ms
2018-09-28T05:24:35.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=139, freeMemoryMb=3956)
2018-09-28T05:24:36.089Z [RxComputationThreadPool-8]/VERBOSE: ---> req 35
GET /health
2018-09-28T05:24:36.097Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:36.098Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 35
200 GET /health , took 8.54 ms
2018-09-28T05:24:37.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=140, freeMemoryMb=3955)
2018-09-28T05:24:38.067Z [RxComputationThreadPool-6]/VERBOSE: ---> req 36
GET /health
2018-09-28T05:24:38.077Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:38.078Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 36
200 GET /health , took 10.66 ms
2018-09-28T05:24:38.089Z [RxComputationThreadPool-4]/VERBOSE: ---> req 37
GET /health
2018-09-28T05:24:38.106Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:38.106Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 37
200 GET /health , took 17.52 ms
2018-09-28T05:24:39.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=141, freeMemoryMb=3954)
2018-09-28T05:24:40.089Z [RxComputationThreadPool-2]/VERBOSE: ---> req 38
GET /health
2018-09-28T05:24:40.116Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:40.117Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 38
200 GET /health , took 27.90 ms
2018-09-28T05:24:41.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=142, freeMemoryMb=3953)
2018-09-28T05:24:42.089Z [RxComputationThreadPool-8]/VERBOSE: ---> req 39
GET /health
2018-09-28T05:24:42.097Z [RxComputationThreadPool-3]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:42.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 39
200 GET /health , took 8.19 ms
2018-09-28T05:24:43.067Z [RxComputationThreadPool-5]/VERBOSE: ---> req 40
GET /health
2018-09-28T05:24:43.087Z [RxComputationThreadPool-1]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:43.087Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 40
200 GET /health , took 20.41 ms
2018-09-28T05:24:43.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=143, freeMemoryMb=3952)
2018-09-28T05:24:44.089Z [RxComputationThreadPool-3]/VERBOSE: ---> req 41
GET /health
2018-09-28T05:24:44.106Z [RxComputationThreadPool-7]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:44.107Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 41
200 GET /health , took 17.91 ms
2018-09-28T05:24:45.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=144, freeMemoryMb=3951)
2018-09-28T05:24:46.089Z [RxComputationThreadPool-1]/VERBOSE: ---> req 42
GET /health
2018-09-28T05:24:46.117Z [RxComputationThreadPool-5]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:46.117Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 42
200 GET /health , took 28.13 ms
2018-09-28T05:24:47.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=144, freeMemoryMb=3951)
2018-09-28T05:24:48.067Z [RxComputationThreadPool-7]/VERBOSE: ---> req 43
GET /health
2018-09-28T05:24:48.089Z [RxComputationThreadPool-3]/VERBOSE: ---> req 44
GET /health
2018-09-28T05:24:48.097Z [RxComputationThreadPool-7]/VERBOSE: Get: batch {
keys: 2, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:48.097Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 43
200 GET /health , took 30.42 ms
2018-09-28T05:24:48.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 44
200 GET /health , took 8.52 ms
2018-09-28T05:24:49.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=146, freeMemoryMb=3949)
2018-09-28T05:24:50.089Z [RxComputationThreadPool-1]/VERBOSE: ---> req 45
GET /health
2018-09-28T05:24:50.106Z [RxComputationThreadPool-5]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:50.107Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 45
200 GET /health , took 17.49 ms
2018-09-28T05:24:51.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=146, freeMemoryMb=3949)
2018-09-28T05:24:52.143Z [RxComputationThreadPool-7]/VERBOSE: ---> req 46
GET /health
2018-09-28T05:24:52.147Z [RxComputationThreadPool-3]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:52.160Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 46
200 GET /health , took 18.20 ms
2018-09-28T05:24:53.067Z [RxComputationThreadPool-5]/VERBOSE: ---> req 47
GET /health
2018-09-28T05:24:53.076Z [RxComputationThreadPool-1]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:53.077Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 47
200 GET /health , took 9.92 ms
2018-09-28T05:24:53.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=147, freeMemoryMb=3948)
2018-09-28T05:24:54.090Z [RxComputationThreadPool-3]/VERBOSE: ---> req 48
GET /health
2018-09-28T05:24:54.097Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:54.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 48
200 GET /health , took 7.70 ms
2018-09-28T05:24:55.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=148, freeMemoryMb=3947)
2018-09-28T05:24:56.089Z [RxComputationThreadPool-8]/VERBOSE: ---> req 49
GET /health
2018-09-28T05:24:56.107Z [RxComputationThreadPool-4]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:56.107Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 49
200 GET /health , took 18.35 ms
2018-09-28T05:24:57.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=149, freeMemoryMb=3946)
2018-09-28T05:24:58.067Z [RxComputationThreadPool-6]/VERBOSE: ---> req 50
GET /health
2018-09-28T05:24:58.086Z [RxComputationThreadPool-2]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:24:58.087Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 50
200 GET /health , took 19.76 ms
2018-09-28T05:24:58.160Z [RxComputationThreadPool-4]/VERBOSE: ---> req 51
GET /health
2018-09-28T05:24:58.177Z [RxComputationThreadPool-8]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 2 ms
2018-09-28T05:24:58.177Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 51
200 GET /health , took 88.30 ms
2018-09-28T05:24:59.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=150, freeMemoryMb=3945)
2018-09-28T05:25:00.089Z [RxComputationThreadPool-2]/VERBOSE: ---> req 52
GET /health
2018-09-28T05:25:00.096Z [RxComputationThreadPool-6]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:25:00.097Z [RxCachedThreadScheduler-1]/VERBOSE: <--- res 52
200 GET /health , took 7.91 ms
2018-09-28T05:25:01.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=151, freeMemoryMb=3944)
2018-09-28T05:25:02.089Z [RxComputationThreadPool-8]/VERBOSE: ---> req 53
GET /health
2018-09-28T05:25:02.106Z [RxComputationThreadPool-3]/VERBOSE: Get: batch {
keys: 1, size: "0 B" } was processed, took 1 ms
2018-09-28T05:25:02.107Z [RxCachedThreadScheduler-2]/VERBOSE: <--- res 53
200 GET /health , took 18.04 ms
2018-09-28T05:25:03.067Z [RxComputationThreadPool-5]/VERBOSE: ---> req 54
GET /health
2018-09-28T05:25:03.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=152, freeMemoryMb=3943)
2018-09-28T05:25:03.598Z [RxComputationThreadPool-1]/W: Get: batch { keys: 1
} failed, retrying.
java.util.concurrent.TimeoutException
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137)
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2018-09-28T05:25:04.090Z [RxComputationThreadPool-5]/VERBOSE: ---> req 55
GET /health
2018-09-28T05:25:04.099Z [RxComputationThreadPool-3]/W: Get: batch { keys: 1
} failed, retrying.
java.util.concurrent.TimeoutException
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137)
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2018-09-28T05:25:04.601Z [RxComputationThreadPool-1]/E: Get: batch { keys: 1
} failed, took 1506 ms.
java.util.concurrent.TimeoutException
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137)
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2018-09-28T05:25:04.602Z [RxComputationThreadPool-1]/W: Cache is not healthy
java.util.concurrent.TimeoutException
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137)
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2018-09-28T05:25:04.615Z [RxCachedThreadScheduler-2]/W: <--- res 54 500 GET
/health  Not OK, took 1547.83 ms
2018-09-28T05:25:04.661Z [RxComputationThreadPool-3]/W: Get: batch { keys: 1
} failed, retrying.
java.util.concurrent.TimeoutException
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137)
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2018-09-28T05:25:05.162Z [RxComputationThreadPool-5]/W: Get: batch { keys: 1
} failed, retrying.
java.util.concurrent.TimeoutException
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutSubscriber.onTimeout(FlowableTimeoutTimed.java:137)
	at
io.reactivex.internal.operators.flowable.FlowableTimeoutTimed$TimeoutTask.run(FlowableTimeoutTimed.java:169)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
	at
io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
2018-09-28T05:25:05.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=155, freeMemoryMb=3940)
2018-09-28T05:25:05.420Z [Thread-12]/I: Shutting down the server...
2018-09-28
05:25:05.462:INFO:oejs.AbstractConnector:RxCachedThreadScheduler-2: Stopped
ServerConnector@113bebca{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
2018-09-28T05:25:07.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:09.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:11.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:13.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:15.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:17.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:19.367Z [RxComputationThreadPool-1]/I: JvmState =
JvmState(totalMemoryMb=4096, usedMemoryMb=158, freeMemoryMb=3937)
2018-09-28T05:25:20.569Z [Thread-12]/I: Successfully shut down the server in
15149 ms
```

I doubt there is network throttling or anything like that, CPU is free, gigs
of heap are free, GC is clearly not active at this moment.

I believe it's a bug in Ignite, I'm out of ideas other than that



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
Sure, I can try that. 

However just to be on same page, high timeouts usually hide errors and let
system go into degraded state while still accepting client requests thus if
something actually goes wrong we can't gracefully recover after say 59
seconds compared to 10 or less.

I'll get back to you with results given timeouts you've suggested 👍



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by Roman Guseinov <ro...@gromtech.ru>.
Hi Artem,

I see quite small timeouts in your configurations. Could you try to increase
them and reproduce the issue?

I suggest to try the following values:
* networkTimeout = 15 000
* failureDetectionTimeout = 60 000
* joinTimeout = 120 000

Best Regards,
Roman



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
As I noted before, we see these issues on smaller deployments too.

Ie right now I'm running configuration with 2 Ignite server pods and 4
Ignite client pods, we see both mentioned symptoms: sudden cache read
latency spikes on clients for some requests and topology split between just
two Ignite servers.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ar...@gmail.com>.
Sure, my bad.

For context, we run it as distributed key/value storage and don't run any
computations on it nor SQL.

*Ignite XML config for the server pods:*

```
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
        http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans.xsd">

    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="failureDetectionTimeout" value="10000"/>
        <property name="metricsLogFrequency" value="5000"/>

        
        <property name="segmentationPolicy" value="NOOP"/>

        <property name="discoverySpi">
            <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    
                    <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
                        <property name="namespace" value="buildcache"/>
                        <property name="serviceName" value="ignite"/>
                    </bean>
                </property>

                
                <property name="joinTimeout" value="10000" />
            </bean>
        </property>

        <property name="communicationSpi">
            <bean
class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                
                <property name="messageQueueLimit" value="1024"/>

                
                <property name="slowClientQueueLimit" value="512"/>
            </bean>
        </property>

        <property name="dataStorageConfiguration">
          <bean
class="org.apache.ignite.configuration.DataStorageConfiguration">
            
            <property name="pageSize" value="8192"/>

            <property name="defaultDataRegionConfiguration">
              <bean
class="org.apache.ignite.configuration.DataRegionConfiguration">
                <property name="name" value="Default_Region"/>
                
                <property name="initialSize" value="#{24L * 1024 * 1024 *
1024}"/>
                
                <property name="maxSize" value="#{24L * 1024 * 1024 *
1024}"/>
                
                <property name="pageEvictionMode" value="RANDOM_2_LRU"/>
                
                <property name="emptyPagesPoolSize" value="25600"/>
                
                <property name="metricsEnabled" value="true"/>
              </bean>
            </property>
          </bean>
        </property>

        <property name="cacheConfiguration">
            <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                <property name="name" value="buildcache"/>
                <property name="dataRegionName" value="Default_Region"/>

                
                <property name="cacheMode" value="PARTITIONED"/>

                
                <property name="writeSynchronizationMode"
value="FULL_ASYNC"/>
            </bean>
        </property>
    </bean>
</beans>
```

*Ignite Client XML config* (embedded into a JVM app that acts as HTTP server
and reads/writes key/value pairs from Ignite):

```
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="
            http://www.springframework.org/schema/beans
            http://www.springframework.org/schema/beans/spring-beans.xsd">

    <bean class="org.apache.ignite.configuration.IgniteConfiguration">
        
        <property name="clientMode" value="true"/>

        <property name="failureDetectionTimeout" value="10000"/>

        <property name="discoverySpi">
            <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
                <property name="ipFinder">
                    
                    <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
                        <property name="namespace" value="buildcache"/>
                        <property name="serviceName" value="ignite"/>
                    </bean>
                </property>

                
                <property name="joinTimeout" value="4500" />
                
                <property name="networkTimeout" value="4000" />
            </bean>
        </property>

        <property name="communicationSpi">
            <bean
class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                
                <property name="messageQueueLimit" value="512"/>
            </bean>
        </property>

        <property name="dataStorageConfiguration">
            <bean
class="org.apache.ignite.configuration.DataStorageConfiguration">
                
                <property name="pageSize" value="8192"/>

                <property name="defaultDataRegionConfiguration">
                    
                    <bean
class="org.apache.ignite.configuration.DataRegionConfiguration">
                        <property name="name" value="Default_Region"/>
                        
                        <property name="initialSize" value="#{50 * 1024 *
1024}"/>
                        
                        <property name="maxSize" value="#{50 * 1024 *
1024}"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
</beans>
```

*Kubernetes Ignite StatefulSet*

```
apiVersion: v1
kind: Service
metadata:
  name: ignite
  namespace: buildcache
spec:
  selector:
    app: ignite-node
  type: LoadBalancer
  ports:
  - name: thinclients
    port: 10900
  - name: spicommunication
    port: 47100
  - name: discoveryspi
    port: 47500
---
apiVersion: apps/v1
kind: StatefulSet # StatefulSet allows us to start/stop pods one by one
(important for Ignite) (and add disk persistence in future).
metadata:
  name: ignite
  namespace: buildcache
spec:
  replicas: 12
  selector:
    matchLabels:
      app: ignite-node
  serviceName: ignite
  template:
    metadata:
      labels:
        app: ignite-node
    spec:
      serviceAccountName: ignite
      volumes:
      - name: config
        configMap:
          name: ignite-config
          items:
          - key: ignite-server-xml
            path: ignite-server-config.xml
      containers:
      - name: ignite-node
        image: apacheignite/ignite:2.6.0
        env:
        - name: OPTION_LIBS
          value: "ignite-kubernetes,ignite-rest-http"
        - name: JVM_OPTS
          # See https://apacheignite.readme.io/docs/jvm-and-system-tuning
          # 24 GB is actual data limit (off heap), see ignite-config-map.
          # 25 GB as JVM off heap limit (JVM can also use off heap for its
needs).
          # 4  GB as heap limit for intermediate operations (normally Ignite
uses about 500 MB, but can jump to 1GB+ on topology change or load spike).
          # 30 GB as container limit (+1 GB of expected normal memory use
just in case).
          value: "-server -Djava.net.preferIPv4Stack=true
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
-XX:MaxDirectMemorySize=25g -Xms4g -Xmx4g -XX:+UseG1GC -XX:+AlwaysPreTouch
-XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC -XX:MaxGCPauseMillis=50
-XX:G1NewSizePercent=20"
        - name: CONFIG_URI
          value: /config/ignite-server-config.xml
        - name: IGNITE_QUIET
          value: "false"
        resources:
          requests:
            memory: "30G"
            cpu: 1
          limits:
            memory: "30G"
            cpu: 2
        volumeMounts:
        - mountPath: /config
          name: config
        ports:
        - containerPort: 8080  # Ignite REST API.
        - containerPort: 47100 # Ignite communication SPI port.
        - containerPort: 47500 # Ignite discovery SPI port number.
        - containerPort: 10900 # Ignite Thin Clients Protocol.
        readinessProbe:
          httpGet:
            path: /ignite?cmd=top
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 1
          periodSeconds: 5
          failureThreshold: 3
        livenessProbe:
          httpGet:
            path: /ignite?cmd=top
            port: 8080
          initialDelaySeconds: 20
          timeoutSeconds: 1
          periodSeconds: 2
          failureThreshold: 1
```

Java service that acts as Ignite clients has similar GC tuning options and
we run 4 instances of clients atm.

So far I see two problems with this deployment:

1) Ignite clients have sudden latency spike for subset of cache reads while
others reads surved fine, GC doesn't seem to be the issue as I've not
noticed it in logs and heap at that time has gigabytes available.

2) Ignite servers nodes fail one by one as they think that topology
segmentation has occurred, logs look like this. You can notice that there is
enough Java heap and CPU is feeling okay on Ignite nodes. (log below)

At this point I think it's either networking problem (but we allow Ignite
ports like 47100, 47500, 10900) or a bug in Ignite 2.6.0

```
[19:12:12,527][INFO][grid-timeout-worker-#23][IgniteKernal] FreeList
[name=null, buckets=256, dataPages=36186, reusePages=702]
[19:12:13,852][INFO][tcp-disco-sock-reader-#116][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.46.176:56691, rmtPort=56691
[19:12:17,534][INFO][grid-timeout-worker-#23][IgniteKernal] 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=8a06dcdd, uptime=01:07:04.517]
    ^-- H/N/C [hosts=7, nodes=7, CPUs=56]
    ^-- CPU [cur=1.97%, avg=1.2%, GC=0.1%]
    ^-- PageMemory [pages=2824373]
    ^-- Heap [used=511MB, free=87.51%, comm=4096MB]
    ^-- Non heap [used=75MB, free=-1%, comm=77MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=2, qSize=0]
    ^-- System thread pool [active=0, idle=8, qSize=0]
[19:12:17,534][INFO][grid-timeout-worker-#23][IgniteKernal] FreeList
[name=null, buckets=256, dataPages=35867, reusePages=702]
[19:12:22,536][INFO][grid-timeout-worker-#23][IgniteKernal] 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=8a06dcdd, uptime=01:07:09.517]
    ^-- H/N/C [hosts=7, nodes=7, CPUs=56]
    ^-- CPU [cur=0.6%, avg=1.2%, GC=0%]
    ^-- PageMemory [pages=2824373]
    ^-- Heap [used=658MB, free=83.92%, comm=4096MB]
    ^-- Non heap [used=75MB, free=-1%, comm=77MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=8, qSize=0]
[19:12:22,536][INFO][grid-timeout-worker-#23][IgniteKernal] FreeList
[name=null, buckets=256, dataPages=36632, reusePages=702]
[19:12:24,699][INFO][tcp-comm-worker-#1][TcpDiscoverySpi] Pinging node:
74330756-9301-401b-bd7f-7d840e921d04
[19:12:24,700][INFO][tcp-comm-worker-#1][TcpDiscoverySpi] Finished node ping
[nodeId=74330756-9301-401b-bd7f-7d840e921d04, res=true, time=5ms]
[19:12:24,957][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.66.84, rmtPort=60023]
[19:12:24,957][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.66.84, rmtPort=60023]
[19:12:24,957][INFO][tcp-disco-sock-reader-#117][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.66.84:60023, rmtPort=60023]
[19:12:24,958][WARNING][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Node is
out of topology (probably, due to short-time network problems).
[19:12:24,958][INFO][tcp-disco-sock-reader-#117][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.66.84:60023, rmtPort=60023
[19:12:24,959][WARNING][disco-event-worker-#41][GridDiscoveryManager] Local
node SEGMENTED: TcpDiscoveryNode [id=8a06dcdd-3d20-4625-b23c-f7d33a2cfceb,
addrs=[10.42.32.222, 127.0.0.1],
sockAddrs=[ignite-0.ignite.buildcache.svc.cluster.local/10.42.32.222:47500,
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1,
lastExchangeTime=1537989144955, loc=true, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[19:12:24,960][SEVERE][tcp-disco-srvr-#3][] Critical system error detected.
Will be handled accordingly to configured handler [hnd=class
o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext
[type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Thread
tcp-disco-srvr-#3 is terminated unexpectedly.]]
java.lang.IllegalStateException: Thread tcp-disco-srvr-#3 is terminated
unexpectedly.
	at
org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:5686)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[19:12:24,961][SEVERE][tcp-disco-srvr-#3][] JVM will be halted immediately
due to the failure: [failureCtx=FailureContext
[type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Thread
tcp-disco-srvr-#3 is terminated unexpectedly.]]
```



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by "Maxim.Pudov" <pu...@gmail.com>.
What about logs from other nodes? This one doesn't tell much. Could you share
you kubernetes config?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ce...@artemzin.com>.
Nevermind, tested by starting each Ignite pod 1 by one and then clients 1 by
one, still fails after few minutes of run-time :(

There is something we're missing that happens in k8s setups I guess



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ce...@artemzin.com>.
I believe there is a detail about our setup that might be important: we start
multiple (12) Ignite server pods via k8s Deployment, by default k8s
deployment starts pods in parallel to each other.

I've read somewhere and heard Denis Magda saying 
https://youtu.be/zqzOgdJjmtw?t=775 <https://youtu.be/zqzOgdJjmtw?t=775>  
that starting Ignite nodes in parallel is a bad idea as it might cause
cluster segmentation.

I don't however see that being mentioned nor some special rollingupdate
configuration being applied in the official docs and Ignite + k8s examples: 
https://apacheignite.readme.io/docs/stateless-deployment
<https://apacheignite.readme.io/docs/stateless-deployment>  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by artem_zin <ce...@artemzin.com>.
We're observing something *very similar* in our setup.

Setup:

- k8s cluster running on AWS
- Ignite deployment (official Ignite 2.6.0 docker image), server mode,
stores key/values in memory (12 pods)
- Java service deployment (embeds Ignite 2.6.0), client mode, talks to
Ignite deployment

Several times a week (sometimes several times a day) we observe Ignite
cluster segmentation, it crashes Ignite servers and hangs connections on
Java service side.

It can happen without any activity thus is very unlikely related to long GC
pauses. In fact I was able to reproduce it in isolated setup where only
health checks from k8s were touching the setup (in a health check I try to
read a value out of Ignite cache).

The worst part about that is that this cascadely brings whole cluster down
and in fact client nodes become split as well and I'm often times observing
at least two Ignite clusters instead of one (I expose /metrics endpoint
which looks up into Ignite topology).

Here is typical picture:

```
kubectl get pods -n somenamespace
NAME                                             READY     STATUS   
RESTARTS   AGE
java-service-5494f775c4-fxb7j                    1/1       Running   1         
13m
java-service-5494f775c4-lrfks                    1/1       Running   3         
13m
ignite-78c4598c97-28fb6                          1/1       Running   0         
15m
ignite-78c4598c97-58dk4                          1/1       Running   1         
15m
ignite-78c4598c97-6nqpv                          1/1       Running   0         
15m
ignite-78c4598c97-c8sfb                          1/1       Running   1         
15m
ignite-78c4598c97-j5nlx                          1/1       Running   0         
15m
ignite-78c4598c97-jr2s8                          1/1       Running   0         
15m
ignite-78c4598c97-k6xh9                          1/1       Running   0         
15m
ignite-78c4598c97-rnmf8                          1/1       Running   0         
15m
ignite-78c4598c97-sqf6j                          1/1       Running   0         
15m
ignite-78c4598c97-v5b6b                          1/1       Running   0         
15m
ignite-78c4598c97-wqr49                          1/1       Running   1         
15m
ignite-78c4598c97-zxm24                          1/1       Running   0         
15m
```

You can see part of Ignite deployment pods being restarted (because default
behavior is to restart on critical failure) and clients are restarted
because Ignite operations just timeout and cause k8s health check to fail.

The only way for us to bring it up (temporary, until it splits again) is to
scale whole deployment down to 0 replicas losing all data and scale it up,
otherwise we only make things worse bringing more pods.

We run on AWS, same region between pods, no Ignite locks used in our code,
just key/value reads/writes, standard Ignite's
TcpDiscoveryKubernetesIpFinder is used for autodiscovery.

I'm happy to provide more info if needed, there is nothing super secret
about our service or use of Ignite :)

Below is full a log of Ignite pod restarted without production load, and
only one client running, just after few minutes of deployment run-time:

```
Ignite Command Line Startup, ver. 2.6.0#20180710-sha1:669feacc
2018 Copyright(C) Apache Software Foundation

[21:53:43,075][INFO][main][IgniteKernal]

>>>    __________  ________________
>>>   /  _/ ___/ |/ /  _/_  __/ __/
>>>  _/ // (7 7    // /  / / / _/
>>> /___/\___/_/|_/___/ /_/ /___/
>>>
>>> ver. 2.6.0#20180710-sha1:669feacc
>>> 2018 Copyright(C) Apache Software Foundation
>>>
>>> Ignite documentation: http://ignite.apache.org

[21:53:43,076][INFO][main][IgniteKernal] Config URL:
file:/config/ignite-server-config.xml
[21:53:43,090][INFO][main][IgniteKernal] IgniteConfiguration
[igniteInstanceName=null, pubPoolSize=8, svcPoolSize=8, callbackPoolSize=8,
stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=8,
dataStreamerPoolSize=8, utilityCachePoolSize=8,
utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8,
igniteHome=/opt/ignite/apache-ignite-fabric,
igniteWorkDir=/opt/ignite/apache-ignite-fabric/work,
mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer@6f94fa3e,
nodeId=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8,
marsh=org.apache.ignite.internal.binary.BinaryMarshaller@3eb25e1a,
marshLocJobs=false, daemon=false, p2pEnabled=false, netTimeout=5000,
sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000,
metricsUpdateFreq=2000, metricsExpTime=9223372036854775807,
discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=0, ackTimeout=0,
marsh=null, reconCnt=10, reconDelay=2000, maxAckTimeout=600000,
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null],
segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true,
allResolversPassReq=true, segChkFreq=10000, commSpi=TcpCommunicationSpi
[connectGate=null, connPlc=null, enableForcibleNodeKill=false,
enableTroubleshootingLog=false,
srvLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@3eb738bb,
locAddr=null, locHost=null, locPort=47100, locPortRange=100, shmemPort=-1,
directBuf=true, directSndBuf=false, idleConnTimeout=600000,
connTimeout=5000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768,
sockRcvBuf=32768, msgQueueLimit=1024, slowClientQueueLimit=512,
nioSrvr=null, shmemSrv=null, usePairedConnections=false,
connectionsPerNode=1, tcpNoDelay=true, filterReachableAddresses=false,
ackSndThreshold=32, unackedMsgsBufSize=0, sockWriteTimeout=2000, lsnr=null,
boundTcpPort=-1, boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0,
addrRslvr=null,
ctxInitLatch=java.util.concurrent.CountDownLatch@5bda8e08[Count = 1],
stopping=false,
metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationMetricsListener@1e800aaa],
evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi@185a6e9,
colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [lsnr=null],
indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi@3cd3e762,
addrRslvr=null, clientMode=false, rebalanceThreadPoolSize=1,
txCfg=org.apache.ignite.configuration.TransactionConfiguration@1fa121e2,
cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=SHARED,
p2pMissedCacheSize=100, locHost=null, timeSrvPortBase=31100,
timeSrvPortRange=100, failureDetectionTimeout=10000,
clientFailureDetectionTimeout=30000, metricsLogFreq=60000, hadoopCfg=null,
connectorCfg=org.apache.ignite.configuration.ConnectorConfiguration@7eac9008,
odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration
[seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null,
grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null,
binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration
[sysRegionInitSize=41943040, sysCacheMaxSize=104857600, pageSize=8192,
concLvl=0, dfltDataRegConf=DataRegionConfiguration [name=Default_Region,
maxSize=25769803776, initSize=25769803776, swapPath=null,
pageEvictionMode=RANDOM_2_LRU, evictionThreshold=0.9,
emptyPagesPoolSize=25600, metricsEnabled=true, metricsSubIntervalCount=5,
metricsRateTimeInterval=60000, persistenceEnabled=false,
checkpointPageBufSize=0], storagePath=null, checkpointFreq=180000,
lockWaitTime=10000, checkpointThreads=4, checkpointWriteOrder=SEQUENTIAL,
walHistSize=20, walSegments=10, walSegmentSize=67108864, walPath=db/wal,
walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY,
walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000,
walRecordIterBuffSize=67108864, alwaysWriteFullPages=false,
fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory@62ddbd7e,
metricsSubIntervalCnt=5, metricsRateTimeInterval=60000,
walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false,
walCompactionEnabled=false], activeOnStart=true, autoActivation=true,
longQryWarnTimeout=3000, sqlConnCfg=null,
cliConnCfg=ClientConnectorConfiguration [host=null, port=10800,
portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true,
maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0,
jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, sslEnabled=false,
useIgniteSslCtxFactory=true, sslClientAuth=false, sslCtxFactory=null],
authEnabled=false, failureHnd=null, commFailureRslvr=null]
[21:53:43,090][INFO][main][IgniteKernal] Daemon mode: off
[21:53:43,090][INFO][main][IgniteKernal] OS: Linux 4.17.17-200.fc28.x86_64
amd64
[21:53:43,091][INFO][main][IgniteKernal] OS user: root
[21:53:43,091][INFO][main][IgniteKernal] PID: 97
[21:53:43,091][INFO][main][IgniteKernal] Language runtime: Java Platform API
Specification ver. 1.8
[21:53:43,091][INFO][main][IgniteKernal] VM information: OpenJDK Runtime
Environment 1.8.0_171-b11 Oracle Corporation OpenJDK 64-Bit Server VM
25.171-b11
[21:53:43,092][INFO][main][IgniteKernal] VM total memory: 3.0GB
[21:53:43,092][INFO][main][IgniteKernal] Remote Management [restart: on,
REST: on, JMX (remote: on, port: 49112, auth: off, ssl: off)]
[21:53:43,093][INFO][main][IgniteKernal] Logger: JavaLogger [quiet=false,
config=null]
[21:53:43,093][INFO][main][IgniteKernal]
IGNITE_HOME=/opt/ignite/apache-ignite-fabric
[21:53:43,093][INFO][main][IgniteKernal] VM arguments:
[-Djava.net.preferIPv4Stack=true, -XX:+UnlockExperimentalVMOptions,
-XX:+UseCGroupMemoryLimitForHeap, -XX:MaxDirectMemorySize=25g, -Xms3g,
-Xmx3g, -XX:+UseG1GC, -XX:+AlwaysPreTouch, -XX:+ScavengeBeforeFullGC,
-XX:+DisableExplicitGC, -DIGNITE_QUIET=false,
-DIGNITE_SUCCESS_FILE=/opt/ignite/apache-ignite-fabric/work/ignite_success_09f322c2-7864-444b-96fa-3dfbb5eef466,
-Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=49112,
-Dcom.sun.management.jmxremote.authenticate=false,
-Dcom.sun.management.jmxremote.ssl=false,
-DIGNITE_HOME=/opt/ignite/apache-ignite-fabric,
-DIGNITE_PROG_NAME=/opt/ignite/apache-ignite-fabric/bin/ignite.sh]
[21:53:43,094][INFO][main][IgniteKernal] System cache's DataRegion size is
configured to 40 MB. Use DataStorageConfiguration.systemCacheMemorySize
property to change the setting.
[21:53:43,094][INFO][main][IgniteKernal] Configured caches [in 'sysMemPlc'
dataRegion: ['ignite-sys-cache'], in 'Default_Region' dataRegion:
['buildcache']]
[21:53:43,097][INFO][main][IgniteKernal] 3-rd party licenses can be found
at: /opt/ignite/apache-ignite-fabric/libs/licenses
[21:53:43,161][INFO][main][IgnitePluginProcessor] Configured plugins:
[21:53:43,161][INFO][main][IgnitePluginProcessor]   ^-- None
[21:53:43,161][INFO][main][IgnitePluginProcessor]
[21:53:43,162][INFO][main][FailureProcessor] Configured failure handler:
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0]]
[21:53:43,199][INFO][main][TcpCommunicationSpi] Successfully bound
communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0,
selectorsCnt=4, selectorSpins=0, pairedConn=false]
[21:53:43,215][WARNING][main][NoopCheckpointSpi] Checkpoints are disabled
(to enable configure any GridCheckpointSpi implementation)
[21:53:43,236][WARNING][main][GridCollisionManager] Collision resolution is
disabled (all jobs will be activated upon arrival).
[21:53:43,236][INFO][main][IgniteKernal] Security status
[authentication=off, tls/ssl=off]
[21:53:43,497][INFO][main][ClientListenerProcessor] Client connector
processor has started on TCP port 10800
[21:53:43,534][INFO][main][GridTcpRestProtocol] Command protocol
successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
[21:53:43,565][INFO][main][IgniteKernal] Non-loopback local IPs: 10.42.69.5
[21:53:43,565][INFO][main][IgniteKernal] Enabled local MACs: 0AF321998484
[21:53:43,606][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port
[port=47500, localHost=0.0.0.0/0.0.0.0,
locNodeId=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8]
[21:53:44,011][WARNING][main][IgniteKernal] Nodes started on local machine
require more than 20% of physical RAM what can lead to significant slowdown
due to swapping (please decrease JVM heap size, data region size or
checkpoint buffer size) [required=27748MB, available=31366MB]
[21:53:44,059][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], crd=true,
evt=NODE_JOINED, evtNode=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8,
customEvt=null, allowMerge=false]
[21:53:44,399][INFO][exchange-worker-#42][GridCacheProcessor] Started cache
[name=ignite-sys-cache, id=-2100569601, memoryPolicyName=sysMemPlc,
mode=REPLICATED, atomicity=TRANSACTIONAL, backups=2147483647]
[21:53:44,402][WARNING][exchange-worker-#42][GridCacheProcessor] Cache write
synchronization mode is set to FULL_ASYNC. All single-key 'put' and 'remove'
operations will return 'null', all 'putx' and 'removex' operations will
return 'true' [cacheName=buildcache]
[21:53:44,466][INFO][exchange-worker-#42][GridCacheProcessor] Started cache
[name=buildcache, id=-467288620, memoryPolicyName=Default_Region,
mode=PARTITIONED, atomicity=ATOMIC, backups=0]
[21:53:44,503][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:53:44,505][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:53:44,683][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=1,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=1, minorTopVer=0]]
[21:53:44,705][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=1,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=1, minorTopVer=0],
err=null]
[21:53:44,714][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=0], crd=true]
[21:53:44,719][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=1, minorTopVer=0], evt=NODE_JOINED,
node=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8]
[21:53:44,805][INFO][main][IgniteKernal] Refer to this page for more
performance suggestions:
https://apacheignite.readme.io/docs/jvm-and-system-tuning
[21:53:44,805][INFO][main][IgniteKernal]
[21:53:44,805][INFO][main][IgniteKernal] To start Console Management &
Monitoring run ignitevisorcmd.{sh|bat}
[21:53:44,805][INFO][main][IgniteKernal]
[21:53:44,806][INFO][main][IgniteKernal]

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.6.0#20180710-sha1:669feacc5d3a4e60efcdd300dc8de99780f38eed
>>> +----------------------------------------------------------------------+
>>> OS name: Linux 4.17.17-200.fc28.x86_64 amd64
>>> CPU(s): 8
>>> Heap: 3.0GB
>>> VM name: 97@ignite-6cc588fd6f-dw78n
>>> Local node [ID=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, order=1,
>>> clientMode=false]
>>> Local node addresses: [ignite-6cc588fd6f-dw78n/10.42.69.5, /127.0.0.1]
>>> Local ports: TCP:10800 TCP:11211 TCP:47100 TCP:47500

[21:53:44,807][INFO][main][GridDiscoveryManager] Topology snapshot [ver=1,
servers=1, clients=0, CPUs=8, offheap=24.0GB, heap=3.0GB]
[21:53:44,807][INFO][main][GridDiscoveryManager]   ^-- Node
[id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:53:44,807][INFO][main][GridDiscoveryManager] Data Regions Configured:
[21:53:44,808][INFO][main][GridDiscoveryManager]   ^-- Default_Region
[initSize=24.0 GiB, maxSize=24.0 GiB, persistenceEnabled=false]
[21:53:54,018][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Your
version is up to date.
[21:54:01,813][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.41.210, rmtPort=50641]
[21:54:01,821][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.41.210, rmtPort=50641]
[21:54:01,821][INFO][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.41.210:50641, rmtPort=50641]
[21:54:01,823][INFO][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Received
ping request from the remote node
[rmtNodeId=abb0bd1b-4a2b-4cf6-90fc-ea223169d1e0,
rmtAddr=/10.42.41.210:50641, rmtPort=50641]
[21:54:01,825][INFO][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Finished
writing ping response [rmtNodeId=abb0bd1b-4a2b-4cf6-90fc-ea223169d1e0,
rmtAddr=/10.42.41.210:50641, rmtPort=50641]
[21:54:01,825][INFO][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.41.210:50641, rmtPort=50641
[21:54:03,049][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.58.43, rmtPort=39907]
[21:54:03,049][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.58.43, rmtPort=39907]
[21:54:03,050][INFO][tcp-disco-sock-reader-#6][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.58.43:39907, rmtPort=39907]
[21:54:03,068][INFO][tcp-disco-sock-reader-#6][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.58.43:39907, rmtPort=39907
[21:54:03,156][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.58.43, rmtPort=56747]
[21:54:03,157][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.58.43, rmtPort=56747]
[21:54:03,157][INFO][tcp-disco-sock-reader-#7][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.58.43:56747, rmtPort=56747]
[21:54:03,162][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
addrs=[10.42.58.43, 127.0.0.1], sockAddrs=[/10.42.58.43:47500,
/127.0.0.1:47500], discPort=47500, order=2, intOrder=2,
lastExchangeTime=1537653243051, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:03,162][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=2, servers=2, clients=0, CPUs=16, offheap=48.0GB, heap=6.0GB]
[21:54:03,162][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:03,162][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:03,163][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:03,164][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], crd=true,
evt=NODE_JOINED, evtNode=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
customEvt=null, allowMerge=true]
[21:54:03,164][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:03,617][INFO][grid-nio-worker-tcp-comm-0-#25][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.58.43:47398]
[21:54:03,646][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partitions release latch: ServerLatch [permits=0,
pendingAcks=[], super=CompletableLatch [id=exchange,
topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]]
[21:54:03,647][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:03,647][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], crd=true]
[21:54:03,675][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=2,
minorTopVer=0], node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae, allReceived=true]
[21:54:03,675][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Coordinator
received all messages, try merge [ver=AffinityTopologyVersion [topVer=2,
minorTopVer=0]]
[21:54:03,675][INFO][sys-#47][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=2,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]
[21:54:03,691][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Finish
exchange future [startVer=AffinityTopologyVersion [topVer=2, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], err=null]
[21:54:03,695][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=2, minorTopVer=0], evt=NODE_JOINED,
node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae]
[21:54:05,170][INFO][tcp-disco-sock-reader-#7][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.58.43:56747, rmtPort=56747
[21:54:05,251][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.33.231, rmtPort=41995]
[21:54:05,251][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.33.231, rmtPort=41995]
[21:54:05,251][INFO][tcp-disco-sock-reader-#8][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.33.231:41995, rmtPort=41995]
[21:54:05,255][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=dfda47e8-9b48-42e1-b2f2-81e660ded9ce,
addrs=[10.42.33.231, 127.0.0.1], sockAddrs=[/10.42.33.231:47500,
/127.0.0.1:47500], discPort=47500, order=3, intOrder=3,
lastExchangeTime=1537653245095, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:05,256][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=3, servers=3, clients=0, CPUs=24, offheap=72.0GB, heap=9.0GB]
[21:54:05,256][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:05,256][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:05,256][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:05,257][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=3, minorTopVer=0], crd=true,
evt=NODE_JOINED, evtNode=dfda47e8-9b48-42e1-b2f2-81e660ded9ce,
customEvt=null, allowMerge=true]
[21:54:05,257][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=3, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:05,735][INFO][grid-nio-worker-tcp-comm-1-#26][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.33.231:60236]
[21:54:05,752][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partitions release latch: ServerLatch [permits=0,
pendingAcks=[], super=CompletableLatch [id=exchange,
topVer=AffinityTopologyVersion [topVer=3, minorTopVer=0]]]
[21:54:05,752][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=3, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:05,752][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=3, minorTopVer=0], crd=true]
[21:54:05,757][INFO][sys-#45][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=3,
minorTopVer=0], node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
allReceived=false]
[21:54:05,758][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=3,
minorTopVer=0], node=dfda47e8-9b48-42e1-b2f2-81e660ded9ce, allReceived=true]
[21:54:05,761][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator
received all messages, try merge [ver=AffinityTopologyVersion [topVer=3,
minorTopVer=0]]
[21:54:05,761][INFO][sys-#46][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=3,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=3, minorTopVer=0]]
[21:54:05,773][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Finish
exchange future [startVer=AffinityTopologyVersion [topVer=3, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=3, minorTopVer=0], err=null]
[21:54:05,776][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=3, minorTopVer=0], evt=NODE_JOINED,
node=dfda47e8-9b48-42e1-b2f2-81e660ded9ce]
[21:54:06,166][INFO][tcp-disco-sock-reader-#8][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.33.231:41995, rmtPort=41995
[21:54:06,217][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.36.168, rmtPort=40153]
[21:54:06,217][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.36.168, rmtPort=40153]
[21:54:06,218][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.36.168:40153, rmtPort=40153]
[21:54:06,221][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=c562fb32-5ed6-4ad1-a8f8-6a95203dd519,
addrs=[10.42.36.168, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
/10.42.36.168:47500], discPort=47500, order=4, intOrder=4,
lastExchangeTime=1537653246143, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:06,222][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=4, servers=4, clients=0, CPUs=32, offheap=96.0GB, heap=12.0GB]
[21:54:06,222][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:06,222][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:06,222][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:06,222][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], crd=true,
evt=NODE_JOINED, evtNode=c562fb32-5ed6-4ad1-a8f8-6a95203dd519,
customEvt=null, allowMerge=true]
[21:54:06,223][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:06,281][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.36.168:40153, rmtPort=40153
[21:54:06,344][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.33.158, rmtPort=53895]
[21:54:06,344][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.33.158, rmtPort=53895]
[21:54:06,344][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.33.158:53895, rmtPort=53895]
[21:54:06,348][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=23c48f2c-571e-4b92-8b76-75e4658d7d77,
addrs=[10.42.33.158, 127.0.0.1], sockAddrs=[/10.42.33.158:47500,
/127.0.0.1:47500], discPort=47500, order=5, intOrder=5,
lastExchangeTime=1537653246254, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:06,348][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=5, servers=5, clients=0, CPUs=40, offheap=120.0GB,
heap=15.0GB]
[21:54:06,348][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:06,348][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:06,348][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:06,691][INFO][grid-nio-worker-tcp-comm-2-#27][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.36.168:36522]
[21:54:06,705][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partitions release latch: ServerLatch [permits=0,
pendingAcks=[], super=CompletableLatch [id=exchange,
topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0]]]
[21:54:06,705][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:06,705][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], crd=true]
[21:54:06,709][INFO][sys-#52][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=4,
minorTopVer=0], node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
allReceived=false]
[21:54:06,710][INFO][sys-#45][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=4,
minorTopVer=0], node=c562fb32-5ed6-4ad1-a8f8-6a95203dd519,
allReceived=false]
[21:54:06,710][INFO][sys-#53][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=4,
minorTopVer=0], node=dfda47e8-9b48-42e1-b2f2-81e660ded9ce, allReceived=true]
[21:54:06,758][INFO][sys-#53][GridDhtPartitionsExchangeFuture] Coordinator
received all messages, try merge [ver=AffinityTopologyVersion [topVer=4,
minorTopVer=0]]
[21:54:06,758][INFO][sys-#53][GridCachePartitionExchangeManager] Merge
exchange future [curFut=AffinityTopologyVersion [topVer=4, minorTopVer=0],
mergedFut=AffinityTopologyVersion [topVer=5, minorTopVer=0],
evt=NODE_JOINED, evtNode=23c48f2c-571e-4b92-8b76-75e4658d7d77,
evtNodeClient=false]
[21:54:06,758][INFO][sys-#53][GridDhtPartitionsExchangeFuture] Merge server
join exchange, wait for message [curFut=AffinityTopologyVersion [topVer=4,
minorTopVer=0], node=23c48f2c-571e-4b92-8b76-75e4658d7d77]
[21:54:06,832][INFO][grid-nio-worker-tcp-comm-3-#28][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.33.158:41558]
[21:54:06,854][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Merge server
join exchange, received message [curFut=AffinityTopologyVersion [topVer=4,
minorTopVer=0], node=23c48f2c-571e-4b92-8b76-75e4658d7d77,
msgVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], process=true,
awaited=1]
[21:54:06,854][INFO][sys-#47][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=4,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=5, minorTopVer=0]]
[21:54:06,868][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Finish
exchange future [startVer=AffinityTopologyVersion [topVer=4, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], err=null]
[21:54:06,871][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=5, minorTopVer=0], evt=NODE_JOINED,
node=c562fb32-5ed6-4ad1-a8f8-6a95203dd519]
[21:54:08,316][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.60.115, rmtPort=49065]
[21:54:08,316][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.60.115, rmtPort=49065]
[21:54:08,317][INFO][tcp-disco-sock-reader-#11][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.60.115:49065, rmtPort=49065]
[21:54:08,661][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=5, minorTopVer=1], crd=true,
evt=DISCOVERY_CUSTOM_EVT, evtNode=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8,
customEvt=CacheAffinityChangeMessage
[id=3ec26430661-55f9c16f-e3e4-4420-90dc-dc89e739996a,
topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], exchId=null,
partsMsg=null, exchangeNeeded=true], allowMerge=false]
[21:54:08,666][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=5, minorTopVer=1], waitTime=0ms,
futInfo=NA]
[21:54:08,715][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partitions release latch: ServerLatch [permits=0,
pendingAcks=[], super=CompletableLatch [id=exchange,
topVer=AffinityTopologyVersion [topVer=5, minorTopVer=1]]]
[21:54:08,716][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=5, minorTopVer=1], waitTime=0ms,
futInfo=NA]
[21:54:08,717][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=5, minorTopVer=1], crd=true]
[21:54:08,719][INFO][sys-#53][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=5,
minorTopVer=1], node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
allReceived=false]
[21:54:08,721][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=5,
minorTopVer=1], node=dfda47e8-9b48-42e1-b2f2-81e660ded9ce,
allReceived=false]
[21:54:08,721][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=5,
minorTopVer=1], node=23c48f2c-571e-4b92-8b76-75e4658d7d77,
allReceived=false]
[21:54:08,721][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=5,
minorTopVer=1], node=c562fb32-5ed6-4ad1-a8f8-6a95203dd519, allReceived=true]
[21:54:08,722][INFO][sys-#49][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=5,
minorTopVer=1], resVer=AffinityTopologyVersion [topVer=5, minorTopVer=1]]
[21:54:08,729][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Finish
exchange future [startVer=AffinityTopologyVersion [topVer=5, minorTopVer=1],
resVer=AffinityTopologyVersion [topVer=5, minorTopVer=1], err=null]
[21:54:08,799][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=5, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT,
node=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8]
[21:54:13,333][INFO][tcp-disco-sock-reader-#11][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.60.115:49065, rmtPort=49065
[21:54:19,353][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.16.120, rmtPort=55009]
[21:54:19,354][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.16.120, rmtPort=55009]
[21:54:19,354][INFO][tcp-disco-sock-reader-#12][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.16.120:55009, rmtPort=55009]
[21:54:19,357][INFO][tcp-disco-sock-reader-#12][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.16.120:55009, rmtPort=55009
[21:54:19,370][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.33.158:53895, rmtPort=53895
[21:54:19,445][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.60.115, rmtPort=43839]
[21:54:19,445][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.60.115, rmtPort=43839]
[21:54:19,445][INFO][tcp-disco-sock-reader-#13][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.60.115:43839, rmtPort=43839]
[21:54:19,448][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=375ce9ce-36c8-46d2-a4f8-41674df021ad,
addrs=[10.42.60.115, 127.0.0.1], sockAddrs=[/10.42.60.115:47500,
/127.0.0.1:47500], discPort=47500, order=6, intOrder=6,
lastExchangeTime=1537653248324, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:19,449][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=6, servers=6, clients=0, CPUs=48, offheap=140.0GB,
heap=18.0GB]
[21:54:19,449][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:19,449][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:19,449][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:19,450][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], crd=true,
evt=NODE_JOINED, evtNode=375ce9ce-36c8-46d2-a4f8-41674df021ad,
customEvt=null, allowMerge=true]
[21:54:19,450][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:19,456][INFO][tcp-disco-sock-reader-#13][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.60.115:43839, rmtPort=43839
[21:54:19,529][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.16.120, rmtPort=43151]
[21:54:19,529][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.16.120, rmtPort=43151]
[21:54:19,530][INFO][tcp-disco-sock-reader-#14][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.16.120:43151, rmtPort=43151]
[21:54:19,574][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=edad0b96-c58d-43f4-8511-533ddfae1069,
addrs=[10.42.16.120, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
/10.42.16.120:47500], discPort=47500, order=7, intOrder=7,
lastExchangeTime=1537653249302, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:19,575][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=7, servers=7, clients=0, CPUs=56, offheap=170.0GB,
heap=21.0GB]
[21:54:19,575][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:19,575][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:19,575][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:19,772][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.47.29, rmtPort=45773]
[21:54:19,772][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.47.29, rmtPort=45773]
[21:54:19,773][INFO][tcp-disco-sock-reader-#15][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.47.29:45773, rmtPort=45773]
[21:54:19,775][INFO][tcp-disco-sock-reader-#15][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.47.29:45773, rmtPort=45773
[21:54:20,058][INFO][grid-nio-worker-tcp-comm-0-#25][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.60.115:49612]
[21:54:20,074][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partitions release latch: ServerLatch [permits=0,
pendingAcks=[], super=CompletableLatch [id=exchange,
topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0]]]
[21:54:20,074][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:20,074][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], crd=true]
[21:54:20,157][INFO][sys-#53][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
allReceived=false]
[21:54:20,158][INFO][sys-#47][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=dfda47e8-9b48-42e1-b2f2-81e660ded9ce,
allReceived=false]
[21:54:20,158][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=c562fb32-5ed6-4ad1-a8f8-6a95203dd519,
allReceived=false]
[21:54:20,159][INFO][sys-#50][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=375ce9ce-36c8-46d2-a4f8-41674df021ad,
allReceived=false]
[21:54:20,159][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=23c48f2c-571e-4b92-8b76-75e4658d7d77, allReceived=true]
[21:54:20,159][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Coordinator
received all messages, try merge [ver=AffinityTopologyVersion [topVer=6,
minorTopVer=0]]
[21:54:20,159][INFO][sys-#49][GridCachePartitionExchangeManager] Merge
exchange future [curFut=AffinityTopologyVersion [topVer=6, minorTopVer=0],
mergedFut=AffinityTopologyVersion [topVer=7, minorTopVer=0],
evt=NODE_JOINED, evtNode=edad0b96-c58d-43f4-8511-533ddfae1069,
evtNodeClient=false]
[21:54:20,160][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Merge server
join exchange, wait for message [curFut=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=edad0b96-c58d-43f4-8511-533ddfae1069]
[21:54:29,663][INFO][tcp-disco-sock-reader-#14][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.16.120:43151, rmtPort=43151
[21:54:29,810][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.47.29, rmtPort=40445]
[21:54:29,810][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.47.29, rmtPort=40445]
[21:54:29,810][INFO][tcp-disco-sock-reader-#16][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.47.29:40445, rmtPort=40445]
[21:54:29,815][INFO][disco-event-worker-#41][GridDiscoveryManager] Added new
node to topology: TcpDiscoveryNode [id=f879f812-c1bc-43bf-a4f4-95af9aa6d647,
addrs=[10.42.47.29, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
/10.42.47.29:47500], discPort=47500, order=8, intOrder=8,
lastExchangeTime=1537653254313, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false]
[21:54:29,815][INFO][disco-event-worker-#41][GridDiscoveryManager] Topology
snapshot [ver=8, servers=8, clients=0, CPUs=64, offheap=190.0GB,
heap=24.0GB]
[21:54:29,815][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Node [id=C0377E70-4C1F-4BCE-A1E9-CF8CDAD84FC8, clusterState=ACTIVE]
[21:54:29,815][INFO][disco-event-worker-#41][GridDiscoveryManager] Data
Regions Configured:
[21:54:29,815][INFO][disco-event-worker-#41][GridDiscoveryManager]   ^--
Default_Region [initSize=24.0 GiB, maxSize=24.0 GiB,
persistenceEnabled=false]
[21:54:29,818][INFO][tcp-disco-sock-reader-#16][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.47.29:40445, rmtPort=40445
[21:54:30,075][WARNING][exchange-worker-#42][diagnostic] Failed to wait for
partition map exchange [topVer=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8]. Consider
changing TransactionConfiguration.txTimeoutOnPartitionMapSynchronization to
non default value to avoid this message. Dumping pending objects that might
be the cause:
[21:54:30,075][WARNING][exchange-worker-#42][diagnostic] Ready affinity
version: AffinityTopologyVersion [topVer=5, minorTopVer=1]
[21:54:30,077][WARNING][exchange-worker-#42][diagnostic] Last exchange
future: GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent
[evtNode=TcpDiscoveryNode [id=375ce9ce-36c8-46d2-a4f8-41674df021ad,
addrs=[10.42.60.115, 127.0.0.1], sockAddrs=[/10.42.60.115:47500,
/127.0.0.1:47500], discPort=47500, order=6, intOrder=6,
lastExchangeTime=1537653248324, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false], topVer=6, nodeId8=c0377e70, msg=Node joined:
TcpDiscoveryNode [id=375ce9ce-36c8-46d2-a4f8-41674df021ad,
addrs=[10.42.60.115, 127.0.0.1], sockAddrs=[/10.42.60.115:47500,
/127.0.0.1:47500], discPort=47500, order=6, intOrder=6,
lastExchangeTime=1537653248324, loc=false, ver=2.6.0#20180710-sha1:669feacc,
isClient=false], type=NODE_JOINED, tstamp=1537653259449],
crd=TcpDiscoveryNode [id=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8,
addrs=[10.42.69.5, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
ignite-6cc588fd6f-dw78n/10.42.69.5:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1537653269848, loc=true,
ver=2.6.0#20180710-sha1:669feacc, isClient=false],
exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6,
minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode
[id=375ce9ce-36c8-46d2-a4f8-41674df021ad, addrs=[10.42.60.115, 127.0.0.1],
sockAddrs=[/10.42.60.115:47500, /127.0.0.1:47500], discPort=47500, order=6,
intOrder=6, lastExchangeTime=1537653248324, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], topVer=6,
nodeId8=c0377e70, msg=Node joined: TcpDiscoveryNode
[id=375ce9ce-36c8-46d2-a4f8-41674df021ad, addrs=[10.42.60.115, 127.0.0.1],
sockAddrs=[/10.42.60.115:47500, /127.0.0.1:47500], discPort=47500, order=6,
intOrder=6, lastExchangeTime=1537653248324, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], type=NODE_JOINED,
tstamp=1537653259449], nodeId=375ce9ce, evt=NODE_JOINED], added=true,
initFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=true,
hash=1593307370], init=true, lastVer=GridCacheVersion [topVer=0,
order=1537653247737, nodeOrder=0], partReleaseFut=PartitionReleaseFuture
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0],
futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion
[topVer=6, minorTopVer=0], futures=[]], AtomicUpdateReleaseFuture
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], futures=[]],
DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=6,
minorTopVer=0], futures=[]], LocalTxReleaseFuture
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], futures=[]],
AllTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=6,
minorTopVer=0], futures=[RemoteTxReleaseFuture
[topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], futures=[]]]]]],
exchActions=null, affChangeMsg=null, initTs=1537653259449,
centralizedAff=false, forceAffReassignment=false, changeGlobalStateE=null,
done=false, state=CRD, evtLatch=0, remaining=[], super=GridFutureAdapter
[ignoreInterrupts=false, state=INIT, res=null, hash=763033251]]
[21:54:30,077][WARNING][exchange-worker-#42][GridCachePartitionExchangeManager]
First 10 pending exchange futures [total=2]
[21:54:30,078][WARNING][exchange-worker-#42][GridCachePartitionExchangeManager]
>>> GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
[topVer=7, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=edad0b96-c58d-43f4-8511-533ddfae1069, addrs=[10.42.16.120, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, /10.42.16.120:47500], discPort=47500, order=7,
intOrder=7, lastExchangeTime=1537653249302, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=false]
[21:54:30,078][WARNING][exchange-worker-#42][GridCachePartitionExchangeManager]
>>> GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
[topVer=8, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=f879f812-c1bc-43bf-a4f4-95af9aa6d647, addrs=[10.42.47.29, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, /10.42.47.29:47500], discPort=47500, order=8,
intOrder=8, lastExchangeTime=1537653254313, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=false]
[21:54:30,079][WARNING][exchange-worker-#42][diagnostic] Last 10 exchange
futures (total: 9):
[21:54:30,079][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=8,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=f879f812-c1bc-43bf-a4f4-95af9aa6d647, addrs=[10.42.47.29, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, /10.42.47.29:47500], discPort=47500, order=8,
intOrder=8, lastExchangeTime=1537653254313, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=false]
[21:54:30,080][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=7,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=edad0b96-c58d-43f4-8511-533ddfae1069, addrs=[10.42.16.120, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, /10.42.16.120:47500], discPort=47500, order=7,
intOrder=7, lastExchangeTime=1537653249302, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=false]
[21:54:30,080][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=6,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=375ce9ce-36c8-46d2-a4f8-41674df021ad, addrs=[10.42.60.115, 127.0.0.1],
sockAddrs=[/10.42.60.115:47500, /127.0.0.1:47500], discPort=47500, order=6,
intOrder=6, lastExchangeTime=1537653248324, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=false]
[21:54:30,080][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=5,
minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, evtNode=TcpDiscoveryNode
[id=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8, addrs=[10.42.69.5, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, ignite-6cc588fd6f-dw78n/10.42.69.5:47500],
discPort=47500, order=1, intOrder=1, lastExchangeTime=1537653270070,
loc=true, ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=true]
[21:54:30,080][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=5,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=23c48f2c-571e-4b92-8b76-75e4658d7d77, addrs=[10.42.33.158, 127.0.0.1],
sockAddrs=[/10.42.33.158:47500, /127.0.0.1:47500], discPort=47500, order=5,
intOrder=5, lastExchangeTime=1537653246254, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=true]
[21:54:30,081][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=4,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=c562fb32-5ed6-4ad1-a8f8-6a95203dd519, addrs=[10.42.36.168, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, /10.42.36.168:47500], discPort=47500, order=4,
intOrder=4, lastExchangeTime=1537653246143, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=true]
[21:54:30,081][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=3,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=dfda47e8-9b48-42e1-b2f2-81e660ded9ce, addrs=[10.42.33.231, 127.0.0.1],
sockAddrs=[/10.42.33.231:47500, /127.0.0.1:47500], discPort=47500, order=3,
intOrder=3, lastExchangeTime=1537653245095, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=true]
[21:54:30,081][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=2,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=2e4b4b61-4d24-447a-ad6d-d9d560c65eae, addrs=[10.42.58.43, 127.0.0.1],
sockAddrs=[/10.42.58.43:47500, /127.0.0.1:47500], discPort=47500, order=2,
intOrder=2, lastExchangeTime=1537653243051, loc=false,
ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=true]
[21:54:30,082][WARNING][exchange-worker-#42][diagnostic] >>>
GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=1,
minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8, addrs=[10.42.69.5, 127.0.0.1],
sockAddrs=[/127.0.0.1:47500, ignite-6cc588fd6f-dw78n/10.42.69.5:47500],
discPort=47500, order=1, intOrder=1, lastExchangeTime=1537653270070,
loc=true, ver=2.6.0#20180710-sha1:669feacc, isClient=false], done=true]
[21:54:30,082][WARNING][exchange-worker-#42][diagnostic] Latch manager
state: ExchangeLatchManager [serverLatches={}, clientLatches={}]
[21:54:30,082][WARNING][exchange-worker-#42][diagnostic] Pending
transactions:
[21:54:30,082][WARNING][exchange-worker-#42][diagnostic] Pending explicit
locks:
[21:54:30,082][WARNING][exchange-worker-#42][diagnostic] Pending cache
futures:
[21:54:30,083][WARNING][exchange-worker-#42][diagnostic] Pending atomic
cache futures:
[21:54:30,083][WARNING][exchange-worker-#42][diagnostic] Pending data
streamer futures:
[21:54:30,083][WARNING][exchange-worker-#42][diagnostic] Pending transaction
deadlock detection futures:
[21:54:30,217][INFO][grid-nio-worker-tcp-comm-1-#26][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.16.120:35696]
[21:54:30,235][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Merge server
join exchange, received message [curFut=AffinityTopologyVersion [topVer=6,
minorTopVer=0], node=edad0b96-c58d-43f4-8511-533ddfae1069,
msgVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], process=true,
awaited=1]
[21:54:30,235][INFO][sys-#49][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=6,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=7, minorTopVer=0]]
[21:54:30,244][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Finish
exchange future [startVer=AffinityTopologyVersion [topVer=6, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], err=null]
[21:54:30,247][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (obsolete exchange ID) [top=AffinityTopologyVersion
[topVer=7, minorTopVer=0], evt=NODE_JOINED,
node=375ce9ce-36c8-46d2-a4f8-41674df021ad]
[21:54:30,247][INFO][exchange-worker-#42][time] Started exchange init
[topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], crd=true,
evt=NODE_JOINED, evtNode=f879f812-c1bc-43bf-a4f4-95af9aa6d647,
customEvt=null, allowMerge=true]
[21:54:30,247][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:30,571][INFO][grid-nio-worker-tcp-comm-2-#27][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/10.42.69.5:47100,
rmtAddr=/10.42.47.29:36634]
[21:54:30,585][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partitions release latch: ServerLatch [permits=0,
pendingAcks=[], super=CompletableLatch [id=exchange,
topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0]]]
[21:54:30,585][INFO][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
Finished waiting for partition release future
[topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], waitTime=0ms,
futInfo=NA]
[21:54:30,585][INFO][exchange-worker-#42][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], crd=true]
[21:54:30,658][INFO][sys-#49][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=23c48f2c-571e-4b92-8b76-75e4658d7d77,
allReceived=false]
[21:54:30,658][INFO][sys-#52][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=dfda47e8-9b48-42e1-b2f2-81e660ded9ce,
allReceived=false]
[21:54:30,659][INFO][sys-#46][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=edad0b96-c58d-43f4-8511-533ddfae1069,
allReceived=false]
[21:54:30,659][INFO][sys-#51][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=2e4b4b61-4d24-447a-ad6d-d9d560c65eae,
allReceived=false]
[21:54:30,659][INFO][sys-#45][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=c562fb32-5ed6-4ad1-a8f8-6a95203dd519,
allReceived=false]
[21:54:30,659][INFO][sys-#53][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=375ce9ce-36c8-46d2-a4f8-41674df021ad,
allReceived=false]
[21:54:30,659][INFO][sys-#50][GridDhtPartitionsExchangeFuture] Coordinator
received single message [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0], node=f879f812-c1bc-43bf-a4f4-95af9aa6d647, allReceived=true]
[21:54:30,660][INFO][sys-#50][GridDhtPartitionsExchangeFuture] Coordinator
received all messages, try merge [ver=AffinityTopologyVersion [topVer=8,
minorTopVer=0]]
[21:54:30,660][INFO][sys-#50][GridDhtPartitionsExchangeFuture]
finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=8,
minorTopVer=0], resVer=AffinityTopologyVersion [topVer=8, minorTopVer=0]]
[21:54:30,668][INFO][sys-#50][GridDhtPartitionsExchangeFuture] Finish
exchange future [startVer=AffinityTopologyVersion [topVer=8, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], err=null]
[21:54:30,671][INFO][exchange-worker-#42][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=8, minorTopVer=0], evt=NODE_JOINED,
node=f879f812-c1bc-43bf-a4f4-95af9aa6d647]
[21:54:40,438][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/10.42.58.43, rmtPort=55679]
[21:54:40,438][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
spawning a new thread for connection [rmtAddr=/10.42.58.43, rmtPort=55679]
[21:54:40,438][INFO][tcp-disco-sock-reader-#17][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/10.42.58.43:55679, rmtPort=55679]
[21:54:40,440][WARNING][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Node is
out of topology (probably, due to short-time network problems).
[21:54:40,440][INFO][tcp-disco-sock-reader-#17][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/10.42.58.43:55679, rmtPort=55679
[21:54:40,441][WARNING][disco-event-worker-#41][GridDiscoveryManager] Local
node SEGMENTED: TcpDiscoveryNode [id=c0377e70-4c1f-4bce-a1e9-cf8cdad84fc8,
addrs=[10.42.69.5, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
ignite-6cc588fd6f-dw78n/10.42.69.5:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1537653280433, loc=true,
ver=2.6.0#20180710-sha1:669feacc, isClient=false]
[21:54:40,442][SEVERE][tcp-disco-srvr-#3][] Critical system error detected.
Will be handled accordingly to configured handler [hnd=class
o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext
[type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Thread
tcp-disco-srvr-#3 is terminated unexpectedly.]]
java.lang.IllegalStateException: Thread tcp-disco-srvr-#3 is terminated
unexpectedly.
	at
org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:5686)
	at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
[21:54:40,443][SEVERE][tcp-disco-srvr-#3][] JVM will be halted immediately
due to the failure: [failureCtx=FailureContext
[type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Thread
tcp-disco-srvr-#3 is terminated unexpectedly.]]
```



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by "Maxim.Pudov" <pu...@gmail.com>.
Hi, 
Can you share the full logs from all nodes? Looks like everything should
work just fine.

Maxim.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Unreasonable segmentation in Kubernetes on one node reboot

Posted by Kopilov <ko...@gmail.com>.
For some reason, Nabble has not displayed formatting, sorry, but it was
displayed in preview.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/