You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by vkulichenko <va...@gmail.com> on 2016/08/01 20:37:08 UTC

Re: Ignite Cluster node stopped

Hi,

Generally, 60GB is too much. Can you try to give about 10GB of heap and
switch caches to off-heap mode [1]?

[1] https://apacheignite.readme.io/docs/off-heap-memory

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6662.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
I attach gc log in ignite server node. gc-logs.bz2
<http://apache-ignite-users.70518.x6.nabble.com/file/n6758/gc-logs.bz2>  



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6758.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
Ok.
I attach IGNITE_HOME/work/logs in all server nodes logs when a server node
stoped.
The stoped server node ip is 172.20.0.183. ignite-logs.bz2
<http://apache-ignite-users.70518.x6.nabble.com/file/n6757/ignite-logs.bz2>  



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6757.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hello,

1) It is my mistake. I mean what a log level set as DEBUG for the package
org.apache.ignite.spi.discovery.

2) It the cluster group (172.21.0.181 and 172.21.0.183) was segmented again
at the time?
I think problem of network will do not stable.

3) Unfortunately, in my opinion, this problem bounds with system settings,
but do not with Ignite version.
However upgrade version will be good idea.

*About segmentation:*

*1. Log file from each nodes (including GC log)*

*2. System monitoring from each nodes and during all time (using dstat -t
--top-mem -m -s -g -d --fs --top-io-adv). We are need make sure, which we
have not delay by reason of memory swapping or IO activity.*

*3. Network monitoring for all time . You can to do it through ping, as you
do it before.*

Please, attach all the logs as soon the issue is repeated.

On Thu, Aug 11, 2016 at 1:58 PM, suhuadong <su...@163.com> wrote:

> hi,
> 1:How to add additional logger to the Discovery classes:
> org.apache.ignite.spi.discovery?
>
> 2:
>  NetWork is ok.
>  Before all server nodes stoped,I deploy shell script that  ping
> 172.21.0.181 from 172.21.0.183 and 172.21.0.183.
> log:
> 64 bytes from 172.21.0.181: icmp_seq=51251 ttl=64 time=0.141 ms 14:14:15
> 64 bytes from 172.21.0.181: icmp_seq=51296 ttl=64 time=0.196 ms 14:15:00
> 64 bytes from 172.21.0.181: icmp_seq=51297 ttl=64 time=0.160 ms 14:15:01
> 64 bytes from 172.21.0.181: icmp_seq=51298 ttl=64 time=0.199 ms 14:15:02
> 64 bytes from 172.21.0.181: icmp_seq=51299 ttl=64 time=0.205 ms 14:15:03
> 64 bytes from 172.21.0.181: icmp_seq=51300 ttl=64 time=0.191 ms 14:15:04
> 64 bytes from 172.21.0.181: icmp_seq=51301 ttl=64 time=0.200 ms 14:15:05
> 64 bytes from 172.21.0.181: icmp_seq=51302 ttl=64 time=0.190 ms 14:15:06
> 64 bytes from 172.21.0.181: icmp_seq=51303 ttl=64 time=0.195 ms 14:15:07
> 64 bytes from 172.21.0.181: icmp_seq=51304 ttl=64 time=0.209 ms 14:15:08
> 64 bytes from 172.21.0.181: icmp_seq=51305 ttl=64 time=0.204 ms 14:15:09
> 64 bytes from 172.21.0.181: icmp_seq=51306 ttl=64 time=0.208 ms 14:15:10
> 64 bytes from 172.21.0.181: icmp_seq=51307 ttl=64 time=0.206 ms 14:15:11
> 64 bytes from 172.21.0.181: icmp_seq=51308 ttl=64 time=0.231 ms 14:15:12
> 64 bytes from 172.21.0.181: icmp_seq=51309 ttl=64 time=0.251 ms 14:15:13
> 64 bytes from 172.21.0.181: icmp_seq=51310 ttl=64 time=0.213 ms 14:15:14
> 64 bytes from 172.21.0.181: icmp_seq=51311 ttl=64 time=0.240 ms 14:15:15
> 64 bytes from 172.21.0.181: icmp_seq=51312 ttl=64 time=0.250 ms 14:15:16
> 64 bytes from 172.21.0.181: icmp_seq=51313 ttl=64 time=0.227 ms 14:15:17
> 64 bytes from 172.21.0.181: icmp_seq=51314 ttl=64 time=0.236 ms 14:15:18
> 64 bytes from 172.21.0.181: icmp_seq=51315 ttl=64 time=0.253 ms 14:15:19
> 64 bytes from 172.21.0.181: icmp_seq=51316 ttl=64 time=0.138 ms 14:15:20
> 64 bytes from 172.21.0.181: icmp_seq=51317 ttl=64 time=0.139 ms 14:15:21
> 64 bytes from 172.21.0.181: icmp_seq=51318 ttl=64 time=0.180 ms 14:15:22
> 64 bytes from 172.21.0.181: icmp_seq=51319 ttl=64 time=0.131 ms 14:15:23
> 64 bytes from 172.21.0.181: icmp_seq=51320 ttl=64 time=0.187 ms 14:15:24
> 64 bytes from 172.21.0.181: icmp_seq=51321 ttl=64 time=0.134 ms 14:15:25
> 64 bytes from 172.21.0.181: icmp_seq=51322 ttl=64 time=0.132 ms 14:15:26
> 64 bytes from 172.21.0.181: icmp_seq=51323 ttl=64 time=0.133 ms 14:15:27
> 64 bytes from 172.21.0.181: icmp_seq=51324 ttl=64 time=0.153 ms 14:15:28
> 64 bytes from 172.21.0.181: icmp_seq=51325 ttl=64 time=0.172 ms 14:15:29
> 64 bytes from 172.21.0.181: icmp_seq=51326 ttl=64 time=0.170 ms 14:15:30
> 64 bytes from 172.21.0.181: icmp_seq=51327 ttl=64 time=0.172 ms 14:15:31
> 64 bytes from 172.21.0.181: icmp_seq=51328 ttl=64 time=0.170 ms 14:15:32
> 64 bytes from 172.21.0.181: icmp_seq=51329 ttl=64 time=0.175 ms 14:15:33
> 64 bytes from 172.21.0.181: icmp_seq=51330 ttl=64 time=0.169 ms 14:15:34
> 64 bytes from 172.21.0.181: icmp_seq=51331 ttl=64 time=0.191 ms 14:15:35
> 64 bytes from 172.21.0.181: icmp_seq=51332 ttl=64 time=0.171 ms 14:15:36
> 64 bytes from 172.21.0.181: icmp_seq=51333 ttl=64 time=0.192 ms 14:15:37
> 64 bytes from 172.21.0.181: icmp_seq=51334 ttl=64 time=0.190 ms 14:15:38
> 64 bytes from 172.21.0.181: icmp_seq=51335 ttl=64 time=0.188 ms 14:15:39
> 64 bytes from 172.21.0.181: icmp_seq=51336 ttl=64 time=0.189 ms 14:15:40
> 64 bytes from 172.21.0.181: icmp_seq=51337 ttl=64 time=0.179 ms 14:15:41
> 64 bytes from 172.21.0.181: icmp_seq=51338 ttl=64 time=0.175 ms 14:15:42
> 64 bytes from 172.21.0.181: icmp_seq=51339 ttl=64 time=0.171 ms 14:15:43
> 64 bytes from 172.21.0.181: icmp_seq=51340 ttl=64 time=0.146 ms 14:15:44
> 64 bytes from 172.21.0.181: icmp_seq=51341 ttl=64 time=0.136 ms 14:15:45
> 64 bytes from 172.21.0.181: icmp_seq=51342 ttl=64 time=0.124 ms 14:15:46
> 64 bytes from 172.21.0.181: icmp_seq=51343 ttl=64 time=0.130 ms 14:15:47
> 64 bytes from 172.21.0.181: icmp_seq=51344 ttl=64 time=0.136 ms 14:15:48
> 64 bytes from 172.21.0.181: icmp_seq=51345 ttl=64 time=0.129 ms 14:15:49
> 64 bytes from 172.21.0.181: icmp_seq=51346 ttl=64 time=0.175 ms 14:15:50
> 64 bytes from 172.21.0.181: icmp_seq=51347 ttl=64 time=0.168 ms 14:15:51
> 64 bytes from 172.21.0.181: icmp_seq=51348 ttl=64 time=0.138 ms 14:15:52
> 64 bytes from 172.21.0.181: icmp_seq=51349 ttl=64 time=0.144 ms 14:15:53
> 64 bytes from 172.21.0.181: icmp_seq=51350 ttl=64 time=0.152 ms 14:15:54
> 64 bytes from 172.21.0.181: icmp_seq=51351 ttl=64 time=0.135 ms 14:15:55
> 64 bytes from 172.21.0.181: icmp_seq=51352 ttl=64 time=0.134 ms 14:15:56
> 64 bytes from 172.21.0.181: icmp_seq=51353 ttl=64 time=0.133 ms 14:15:57
> 64 bytes from 172.21.0.181: icmp_seq=51354 ttl=64 time=0.177 ms 14:15:58
> 64 bytes from 172.21.0.181: icmp_seq=51355 ttl=64 time=0.127 ms 14:15:59
>
> 3:Faced with the problem,I plan to upgrade ignite from 1.6 to 1.7。
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6962.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
hi,
1:How to add additional logger to the Discovery classes:
org.apache.ignite.spi.discovery?

2:
 NetWork is ok.
 Before all server nodes stoped,I deploy shell script that  ping
172.21.0.181 from 172.21.0.183 and 172.21.0.183.   
log:
64 bytes from 172.21.0.181: icmp_seq=51251 ttl=64 time=0.141 ms 14:14:15
64 bytes from 172.21.0.181: icmp_seq=51296 ttl=64 time=0.196 ms 14:15:00
64 bytes from 172.21.0.181: icmp_seq=51297 ttl=64 time=0.160 ms 14:15:01
64 bytes from 172.21.0.181: icmp_seq=51298 ttl=64 time=0.199 ms 14:15:02
64 bytes from 172.21.0.181: icmp_seq=51299 ttl=64 time=0.205 ms 14:15:03
64 bytes from 172.21.0.181: icmp_seq=51300 ttl=64 time=0.191 ms 14:15:04
64 bytes from 172.21.0.181: icmp_seq=51301 ttl=64 time=0.200 ms 14:15:05
64 bytes from 172.21.0.181: icmp_seq=51302 ttl=64 time=0.190 ms 14:15:06
64 bytes from 172.21.0.181: icmp_seq=51303 ttl=64 time=0.195 ms 14:15:07
64 bytes from 172.21.0.181: icmp_seq=51304 ttl=64 time=0.209 ms 14:15:08
64 bytes from 172.21.0.181: icmp_seq=51305 ttl=64 time=0.204 ms 14:15:09
64 bytes from 172.21.0.181: icmp_seq=51306 ttl=64 time=0.208 ms 14:15:10
64 bytes from 172.21.0.181: icmp_seq=51307 ttl=64 time=0.206 ms 14:15:11
64 bytes from 172.21.0.181: icmp_seq=51308 ttl=64 time=0.231 ms 14:15:12
64 bytes from 172.21.0.181: icmp_seq=51309 ttl=64 time=0.251 ms 14:15:13
64 bytes from 172.21.0.181: icmp_seq=51310 ttl=64 time=0.213 ms 14:15:14
64 bytes from 172.21.0.181: icmp_seq=51311 ttl=64 time=0.240 ms 14:15:15
64 bytes from 172.21.0.181: icmp_seq=51312 ttl=64 time=0.250 ms 14:15:16
64 bytes from 172.21.0.181: icmp_seq=51313 ttl=64 time=0.227 ms 14:15:17
64 bytes from 172.21.0.181: icmp_seq=51314 ttl=64 time=0.236 ms 14:15:18
64 bytes from 172.21.0.181: icmp_seq=51315 ttl=64 time=0.253 ms 14:15:19
64 bytes from 172.21.0.181: icmp_seq=51316 ttl=64 time=0.138 ms 14:15:20
64 bytes from 172.21.0.181: icmp_seq=51317 ttl=64 time=0.139 ms 14:15:21
64 bytes from 172.21.0.181: icmp_seq=51318 ttl=64 time=0.180 ms 14:15:22
64 bytes from 172.21.0.181: icmp_seq=51319 ttl=64 time=0.131 ms 14:15:23
64 bytes from 172.21.0.181: icmp_seq=51320 ttl=64 time=0.187 ms 14:15:24
64 bytes from 172.21.0.181: icmp_seq=51321 ttl=64 time=0.134 ms 14:15:25
64 bytes from 172.21.0.181: icmp_seq=51322 ttl=64 time=0.132 ms 14:15:26
64 bytes from 172.21.0.181: icmp_seq=51323 ttl=64 time=0.133 ms 14:15:27
64 bytes from 172.21.0.181: icmp_seq=51324 ttl=64 time=0.153 ms 14:15:28
64 bytes from 172.21.0.181: icmp_seq=51325 ttl=64 time=0.172 ms 14:15:29
64 bytes from 172.21.0.181: icmp_seq=51326 ttl=64 time=0.170 ms 14:15:30
64 bytes from 172.21.0.181: icmp_seq=51327 ttl=64 time=0.172 ms 14:15:31
64 bytes from 172.21.0.181: icmp_seq=51328 ttl=64 time=0.170 ms 14:15:32
64 bytes from 172.21.0.181: icmp_seq=51329 ttl=64 time=0.175 ms 14:15:33
64 bytes from 172.21.0.181: icmp_seq=51330 ttl=64 time=0.169 ms 14:15:34
64 bytes from 172.21.0.181: icmp_seq=51331 ttl=64 time=0.191 ms 14:15:35
64 bytes from 172.21.0.181: icmp_seq=51332 ttl=64 time=0.171 ms 14:15:36
64 bytes from 172.21.0.181: icmp_seq=51333 ttl=64 time=0.192 ms 14:15:37
64 bytes from 172.21.0.181: icmp_seq=51334 ttl=64 time=0.190 ms 14:15:38
64 bytes from 172.21.0.181: icmp_seq=51335 ttl=64 time=0.188 ms 14:15:39
64 bytes from 172.21.0.181: icmp_seq=51336 ttl=64 time=0.189 ms 14:15:40
64 bytes from 172.21.0.181: icmp_seq=51337 ttl=64 time=0.179 ms 14:15:41
64 bytes from 172.21.0.181: icmp_seq=51338 ttl=64 time=0.175 ms 14:15:42
64 bytes from 172.21.0.181: icmp_seq=51339 ttl=64 time=0.171 ms 14:15:43
64 bytes from 172.21.0.181: icmp_seq=51340 ttl=64 time=0.146 ms 14:15:44
64 bytes from 172.21.0.181: icmp_seq=51341 ttl=64 time=0.136 ms 14:15:45
64 bytes from 172.21.0.181: icmp_seq=51342 ttl=64 time=0.124 ms 14:15:46
64 bytes from 172.21.0.181: icmp_seq=51343 ttl=64 time=0.130 ms 14:15:47
64 bytes from 172.21.0.181: icmp_seq=51344 ttl=64 time=0.136 ms 14:15:48
64 bytes from 172.21.0.181: icmp_seq=51345 ttl=64 time=0.129 ms 14:15:49
64 bytes from 172.21.0.181: icmp_seq=51346 ttl=64 time=0.175 ms 14:15:50
64 bytes from 172.21.0.181: icmp_seq=51347 ttl=64 time=0.168 ms 14:15:51
64 bytes from 172.21.0.181: icmp_seq=51348 ttl=64 time=0.138 ms 14:15:52
64 bytes from 172.21.0.181: icmp_seq=51349 ttl=64 time=0.144 ms 14:15:53
64 bytes from 172.21.0.181: icmp_seq=51350 ttl=64 time=0.152 ms 14:15:54
64 bytes from 172.21.0.181: icmp_seq=51351 ttl=64 time=0.135 ms 14:15:55
64 bytes from 172.21.0.181: icmp_seq=51352 ttl=64 time=0.134 ms 14:15:56
64 bytes from 172.21.0.181: icmp_seq=51353 ttl=64 time=0.133 ms 14:15:57
64 bytes from 172.21.0.181: icmp_seq=51354 ttl=64 time=0.177 ms 14:15:58
64 bytes from 172.21.0.181: icmp_seq=51355 ttl=64 time=0.127 ms 14:15:59

3:Faced with the problem,I plan to upgrade ignite from 1.6 to 1.7。



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6962.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hello,

This is seems to be a network issue.
Please, add additional logger to the Discovery classes:
org.apache.ignite.spi.discovery

Also, check a network, when topology starts to segmented.

On Wed, Aug 10, 2016 at 2:12 PM, suhuadong <su...@163.com> wrote:

> Hi,
> Today,All server nodes stoped.It take place at Aug 10, 2016; 14:15.
>
> ignite-logs: ignite-logs.bz2
> <http://apache-ignite-users.70518.x6.nabble.com/file/n6917/ignite-logs.bz2
> >
>
>
> gc-logs: gc-logs.bz2
> <http://apache-ignite-users.70518.x6.nabble.com/file/n6917/gc-logs.bz2>
>
>
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6917.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
Hi,
Today,All server nodes stoped.It take place at Aug 10, 2016; 14:15.

ignite-logs: ignite-logs.bz2
<http://apache-ignite-users.70518.x6.nabble.com/file/n6917/ignite-logs.bz2>  


gc-logs: gc-logs.bz2
<http://apache-ignite-users.70518.x6.nabble.com/file/n6917/gc-logs.bz2>  






--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6917.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
Hello,

Sorry,I don't  describe clearly. 
My cluster stop many times.
The topology destruction take place at 22:03:38,But I attach log files don't 
contain it.
The topology destruction take place at 21:26:55,I attach log files contain
it.

The stoped server node's ip is 172.21.0.183.
The stoped server node's id is ac184004.

The topology destruction take place at 21:26:55 in
ignite-ac184004.0.log-172.21.0.183.
ignite-ac184004.0.log-172.21.0.183:

[21:26:55,427][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Node
is out of topology (probably, due to short-time network problems).
[21:26:55,430][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Local node SEGMENTED: TcpDiscoveryNode
[id=ac184004-f488-40be-9dd5-dc54632bcbbe, addrs=[124.250.36.149, 127.0.0.1,
172.21.0.183], sockAddrs=[/172.21.0.183:58600, /124.250.36.149:58600,
/124.250.36.149:58600, /127.0.0.1:58600, /172.21.0.183:58600],
discPort=58600, order=526, intOrder=275, lastExchangeTime=1470144415424,
loc=true, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,466][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Stopping local node according to configured segmentation policy.
[21:26:55,467][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=588dbb39-a124-4879-a476-3bb723e1e4d9,
addrs=[124.250.36.148, 127.0.0.1, 172.21.0.182],
sockAddrs=[/124.250.36.148:58600, /124.250.36.148:58600,
/124.250.36.148:58600, /127.0.0.1:58600, /172.21.0.182:58600],
discPort=58600, order=244, intOrder=132, lastExchangeTime=1470103699289,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,470][INFO][disco-event-worker-#144%null%][GridDiscoveryManager]
Topology snapshot [ver=567, servers=1, clients=0, CPUs=32, heap=60.0GB]
[21:26:55,471][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=ea39b0d0-fb6b-483b-86d5-d2cfda7c0abf,
addrs=[127.0.0.1, 172.21.0.39], sockAddrs=[/172.21.0.39:58600,
/127.0.0.1:58600, /172.21.0.39:58600], discPort=58600, order=251,
intOrder=136, lastExchangeTime=1470103699833, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,473][INFO][Thread-69][GridTcpRestProtocol] Command protocol
successfully stopped: TCP binary
[21:26:55,474][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=51421fee-0d08-4f46-82ea-1a12f8d80f6f,
addrs=[127.0.0.1, 172.21.0.40], sockAddrs=[/172.21.0.40:58600,
/127.0.0.1:58600, /172.21.0.40:58600], discPort=58600, order=257,
intOrder=139, lastExchangeTime=1470103699833, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,475][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=14d75a27-f8ae-4da6-9285-f6b6abfe645c,
addrs=[127.0.0.1, 172.21.0.41], sockAddrs=[/172.21.0.41:58600,
/127.0.0.1:58600, /172.21.0.41:58600], discPort=58600, order=263,
intOrder=142, lastExchangeTime=1470103699833, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,475][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=2333a45f-4d81-423d-90fe-445efc233883,
addrs=[127.0.0.1, 172.21.0.42], sockAddrs=[/172.21.0.42:58600,
/127.0.0.1:58600, /172.21.0.42:58600], discPort=58600, order=265,
intOrder=143, lastExchangeTime=1470103699833, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,476][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=f67c2596-bf28-41d5-b79d-7e351b72768f,
addrs=[127.0.0.1, 172.21.0.43], sockAddrs=[/172.21.0.43:58600,
/127.0.0.1:58600, /172.21.0.43:58600], discPort=58600, order=267,
intOrder=144, lastExchangeTime=1470103699833, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,477][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=ddcbfdba-cf93-415c-a914-e84ac729ccc4,
addrs=[124.250.36.139, 127.0.0.1, 172.21.0.173],
sockAddrs=[/124.250.36.139:0, /124.250.36.139:0, /124.250.36.139:0,
/127.0.0.1:0, /172.21.0.173:0], discPort=0, order=299, intOrder=160,
lastExchangeTime=1470103699833, loc=false, ver=1.6.0#20160518-sha1:0b22c45b,
isClient=true]
[21:26:55,478][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=36e21ab4-2d96-4e22-8496-755ae90253ba,
addrs=[124.250.36.70, 124.250.36.74, 127.0.0.1, 172.21.0.77],
sockAddrs=[/124.250.36.70:0, /124.250.36.70:0, /124.250.36.74:0,
/124.250.36.74:0, /127.0.0.1:0, /127.0.0.1:0, /172.21.0.77:0], discPort=0,
order=307, intOrder=164, lastExchangeTime=1470103700085, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,478][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=048e6c2a-7520-4197-8074-a65beefac047,
addrs=[124.250.36.73, 124.250.36.74, 127.0.0.1, 172.21.0.80],
sockAddrs=[/124.250.36.73:0, /124.250.36.73:0, /124.250.36.73:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.80:0],
discPort=0, order=313, intOrder=167, lastExchangeTime=1470103700360,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,479][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=0a8abdd3-ebe9-4bf3-a1af-ffaa40f8c3ac,
addrs=[124.250.36.196, 124.250.36.74, 127.0.0.1, 172.21.0.223],
sockAddrs=[/172.21.0.223:0, /124.250.36.196:0, /124.250.36.196:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.223:0],
discPort=0, order=319, intOrder=170, lastExchangeTime=1470103700511,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,480][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=aef17bf7-ef0d-4a9b-8d7b-a87ffa42afbf,
addrs=[124.250.36.69, 124.250.36.74, 124.250.36.74, 127.0.0.1, 172.21.0.76],
sockAddrs=[/124.250.36.69:0, /124.250.36.69:0, /124.250.36.74:0,
/124.250.36.74:0, /124.250.36.74:0, /124.250.36.74:0,
test.com/69.172.200.235:0, /127.0.0.1:0, /172.21.0.76:0], discPort=0,
order=358, intOrder=188, lastExchangeTime=1470103700511, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,481][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=a51432a3-25e8-41e1-9201-ad626852af80,
addrs=[124.250.36.72, 124.250.36.74, 127.0.0.1, 172.21.0.79],
sockAddrs=[/124.250.36.72:0, /124.250.36.72:0, /124.250.36.72:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.79:0],
discPort=0, order=359, intOrder=189, lastExchangeTime=1470103702071,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,481][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=1178eb68-3025-4cbf-8515-d34e5a28a452,
addrs=[124.250.36.71, 124.250.36.74, 127.0.0.1, 172.21.0.78],
sockAddrs=[/124.250.36.71:0, /124.250.36.71:0, /124.250.36.74:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.78:0],
discPort=0, order=360, intOrder=190, lastExchangeTime=1470103702957,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,483][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=89f1d6cd-8c40-4465-86e7-b2169bb59e8c,
addrs=[124.250.36.197, 124.250.36.74, 127.0.0.1, 172.21.0.224],
sockAddrs=[/172.21.0.224:0, /124.250.36.197:0, /124.250.36.74:0,
/124.250.36.74:0, /124.250.36.197:0, /127.0.0.1:0, /172.21.0.224:0],
discPort=0, order=362, intOrder=192, lastExchangeTime=1470103703048,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,483][INFO][Thread-69][GridJettyRestProtocol] Command protocol
successfully stopped: Jetty REST
[21:26:55,483][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=4e040db4-6a6e-4f90-ad1c-a4100a28b792,
addrs=[124.250.36.185, 124.250.36.45, 127.0.0.1, 172.21.0.35],
sockAddrs=[/172.21.0.35:0, /124.250.36.185:0, /124.250.36.45:0,
/124.250.36.45:0, /124.250.36.185:0, /127.0.0.1:0, /172.21.0.35:0],
discPort=0, order=366, intOrder=194, lastExchangeTime=1470103703048,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,484][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=2f84fb8b-ec6c-435a-8175-bc87e3db252c,
addrs=[124.250.36.185, 124.250.36.185, 124.250.36.46, 127.0.0.1,
172.21.0.36], sockAddrs=[/124.250.36.185:0, /124.250.36.185:0,
/124.250.36.185:0, /124.250.36.185:0, /124.250.36.46:0, /124.250.36.46:0,
/124.250.36.185:0, /127.0.0.1:0, /172.21.0.36:0], discPort=0, order=372,
intOrder=197, lastExchangeTime=1470103703048, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,484][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=b79cced2-1175-443b-8e7b-b73b2a622d5f,
addrs=[127.0.0.1, 172.21.0.38], sockAddrs=[/172.21.0.38:58600,
/127.0.0.1:58600, /172.21.0.38:58600], discPort=58600, order=390,
intOrder=207, lastExchangeTime=1470103703078, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,485][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=f0f33c05-7173-4a38-97ed-74f709514b32,
addrs=[124.250.36.223, 127.0.0.1, 172.21.0.127], sockAddrs=[/172.21.0.127:0,
/124.250.36.223:0, /124.250.36.223:0, /127.0.0.1:0, /172.21.0.127:0],
discPort=0, order=427, intOrder=225, lastExchangeTime=1470103703078,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,486][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=ec94b23e-f3d7-4e6d-aa22-570a248f7172,
addrs=[124.250.36.222, 127.0.0.1, 172.21.0.126], sockAddrs=[/172.21.0.126:0,
/124.250.36.222:0, /124.250.36.222:0, /127.0.0.1:0, /172.21.0.126:0],
discPort=0, order=431, intOrder=227, lastExchangeTime=1470103703088,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,486][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=0687433b-a7a2-47db-ba38-6d1d4f34f82e,
addrs=[124.250.36.147, 127.0.0.1, 172.21.0.181],
sockAddrs=[/172.21.0.181:58600, /124.250.36.147:58600,
/124.250.36.147:58600, /127.0.0.1:58600, /172.21.0.181:58600],
discPort=58600, order=440, intOrder=232, lastExchangeTime=1470103703088,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
[21:26:55,487][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=0e163fa7-091d-4804-8e8b-95feb1f0e2ef,
addrs=[124.250.36.47, 127.0.0.1, 172.21.0.37, 33.33.33.1],
sockAddrs=[/33.33.33.1:0, /124.250.36.47:0, /172.21.0.37:0, /127.0.0.1:0,
/124.250.36.47:0, /172.21.0.37:0, /33.33.33.1:0], discPort=0, order=536,
intOrder=280, lastExchangeTime=1470135253093, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
[21:26:55,856][INFO][Thread-69][GridCacheProcessor] Stopped cache:
bjdqAppRecommendingCache
[21:26:55,858][INFO][Thread-69][GridCacheProcessor] Stopped cache:
bjdqAppRecommendedCache
[21:26:55,859][INFO][Thread-69][GridCacheProcessor] Stopped cache:
yiCheAppRecommendingCache
[21:26:55,860][INFO][Thread-69][GridCacheProcessor] Stopped cache:
yiCheAppRecommendedCache
[21:26:55,861][INFO][Thread-69][GridCacheProcessor] Stopped cache: idfaCache
[21:26:55,862][INFO][Thread-69][GridCacheProcessor] Stopped cache:
clickedArticleCache
[21:26:55,862][INFO][Thread-69][GridCacheProcessor] Stopped cache:
deviceCache
[21:26:55,863][INFO][Thread-69][GridCacheProcessor] Stopped cache:
ignite-marshaller-sys-cache
[21:26:55,863][INFO][Thread-69][GridCacheProcessor] Stopped cache:
ignite-sys-cache
[21:26:55,864][INFO][Thread-69][GridCacheProcessor] Stopped cache:
ignite-atomics-sys-cache
[21:26:55,867][INFO][Thread-69][GridDeploymentLocalStore] Removed undeployed
class: GridDeployment [ts=1470103704338, depMode=SHARED,
clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
clsLdrId=16432094651-ac184004-f488-40be-9dd5-dc54632bcbbe, userVer=0,
loc=true,
sampleClsName=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap,
pendingUndeploy=false, undeployed=true, usage=0]
[21:26:55,880][INFO][Thread-69][IgniteKernal] 




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6893.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hello,

Ok, the topology destruction take place at 22:03:38, but I do not see any
rows at the time. Not in GC (172.21.0.183-gc.log) as not
ignite-ac184004.0.log-172.21.0.183. Also, I can not find a needed logs in
other files.

The logs need for investigation and understanding what to do next.
Please, check (by timestamp) what all file contains rows where is there
issue.
Can you  attach need a log files?

On Tue, Aug 9, 2016 at 4:26 AM, suhuadong <su...@163.com> wrote:

> hi,
> Gc logs from all nine node.
>
> http://apache-ignite-users.70518.x6.nabble.com/file/n6758/gc-logs.bz2
> Ignite logs from all nine node.
>
> http://apache-ignite-users.70518.x6.nabble.com/file/n6757/ignite-logs.bz2
>
> I allocate 60g memory to jvm, jvm used memory is about 30g.
>
> My JVM_OPTS:
> -server -Xms60g -Xmx60g -Djava.net.preferIPv4Stack=true -XX:+UseG1GC
> -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=30
> -XX:ConcGCThreads=8 -XX:+UseTLAB -XX:+DisableExplicitGC -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation
> -XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=100M
> -Xloggc:/data/ignite-1.6/gc/logs/log.txt
>
> My ignite config file:
>
> <beans xmlns="http://www.springframework.org/schema/beans"
>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>        xsi:schemaLocation="
>         http://www.springframework.org/schema/beans
>         http://www.springframework.org/schema/beans/spring-beans.xsd">
>
>     <bean id="cassandraAdminCredentials"
> class="org.apache.ignite.cache.store.cassandra.
> datasource.PlainCredentials">
>         <constructor-arg index="0" value=""/>
>         <constructor-arg index="1" value=""/>
>     </bean>
>     <bean id="loadBalancingPolicy"
> class="com.datastax.driver.core.policies.RoundRobinPolicy"/>
>     <bean id="device_persistence_settings"
>
> class="org.apache.ignite.cache.store.cassandra.persistence.
> KeyValuePersistenceSettings">
>         <constructor-arg type="org.springframework.core.io.Resource"
>
> value="classpath:com/yiche/abraham/domain/persistence-device.xml"/>
>     </bean>
>     <bean id="click_persistence_settings"
>
> class="org.apache.ignite.cache.store.cassandra.persistence.
> KeyValuePersistenceSettings">
>         <constructor-arg type="org.springframework.core.io.Resource"
>
> value="classpath:com/yiche/abraham/domain/persistence-click.xml"/>
>     </bean>
>     <bean id="idfa_persistence_settings"
>
> class="org.apache.ignite.cache.store.cassandra.persistence.
> KeyValuePersistenceSettings">
>         <constructor-arg type="org.springframework.core.io.Resource"
>
> value="classpath:com/yiche/abraham/domain/persistence-idfa.xml"/>
>     </bean>
>
>     <bean id="cassandraRegularDataSource"
> class="org.apache.ignite.cache.store.cassandra.datasource.DataSource">
>         <property name="credentials" ref="cassandraAdminCredentials"/>
>
>         <property name="contactPoints">
>             <list>
>                 <value>172.21.0.177</value>
>                 <value>172.21.0.178</value>
>                 <value>172.21.0.179</value>
>                 <value>172.21.0.180</value>
>             </list>
>         </property>
>         <property name="readConsistency" value="ONE"/>
>         <property name="writeConsistency" value="ONE"/>
>         <property name="loadBalancingPolicy" ref="loadBalancingPolicy"/>
>     </bean>
>
>     <bean id="ignite.cfg"
> class="org.apache.ignite.configuration.IgniteConfiguration">
>         <property name="peerClassLoadingEnabled" value="true"/>
>         <property name="cacheConfiguration">
>             <list>
>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                     <property name="name" value="deviceCache"/>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="0"/>
>                     <property name="readThrough" value="true"/>
>                     <property name="writeThrough" value="true"/>
>                     <property name="cacheStoreFactory">
>
>                         <bean
> class="org.apache.ignite.cache.store.cassandra.
> CassandraCacheStoreFactory">
>                             <property name="dataSourceBean"
> value="cassandraRegularDataSource"/>
>                             <property name="persistenceSettingsBean"
> value="device_persistence_settings"/>
>                         </bean>
>                     </property>
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="1000000"/>
>                         </bean>
>                     </property>
>
>                 </bean>
>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                     <property name="name" value="clickedArticleCache"/>
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="1000000"/>
>                         </bean>
>                     </property>
>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="0"/>
>
>                     <property name="readThrough" value="true"/>
>                     <property name="writeThrough" value="true"/>
>                     <property name="cacheStoreFactory">
>
>                         <bean
> class="org.apache.ignite.cache.store.cassandra.
> CassandraCacheStoreFactory">
>                             <property name="dataSourceBean"
> value="cassandraRegularDataSource"/>
>                             <property name="persistenceSettingsBean"
> value="click_persistence_settings"/>
>                         </bean>
>                     </property>
>
>                 </bean>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                     <property name="name" value="idfaCache"/>
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="1000000"/>
>                         </bean>
>                     </property>
>
>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="0"/>
>
>                     <property name="readThrough" value="true"/>
>                     <property name="writeThrough" value="true"/>
>                     <property name="cacheStoreFactory">
>
>                         <bean
> class="org.apache.ignite.cache.store.cassandra.
> CassandraCacheStoreFactory">
>                             <property name="dataSourceBean"
> value="cassandraRegularDataSource"/>
>                             <property name="persistenceSettingsBean"
> value="idfa_persistence_settings"/>
>                         </bean>
>                     </property>
>                 </bean>
>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                     <property name="name" value="
> yiCheAppRecommendedCache"/>
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="2000000"/>
>                         </bean>
>                     </property>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="1"/>
>                 </bean>
>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="500000"/>
>                         </bean>
>                     </property>
>
>                     <property name="name"
> value="yiCheAppRecommendingCache"/>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="0"/>
>
>                 </bean>
>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                     <property name="name" value="
> bjdqAppRecommendedCache"/>
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="2000000"/>
>                         </bean>
>                     </property>
>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="1"/>
>
>
>                 </bean>
>
>                 <bean
> class="org.apache.ignite.configuration.CacheConfiguration">
>                     <property name="name" value="
> bjdqAppRecommendingCache"/>
>
>                     <property name="memoryMode" value="ONHEAP_TIERED"/>
>                     <property name="evictionPolicy">
>
>                         <bean
> class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
>
>                             <property name="maxSize" value="500000"/>
>                         </bean>
>                     </property>
>
>
>                     <property name="writeSynchronizationMode"
> value="PRIMARY_SYNC"/>
>
>                     <property name="cacheMode" value="PARTITIONED"/>
>                     <property name="atomicityMode" value="ATOMIC"/>
>                     <property name="backups" value="0"/>
>                 </bean>
>
>             </list>
>         </property>
>
>
>         <property name="discoverySpi">
>             <bean
> class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
>
>
>
>                 <property name="localPort" value="58600"/>
>
>
>                 <property name="localPortRange" value="20"/>
>
>                 <property name="ipFinder">
>
>
>
>                     <bean
> class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.
> TcpDiscoveryVmIpFinder">
>
>                         <property name="addresses">
>                             <list>
>
>
>                                 <value>172.21.0.181:58600..58620</value>
>                                 <value>172.21.0.182:58600..58620</value>
>                                 <value>172.21.0.183:58600..58620</value>
>                             </list>
>                         </property>
>                     </bean>
>                 </property>
>             </bean>
>         </property>
>
>
>         <property name="communicationSpi">
>             <bean
> class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
>                 <property name="localPort" value="58200"/>
>                 <property name="localAddress" value="172.21.0.183"/>
>             </bean>
>         </property>
>     </bean>
> </beans>
>
>
>
> The stoped server node'ip is 172.20.0.183.
>
> work/log:
>
> [22:03:38,100][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi]
> Node
> is out of topology (probably, due to short-time network problems).
> [22:03:38,103][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Local node SEGMENTED: TcpDiscoveryNode
> [id=424ef276-c1b6-48b0-9ded-c6fca0997502, addrs=[124.250.36.149,
> 127.0.0.1,
> 172.21.0.183], sockAddrs=[/172.21.0.183:58600, /124.250.36.149:58600,
> /124.250.36.149:58600, /127.0.0.1:58600, /172.21.0.183:58600],
> discPort=58600, order=485, intOrder=255, lastExchangeTime=1469714618102,
> loc=true, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,134][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Stopping local node according to configured segmentation policy.
> [22:03:38,135][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=588dbb39-a124-4879-a476-3bb723e1e4d9,
> addrs=[124.250.36.148, 127.0.0.1, 172.21.0.182],
> sockAddrs=[/124.250.36.148:58600, /124.250.36.148:58600,
> /124.250.36.148:58600, /127.0.0.1:58600, /172.21.0.182:58600],
> discPort=58600, order=244, intOrder=132, lastExchangeTime=1469624391582,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,138][INFO][disco-event-worker-#144%null%][GridDiscoveryManager]
> Topology snapshot [ver=527, servers=1, clients=0, CPUs=32, heap=60.0GB]
> [22:03:38,139][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=ea39b0d0-fb6b-483b-86d5-d2cfda7c0abf,
> addrs=[127.0.0.1, 172.21.0.39], sockAddrs=[/172.21.0.39:58600,
> /127.0.0.1:58600, /172.21.0.39:58600], discPort=58600, order=251,
> intOrder=136, lastExchangeTime=1469624391582, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,143][INFO][Thread-117][GridTcpRestProtocol] Command protocol
> successfully stopped: TCP binary
> [22:03:38,144][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=51421fee-0d08-4f46-82ea-1a12f8d80f6f,
> addrs=[127.0.0.1, 172.21.0.40], sockAddrs=[/172.21.0.40:58600,
> /127.0.0.1:58600, /172.21.0.40:58600], discPort=58600, order=257,
> intOrder=139, lastExchangeTime=1469624391582, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,145][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=14d75a27-f8ae-4da6-9285-f6b6abfe645c,
> addrs=[127.0.0.1, 172.21.0.41], sockAddrs=[/172.21.0.41:58600,
> /127.0.0.1:58600, /172.21.0.41:58600], discPort=58600, order=263,
> intOrder=142, lastExchangeTime=1469624391582, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,146][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=2333a45f-4d81-423d-90fe-445efc233883,
> addrs=[127.0.0.1, 172.21.0.42], sockAddrs=[/172.21.0.42:58600,
> /127.0.0.1:58600, /172.21.0.42:58600], discPort=58600, order=265,
> intOrder=143, lastExchangeTime=1469624391582, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,146][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=f67c2596-bf28-41d5-b79d-7e351b72768f,
> addrs=[127.0.0.1, 172.21.0.43], sockAddrs=[/172.21.0.43:58600,
> /127.0.0.1:58600, /172.21.0.43:58600], discPort=58600, order=267,
> intOrder=144, lastExchangeTime=1469624391582, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,147][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=ddcbfdba-cf93-415c-a914-e84ac729ccc4,
> addrs=[124.250.36.139, 127.0.0.1, 172.21.0.173],
> sockAddrs=[/124.250.36.139:0, /124.250.36.139:0, /124.250.36.139:0,
> /127.0.0.1:0, /172.21.0.173:0], discPort=0, order=299, intOrder=160,
> lastExchangeTime=1469624391582, loc=false, ver=1.6.0#20160518-sha1:
> 0b22c45b,
> isClient=true]
> [22:03:38,148][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=36e21ab4-2d96-4e22-8496-755ae90253ba,
> addrs=[124.250.36.70, 124.250.36.74, 127.0.0.1, 172.21.0.77],
> sockAddrs=[/124.250.36.70:0, /124.250.36.70:0, /124.250.36.74:0,
> /124.250.36.74:0, /127.0.0.1:0, /127.0.0.1:0, /172.21.0.77:0], discPort=0,
> order=307, intOrder=164, lastExchangeTime=1469624391582, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,150][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=048e6c2a-7520-4197-8074-a65beefac047,
> addrs=[124.250.36.73, 124.250.36.74, 127.0.0.1, 172.21.0.80],
> sockAddrs=[/124.250.36.73:0, /124.250.36.73:0, /124.250.36.73:0,
> /124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.80:0],
> discPort=0, order=313, intOrder=167, lastExchangeTime=1469624391582,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,150][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=0a8abdd3-ebe9-4bf3-a1af-ffaa40f8c3ac,
> addrs=[124.250.36.196, 124.250.36.74, 127.0.0.1, 172.21.0.223],
> sockAddrs=[/172.21.0.223:0, /124.250.36.196:0, /124.250.36.196:0,
> /124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.223:0],
> discPort=0, order=319, intOrder=170, lastExchangeTime=1469624391582,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,151][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=aef17bf7-ef0d-4a9b-8d7b-a87ffa42afbf,
> addrs=[124.250.36.69, 124.250.36.74, 124.250.36.74, 127.0.0.1,
> 172.21.0.76],
> sockAddrs=[/124.250.36.69:0, /124.250.36.69:0, /124.250.36.74:0,
> /124.250.36.74:0, /124.250.36.74:0, /124.250.36.74:0,
> test.com/69.172.200.235:0, /127.0.0.1:0, /172.21.0.76:0], discPort=0,
> order=358, intOrder=188, lastExchangeTime=1469624391592, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,152][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=a51432a3-25e8-41e1-9201-ad626852af80,
> addrs=[124.250.36.72, 124.250.36.74, 127.0.0.1, 172.21.0.79],
> sockAddrs=[/124.250.36.72:0, /124.250.36.72:0, /124.250.36.72:0,
> /124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.79:0],
> discPort=0, order=359, intOrder=189, lastExchangeTime=1469624391592,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,152][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=1178eb68-3025-4cbf-8515-d34e5a28a452,
> addrs=[124.250.36.71, 124.250.36.74, 127.0.0.1, 172.21.0.78],
> sockAddrs=[/124.250.36.71:0, /124.250.36.71:0, /124.250.36.74:0,
> /124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.78:0],
> discPort=0, order=360, intOrder=190, lastExchangeTime=1469624391592,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,153][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=89f1d6cd-8c40-4465-86e7-b2169bb59e8c,
> addrs=[124.250.36.197, 124.250.36.74, 127.0.0.1, 172.21.0.224],
> sockAddrs=[/172.21.0.224:0, /124.250.36.197:0, /124.250.36.74:0,
> /124.250.36.74:0, /124.250.36.197:0, /127.0.0.1:0, /172.21.0.224:0],
> discPort=0, order=362, intOrder=192, lastExchangeTime=1469624391592,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,154][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=4e040db4-6a6e-4f90-ad1c-a4100a28b792,
> addrs=[124.250.36.185, 124.250.36.45, 127.0.0.1, 172.21.0.35],
> sockAddrs=[/172.21.0.35:0, /124.250.36.185:0, /124.250.36.45:0,
> /124.250.36.45:0, /124.250.36.185:0, /127.0.0.1:0, /172.21.0.35:0],
> discPort=0, order=366, intOrder=194, lastExchangeTime=1469624391592,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,154][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=2f84fb8b-ec6c-435a-8175-bc87e3db252c,
> addrs=[124.250.36.185, 124.250.36.185, 124.250.36.46, 127.0.0.1,
> 172.21.0.36], sockAddrs=[/124.250.36.185:0, /124.250.36.185:0,
> /124.250.36.185:0, /124.250.36.185:0, /124.250.36.46:0, /124.250.36.46:0,
> /124.250.36.185:0, /127.0.0.1:0, /172.21.0.36:0], discPort=0, order=372,
> intOrder=197, lastExchangeTime=1469624391592, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,156][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=b79cced2-1175-443b-8e7b-b73b2a622d5f,
> addrs=[127.0.0.1, 172.21.0.38], sockAddrs=[/172.21.0.38:58600,
> /127.0.0.1:58600, /172.21.0.38:58600], discPort=58600, order=390,
> intOrder=207, lastExchangeTime=1469624391592, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,156][INFO][Thread-117][GridJettyRestProtocol] Command protocol
> successfully stopped: Jetty REST
> [22:03:38,156][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=f0f33c05-7173-4a38-97ed-74f709514b32,
> addrs=[124.250.36.223, 127.0.0.1, 172.21.0.127], sockAddrs=[/
> 172.21.0.127:0,
> /124.250.36.223:0, /124.250.36.223:0, /127.0.0.1:0, /172.21.0.127:0],
> discPort=0, order=427, intOrder=225, lastExchangeTime=1469624388223,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,157][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=ec94b23e-f3d7-4e6d-aa22-570a248f7172,
> addrs=[124.250.36.222, 127.0.0.1, 172.21.0.126], sockAddrs=[/
> 172.21.0.126:0,
> /124.250.36.222:0, /124.250.36.222:0, /127.0.0.1:0, /172.21.0.126:0],
> discPort=0, order=431, intOrder=227, lastExchangeTime=1469624388234,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,158][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=0687433b-a7a2-47db-ba38-6d1d4f34f82e,
> addrs=[124.250.36.147, 127.0.0.1, 172.21.0.181],
> sockAddrs=[/172.21.0.181:58600, /124.250.36.147:58600,
> /124.250.36.147:58600, /127.0.0.1:58600, /172.21.0.181:58600],
> discPort=58600, order=440, intOrder=232, lastExchangeTime=1469624391592,
> loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false]
> [22:03:38,158][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=0d2fc522-5d2d-41b3-a0a4-98e12f3aa876,
> addrs=[124.250.36.47, 127.0.0.1, 172.21.0.37, 33.33.33.1],
> sockAddrs=[/33.33.33.1:0, /124.250.36.47:0, /172.21.0.37:0, /127.0.0.1:0,
> /124.250.36.47:0, /172.21.0.37:0, /33.33.33.1:0], discPort=0, order=496,
> intOrder=260, lastExchangeTime=1469676024755, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,159][WARNING][disco-event-worker-#144%null%][
> GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=bf64226a-7597-4bc8-866b-6d99c7e9f2aa,
> addrs=[124.250.36.47, 127.0.0.1, 172.21.0.37, 33.33.33.1],
> sockAddrs=[/33.33.33.1:0, /124.250.36.47:0, /172.21.0.37:0, /127.0.0.1:0,
> /124.250.36.47:0, /172.21.0.37:0, /33.33.33.1:0], discPort=0, order=503,
> intOrder=264, lastExchangeTime=1469698198271, loc=false,
> ver=1.6.0#20160518-sha1:0b22c45b, isClient=true]
> [22:03:38,561][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> bjdqAppRecommendingCache
> [22:03:38,563][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> bjdqAppRecommendedCache
> [22:03:38,564][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> yiCheAppRecommendingCache
> [22:03:38,564][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> yiCheAppRecommendedCache
> [22:03:38,565][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> idfaCache
> [22:03:38,566][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> clickedArticleCache
> [22:03:38,566][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> deviceCache
> [22:03:38,567][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> ignite-marshaller-sys-cache
> [22:03:38,567][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> ignite-sys-cache
> [22:03:38,568][INFO][Thread-117][GridCacheProcessor] Stopped cache:
> ignite-atomics-sys-cache
> [22:03:38,571][INFO][Thread-117][GridDeploymentLocalStore] Removed
> undeployed class: GridDeployment [ts=1469624393224, depMode=SHARED,
> clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
> clsLdrId=a1d707c2651-424ef276-c1b6-48b0-9ded-c6fca0997502, userVer=0,
> loc=true,
> sampleClsName=org.apache.ignite.internal.processors.cache.distributed.dht.
> preloader.GridDhtPartitionFullMap,
> pendingUndeploy=false, undeployed=true, usage=0]
> [22:03:38,585][INFO][Thread-117][IgniteKernal]
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6865.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
hi,
Gc logs from all nine node. 
       
http://apache-ignite-users.70518.x6.nabble.com/file/n6758/gc-logs.bz2
Ignite logs from all nine node.
      
http://apache-ignite-users.70518.x6.nabble.com/file/n6757/ignite-logs.bz2

I allocate 60g memory to jvm, jvm used memory is about 30g. 

My JVM_OPTS: 
-server -Xms60g -Xmx60g -Djava.net.preferIPv4Stack=true -XX:+UseG1GC
-XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=30
-XX:ConcGCThreads=8 -XX:+UseTLAB -XX:+DisableExplicitGC -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=100M
-Xloggc:/data/ignite-1.6/gc/logs/log.txt 

My ignite config file: 

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation=" 
        http://www.springframework.org/schema/beans
        http://www.springframework.org/schema/beans/spring-beans.xsd">

    <bean id="cassandraAdminCredentials"
class="org.apache.ignite.cache.store.cassandra.datasource.PlainCredentials">
        <constructor-arg index="0" value=""/>
        <constructor-arg index="1" value=""/>
    </bean>
    <bean id="loadBalancingPolicy"
class="com.datastax.driver.core.policies.RoundRobinPolicy"/>
    <bean id="device_persistence_settings" 
         
class="org.apache.ignite.cache.store.cassandra.persistence.KeyValuePersistenceSettings">
        <constructor-arg type="org.springframework.core.io.Resource" 
                        
value="classpath:com/yiche/abraham/domain/persistence-device.xml"/>
    </bean>
    <bean id="click_persistence_settings" 
         
class="org.apache.ignite.cache.store.cassandra.persistence.KeyValuePersistenceSettings">
        <constructor-arg type="org.springframework.core.io.Resource" 
                        
value="classpath:com/yiche/abraham/domain/persistence-click.xml"/>
    </bean>
    <bean id="idfa_persistence_settings" 
         
class="org.apache.ignite.cache.store.cassandra.persistence.KeyValuePersistenceSettings">
        <constructor-arg type="org.springframework.core.io.Resource" 
                        
value="classpath:com/yiche/abraham/domain/persistence-idfa.xml"/>
    </bean>

    <bean id="cassandraRegularDataSource"
class="org.apache.ignite.cache.store.cassandra.datasource.DataSource">
        <property name="credentials" ref="cassandraAdminCredentials"/>

        <property name="contactPoints">
            <list>
                <value>172.21.0.177</value>
                <value>172.21.0.178</value>
                <value>172.21.0.179</value>
                <value>172.21.0.180</value>
            </list>
        </property>
        <property name="readConsistency" value="ONE"/>
        <property name="writeConsistency" value="ONE"/>
        <property name="loadBalancingPolicy" ref="loadBalancingPolicy"/>
    </bean>

    <bean id="ignite.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">
        <property name="peerClassLoadingEnabled" value="true"/>
        <property name="cacheConfiguration">
            <list>
                
                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="deviceCache"/>
                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="0"/>
                    <property name="readThrough" value="true"/>
                    <property name="writeThrough" value="true"/>
                    <property name="cacheStoreFactory">

                        <bean
class="org.apache.ignite.cache.store.cassandra.CassandraCacheStoreFactory">
                            <property name="dataSourceBean"
value="cassandraRegularDataSource"/>
                            <property name="persistenceSettingsBean"
value="device_persistence_settings"/>
                        </bean>
                    </property>
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="1000000"/>
                        </bean>
                    </property>

                </bean>

                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="clickedArticleCache"/>
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="1000000"/>
                        </bean>
                    </property>

                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="0"/>

                    <property name="readThrough" value="true"/>
                    <property name="writeThrough" value="true"/>
                    <property name="cacheStoreFactory">

                        <bean
class="org.apache.ignite.cache.store.cassandra.CassandraCacheStoreFactory">
                            <property name="dataSourceBean"
value="cassandraRegularDataSource"/>
                            <property name="persistenceSettingsBean"
value="click_persistence_settings"/>
                        </bean>
                    </property>

                </bean>
                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="idfaCache"/>
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="1000000"/>
                        </bean>
                    </property>


                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="0"/>

                    <property name="readThrough" value="true"/>
                    <property name="writeThrough" value="true"/>
                    <property name="cacheStoreFactory">

                        <bean
class="org.apache.ignite.cache.store.cassandra.CassandraCacheStoreFactory">
                            <property name="dataSourceBean"
value="cassandraRegularDataSource"/>
                            <property name="persistenceSettingsBean"
value="idfa_persistence_settings"/>
                        </bean>
                    </property>
                </bean>

                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="yiCheAppRecommendedCache"/>
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="2000000"/>
                        </bean>
                    </property>
                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="1"/>
                </bean>

                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="500000"/>
                        </bean>
                    </property>

                    <property name="name"
value="yiCheAppRecommendingCache"/>
                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="0"/>

                </bean>

                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="bjdqAppRecommendedCache"/>
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="2000000"/>
                        </bean>
                    </property>

                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="1"/>
                    

                </bean>

                <bean
class="org.apache.ignite.configuration.CacheConfiguration">
                    <property name="name" value="bjdqAppRecommendingCache"/>
                    
                    <property name="memoryMode" value="ONHEAP_TIERED"/>
                    <property name="evictionPolicy">
                        
                        <bean
class="org.apache.ignite.cache.eviction.lru.LruEvictionPolicy">
                            
                            <property name="maxSize" value="500000"/>
                        </bean>
                    </property>

                    
                    <property name="writeSynchronizationMode"
value="PRIMARY_SYNC"/>
                    
                    <property name="cacheMode" value="PARTITIONED"/>
                    <property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="0"/>
                </bean>

            </list>
        </property>

        
        <property name="discoverySpi">
            <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">

                
                
                <property name="localPort" value="58600"/>

                
                <property name="localPortRange" value="20"/>

                <property name="ipFinder">
                    
                    
                    
                    <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
                        
                        <property name="addresses">
                            <list>
                                

                                <value>172.21.0.181:58600..58620</value>
                                <value>172.21.0.182:58600..58620</value>
                                <value>172.21.0.183:58600..58620</value>
                            </list>
                        </property>
                    </bean>
                </property>
            </bean>
        </property>

        
        <property name="communicationSpi">
            <bean
class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
                <property name="localPort" value="58200"/>
                <property name="localAddress" value="172.21.0.183"/>
            </bean>
        </property>
    </bean>
</beans>



The stoped server node'ip is 172.20.0.183.

work/log:

[22:03:38,100][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Node
is out of topology (probably, due to short-time network problems). 
[22:03:38,103][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Local node SEGMENTED: TcpDiscoveryNode
[id=424ef276-c1b6-48b0-9ded-c6fca0997502, addrs=[124.250.36.149, 127.0.0.1,
172.21.0.183], sockAddrs=[/172.21.0.183:58600, /124.250.36.149:58600,
/124.250.36.149:58600, /127.0.0.1:58600, /172.21.0.183:58600],
discPort=58600, order=485, intOrder=255, lastExchangeTime=1469714618102,
loc=true, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,134][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Stopping local node according to configured segmentation policy. 
[22:03:38,135][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=588dbb39-a124-4879-a476-3bb723e1e4d9,
addrs=[124.250.36.148, 127.0.0.1, 172.21.0.182],
sockAddrs=[/124.250.36.148:58600, /124.250.36.148:58600,
/124.250.36.148:58600, /127.0.0.1:58600, /172.21.0.182:58600],
discPort=58600, order=244, intOrder=132, lastExchangeTime=1469624391582,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,138][INFO][disco-event-worker-#144%null%][GridDiscoveryManager]
Topology snapshot [ver=527, servers=1, clients=0, CPUs=32, heap=60.0GB] 
[22:03:38,139][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=ea39b0d0-fb6b-483b-86d5-d2cfda7c0abf,
addrs=[127.0.0.1, 172.21.0.39], sockAddrs=[/172.21.0.39:58600,
/127.0.0.1:58600, /172.21.0.39:58600], discPort=58600, order=251,
intOrder=136, lastExchangeTime=1469624391582, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,143][INFO][Thread-117][GridTcpRestProtocol] Command protocol
successfully stopped: TCP binary 
[22:03:38,144][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=51421fee-0d08-4f46-82ea-1a12f8d80f6f,
addrs=[127.0.0.1, 172.21.0.40], sockAddrs=[/172.21.0.40:58600,
/127.0.0.1:58600, /172.21.0.40:58600], discPort=58600, order=257,
intOrder=139, lastExchangeTime=1469624391582, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,145][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=14d75a27-f8ae-4da6-9285-f6b6abfe645c,
addrs=[127.0.0.1, 172.21.0.41], sockAddrs=[/172.21.0.41:58600,
/127.0.0.1:58600, /172.21.0.41:58600], discPort=58600, order=263,
intOrder=142, lastExchangeTime=1469624391582, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,146][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=2333a45f-4d81-423d-90fe-445efc233883,
addrs=[127.0.0.1, 172.21.0.42], sockAddrs=[/172.21.0.42:58600,
/127.0.0.1:58600, /172.21.0.42:58600], discPort=58600, order=265,
intOrder=143, lastExchangeTime=1469624391582, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,146][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=f67c2596-bf28-41d5-b79d-7e351b72768f,
addrs=[127.0.0.1, 172.21.0.43], sockAddrs=[/172.21.0.43:58600,
/127.0.0.1:58600, /172.21.0.43:58600], discPort=58600, order=267,
intOrder=144, lastExchangeTime=1469624391582, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,147][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=ddcbfdba-cf93-415c-a914-e84ac729ccc4,
addrs=[124.250.36.139, 127.0.0.1, 172.21.0.173],
sockAddrs=[/124.250.36.139:0, /124.250.36.139:0, /124.250.36.139:0,
/127.0.0.1:0, /172.21.0.173:0], discPort=0, order=299, intOrder=160,
lastExchangeTime=1469624391582, loc=false, ver=1.6.0#20160518-sha1:0b22c45b,
isClient=true] 
[22:03:38,148][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=36e21ab4-2d96-4e22-8496-755ae90253ba,
addrs=[124.250.36.70, 124.250.36.74, 127.0.0.1, 172.21.0.77],
sockAddrs=[/124.250.36.70:0, /124.250.36.70:0, /124.250.36.74:0,
/124.250.36.74:0, /127.0.0.1:0, /127.0.0.1:0, /172.21.0.77:0], discPort=0,
order=307, intOrder=164, lastExchangeTime=1469624391582, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,150][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=048e6c2a-7520-4197-8074-a65beefac047,
addrs=[124.250.36.73, 124.250.36.74, 127.0.0.1, 172.21.0.80],
sockAddrs=[/124.250.36.73:0, /124.250.36.73:0, /124.250.36.73:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.80:0],
discPort=0, order=313, intOrder=167, lastExchangeTime=1469624391582,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,150][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=0a8abdd3-ebe9-4bf3-a1af-ffaa40f8c3ac,
addrs=[124.250.36.196, 124.250.36.74, 127.0.0.1, 172.21.0.223],
sockAddrs=[/172.21.0.223:0, /124.250.36.196:0, /124.250.36.196:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.223:0],
discPort=0, order=319, intOrder=170, lastExchangeTime=1469624391582,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,151][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=aef17bf7-ef0d-4a9b-8d7b-a87ffa42afbf,
addrs=[124.250.36.69, 124.250.36.74, 124.250.36.74, 127.0.0.1, 172.21.0.76],
sockAddrs=[/124.250.36.69:0, /124.250.36.69:0, /124.250.36.74:0,
/124.250.36.74:0, /124.250.36.74:0, /124.250.36.74:0,
test.com/69.172.200.235:0, /127.0.0.1:0, /172.21.0.76:0], discPort=0,
order=358, intOrder=188, lastExchangeTime=1469624391592, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,152][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=a51432a3-25e8-41e1-9201-ad626852af80,
addrs=[124.250.36.72, 124.250.36.74, 127.0.0.1, 172.21.0.79],
sockAddrs=[/124.250.36.72:0, /124.250.36.72:0, /124.250.36.72:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.79:0],
discPort=0, order=359, intOrder=189, lastExchangeTime=1469624391592,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,152][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=1178eb68-3025-4cbf-8515-d34e5a28a452,
addrs=[124.250.36.71, 124.250.36.74, 127.0.0.1, 172.21.0.78],
sockAddrs=[/124.250.36.71:0, /124.250.36.71:0, /124.250.36.74:0,
/124.250.36.74:0, /124.250.36.74:0, /127.0.0.1:0, /172.21.0.78:0],
discPort=0, order=360, intOrder=190, lastExchangeTime=1469624391592,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,153][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=89f1d6cd-8c40-4465-86e7-b2169bb59e8c,
addrs=[124.250.36.197, 124.250.36.74, 127.0.0.1, 172.21.0.224],
sockAddrs=[/172.21.0.224:0, /124.250.36.197:0, /124.250.36.74:0,
/124.250.36.74:0, /124.250.36.197:0, /127.0.0.1:0, /172.21.0.224:0],
discPort=0, order=362, intOrder=192, lastExchangeTime=1469624391592,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,154][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=4e040db4-6a6e-4f90-ad1c-a4100a28b792,
addrs=[124.250.36.185, 124.250.36.45, 127.0.0.1, 172.21.0.35],
sockAddrs=[/172.21.0.35:0, /124.250.36.185:0, /124.250.36.45:0,
/124.250.36.45:0, /124.250.36.185:0, /127.0.0.1:0, /172.21.0.35:0],
discPort=0, order=366, intOrder=194, lastExchangeTime=1469624391592,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,154][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=2f84fb8b-ec6c-435a-8175-bc87e3db252c,
addrs=[124.250.36.185, 124.250.36.185, 124.250.36.46, 127.0.0.1,
172.21.0.36], sockAddrs=[/124.250.36.185:0, /124.250.36.185:0,
/124.250.36.185:0, /124.250.36.185:0, /124.250.36.46:0, /124.250.36.46:0,
/124.250.36.185:0, /127.0.0.1:0, /172.21.0.36:0], discPort=0, order=372,
intOrder=197, lastExchangeTime=1469624391592, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,156][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=b79cced2-1175-443b-8e7b-b73b2a622d5f,
addrs=[127.0.0.1, 172.21.0.38], sockAddrs=[/172.21.0.38:58600,
/127.0.0.1:58600, /172.21.0.38:58600], discPort=58600, order=390,
intOrder=207, lastExchangeTime=1469624391592, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,156][INFO][Thread-117][GridJettyRestProtocol] Command protocol
successfully stopped: Jetty REST 
[22:03:38,156][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=f0f33c05-7173-4a38-97ed-74f709514b32,
addrs=[124.250.36.223, 127.0.0.1, 172.21.0.127], sockAddrs=[/172.21.0.127:0,
/124.250.36.223:0, /124.250.36.223:0, /127.0.0.1:0, /172.21.0.127:0],
discPort=0, order=427, intOrder=225, lastExchangeTime=1469624388223,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,157][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=ec94b23e-f3d7-4e6d-aa22-570a248f7172,
addrs=[124.250.36.222, 127.0.0.1, 172.21.0.126], sockAddrs=[/172.21.0.126:0,
/124.250.36.222:0, /124.250.36.222:0, /127.0.0.1:0, /172.21.0.126:0],
discPort=0, order=431, intOrder=227, lastExchangeTime=1469624388234,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,158][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=0687433b-a7a2-47db-ba38-6d1d4f34f82e,
addrs=[124.250.36.147, 127.0.0.1, 172.21.0.181],
sockAddrs=[/172.21.0.181:58600, /124.250.36.147:58600,
/124.250.36.147:58600, /127.0.0.1:58600, /172.21.0.181:58600],
discPort=58600, order=440, intOrder=232, lastExchangeTime=1469624391592,
loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 
[22:03:38,158][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=0d2fc522-5d2d-41b3-a0a4-98e12f3aa876,
addrs=[124.250.36.47, 127.0.0.1, 172.21.0.37, 33.33.33.1],
sockAddrs=[/33.33.33.1:0, /124.250.36.47:0, /172.21.0.37:0, /127.0.0.1:0,
/124.250.36.47:0, /172.21.0.37:0, /33.33.33.1:0], discPort=0, order=496,
intOrder=260, lastExchangeTime=1469676024755, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,159][WARNING][disco-event-worker-#144%null%][GridDiscoveryManager]
Node FAILED: TcpDiscoveryNode [id=bf64226a-7597-4bc8-866b-6d99c7e9f2aa,
addrs=[124.250.36.47, 127.0.0.1, 172.21.0.37, 33.33.33.1],
sockAddrs=[/33.33.33.1:0, /124.250.36.47:0, /172.21.0.37:0, /127.0.0.1:0,
/124.250.36.47:0, /172.21.0.37:0, /33.33.33.1:0], discPort=0, order=503,
intOrder=264, lastExchangeTime=1469698198271, loc=false,
ver=1.6.0#20160518-sha1:0b22c45b, isClient=true] 
[22:03:38,561][INFO][Thread-117][GridCacheProcessor] Stopped cache:
bjdqAppRecommendingCache 
[22:03:38,563][INFO][Thread-117][GridCacheProcessor] Stopped cache:
bjdqAppRecommendedCache 
[22:03:38,564][INFO][Thread-117][GridCacheProcessor] Stopped cache:
yiCheAppRecommendingCache 
[22:03:38,564][INFO][Thread-117][GridCacheProcessor] Stopped cache:
yiCheAppRecommendedCache 
[22:03:38,565][INFO][Thread-117][GridCacheProcessor] Stopped cache:
idfaCache 
[22:03:38,566][INFO][Thread-117][GridCacheProcessor] Stopped cache:
clickedArticleCache 
[22:03:38,566][INFO][Thread-117][GridCacheProcessor] Stopped cache:
deviceCache 
[22:03:38,567][INFO][Thread-117][GridCacheProcessor] Stopped cache:
ignite-marshaller-sys-cache 
[22:03:38,567][INFO][Thread-117][GridCacheProcessor] Stopped cache:
ignite-sys-cache 
[22:03:38,568][INFO][Thread-117][GridCacheProcessor] Stopped cache:
ignite-atomics-sys-cache 
[22:03:38,571][INFO][Thread-117][GridDeploymentLocalStore] Removed
undeployed class: GridDeployment [ts=1469624393224, depMode=SHARED,
clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2,
clsLdrId=a1d707c2651-424ef276-c1b6-48b0-9ded-c6fca0997502, userVer=0,
loc=true,
sampleClsName=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap,
pendingUndeploy=false, undeployed=true, usage=0] 
[22:03:38,585][INFO][Thread-117][IgniteKernal]



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6865.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by Vladislav Pyatkov <vl...@gmail.com>.
Hello,

I have seen log from 172.21.0.40 only and only one GC log file.
I do not see any necessary provide a log file over some days. Can you
provide file over 4-5 minutes (it would be enought) on fail time?

Could you please provide log (GC and application) from all nine node?

On Mon, Aug 8, 2016 at 6:12 AM, suhuadong <su...@163.com> wrote:

> hi vkulichenko,
> Can you find out reason about the node stoped?
>
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6842.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Vladislav Pyatkov

Re: Ignite Cluster node stopped

Posted by suhuadong <su...@163.com>.
hi vkulichenko,
Can you find out reason about the node stoped?




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6842.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by vkulichenko <va...@gmail.com>.
Got it. Please attach *full* log files from all the nodes.

-Val 



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6729.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Ignite Cluster node stopped

Posted by vkulichenko <va...@gmail.com>.
Do you mean that client could not reconnect? This is true, it will always use
addresses from the IP finder to connect and needs at least one of them to be
available. You can use one of the shared IP finders [1] (JDBC, shared FS,
etc.) to avoid this problem.

[1] https://apacheignite.readme.io/docs/cluster-config

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-Cluster-node-stopped-tp6608p6725.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.