You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by 李玉...@163, 18...@163.com on 2019/01/11 16:36:53 UTC

Abnormal termination of nodes with native persistence enabled

Hi,

Currently, after cluster activation, if a node with native persistence 
is enabled terminates abnormally,when the node is restarted, it cannot 
join the cluster.

So the question is:

1.If the node terminates abnormally, how can the node rejoin the cluster?

2.How to restart the node gracefully?



Re: Abnormal termination of nodes with native persistence enabled

Posted by 李玉...@163, 18...@163.com.
Hi,

The console log is as follows:
But if all nodes are killed, all nodes can start successfully, and the 
data is normal. Only after a single node fails, it can not join the 
cluster when it starts again.

thanks!

-----------------------------log start------------------------------

2019-01-14T10:33:35,438][INFO ][main][IgniteKernal]

 >>>    __________  ________________
 >>>   /  _/ ___/ |/ /  _/_  __/ __/
 >>>  _/ // (7 7    // /  / / / _/
 >>> /___/\___/_/|_/___/ /_/ /___/
 >>>
 >>> ver. 2.6.0#20180710-sha1:669feacc
 >>> 2018 Copyright(C) Apache Software Foundation
 >>>
 >>> Ignite documentation: http://ignite.apache.org

2019-01-14T10:33:35,441][INFO ][main][IgniteKernal] Config URL: 
file:/opt/ignite/apache-ignite-fabric-2.6.0-bin/config/practice-config.xml
2019-01-14T10:33:35,458][INFO ][main][IgniteKernal] IgniteConfiguration 
[igniteInstanceName=null, pubPoolSize=8, svcPoolSize=8, 
callbackPoolSize=8, stripedPoolSize=8, sysPoolSize=8, mgmtPoolS
ize=4, igfsPoolSize=4, dataStreamerPoolSize=8, utilityCachePoolSize=8, 
utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8, 
igniteHome=/opt/ignite/apache-ignite-fabric-2.6.0-bin, 
igniteWorkDir=/opt/ignite/apache-ignite-fabric-2.6.0-bin/work, 
mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer@6f94fa3e, 
nodeId=142b548a-6480-4c31-9559-7d7b2092175c, 
marsh=org.apache.ignite.internal.binary.BinaryMarshaller@4e0ae11f, 
marshLocJobs=false, daemon=false, p2pEnabled=true, netTimeout=5000, 
sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000, 
metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, 
discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=15000, 
ackTimeout=60000, marsh=null, reconCnt=10, reconDelay=2000, 
maxAckTimeout=600000, forceSrvMode=false, clientReconnectDisabled=false, 
internalLsnr=null], segPlc=STOP, segResolveAttempts=2, 
waitForSegOnStart=true, allResolversPassReq=true, segChkFreq=10000, 
commSpi=TcpCommunicationSpi [connectGate=null, connPlc=null, 
enableForcibleNodeKill=false, enableTroubleshootingLog=false, 
srvLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@4c2bb6e0, 
locAddr=null, locHost=null, locPort=47100, locPortRange=100, 
shmemPort=-1, directBuf=true, directSndBuf=false, 
idleConnTimeout=600000, connTimeout=5000, maxConnTimeout=600000, 
reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=0, 
slowClientQueueLimit=0, nioSrvr=null, shmemSrv=null, 
usePairedConnections=false, connectionsPerNode=1, tcpNoDelay=true, 
filterReachableAddresses=false, ackSndThreshold=32, 
unackedMsgsBufSize=0, sockWriteTimeout=2000, lsnr=null, boundTcpPort=-1, 
boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null, 
ctxInitLatch=java.util.concurrent.CountDownLatch@3e62d773[Count = 1], 
stopping=false, 
metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationMetricsListener@4ef74c30], 
evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi@7283d3eb, 
colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [lsnr=null], 
indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi@47c81abf, 
addrRslvr=null, clientMode=false, rebalanceThreadPoolSize=1, 
txCfg=org.apache.ignite.configuration.TransactionConfiguration@776a6d9b, 
cacheSanityCheckEnabled=true, discoStartupDelay=60000, 
deployMode=PRIVATE, p2pMissedCacheSize=100, locHost=null, 
timeSrvPortBase=31100, timeSrvPortRange=100, 
failureDetectionTimeout=10000, clientFailureDetectionTimeout=30000, 
metricsLogFreq=60000, hadoopCfg=null, 
connectorCfg=org.apache.ignite.configuration.ConnectorConfiguration@21d03963, 
odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration 
[seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null, 
grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null, 
binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration 
[sysRegionInitSize=41943040, sysCacheMaxSize=104857600, pageSize=0, 
concLvl=4, dfltDataRegConf=DataRegionConfiguration [name=default, 
maxSize=34359738368, initSize=268435456, swapPath=null, 
pageEvictionMode=DISABLED, evictionThreshold=0.9, 
emptyPagesPoolSize=100, metricsEnabled=false, metricsSubIntervalCount=5, 
metricsRateTimeInterval=60000, persistenceEnabled=true, 
checkpointPageBufSize=0], storagePath=/data/ignite/storage, 
checkpointFreq=180000, lockWaitTime=10000, checkpointThreads=4, 
checkpointWriteOrder=SEQUENTIAL, walHistSize=20, walSegments=10, 
walSegmentSize=67108864, walPath=/data/ignite/wal, 
walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY, 
walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000, 
walRecordIterBuffSize=67108864, alwaysWriteFullPages=false, 
fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory@18ece7f4, 
metricsSubIntervalCnt=5, metricsRateTimeInterval=60000, 
walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false, 
walCompactionEnabled=true], activeOnStart=true, autoActivation=true, 
longQryWarnTimeout=3000, sqlConnCfg=null, 
cliConnCfg=ClientConnectorConfiguration [host=10.37.184.213, port=10800, 
portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true, 
maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0, 
jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, 
sslEnabled=false, useIgniteSslCtxFactory=true, sslClientAuth=false, 
sslCtxFactory=null], authEnabled=false, 
failureHnd=RestartProcessFailureHandler [], 
commFailureRslvr=null]2019-01-14T10:33:35,459][INFO 
][main][IgniteKernal] Daemon mode: off
2019-01-14T10:33:35,460][INFO ][main][IgniteKernal] OS: Linux 
3.10.0-229.el7.x86_64 amd64
2019-01-14T10:33:35,460][INFO ][main][IgniteKernal] OS user: root
2019-01-14T10:33:35,461][INFO ][main][IgniteKernal] PID: 25000
2019-01-14T10:33:35,461][INFO ][main][IgniteKernal] Language runtime: 
Java Platform API Specification ver. 1.8
2019-01-14T10:33:35,461][INFO ][main][IgniteKernal] VM information: 
Java(TM) SE Runtime Environment 1.8.0_151-b12 Oracle Corporation Java 
HotSpot(TM) 64-Bit Server VM 25.151-b12
2019-01-14T10:33:35,463][INFO ][main][IgniteKernal] VM total memory: 8.0GB
2019-01-14T10:33:35,463][INFO ][main][IgniteKernal] Remote Management 
[restart: on, REST: on, JMX (remote: on, port: 49224, auth: off, ssl: off)]
2019-01-14T10:33:35,464][INFO ][main][IgniteKernal] Logger: Log4J2Logger 
[quiet=false, config=config/log4j2.xml]
2019-01-14T10:33:35,464][INFO ][main][IgniteKernal] 
IGNITE_HOME=/opt/ignite/apache-ignite-fabric-2.6.0-bin
2019-01-14T10:33:35,464][INFO ][main][IgniteKernal] VM arguments: 
[-Xms1g, -Xmx8g, -XX:+AggressiveOpts, -XX:MaxMetaspaceSize=384m, 
-XX:+AlwaysPreTouch, -XX:+ScavengeBeforeFullGC, -XX:+Disable
ExplicitGC, -XX:+UseG1GC, -Xss4m, -Djava.net.preferIPv4Stack=true, 
-DIGNITE_QUIET=false, 
-DIGNITE_SUCCESS_FILE=/opt/ignite/apache-ignite-fabric-2.6.0-bin/work/ignite_success_911c4c15-f1f8-49b4-9a92-85e54059d4d4, 
-Dcom.sun.management.jmxremote, 
-Dcom.sun.management.jmxremote.port=49224, 
-Dcom.sun.management.jmxremote.authenticate=false, 
-Dcom.sun.management.jmxremote.ssl=false, 
-DIGNITE_HOME=/opt/ignite/apache-ignite-fabric-2.6.0-bin, 
-DIGNITE_PROG_NAME=/opt/ignite/apache-ignite-fabric-2.6.0-bin/bin/ignite.sh]2019-01-14T10:33:35,465][INFO 
][main][IgniteKernal] System cache's DataRegion size is configured to 40 
MB. Use DataStorageConfiguration.systemCacheMemorySize property to 
change the setting.
2019-01-14T10:33:35,484][INFO ][main][IgniteKernal] Configured caches 
[in 'sysMemPlc' dataRegion: ['ignite-sys-cache']]
2019-01-14T10:33:35,485][WARN ][main][IgniteKernal] Peer class loading 
is enabled (disable it in production for performance and deployment 
consistency reasons)
2019-01-14T10:33:35,494][INFO ][main][IgniteKernal] 3-rd party licenses 
can be found at: /opt/ignite/apache-ignite-fabric-2.6.0-bin/libs/licenses
2019-01-14T10:33:35,494][INFO ][main][IgniteKernal] Local node user 
attribute [DATA_ROLE=BUDS]
2019-01-14T10:33:35,555][INFO ][main][IgnitePluginProcessor] Configured 
plugins:
2019-01-14T10:33:35,556][INFO ][main][IgnitePluginProcessor]   ^-- None
2019-01-14T10:33:35,556][INFO ][main][IgnitePluginProcessor]
2019-01-14T10:33:35,557][INFO ][main][FailureProcessor] Configured 
failure handler: [hnd=RestartProcessFailureHandler []]
2019-01-14T10:33:35,601][INFO ][main][TcpCommunicationSpi] Successfully 
bound communication NIO server to TCP port [port=47100, 
locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, paire
dConn=false]2019-01-14T10:33:35,603][WARN ][main][TcpCommunicationSpi] 
Message queue limit is set to 0 which may lead to potential OOMEs when 
running cache operations in FULL_ASYNC or PRIMARY_SYNC modes
due to message queues growth on sender and receiver 
sides.2019-01-14T10:33:35,626][WARN ][main][NoopCheckpointSpi] 
Checkpoints are disabled (to enable configure any GridCheckpointSpi 
implementation)
2019-01-14T10:33:35,650][WARN ][main][GridCollisionManager] Collision 
resolution is disabled (all jobs will be activated upon arrival).
2019-01-14T10:33:35,652][INFO ][main][IgniteKernal] Security status 
[authentication=off, tls/ssl=off]
2019-01-14T10:33:35,682][INFO ][main][TcpDiscoverySpi] Successfully 
bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, 
locNodeId=142b548a-6480-4c31-9559-7d7b2092175c]
2019-01-14T10:33:35,691][INFO ][main][PdsFoldersResolver] Successfully 
locked persistence storage folder 
[/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477b2101905]
2019-01-14T10:33:35,692][INFO ][main][PdsFoldersResolver] Consistent ID 
used for local node is [cf5501f0-c13e-457f-8c34-7477b2101905] according 
to persistence data storage folders
2019-01-14T10:33:35,692][INFO ][main][CacheObjectBinaryProcessorImpl] 
Resolved directory for serialized binary metadata: 
/opt/ignite/apache-ignite-fabric-2.6.0-bin/work/binary_meta/node00-cf5
501f0-c13e-457f-8c34-7477b21019052019-01-14T10:33:35,813][WARN 
][main][GridCacheProcessor] Deployment mode for cache is not CONTINUOUS 
or SHARED (it is recommended that you change deployment mode and 
restart): PRIVATE
2019-01-14T10:33:35,914][INFO ][main][FilePageStoreManager] Resolved 
page store work directory: 
/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477b2101905
2019-01-14T10:33:35,914][INFO ][main][FileWriteAheadLogManager] Resolved 
write ahead log work directory: 
/data/ignite/wal/node00-cf5501f0-c13e-457f-8c34-7477b2101905
2019-01-14T10:33:35,915][INFO ][main][FileWriteAheadLogManager] Resolved 
write ahead log archive directory: 
/opt/ignite/apache-ignite-fabric-2.6.0-bin/work/db/wal/archive/node00-cf5501f0-c13e
-457f-8c34-7477b21019052019-01-14T10:33:35,942][INFO 
][main][FileWriteAheadLogManager] Started write-ahead log manager 
[mode=LOG_ONLY]
2019-01-14T10:33:35,953][WARN ][main][GridCacheDatabaseSharedManager] 
Page eviction mode set for [DR_MEM] data will have no effect because the 
oldest pages are evicted automatically if Ignite
  persistence is enabled.2019-01-14T10:33:35,975][INFO 
][main][GridCacheDatabaseSharedManager] Read checkpoint status 
[startMarker=/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477b2101905/cp/1547192473734-c5e
d49c2-0263-4d21-9c1e-63fcd0f6c9d6-START.bin, 
endMarker=/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477b2101905/cp/1547192473734-c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6-END.bin]2019-01-14T10:33:35,988][INFO 
][main][PageMemoryImpl] Started page memory [memoryAllocated=100.0 MiB, 
pages=24812, tableSize=1.9 MiB, checkpointBuffer=100.0 MiB]
2019-01-14T10:33:35,989][INFO ][main][GridCacheDatabaseSharedManager] 
Checking memory state [lastValidPos=FileWALPointer [idx=3612, 
fileOff=49901060, len=40363], lastMarked=FileWALPointer [id
x=3612, fileOff=49901060, len=40363], 
lastCheckpointId=c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6]2019-01-14T10:33:36,017][INFO 
][main][FileWriteAheadLogManager] Stopping WAL iteration due to an 
exception: Failed to read WAL record at position: 49941423, 
ptr=FileWALPointer [idx=3612, file
Off=49941423, len=0]2019-01-14T10:33:36,018][INFO 
][main][GridCacheDatabaseSharedManager] Found last checkpoint marker 
[cpId=c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6, pos=FileWALPointer 
[idx=3612, fileOff=49901060,
len=40363]]2019-01-14T10:33:36,048][INFO 
][main][GridCacheDatabaseSharedManager] Applying lost cache updates 
since last checkpoint record [lastMarked=FileWALPointer [idx=3612, 
fileOff=49901060, len=4036
3], 
lastCheckpointId=c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6]2019-01-14T10:33:36,062][INFO 
][main][FileWriteAheadLogManager] Stopping WAL iteration due to an 
exception: Failed to read WAL record at position: 49941423, 
ptr=FileWALPointer [idx=3612, file
Off=49941423, len=0]2019-01-14T10:33:36,063][INFO 
][main][GridCacheDatabaseSharedManager] Finished applying WAL changes 
[updatesApplied=0, time=10ms]
2019-01-14T10:33:36,113][INFO ][main][GridClusterStateProcessor] 
Restoring history for BaselineTopology[id=0]
2019-01-14T10:33:36,234][INFO ][main][ClientListenerProcessor] Client 
connector processor has started on TCP port 10800
2019-01-14T10:33:36,286][INFO ][main][GridTcpRestProtocol] Command 
protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, 
port=11211]
2019-01-14T10:33:36,317][INFO ][main][IgniteKernal] Non-loopback local 
IPs: 10.37.184.213
2019-01-14T10:33:36,317][INFO ][main][IgniteKernal] Enabled local MACs: 
FA163E3C967A
2019-01-14T10:33:37,663][INFO ][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP 
discovery accepted incoming connection [rmtAddr=/10.37.184.217, 
rmtPort=37803]
2019-01-14T10:33:37,674][INFO ][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP 
discovery spawning a new thread for connection [rmtAddr=/10.37.184.217, 
rmtPort=37803]
2019-01-14T10:33:37,675][INFO 
][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Started serving remote node 
connection [rmtAddr=/10.37.184.217:37803, rmtPort=37803]
2019-01-14T10:33:37,751][ERROR][tcp-disco-msg-worker-#3][TcpDiscoverySpi] 
TcpDiscoverSpi's message worker thread failed abnormally. Stopping the 
node in order to prevent cluster wide instabil
ity.2019-01-14T10:33:37,757][ERROR][tcp-disco-msg-worker-#3][] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=class o.a.i.failure.RestartProcessFailureHand
ler, failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=class o.a.i.IgniteException: Node with BaselineTopology cannot join 
mixed cluster running in compatibility 
mode]]2019-01-14T10:33:37,760][ERROR][tcp-disco-msg-worker-#3][FailureProcessor] 
Ignite node is in invalid state due to a critical failure.
2019-01-14T10:33:37,761][ERROR][tcp-disco-msg-worker-#3][TcpDiscoverySpi] 
Runtime error caught during grid runnable execution: IgniteSpiThread 
[name=tcp-disco-msg-worker-#3]
2019-01-14T10:33:37,761][ERROR][node-restarter][] Restarting JVM on 
Ignite failure: [failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Node 
with Base
lineTopology cannot join mixed cluster running in compatibility 
mode]][10:33:37] Restarting node. Will exit (250).
2019-01-14T10:33:37,757][ERROR][main][IgniteKernal] Failed to start 
manager: GridManagerAdapter [enabled=true, 
name=o.a.i.i.managers.discovery.GridDiscoveryManager]
2019-01-14T10:33:37,763][ERROR][main][IgniteKernal] Got exception while 
starting (will rollback startup routine).
[10:33:37] (wrn) Ignoring stopping Ignite instance that was already 
stopped or never started: null
2019-01-14T10:33:37,765][INFO ][node-stop-thread][TcpDiscoverySpi] 
Stopped the node successfully in response to TcpDiscoverySpi's message 
worker thread abnormal termination.
2019-01-14T10:33:37,776][INFO ][main][GridTcpRestProtocol] Command 
protocol successfully stopped: TCP binary
2019-01-14T10:33:37,788][INFO 
][tcp-disco-sock-reader-#5][TcpDiscoverySpi] Finished serving remote 
node connection [rmtAddr=/10.37.184.217:37803, rmtPort=37803
2019-01-14T10:33:37,955][INFO ][main][IgniteKernal]

 >>> 
+---------------------------------------------------------------------------------+
 >>> Ignite ver. 
2.6.0#20180710-sha1:669feacc5d3a4e60efcdd300dc8de99780f38eed stopped OK
 >>> 
+---------------------------------------------------------------------------------+
 >>> Grid uptime: 00:00:03.441


class org.apache.ignite.IgniteException: Failed to start manager: 
GridManagerAdapter [enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
     at 
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:990)
     at org.apache.ignite.Ignition.start(Ignition.java:355)
     at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301)
2019-01-14T10:33:37,959][WARN ][node-restarter][G] Attempting to stop an 
already stopped Ignite instance (ignore): null
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to 
start manager: GridManagerAdapter [enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
     at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1726)
     at 
org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1028)
     at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014)
     at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723)
     at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151)
     at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069)
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955)
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854)
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724)
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693)
     at org.apache.ignite.Ignition.start(Ignition.java:352)
     ... 1 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to 
start SPI: TcpDiscoverySpi [addrRslvr=null, sockTimeout=15000, 
ackTimeout=60000, marsh=JdkMarshaller [clsFilter=org.apache
.ignite.marshaller.MarshallerUtils$1@41f4fe5], reconCnt=10, 
reconDelay=2000, maxAckTimeout=600000, forceSrvMode=false, 
clientReconnectDisabled=false, internalLsnr=null]    at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:300)
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915)
     at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1721)
     ... 11 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Thread has 
been interrupted.
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:938)
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373)
     at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948)
     at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
     ... 13 more
Failed to start grid: Failed to start manager: GridManagerAdapter 
[enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]

-----------------------------log end------------------------------


在 2019/1/12 上午12:58, Ilya Kasnacheev 写道:
> Hello!
>
> Can you show what you get in logs as your nodes attempt to join the 
> cluster?
>
> Regards,
> -- 
> Ilya Kasnacheev
>
>
> пт, 11 янв. 2019 г. в 19:43, 李玉珏@163 <18624049226@163.com 
> <ma...@163.com>>:
>
>     Hi,
>
>     Currently, after cluster activation, if a node with native
>     persistence
>     is enabled terminates abnormally,when the node is restarted, it
>     cannot
>     join the cluster.
>
>     So the question is:
>
>     1.If the node terminates abnormally, how can the node rejoin the
>     cluster?
>
>     2.How to restart the node gracefully?
>
>

Re: Abnormal termination of nodes with native persistence enabled

Posted by 李玉...@163, 18...@163.com.
thanks!

在 2019/1/14 下午11:44, ilya.kasnacheev 写道:
> Hello!
>
> Please see
> http://apache-ignite-users.70518.x6.nabble.com/Native-persistence-ignite-server-failed-to-join-when-client-has-been-started-td26248.html#a26250
> with regards to visor node.
>
>> Node with BaselineTopology cannot join mixed cluster running in
>> compatibility mode
> This is the key.
>
> Upgrade to 2.7 is recommended since
> https://issues.apache.org/jira/browse/IGNITE-8774 is fixed there.
>
> Regards,
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Abnormal termination of nodes with native persistence enabled

Posted by "ilya.kasnacheev" <il...@gmail.com>.
Hello!

Please see
http://apache-ignite-users.70518.x6.nabble.com/Native-persistence-ignite-server-failed-to-join-when-client-has-been-started-td26248.html#a26250
with regards to visor node.

> Node with BaselineTopology cannot join mixed cluster running in
> compatibility mode

This is the key.

Upgrade to 2.7 is recommended since
https://issues.apache.org/jira/browse/IGNITE-8774 is fixed there.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Abnormal termination of nodes with native persistence enabled

Posted by 李玉...@163, 18...@163.com.
Hi,

Here is log4j's log:

-------------------------log start----------------------------

[10:33:42:399] [WARN]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Attempting to stop an already stopped Ignite instance (ignore): null
[10:33:44:318] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) -
>>>    __________  ________________  
>>>   /  _/ ___/ |/ /  _/_  __/ __/  
>>>  _/ // (77// / / / / _/
>>> /___/\___/_/|_/___/ /_/ /___/   
>>> 
>>> ver. 2.6.0#20180710-sha1:669feacc
>>> 2018Copyright(C) Apache Software Foundation
>>> 
>>> Ignite documentation: http://ignite.apache.org
[10:33:44:321] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Config URL: 
file:/opt/ignite/apache-ignite-fabric-2.6.0-bin/config/practice-config.xml
[10:33:44:337] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- IgniteConfiguration [igniteInstanceName=null, pubPoolSize=8, 
svcPoolSize=8, callbackPoolSize
=8, stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=4, 
dataStreamerPoolSize=8, utilityCachePoolSize=8, 
utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8, 
igniteHome=/opt/ignite/apache-ignite-fabric-2.6.0-bin, 
igniteWorkDir=/opt/ignite/apache-ignite-fabric-2.6.0-bin/work, 
mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer@6f94fa3e, 
nodeId=7568b741-7451-458b-a805-89c67ac014fb, 
marsh=org.apache.ignite.internal.binary.BinaryMarshaller@4e0ae11f, 
marshLocJobs=false, daemon=false, p2pEnabled=true, netTimeout=5000, 
sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000, 
metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, 
discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=15000, 
ackTimeout=60000, marsh=null, reconCnt=10, reconDelay=2000, 
maxAckTimeout=600000, forceSrvMode=false, clientReconnectDisabled=false, 
internalLsnr=null], segPlc=STOP, segResolveAttempts=2, 
waitForSegOnStart=true, allResolversPassReq=true, segChkFreq=10000, 
commSpi=TcpCommunicationSpi [connectGate=null, connPlc=null, 
enableForcibleNodeKill=false, enableTroubleshootingLog=false, 
srvLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@4c2bb6e0, 
locAddr=null, locHost=null, locPort=47100, locPortRange=100, 
shmemPort=-1, directBuf=true, directSndBuf=false, 
idleConnTimeout=600000, connTimeout=5000, maxConnTimeout=600000, 
reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=0, 
slowClientQueueLimit=0, nioSrvr=null, shmemSrv=null, 
usePairedConnections=false, connectionsPerNode=1, tcpNoDelay=true, 
filterReachableAddresses=false, ackSndThreshold=32, 
unackedMsgsBufSize=0, sockWriteTimeout=2000, lsnr=null, boundTcpPort=-1, 
boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null, 
ctxInitLatch=java.util.concurrent.CountDownLatch@3e62d773[Count = 1], 
stopping=false, 
metricsLsnr=org.apache.ignite.spi.communication.tcp.TcpCommunicationMetricsListener@4ef74c30], 
evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi@7283d3eb, 
colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [lsnr=null], 
indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi@47c81abf, 
addrRslvr=null, clientMode=false, rebalanceThreadPoolSize=1, 
txCfg=org.apache.ignite.configuration.TransactionConfiguration@776a6d9b, 
cacheSanityCheckEnabled=true, discoStartupDelay=60000, 
deployMode=PRIVATE, p2pMissedCacheSize=100, locHost=null, 
timeSrvPortBase=31100, timeSrvPortRange=100, 
failureDetectionTimeout=10000, clientFailureDetectionTimeout=30000, 
metricsLogFreq=60000, hadoopCfg=null, 
connectorCfg=org.apache.ignite.configuration.ConnectorConfiguration@21d03963, 
odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration 
[seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null, 
grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null, 
binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration 
[sysRegionInitSize=41943040, sysCacheMaxSize=104857600, pageSize=0, 
concLvl=4, dfltDataRegConf=DataRegionConfiguration [name=default, 
maxSize=34359738368, initSize=268435456, swapPath=null, 
pageEvictionMode=DISABLED, evictionThreshold=0.9, 
emptyPagesPoolSize=100, metricsEnabled=false, metricsSubIntervalCount=5, 
metricsRateTimeInterval=60000, persistenceEnabled=true, 
checkpointPageBufSize=0], storagePath=/data/ignite/storage, 
checkpointFreq=180000, lockWaitTime=10000, checkpointThreads=4, 
checkpointWriteOrder=SEQUENTIAL, walHistSize=20, walSegments=10, 
walSegmentSize=67108864, walPath=/data/ignite/wal, 
walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY, 
walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000, 
walRecordIterBuffSize=67108864, alwaysWriteFullPages=false, 
fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory@18ece7f4, 
metricsSubIntervalCnt=5, metricsRateTimeInterval=60000, 
walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false, 
walCompactionEnabled=true], activeOnStart=true, autoActivation=true, 
longQryWarnTimeout=3000, sqlConnCfg=null, 
cliConnCfg=ClientConnectorConfiguration [host=10.37.184.213, port=10800, 
portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true, 
maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0, 
jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, 
sslEnabled=false, useIgniteSslCtxFactory=true, sslClientAuth=false, 
sslCtxFactory=null], authEnabled=false, 
failureHnd=RestartProcessFailureHandler [], 
commFailureRslvr=null][10:33:44:339] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Daemon mode: off
[10:33:44:339] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- OS: Linux 3.10.0-229.el7.x86_64amd64
[10:33:44:339] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- OS user: root
[10:33:44:340] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- PID: 25169
[10:33:44:340] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Language runtime: Java Platform API Specification ver. 1.8
[10:33:44:340] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- VM information:Java(TM) SE Runtime Environment 1.8.0_151-b12 Oracle 
Corporation Java HotSpo
t(TM) 64-Bit Server VM 25.151-b12[10:33:44:342] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- VM total memory: 8.0GB
[10:33:44:342] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Remote Management [restart: on, REST: on, JMX (remote: on, port: 
49224, auth: off, ssl: off)
][10:33:44:343] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Logger: Log4J2Logger [quiet=false, config=config/log4j2.xml]
[10:33:44:343] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- IGNITE_HOME=/opt/ignite/apache-ignite-fabric-2.6.0-bin
[10:33:44:344] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- VM arguments: [-Xms1g, -Xmx8g, -XX:+AggressiveOpts, 
-XX:MaxMetaspaceSize=384m, -XX:+AlwaysPr
eTouch, -XX:+ScavengeBeforeFullGC, -XX:+DisableExplicitGC, -XX:+UseG1GC, 
-Xss4m, -Djava.net.preferIPv4Stack=true, -DIGNITE_QUIET=false, 
-DIGNITE_SUCCESS_FILE=/opt/ignite/apache-ignite-fabric-2.6.0-bin/work/ignite_success_911c4c15-f1f8-49b4-9a92-85e54059d4d4, 
-Dcom.sun.management.jmxremote, 
-Dcom.sun.management.jmxremote.port=49224, 
-Dcom.sun.management.jmxremote.authenticate=false, 
-Dcom.sun.management.jmxremote.ssl=false, 
-DIGNITE_HOME=/opt/ignite/apache-ignite-fabric-2.6.0-bin, 
-DIGNITE_PROG_NAME=/opt/ignite/apache-ignite-fabric-2.6.0-bin/bin/ignite.sh][10:33:44:344] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- System cache's DataRegion size is configured to 40MB. Use 
DataStorageConfiguration.systemCa
cheMemorySize property to change the setting.[10:33:44:364] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Configured caches [in 'sysMemPlc'dataRegion: ['ignite-sys-cache']]
[10:33:44:364] [WARN]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Peer class loading is enabled (disable it in production for 
performance and deployment co
nsistency reasons)[10:33:44:377] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- 3-rd party licenses can be found at: 
/opt/ignite/apache-ignite-fabric-2.6.0-bin/libs/license
s[10:33:44:378] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Local node user attribute [DATA_ROLE=BUDS]
[10:33:44:436] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Configured plugins:
[10:33:44:437] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- ^-- None
[10:33:44:437] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) -
[10:33:44:437] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Configured failure handler: [hnd=RestartProcessFailureHandler []]
[10:33:44:485] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Successfully bound communication NIO server to TCP port [port=47100, 
locHost=0.0.0.0/0.0.0.0
, selectorsCnt=4, selectorSpins=0, pairedConn=false][10:33:44:488] 
[WARN]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Message queue limit is set to 0which may lead to potential OOMEs when 
running cache oper
ations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth 
on sender and receiver sides.[10:33:44:511] [WARN] - 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Checkpoints are disabled (to enable configure any GridCheckpointSpi 
implementation)
[10:33:44:537] [WARN]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Collision resolution is disabled (all jobs will be activated upon 
arrival).
[10:33:44:538] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Security status [authentication=off, tls/ssl=off]
[10:33:44:572] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, 
locNodeId=7568b741-74
51-458b-a805-89c67ac014fb][10:33:44:581] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Successfully locked persistence storage folder 
[/data/ignite/storage/node00-cf5501f0-c13e-45
7f-8c34-7477b2101905][10:33:44:582] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Consistent ID used for local node is 
[cf5501f0-c13e-457f-8c34-7477b2101905] according to per
sistence data storage folders[10:33:44:582] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Resolved directory for serialized binary metadata: 
/opt/ignite/apache-ignite-fabric-2.6.0-bi
n/work/binary_meta/node00-cf5501f0-c13e-457f-8c34-7477b2101905[10:33:44:707] 
[WARN]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Deployment mode for cache is not CONTINUOUS or SHARED (it is 
recommended that you change
deployment mode and restart): PRIVATE[10:33:44:809] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Resolved page store work directory: 
/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477
b2101905[10:33:44:810] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Resolved write ahead log work directory: 
/data/ignite/wal/node00-cf5501f0-c13e-457f-8c34-747
7b2101905[10:33:44:810] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Resolved write ahead log archive directory: 
/opt/ignite/apache-ignite-fabric-2.6.0-bin/work/
db/wal/archive/node00-cf5501f0-c13e-457f-8c34-7477b2101905[10:33:44:836] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Started write-ahead log manager [mode=LOG_ONLY]
[10:33:44:847] [WARN]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.warning(Log4J2Logger.java:488) 
- Page eviction mode set for [DR_MEM] data will have no effect because 
the oldest pages are
evicted automatically if Ignite persistence is enabled.[10:33:44:872] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Read checkpoint status 
[startMarker=/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477
b2101905/cp/1547192473734-c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6-START.bin, 
endMarker=/data/ignite/storage/node00-cf5501f0-c13e-457f-8c34-7477b2101905/cp/1547192473734-c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6-END.bin][10:33:44:884] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Started page memory [memoryAllocated=100.0MiB, pages=24812, 
tableSize=1.9MiB, checkpointBu
ffer=100.0MiB][10:33:44:885] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Checking memory state [lastValidPos=FileWALPointer [idx=3612, 
fileOff=49901060, len=40363],
lastMarked=FileWALPointer [idx=3612, fileOff=49901060, len=40363], 
lastCheckpointId=c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6][10:33:44:915] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Stopping WAL iteration due to an exception: Failed to read WAL record 
at position: 49941423,
ptr=FileWALPointer [idx=3612, fileOff=49941423, len=0][10:33:44:916] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Found last checkpoint marker 
[cpId=c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6, pos=FileWALPointer
[idx=3612, fileOff=49901060, len=40363]][10:33:44:948] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Applying lost cache updates since last checkpoint record 
[lastMarked=FileWALPointer [idx=361
2, fileOff=49901060, len=40363], 
lastCheckpointId=c5ed49c2-0263-4d21-9c1e-63fcd0f6c9d6][10:33:44:962] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Stopping WAL iteration due to an exception: Failed to read WAL record 
at position: 49941423,
ptr=FileWALPointer [idx=3612, fileOff=49941423, len=0][10:33:44:963] 
[INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Finished applying WAL changes [updatesApplied=0, time=20ms]
[10:33:45:015] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Restoring history for BaselineTopology[id=0]
[10:33:45:141] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Client connector processor has started on TCP port 10800
[10:33:45:197] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Command protocol successfully started [name=TCP binary, 
host=0.0.0.0/0.0.0.0, port=11211]
[10:33:45:224] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Non-loopback local IPs: 10.37.184.213
[10:33:45:225] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Enabled local MACs: FA163E3C967A
[10:33:46:584] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- TCP discovery accepted incoming connection [rmtAddr=/10.37.184.217, 
rmtPort=37330]
[10:33:46:595] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- TCP discovery spawning a new thread for connection 
[rmtAddr=/10.37.184.217, rmtPort=37330]
[10:33:46:596] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Started serving remote node connection [rmtAddr=/10.37.184.217:37330, 
rmtPort=37330]
[10:33:46:667] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- TcpDiscoverSpi's message worker thread failed abnormally. Stopping the 
node in order to pr
event cluster wide instability.org.apache.ignite.IgniteException: Node 
with BaselineTopology cannot join mixed cluster running in compatibility 
mode
     at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.onGridDataReceived(GridClusterStateProcessor.java:714) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5.onExchange(GridDiscoveryManager.java:883) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.onExchange(TcpDiscoverySpi.java:1939) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4354) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2744) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) 
[ignite-core-2.6.0.jar:2.6.0]
[10:33:46:672] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- Critical system error detected. Will be handled accordingly to 
configured handler [hnd=cla
ss o.a.i.failure.RestartProcessFailureHandler, failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Node 
with BaselineTopology cannot join mixed cluster running in compatibility 
mode]]org.apache.ignite.IgniteException: Node with BaselineTopology 
cannot join mixed cluster running in compatibility mode
     at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.onGridDataReceived(GridClusterStateProcessor.java:714) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5.onExchange(GridDiscoveryManager.java:883) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.onExchange(TcpDiscoverySpi.java:1939) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4354) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2744) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) 
[ignite-core-2.6.0.jar:2.6.0]
[10:33:46:672] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- Failed to start manager: GridManagerAdapter [enabled=true, 
name=o.a.i.i.managers.discovery
.GridDiscoveryManager]org.apache.ignite.IgniteCheckedException: Failed 
to start SPI: TcpDiscoverySpi [addrRslvr=null, sockTimeout=15000, 
ackTimeout=60000, marsh=JdkMarshaller [clsFilter=org.apache.ignite.marshalle
r.MarshallerUtils$1@41f4fe5], reconCnt=10, reconDelay=2000, 
maxAckTimeout=600000, forceSrvMode=false, clientReconnectDisabled=false, 
internalLsnr=null] at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:300) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1721) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1028) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.Ignition.start(Ignition.java:352) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) 
[ignite-core-2.6.0.jar:2.6.0]
Caused by: org.apache.ignite.spi.IgniteSpiException: Thread has been 
interrupted.
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:938) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) 
~[ignite-core-2.6.0.jar:2.6.0]
     ... 13more
[10:33:46:673] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- Ignite node is in invalid state due to a critical failure.
[10:33:46:673] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- Got exception while starting (will rollback startup routine).
org.apache.ignite.IgniteCheckedException: Failed to start manager: 
GridManagerAdapter [enabled=true, 
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
     at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1726) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1028) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2014) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1723) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1151) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1069) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:955) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:854) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:724) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:693) 
[ignite-core-2.6.0.jar:2.6.0]
     at org.apache.ignite.Ignition.start(Ignition.java:352) 
[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301) 
[ignite-core-2.6.0.jar:2.6.0]
Caused by: org.apache.ignite.IgniteCheckedException: Failed to start 
SPI: TcpDiscoverySpi [addrRslvr=null, sockTimeout=15000, 
ackTimeout=60000, marsh=JdkMarshaller [clsFilter=org.apache.ignit
e.marshaller.MarshallerUtils$1@41f4fe5], reconCnt=10, reconDelay=2000, 
maxAckTimeout=600000, forceSrvMode=false, clientReconnectDisabled=false, 
internalLsnr=null]  at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:300) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1721) 
~[ignite-core-2.6.0.jar:2.6.0]
     ... 11more
Caused by: org.apache.ignite.spi.IgniteSpiException: Thread has been 
interrupted.
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:938) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:373) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:1948) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:915) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1721) 
~[ignite-core-2.6.0.jar:2.6.0]
     ... 11more
[10:33:46:674] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- Runtime error caught during grid runnable execution: IgniteSpiThread 
[name=tcp-disco-msg-w
orker-#3]org.apache.ignite.IgniteException: Node with BaselineTopology 
cannot join mixed cluster running in compatibility mode
     at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.onGridDataReceived(GridClusterStateProcessor.java:714) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$5.onExchange(GridDiscoveryManager.java:883) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.onExchange(TcpDiscoverySpi.java:1939) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4354) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2744) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) 
~[ignite-core-2.6.0.jar:2.6.0]
     at 
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) 
[ignite-core-2.6.0.jar:2.6.0]
[10:33:46:675] [ERROR]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.error(Log4J2Logger.java:498) 
- Restarting JVM on Ignite failure: [failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATI
ON, err=class o.a.i.IgniteException: Node with BaselineTopology cannot 
join mixed cluster running in compatibility mode]][10:33:46:679] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Stopped the node successfully in response to TcpDiscoverySpi's message 
worker thread abnorma
l termination.[10:33:46:687] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Command protocol successfully stopped: TCP binary
[10:33:46:696] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) 
- Finished serving remote node connection [rmtAddr=/10.37.184.217:37330, 
rmtPort=37330
[10:33:46:862] [INFO]- 
org.apache.ignite.logger.log4j2.Log4J2Logger.info(Log4J2Logger.java:478) -
>>> +---------------------------------------------------------------------------------+
>>> Ignite ver. 2.6.0#20180710-sha1:669feacc5d3a4e60efcdd300dc8de99780f38eedstopped OK
>>> +---------------------------------------------------------------------------------+
>>> Grid uptime: 00:00:03.470

-------------------------log end----------------------------


在 2019/1/12 上午12:58, Ilya Kasnacheev 写道:
> Hello!
>
> Can you show what you get in logs as your nodes attempt to join the 
> cluster?
>
> Regards,
> -- 
> Ilya Kasnacheev
>
>
> пт, 11 янв. 2019 г. в 19:43, 李玉珏@163 <18624049226@163.com 
> <ma...@163.com>>:
>
>     Hi,
>
>     Currently, after cluster activation, if a node with native
>     persistence
>     is enabled terminates abnormally,when the node is restarted, it
>     cannot
>     join the cluster.
>
>     So the question is:
>
>     1.If the node terminates abnormally, how can the node rejoin the
>     cluster?
>
>     2.How to restart the node gracefully?
>
>

Re: Abnormal termination of nodes with native persistence enabled

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Can you show what you get in logs as your nodes attempt to join the cluster?

Regards,
-- 
Ilya Kasnacheev


пт, 11 янв. 2019 г. в 19:43, 李玉珏@163 <18...@163.com>:

> Hi,
>
> Currently, after cluster activation, if a node with native persistence
> is enabled terminates abnormally,when the node is restarted, it cannot
> join the cluster.
>
> So the question is:
>
> 1.If the node terminates abnormally, how can the node rejoin the cluster?
>
> 2.How to restart the node gracefully?
>
>
>