You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Vladimir Ozerov <vo...@gridgain.com> on 2016/05/11 11:31:49 UTC

Re: Ignite Still waiting for initial partition map exchange problem

H Kevin,

Does it lead to any problems with application? If yes, please describe your
observations and provide thread dump of the node.

Vladimir.

On Wed, May 11, 2016 at 11:16 AM, Zhengqingzheng <zh...@huawei.com>
wrote:

> Hi there,
>
> When I tried to run an writethrough example, I got the following error
> message from log file:
>
>
>
> [15:29:48,902][WARNING][main][GridCachePartitionExchangeManager] Still
> waiting for initial partition map exchange
> [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false,
> reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=98392607-3eaa-417c-8782-5a180e655ec2, addrs=[0:0:0:0:0:0:0:1,
> 10.135.66.169, 127.0.0.1], sockAddrs=[
> NKGY2Z002066461.china.huawei.com/10.135.66.169:0, /0:0:0:0:0:0:0:1:0, /
> 10.135.66.169:0, /127.0.0.1:0], discPort=0, order=8, intOrder=0,
> lastExchangeTime=1462951688135, loc=true, ver=1.5.0#19700101-sha1:00000000,
> isClient=true], topVer=8, nodeId8=98392607, msg=null, type=NODE_JOINED,
> tstamp=1462951698228], rcvdIds=GridConcurrentHashSet [elements=[]],
> rmtIds=[6bb71edf-8c09-46f4-8138-d36310615582,
> c34f8715-7707-4052-96ba-2120c3a38158], exchId=GridDhtPartitionExchangeId
> [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], nodeId=98392607,
> evt=NODE_JOINED], init=true, ready=false, replied=false, added=false,
> initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1462951698725,
> endTime=1462951698781, ignoreInterrupts=false, lsnr=null, state=DONE],
> topSnapshot=null, lastVer=null, partReleaseFut=null, skipPreload=true,
> clientOnlyExchange=true, initTs=1462951698725,
> oldest=c34f8715-7707-4052-96ba-2120c3a38158, oldestOrder=1, evtLatch=0,
> remaining=[6bb71edf-8c09-46f4-8138-d36310615582,
> c34f8715-7707-4052-96ba-2120c3a38158], super=GridFutureAdapter [resFlag=0,
> res=null, startTime=1462951698725, endTime=0, ignoreInterrupts=false,
> lsnr=null, state=INIT]]]
>
> [15:30:18,933][WARNING][main][GridCachePartitionExchangeManager] Still
> waiting for initial partition map exchange
> [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false,
> reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=98392607-3eaa-417c-8782-5a180e655ec2, addrs=[0:0:0:0:0:0:0:1,
> 10.135.66.169, 127.0.0.1], sockAddrs=[
> NKGY2Z002066461.china.huawei.com/10.135.66.169:0, /0:0:0:0:0:0:0:1:0, /
> 10.135.66.169:0, /127.0.0.1:0], discPort=0, order=8, intOrder=0,
> lastExchangeTime=1462951688135, loc=true, ver=1.5.0#19700101-sha1:00000000,
> isClient=true], topVer=8, nodeId8=98392607, msg=null, type=NODE_JOINED,
> tstamp=1462951698228], rcvdIds=GridConcurrentHashSet [elements=[]],
> rmtIds=[6bb71edf-8c09-46f4-8138-d36310615582,
> c34f8715-7707-4052-96ba-2120c3a38158], exchId=GridDhtPartitionExchangeId
> [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], nodeId=98392607,
> evt=NODE_JOINED], init=true, ready=false, replied=false, added=false,
> initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1462951698725,
> endTime=1462951698781, ignoreInterrupts=false, lsnr=null, state=DONE],
> topSnapshot=null, lastVer=null, partReleaseFut=null, skipPreload=true,
> clientOnlyExchange=true, initTs=1462951698725,
> oldest=c34f8715-7707-4052-96ba-2120c3a38158, oldestOrder=1, evtLatch=0,
> remaining=[6bb71edf-8c09-46f4-8138-d36310615582,
> c34f8715-7707-4052-96ba-2120c3a38158], super=GridFutureAdapter [resFlag=0,
> res=null, startTime=1462951698725, endTime=0, ignoreInterrupts=false,
> lsnr=null, state=INIT]]]
>
>
>
>
>
> Any idea to fix it?
>
>
>
> Best regards,
>
> Kevin
>

Re: Ignite Still waiting for initial partition map exchange problem

Posted by Denis Magda <dm...@gridgain.com>.
Hi Kevin,

> My first observation is that, the client side keeps running without doing any write through actions to the database;

Client does write-through to a database only if it’s a coordinator of an Ignite transaction or writes to a transaction cache. In other cases server nodes are responsible for writing data to the storage. How do you detect that there are no updates into the storage? Don’t you use write-behind mode that postpones sync with database?

Second, I cannot get into the cmd interface to check cache or node information (running ./ignitevisorcmd.sh),

Try to use the same configuration file that is used by your server nodes. If this doesn’t help create thread dumps and check where the visor is hanging.

Also I see that you specified big Java heap size. Such big heap size can lead to long GC pauses and to the situation when a node is kicked out of the cluster. If you expect to store more than 10 GB of cache data on every node then I would suggest switching to off-heap caches [1] and tune GC properly [2]. In addition refer to similar discussions [3] and [4]

[1] https://apacheignite.readme.io/docs/off-heap-memory
[2]  https://apacheignite.readme.io/docs/jvm-and-system-tuning#jvm-tuning-for-clusters-with-off_heap-caches <https://apacheignite.readme.io/docs/jvm-and-system-tuning#jvm-tuning-for-clusters-with-off_heap-caches>
[3] http://apache-ignite-users.70518.x6.nabble.com/Cluster-size-for-Terrabytes-of-data-tp4857.html <http://apache-ignite-users.70518.x6.nabble.com/Cluster-size-for-Terrabytes-of-data-tp4857.html>
[4] http://apache-ignite-users.70518.x6.nabble.com/Slow-data-loading-and-high-very-memory-usage-issues-tp4798.html <http://apache-ignite-users.70518.x6.nabble.com/Slow-data-loading-and-high-very-memory-usage-issues-tp4798.html>

—
Denis

> On May 11, 2016, at 4:11 PM, Zhengqingzheng <zh...@huawei.com> wrote:
> 
> Hi Vladimir,
> The warning message disappeared. However, there are some other abnormal behavior occurred.
> My first observation is that, the client side keeps running without doing any write through actions to the database;
>  
> Second, I cannot get into the cmd interface to check cache or node information (running ./ignitevisorcmd.sh),
> After typing open command, and select the default ‘0’ configure file, It will stop at:
> [20:50:55]
> [20:50:55] Quiet mode.
> [20:50:55]   ^-- Logging to file '/opt/apache-ignite-bin/work/log/ignite-dc2a77e5.0.log'
> [20:50:55]   ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
> [20:50:55]
> [20:50:55] OS: Linux 3.13.0-24-generic amd64
> [20:50:55] VM information: OpenJDK Runtime Environment 1.7.0_95-b00 Oracle Corporation OpenJDK 64-Bit Server VM 24.95-b01
> [20:50:55] Configured plugins:
> [20:50:55]   ^-- None
> [20:50:55]
> [20:50:55] Security status [authentication=off, tls/ssl=off]
>  
> No further information occurred
>  
> The latest log file shows information like this:
> [20:49:28,368][INFO][main][IgniteKernal]
>  
> >>>    __________  ________________ 
> >>>   /  _/ ___/ |/ /  _/_  __/ __/ 
> >>>  _/ // (7 7    // /  / / / _/   
> >>> /___/\___/_/|_/___/ /_/ /___/  
> >>>
> >>> ver. 1.5.0-final#20151229-sha1:f1f8cda2
> >>> 2015 Copyright(C) Apache Software Foundation
> >>>
> >>> Ignite documentation: http://ignite.apache.org <http://ignite.apache.org/>
>  
> [20:49:28,370][INFO][main][IgniteKernal] Config URL: file:/opt/apache-ignite-bin/config/default-config.xml
> [20:49:28,370][INFO][main][IgniteKernal] Daemon mode: off
> [20:49:28,371][INFO][main][IgniteKernal] OS: Linux 3.13.0-24-generic amd64
> [20:49:28,371][INFO][main][IgniteKernal] OS user: root
> [20:49:28,371][INFO][main][IgniteKernal] Language runtime: Java Platform API Specification ver. 1.7
> [20:49:28,371][INFO][main][IgniteKernal] VM information: OpenJDK Runtime Environment 1.7.0_95-b00 Oracle Corporation OpenJDK 64-Bit Server VM 24.95-b01
> [20:49:28,372][INFO][main][IgniteKernal] VM total memory: 44.0GB
> [20:49:28,372][INFO][main][IgniteKernal] Remote Management [restart: on, REST: on, JMX (remote: on, port: 49115, auth: off, ssl: off)]
> [20:49:28,372][INFO][main][IgniteKernal] IGNITE_HOME=/opt/apache-ignite-bin
> [20:49:28,372][INFO][main][IgniteKernal] VM arguments: [-Xms1g, -Xmx1g, -XX:+AggressiveOpts, -XX:MaxPermSize=256m, -DIGNITE_QUIET=true, -DIGNITE_SUCCESS_FILE=/opt/apache-ignite-bin/work/ignite_success_aecbd7da-db6b-4e53-ae30-cde30062da2e, -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=49115, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -DIGNITE_HOME=/opt/apache-ignite-bin, -DIGNITE_PROG_NAME=./ignite.sh, -Xmx48g, -Xms46g]
> [20:49:28,372][INFO][main][IgniteKernal] Configured caches ['ignite-marshaller-sys-cache', 'ignite-sys-cache', 'ignite-atomics-sys-cache', 'default']
> [20:49:28,375][INFO][main][IgniteKernal] 3-rd party licenses can be found at: /opt/apache-ignite-bin/libs/licenses
> [20:49:28,436][INFO][main][IgniteKernal] Non-loopback local IPs: 10.12.23.34, fe80:0:0:0:2a6e:d4ff:fe89:c620%2
> [20:49:28,436][INFO][main][IgniteKernal] Enabled local MACs: 286ED489C620
> [20:49:28,446][INFO][main][IgnitePluginProcessor] Configured plugins:
> [20:49:28,446][INFO][main][IgnitePluginProcessor]   ^-- None
> [20:49:28,446][INFO][main][IgnitePluginProcessor]
> [20:49:28,523][INFO][main][TcpCommunicationSpi] IPC shared memory server endpoint started [port=48100, tokDir=/opt/apache-ignite-bin/work/ipc/shmem/65b246cd-8da3-42f6-b434-5a0cdb20ea6c-22394]
> [20:49:28,523][INFO][main][TcpCommunicationSpi] Successfully bound shared memory communication to TCP port [port=48100, locHost=0.0.0.0/0.0.0.0]
> [20:49:28,539][INFO][main][TcpCommunicationSpi] Successfully bound to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0]
> [20:49:28,566][WARNING][main][NoopCheckpointSpi] Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
> [20:49:28,597][WARNING][main][GridCollisionManager] Collision resolution is disabled (all jobs will be activated upon arrival).
> [20:49:28,601][WARNING][main][NoopSwapSpaceSpi] Swap space is disabled. To enable use FileSwapSpaceSpi.
> [20:49:28,603][INFO][main][IgniteKernal] Security status [authentication=off, tls/ssl=off]
> [20:49:28,849][INFO][main][GridTcpRestProtocol] Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
> [20:49:28,931][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0]
> [20:49:29,560][WARNING][main][GridDiscoveryManager] Local java version is different from remote [loc=7, rmt=8]
> [20:49:29,797][INFO][main][GridCacheProcessor] Started cache [name=ignite-marshaller-sys-cache, mode=REPLICATED]
> [20:49:29,830][INFO][main][GridCacheProcessor] Started cache [name=ignite-atomics-sys-cache, mode=PARTITIONED]
> [20:49:31,044][INFO][main][GridCacheProcessor] Started cache [name=default, mode=PARTITIONED]
> [20:49:31,051][INFO][main][GridCacheProcessor] Started cache [name=ignite-sys-cache, mode=REPLICATED]
> [20:49:31,491][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=ignite-sys-cache, waitList=[]]
> [20:49:31,492][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=ignite-marshaller-sys-cache, waitList=[]]
> [20:49:31,492][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=ignite-atomics-sys-cache, waitList=[ignite-sys-cache, ignite-marshaller-sys-cache]]
> [20:49:31,492][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=null, waitList=[ignite-sys-cache, ignite-marshaller-sys-cache, ignite-atomics-sys-cache]]
> [20:49:31,493][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Rebalancing required[top=AffinityTopologyVersion [topVer=2, minorTopVer=0], evt=NODE_JOINED, node=65b246cd-8da3-42f6-b434-5a0cdb20ea6c]
> [20:49:31,493][INFO][exchange-worker-#50%null%][GridDhtPartitionDemander] <ignite-marshaller-sys-cache> Starting rebalancing [cache=ignite-marshaller-sys-cache, mode=SYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=20, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
> [20:49:31,496][INFO][sys-#20%null%][GridDhtPartitionDemander] <ignite-sys-cache> Starting rebalancing [cache=ignite-sys-cache, mode=SYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=100, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
> [20:49:31,507][INFO][marshaller-cache-#52%null%][GridDhtPartitionDemander] <ignite-marshaller-sys-cache> Completed (final) rebalancing [cache=ignite-marshaller-sys-cache, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=10 ms]
> [20:49:31,511][INFO][marshaller-cache-#52%null%][GridDhtPartitionDemander] <ignite-sys-cache> Completed (final) rebalancing [cache=ignite-sys-cache, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=20 ms]
> [20:49:31,514][INFO][sys-#20%null%][GridDhtPartitionDemander] <ignite-atomics-sys-cache> Starting rebalancing [cache=ignite-atomics-sys-cache, mode=SYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=485, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
> [20:49:31,540][INFO][sys-#21%null%][GridDhtPartitionDemander] <ignite-atomics-sys-cache> Completed (final) rebalancing [cache=ignite-atomics-sys-cache, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=31 ms]
> [20:49:31,542][INFO][sys-#20%null%][GridDhtPartitionDemander] <default> Starting rebalancing [cache=null, mode=ASYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=485, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
> [20:49:32,565][INFO][sys-#22%null%][GridDhtPartitionDemander] <default> Completed (final) rebalancing [cache=null, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=1025 ms]
> [20:49:32,597][INFO][main][IgniteKernal] Performance suggestions for grid  (fix if possible)
> [20:49:32,597][INFO][main][IgniteKernal] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
> [20:49:32,597][INFO][main][IgniteKernal]   ^-- Enable write-behind to persistent store (set 'writeBehindEnabled' to true)
> [20:49:32,597][INFO][main][IgniteKernal]
> [20:49:32,598][INFO][main][IgniteKernal] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
> [20:49:32,598][INFO][main][IgniteKernal]
> [20:49:32,599][INFO][main][IgniteKernal]
>  
> >>> +----------------------------------------------------------------------------+
> >>> Ignite ver. 1.5.0-final#20151229-sha1:f1f8cda2f3f62231f42a59951bf34c39577c1bec
> >>> +----------------------------------------------------------------------------+
> >>> OS name: Linux 3.13.0-24-generic amd64
> >>> CPU(s): 8
> >>> Heap: 46.0GB
> >>> VM name: 22394@SZV1000071420
> >>> Local node [ID=65B246CD-8DA3-42F6-B434-5A0CDB20EA6C, order=2, clientMode=false]
> >>> Local node addresses: [SZV1000071420/0:0:0:0:0:0:0:1%1, /10.12.23.34, /127.0.0.1]
> >>> Local ports: TCP:11211 TCP:47100 TCP:47500 TCP:48100
>  
> [20:49:32,600][INFO][main][GridDiscoveryManager] Topology snapshot [ver=2, servers=2, clients=0, CPUs=16, heap=76.0GB]
> [20:50:32,598][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.17%, avg=0.24%, GC=0%]
>     ^-- Heap [used=10444MB, free=76.86%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:50:38,461][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Update status is not available.
> [20:50:56,320][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=3, minorTopVer=0], evt=NODE_JOINED, node=dc2a77e5-0aeb-47eb-a0df-651ec6a54802]
> [20:51:32,605][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.13%, avg=0.21%, GC=0%]
>     ^-- Heap [used=10939MB, free=75.77%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:51:37,307][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=4, minorTopVer=0], evt=NODE_LEFT, node=dc2a77e5-0aeb-47eb-a0df-651ec6a54802]
> [20:52:32,608][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.1%, avg=0.2%, GC=0%]
>     ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:53:32,612][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.1%, avg=0.18%, GC=0%]
>     ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:54:32,615][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.17%, avg=0.17%, GC=0%]
>     ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:55:32,620][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.17%, avg=0.16%, GC=0%]
>     ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:56:32,627][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.03%, avg=0.15%, GC=0%]
>     ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=0, idle=16, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:57:02,850][WARNING][disco-event-worker-#48%null%][GridDiscoveryManager] Local java version is different from remote [loc=7, rmt=8]
> [20:57:02,851][INFO][disco-event-worker-#48%null%][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=aca2bc4c-1b3e-4488-9773-0a7efca3aebc, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0, /127.0.0.1:0], discPort=0, order=5, intOrder=4, lastExchangeTime=1462971422737, loc=false, ver=1.5.0#19700101-sha1:00000000, isClient=true]
> [20:57:02,852][INFO][disco-event-worker-#48%null%][GridDiscoveryManager] Topology snapshot [ver=5, servers=2, clients=1, CPUs=20, heap=77.0GB]
> [20:57:02,862][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=0], evt=NODE_JOINED, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:08,829][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestDatabaseStat_Cache, mode=PARTITIONED]
> [20:57:08,957][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:09,032][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestDatabaseFieldStat_Cache, mode=PARTITIONED]
> [20:57:09,147][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=2], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:09,196][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestObject_Cache, mode=PARTITIONED]
> [20:57:09,272][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=3], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:09,322][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestColumn_Cache, mode=PARTITIONED]
> [20:57:09,393][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=4], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:09,470][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestIndex_Cache, mode=PARTITIONED]
> [20:57:09,525][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=5], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:09,582][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestUniqueField_Cache, mode=PARTITIONED]
> [20:57:09,643][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=6], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:09,737][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestDataBase_Cache, mode=PARTITIONED]
> [20:57:09,797][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=7], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
> [20:57:32,633][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
>     ^-- CPU [cur=0.5%, avg=0.21%, GC=0%]
>     ^-- Heap [used=11814MB, free=73.83%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:57:55,689][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Local node has detected failed nodes and started cluster-wide procedure. To speed up failure detection please see 'Failure Detection' section under javadoc for 'TcpDiscoverySpi'
> [20:58:32,635][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
>     ^-- CPU [cur=0.13%, avg=0.2%, GC=0%]
>     ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [20:59:32,642][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
>     ^-- CPU [cur=0.13%, avg=0.19%, GC=0%]
>     ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [21:00:32,650][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
>     ^-- CPU [cur=0.1%, avg=0.19%, GC=0%]
>     ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
>    ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [21:01:32,656][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
>     ^-- CPU [cur=0.13%, avg=0.18%, GC=0%]
>     ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [21:01:38,374][WARNING][disco-event-worker-#48%null%][GridDiscoveryManager] Node FAILED: TcpDiscoveryNode [id=aca2bc4c-1b3e-4488-9773-0a7efca3aebc, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0, /127.0.0.1:0], discPort=0, order=5, intOrder=4, lastExchangeTime=1462971422737, loc=false, ver=1.5.0#19700101-sha1:00000000, isClient=true]
> [21:01:38,374][INFO][disco-event-worker-#48%null%][GridDiscoveryManager] Topology snapshot [ver=8, servers=2, clients=0, CPUs=16, heap=77.0GB]
> [21:02:32,661][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.13%, avg=0.17%, GC=0%]
>     ^-- Heap [used=12300MB, free=72.75%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
> [21:03:32,667][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=65b246cd, name=null]
>     ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
>     ^-- CPU [cur=0.1%, avg=0.17%, GC=0%]
>     ^-- Heap [used=12536MB, free=72.23%, comm=45142MB]
>     ^-- Public thread pool [active=0, idle=16, qSize=0]
>     ^-- System thread pool [active=4, idle=12, qSize=0]
>     ^-- Outbound messages queue [size=0]
>  
>  
>  
> 发件人: Vladimir Ozerov [mailto:vozerov@gridgain.com] 
> 发送时间: 2016年5月11日 19:32
> 收件人: user@ignite.apache.org
> 主题: Re: Ignite Still waiting for initial partition map exchange problem
>  
> H Kevin,
>  
> Does it lead to any problems with application? If yes, please describe your observations and provide thread dump of the node.
>  
> Vladimir.
>  
> On Wed, May 11, 2016 at 11:16 AM, Zhengqingzheng <zhengqingzheng@huawei.com <ma...@huawei.com>> wrote:
> Hi there,
> When I tried to run an writethrough example, I got the following error message from log file:
>  
> [15:29:48,902][WARNING][main][GridCachePartitionExchangeManager] Still waiting for initial partition map exchange [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false, reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=98392607-3eaa-417c-8782-5a180e655ec2, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[NKGY2Z002066461.china.huawei.com/10.135.66.169:0 <http://nkgy2z002066461.china.huawei.com/10.135.66.169:0>, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0 <http://10.135.66.169:0/>, /127.0.0.1:0 <http://127.0.0.1:0/>], discPort=0, order=8, intOrder=0, lastExchangeTime=1462951688135, loc=true, ver=1.5.0#19700101-sha1:00000000, isClient=true], topVer=8, nodeId8=98392607, msg=null, type=NODE_JOINED, tstamp=1462951698228], rcvdIds=GridConcurrentHashSet [elements=[]], rmtIds=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], nodeId=98392607, evt=NODE_JOINED], init=true, ready=false, replied=false, added=false, initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1462951698725, endTime=1462951698781, ignoreInterrupts=false, lsnr=null, state=DONE], topSnapshot=null, lastVer=null, partReleaseFut=null, skipPreload=true, clientOnlyExchange=true, initTs=1462951698725, oldest=c34f8715-7707-4052-96ba-2120c3a38158, oldestOrder=1, evtLatch=0, remaining=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], super=GridFutureAdapter [resFlag=0, res=null, startTime=1462951698725, endTime=0, ignoreInterrupts=false, lsnr=null, state=INIT]]]
> [15:30:18,933][WARNING][main][GridCachePartitionExchangeManager] Still waiting for initial partition map exchange [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false, reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=98392607-3eaa-417c-8782-5a180e655ec2, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[NKGY2Z002066461.china.huawei.com/10.135.66.169:0 <http://nkgy2z002066461.china.huawei.com/10.135.66.169:0>, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0 <http://10.135.66.169:0/>, /127.0.0.1:0 <http://127.0.0.1:0/>], discPort=0, order=8, intOrder=0, lastExchangeTime=1462951688135, loc=true, ver=1.5.0#19700101-sha1:00000000, isClient=true], topVer=8, nodeId8=98392607, msg=null, type=NODE_JOINED, tstamp=1462951698228], rcvdIds=GridConcurrentHashSet [elements=[]], rmtIds=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], nodeId=98392607, evt=NODE_JOINED], init=true, ready=false, replied=false, added=false, initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1462951698725, endTime=1462951698781, ignoreInterrupts=false, lsnr=null, state=DONE], topSnapshot=null, lastVer=null, partReleaseFut=null, skipPreload=true, clientOnlyExchange=true, initTs=1462951698725, oldest=c34f8715-7707-4052-96ba-2120c3a38158, oldestOrder=1, evtLatch=0, remaining=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], super=GridFutureAdapter [resFlag=0, res=null, startTime=1462951698725, endTime=0, ignoreInterrupts=false, lsnr=null, state=INIT]]]
>  
>  
> Any idea to fix it?
>  
> Best regards,
> Kevin


re: Ignite Still waiting for initial partition map exchange problem

Posted by Zhengqingzheng <zh...@huawei.com>.
Hi Vladimir,
The warning message disappeared. However, there are some other abnormal behavior occurred.
My first observation is that, the client side keeps running without doing any write through actions to the database;

Second, I cannot get into the cmd interface to check cache or node information (running ./ignitevisorcmd.sh),
After typing open command, and select the default ‘0’ configure file, It will stop at:
[20:50:55]
[20:50:55] Quiet mode.
[20:50:55]   ^-- Logging to file '/opt/apache-ignite-bin/work/log/ignite-dc2a77e5.0.log'
[20:50:55]   ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
[20:50:55]
[20:50:55] OS: Linux 3.13.0-24-generic amd64
[20:50:55] VM information: OpenJDK Runtime Environment 1.7.0_95-b00 Oracle Corporation OpenJDK 64-Bit Server VM 24.95-b01
[20:50:55] Configured plugins:
[20:50:55]   ^-- None
[20:50:55]
[20:50:55] Security status [authentication=off, tls/ssl=off]

No further information occurred

The latest log file shows information like this:
[20:49:28,368][INFO][main][IgniteKernal]

>>>    __________  ________________
>>>   /  _/ ___/ |/ /  _/_  __/ __/
>>>  _/ // (7 7    // /  / / / _/
>>> /___/\___/_/|_/___/ /_/ /___/
>>>
>>> ver. 1.5.0-final#20151229-sha1:f1f8cda2
>>> 2015 Copyright(C) Apache Software Foundation
>>>
>>> Ignite documentation: http://ignite.apache.org

[20:49:28,370][INFO][main][IgniteKernal] Config URL: file:/opt/apache-ignite-bin/config/default-config.xml
[20:49:28,370][INFO][main][IgniteKernal] Daemon mode: off
[20:49:28,371][INFO][main][IgniteKernal] OS: Linux 3.13.0-24-generic amd64
[20:49:28,371][INFO][main][IgniteKernal] OS user: root
[20:49:28,371][INFO][main][IgniteKernal] Language runtime: Java Platform API Specification ver. 1.7
[20:49:28,371][INFO][main][IgniteKernal] VM information: OpenJDK Runtime Environment 1.7.0_95-b00 Oracle Corporation OpenJDK 64-Bit Server VM 24.95-b01
[20:49:28,372][INFO][main][IgniteKernal] VM total memory: 44.0GB
[20:49:28,372][INFO][main][IgniteKernal] Remote Management [restart: on, REST: on, JMX (remote: on, port: 49115, auth: off, ssl: off)]
[20:49:28,372][INFO][main][IgniteKernal] IGNITE_HOME=/opt/apache-ignite-bin
[20:49:28,372][INFO][main][IgniteKernal] VM arguments: [-Xms1g, -Xmx1g, -XX:+AggressiveOpts, -XX:MaxPermSize=256m, -DIGNITE_QUIET=true, -DIGNITE_SUCCESS_FILE=/opt/apache-ignite-bin/work/ignite_success_aecbd7da-db6b-4e53-ae30-cde30062da2e, -Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=49115, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -DIGNITE_HOME=/opt/apache-ignite-bin, -DIGNITE_PROG_NAME=./ignite.sh, -Xmx48g, -Xms46g]
[20:49:28,372][INFO][main][IgniteKernal] Configured caches ['ignite-marshaller-sys-cache', 'ignite-sys-cache', 'ignite-atomics-sys-cache', 'default']
[20:49:28,375][INFO][main][IgniteKernal] 3-rd party licenses can be found at: /opt/apache-ignite-bin/libs/licenses
[20:49:28,436][INFO][main][IgniteKernal] Non-loopback local IPs: 10.12.23.34, fe80:0:0:0:2a6e:d4ff:fe89:c620%2
[20:49:28,436][INFO][main][IgniteKernal] Enabled local MACs: 286ED489C620
[20:49:28,446][INFO][main][IgnitePluginProcessor] Configured plugins:
[20:49:28,446][INFO][main][IgnitePluginProcessor]   ^-- None
[20:49:28,446][INFO][main][IgnitePluginProcessor]
[20:49:28,523][INFO][main][TcpCommunicationSpi] IPC shared memory server endpoint started [port=48100, tokDir=/opt/apache-ignite-bin/work/ipc/shmem/65b246cd-8da3-42f6-b434-5a0cdb20ea6c-22394]
[20:49:28,523][INFO][main][TcpCommunicationSpi] Successfully bound shared memory communication to TCP port [port=48100, locHost=0.0.0.0/0.0.0.0]
[20:49:28,539][INFO][main][TcpCommunicationSpi] Successfully bound to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0]
[20:49:28,566][WARNING][main][NoopCheckpointSpi] Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
[20:49:28,597][WARNING][main][GridCollisionManager] Collision resolution is disabled (all jobs will be activated upon arrival).
[20:49:28,601][WARNING][main][NoopSwapSpaceSpi] Swap space is disabled. To enable use FileSwapSpaceSpi.
[20:49:28,603][INFO][main][IgniteKernal] Security status [authentication=off, tls/ssl=off]
[20:49:28,849][INFO][main][GridTcpRestProtocol] Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
[20:49:28,931][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0]
[20:49:29,560][WARNING][main][GridDiscoveryManager] Local java version is different from remote [loc=7, rmt=8]
[20:49:29,797][INFO][main][GridCacheProcessor] Started cache [name=ignite-marshaller-sys-cache, mode=REPLICATED]
[20:49:29,830][INFO][main][GridCacheProcessor] Started cache [name=ignite-atomics-sys-cache, mode=PARTITIONED]
[20:49:31,044][INFO][main][GridCacheProcessor] Started cache [name=default, mode=PARTITIONED]
[20:49:31,051][INFO][main][GridCacheProcessor] Started cache [name=ignite-sys-cache, mode=REPLICATED]
[20:49:31,491][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=ignite-sys-cache, waitList=[]]
[20:49:31,492][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=ignite-marshaller-sys-cache, waitList=[]]
[20:49:31,492][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=ignite-atomics-sys-cache, waitList=[ignite-sys-cache, ignite-marshaller-sys-cache]]
[20:49:31,492][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Cache rebalancing scheduled: [cache=null, waitList=[ignite-sys-cache, ignite-marshaller-sys-cache, ignite-atomics-sys-cache]]
[20:49:31,493][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Rebalancing required[top=AffinityTopologyVersion [topVer=2, minorTopVer=0], evt=NODE_JOINED, node=65b246cd-8da3-42f6-b434-5a0cdb20ea6c]
[20:49:31,493][INFO][exchange-worker-#50%null%][GridDhtPartitionDemander] <ignite-marshaller-sys-cache> Starting rebalancing [cache=ignite-marshaller-sys-cache, mode=SYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=20, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
[20:49:31,496][INFO][sys-#20%null%][GridDhtPartitionDemander] <ignite-sys-cache> Starting rebalancing [cache=ignite-sys-cache, mode=SYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=100, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
[20:49:31,507][INFO][marshaller-cache-#52%null%][GridDhtPartitionDemander] <ignite-marshaller-sys-cache> Completed (final) rebalancing [cache=ignite-marshaller-sys-cache, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=10 ms]
[20:49:31,511][INFO][marshaller-cache-#52%null%][GridDhtPartitionDemander] <ignite-sys-cache> Completed (final) rebalancing [cache=ignite-sys-cache, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=20 ms]
[20:49:31,514][INFO][sys-#20%null%][GridDhtPartitionDemander] <ignite-atomics-sys-cache> Starting rebalancing [cache=ignite-atomics-sys-cache, mode=SYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=485, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
[20:49:31,540][INFO][sys-#21%null%][GridDhtPartitionDemander] <ignite-atomics-sys-cache> Completed (final) rebalancing [cache=ignite-atomics-sys-cache, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=31 ms]
[20:49:31,542][INFO][sys-#20%null%][GridDhtPartitionDemander] <default> Starting rebalancing [cache=null, mode=ASYNC, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, partitionsCount=485, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], updateSeq=1]
[20:49:32,565][INFO][sys-#22%null%][GridDhtPartitionDemander] <default> Completed (final) rebalancing [cache=null, fromNode=d17a8010-18e8-47dd-a016-2893d9522df9, topology=AffinityTopologyVersion [topVer=2, minorTopVer=0], time=1025 ms]
[20:49:32,597][INFO][main][IgniteKernal] Performance suggestions for grid  (fix if possible)
[20:49:32,597][INFO][main][IgniteKernal] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[20:49:32,597][INFO][main][IgniteKernal]   ^-- Enable write-behind to persistent store (set 'writeBehindEnabled' to true)
[20:49:32,597][INFO][main][IgniteKernal]
[20:49:32,598][INFO][main][IgniteKernal] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
[20:49:32,598][INFO][main][IgniteKernal]
[20:49:32,599][INFO][main][IgniteKernal]

>>> +----------------------------------------------------------------------------+
>>> Ignite ver. 1.5.0-final#20151229-sha1:f1f8cda2f3f62231f42a59951bf34c39577c1bec
>>> +----------------------------------------------------------------------------+
>>> OS name: Linux 3.13.0-24-generic amd64
>>> CPU(s): 8
>>> Heap: 46.0GB
>>> VM name: 22394@SZV1000071420
>>> Local node [ID=65B246CD-8DA3-42F6-B434-5A0CDB20EA6C, order=2, clientMode=false]
>>> Local node addresses: [SZV1000071420/0:0:0:0:0:0:0:1%1, /10.12.23.34, /127.0.0.1]
>>> Local ports: TCP:11211 TCP:47100 TCP:47500 TCP:48100

[20:49:32,600][INFO][main][GridDiscoveryManager] Topology snapshot [ver=2, servers=2, clients=0, CPUs=16, heap=76.0GB]
[20:50:32,598][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.17%, avg=0.24%, GC=0%]
    ^-- Heap [used=10444MB, free=76.86%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:50:38,461][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Update status is not available.
[20:50:56,320][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=3, minorTopVer=0], evt=NODE_JOINED, node=dc2a77e5-0aeb-47eb-a0df-651ec6a54802]
[20:51:32,605][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.13%, avg=0.21%, GC=0%]
    ^-- Heap [used=10939MB, free=75.77%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:51:37,307][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=4, minorTopVer=0], evt=NODE_LEFT, node=dc2a77e5-0aeb-47eb-a0df-651ec6a54802]
[20:52:32,608][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.1%, avg=0.2%, GC=0%]
    ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:53:32,612][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.1%, avg=0.18%, GC=0%]
    ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:54:32,615][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.17%, avg=0.17%, GC=0%]
    ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:55:32,620][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.17%, avg=0.16%, GC=0%]
    ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:56:32,627][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.03%, avg=0.15%, GC=0%]
    ^-- Heap [used=11174MB, free=75.25%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=0, idle=16, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:57:02,850][WARNING][disco-event-worker-#48%null%][GridDiscoveryManager] Local java version is different from remote [loc=7, rmt=8]
[20:57:02,851][INFO][disco-event-worker-#48%null%][GridDiscoveryManager] Added new node to topology: TcpDiscoveryNode [id=aca2bc4c-1b3e-4488-9773-0a7efca3aebc, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0, /127.0.0.1:0], discPort=0, order=5, intOrder=4, lastExchangeTime=1462971422737, loc=false, ver=1.5.0#19700101-sha1:00000000, isClient=true]
[20:57:02,852][INFO][disco-event-worker-#48%null%][GridDiscoveryManager] Topology snapshot [ver=5, servers=2, clients=1, CPUs=20, heap=77.0GB]
[20:57:02,862][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=0], evt=NODE_JOINED, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:08,829][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestDatabaseStat_Cache, mode=PARTITIONED]
[20:57:08,957][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:09,032][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestDatabaseFieldStat_Cache, mode=PARTITIONED]
[20:57:09,147][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=2], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:09,196][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestObject_Cache, mode=PARTITIONED]
[20:57:09,272][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=3], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:09,322][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestColumn_Cache, mode=PARTITIONED]
[20:57:09,393][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=4], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:09,470][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestIndex_Cache, mode=PARTITIONED]
[20:57:09,525][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=5], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:09,582][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestUniqueField_Cache, mode=PARTITIONED]
[20:57:09,643][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=6], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:09,737][INFO][exchange-worker-#50%null%][GridCacheProcessor] Started cache [name=org.apache.ignite.model.IgniteTestDataBase_Cache, mode=PARTITIONED]
[20:57:09,797][INFO][exchange-worker-#50%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=5, minorTopVer=7], evt=DISCOVERY_CUSTOM_EVT, node=aca2bc4c-1b3e-4488-9773-0a7efca3aebc]
[20:57:32,633][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
    ^-- CPU [cur=0.5%, avg=0.21%, GC=0%]
    ^-- Heap [used=11814MB, free=73.83%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:57:55,689][WARNING][tcp-disco-msg-worker-#2%null%][TcpDiscoverySpi] Local node has detected failed nodes and started cluster-wide procedure. To speed up failure detection please see 'Failure Detection' section under javadoc for 'TcpDiscoverySpi'
[20:58:32,635][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
    ^-- CPU [cur=0.13%, avg=0.2%, GC=0%]
    ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]
[20:59:32,642][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
    ^-- CPU [cur=0.13%, avg=0.19%, GC=0%]
    ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]
[21:00:32,650][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
    ^-- CPU [cur=0.1%, avg=0.19%, GC=0%]
    ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
   ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]
[21:01:32,656][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=3, nodes=3, CPUs=20]
    ^-- CPU [cur=0.13%, avg=0.18%, GC=0%]
    ^-- Heap [used=12050MB, free=73.31%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]
[21:01:38,374][WARNING][disco-event-worker-#48%null%][GridDiscoveryManager] Node FAILED: TcpDiscoveryNode [id=aca2bc4c-1b3e-4488-9773-0a7efca3aebc, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0, /127.0.0.1:0], discPort=0, order=5, intOrder=4, lastExchangeTime=1462971422737, loc=false, ver=1.5.0#19700101-sha1:00000000, isClient=true]
[21:01:38,374][INFO][disco-event-worker-#48%null%][GridDiscoveryManager] Topology snapshot [ver=8, servers=2, clients=0, CPUs=16, heap=77.0GB]
[21:02:32,661][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.13%, avg=0.17%, GC=0%]
    ^-- Heap [used=12300MB, free=72.75%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]
[21:03:32,667][INFO][grid-timeout-worker-#33%null%][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=65b246cd, name=null]
    ^-- H/N/C [hosts=2, nodes=2, CPUs=16]
    ^-- CPU [cur=0.1%, avg=0.17%, GC=0%]
    ^-- Heap [used=12536MB, free=72.23%, comm=45142MB]
    ^-- Public thread pool [active=0, idle=16, qSize=0]
    ^-- System thread pool [active=4, idle=12, qSize=0]
    ^-- Outbound messages queue [size=0]



发件人: Vladimir Ozerov [mailto:vozerov@gridgain.com]
发送时间: 2016年5月11日 19:32
收件人: user@ignite.apache.org
主题: Re: Ignite Still waiting for initial partition map exchange problem

H Kevin,

Does it lead to any problems with application? If yes, please describe your observations and provide thread dump of the node.

Vladimir.

On Wed, May 11, 2016 at 11:16 AM, Zhengqingzheng <zh...@huawei.com>> wrote:
Hi there,
When I tried to run an writethrough example, I got the following error message from log file:

[15:29:48,902][WARNING][main][GridCachePartitionExchangeManager] Still waiting for initial partition map exchange [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false, reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=98392607-3eaa-417c-8782-5a180e655ec2, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[NKGY2Z002066461.china.huawei.com/10.135.66.169:0<http://NKGY2Z002066461.china.huawei.com/10.135.66.169:0>, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0<http://10.135.66.169:0>, /127.0.0.1:0<http://127.0.0.1:0>], discPort=0, order=8, intOrder=0, lastExchangeTime=1462951688135, loc=true, ver=1.5.0#19700101-sha1:00000000, isClient=true], topVer=8, nodeId8=98392607, msg=null, type=NODE_JOINED, tstamp=1462951698228], rcvdIds=GridConcurrentHashSet [elements=[]], rmtIds=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], nodeId=98392607, evt=NODE_JOINED], init=true, ready=false, replied=false, added=false, initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1462951698725, endTime=1462951698781, ignoreInterrupts=false, lsnr=null, state=DONE], topSnapshot=null, lastVer=null, partReleaseFut=null, skipPreload=true, clientOnlyExchange=true, initTs=1462951698725, oldest=c34f8715-7707-4052-96ba-2120c3a38158, oldestOrder=1, evtLatch=0, remaining=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], super=GridFutureAdapter [resFlag=0, res=null, startTime=1462951698725, endTime=0, ignoreInterrupts=false, lsnr=null, state=INIT]]]
[15:30:18,933][WARNING][main][GridCachePartitionExchangeManager] Still waiting for initial partition map exchange [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false, reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=98392607-3eaa-417c-8782-5a180e655ec2, addrs=[0:0:0:0:0:0:0:1, 10.135.66.169, 127.0.0.1], sockAddrs=[NKGY2Z002066461.china.huawei.com/10.135.66.169:0<http://NKGY2Z002066461.china.huawei.com/10.135.66.169:0>, /0:0:0:0:0:0:0:1:0, /10.135.66.169:0<http://10.135.66.169:0>, /127.0.0.1:0<http://127.0.0.1:0>], discPort=0, order=8, intOrder=0, lastExchangeTime=1462951688135, loc=true, ver=1.5.0#19700101-sha1:00000000, isClient=true], topVer=8, nodeId8=98392607, msg=null, type=NODE_JOINED, tstamp=1462951698228], rcvdIds=GridConcurrentHashSet [elements=[]], rmtIds=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], nodeId=98392607, evt=NODE_JOINED], init=true, ready=false, replied=false, added=false, initFut=GridFutureAdapter [resFlag=2, res=true, startTime=1462951698725, endTime=1462951698781, ignoreInterrupts=false, lsnr=null, state=DONE], topSnapshot=null, lastVer=null, partReleaseFut=null, skipPreload=true, clientOnlyExchange=true, initTs=1462951698725, oldest=c34f8715-7707-4052-96ba-2120c3a38158, oldestOrder=1, evtLatch=0, remaining=[6bb71edf-8c09-46f4-8138-d36310615582, c34f8715-7707-4052-96ba-2120c3a38158], super=GridFutureAdapter [resFlag=0, res=null, startTime=1462951698725, endTime=0, ignoreInterrupts=false, lsnr=null, state=INIT]]]


Any idea to fix it?

Best regards,
Kevin