You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by praveeng <pr...@gmail.com> on 2019/03/22 12:10:05 UTC
Ignite node is down due to full RAM usage
Hi,
Ignite version : 1.8
One of the ignite node in 3node cluster is down due to full usage of RAM.
At that point of time i can observe the following logs on this node:
[00:32:02,119][INFO
][grid-timeout-worker-#7%CasinoApacheIgniteServices%][IgniteKernal%CasinoApacheIgniteServices]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=9f8df386, name=CasinoApacheIgniteServices,
uptime=23:21:45:744]
^-- H/N/C [hosts=8, nodes=8, CPUs=44]
^-- CPU [cur=8.33%, avg=1.6%, GC=0%]
^-- Heap [used=3886MB, free=36.65%, comm=6134MB]
^-- Non heap [used=78MB, free=85.96%, comm=529MB]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=16, qSize=0]
^-- Outbound messages queue [size=0]
[00:33:24,674][WARN
][exchange-worker-#23%CasinoApacheIgniteServices%][GridCachePartitionExchangeManager]
Failed to wait for partition map exchange [topVer=AffinityTopologyVersion
[topVer=84, minorTopVer=0], node=9f8df386-2886-451f-b1ff-53713878d432].
Dumping pending objects that might be the cause:
[00:33:24,674][WARN
][exchange-worker-#23%CasinoApacheIgniteServices%][GridCachePartitionExchangeManager]
Failed to wait for partition map exchange [topVer=AffinityTopologyVersion
[topVer=84, minorTopVer=0], node=9f8df386-2886-451f-b1ff-53713878d432].
Dumping pending objects that might be the cause:
SAR stats for memory usage on this date:
-- mar 6
12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit
%commit kbactive kbinact kbdirty
12:10:01 PM 170120 16090232 98.95 0 3393384 8222696
45.02 9887268 2088504 60
01:50:01 PM 168176 16092176 98.97 0 2120848 8224724
45.03 10804712 1596792 48
03:10:01 PM 199128 16061224 98.78 0 991832 8224904
45.04 11384652 1241284 436
04:10:01 PM 153060 16107292 99.06 0 229984 8224880
45.04 11255628 1627600 208
04:20:01 PM 165580 16094772 98.98 0 78572 8224828
45.03 11338592 1560944 52
04:30:01 PM 153508 16106844 99.06 0 29740 8224872
45.03 11436544 1579468 44
04:40:01 PM 162184 16098168 99.00 0 33152 8224892
45.04 11606584 1580388 24
11:10:01 PM 370956 15889396 97.72 0 74816 8225312
45.04 11927676 1610828 36
11:20:01 PM 348576 15911776 97.86 0 69012 8225272
45.04 11929820 1602748 48
11:30:01 PM 359132 15901220 97.79 0 27060 8225308
45.04 11912656 1577848 36
11:40:01 PM 340252 15920100 97.91 0 24908 8225272
45.04 11910516 1577668 32
11:50:01 PM 308340 15952012 98.10 0 39208 8242284
45.13 11914564 1589208 48
Average: 253568 16006784 98.44 0 2317289 8226063
45.04 10368276 1955525 142
Please find the attached file for the cache configuration.
ignite-clb-cache-config_dev.xml
<http://apache-ignite-users.70518.x6.nabble.com/file/t1753/ignite-clb-cache-config_dev.xml>
Please find the memory snapshot which is captured by app dynamics tool in
the attachment.
memorySnapshot.JPG
<http://apache-ignite-users.70518.x6.nabble.com/file/t1753/memorySnapshot.JPG>
Following is my analysis.
When the data is evicting from on heap to off heap, there is not much space
in off heap.
Due to that off heap memory usage is full and application has become slow
and unresponsive.
Even the data in off heap is not expired because of that there is not much
free memory in RAM.
After i restarted the application on this node, the RAM usage has become to
25% and now it's usage is 45%.
can you please check and suggest once.
Thanks,
Praveen
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Ignite node is down due to full RAM usage
Posted by praveeng <pr...@gmail.com>.
Hi,
As we can't upgrade java version to 1.8, we can't use the ignite latest
version.
If it is because of Heap Memory issue, i could have got the OOM error in
logs and heap dump might have generated automatically.
This could be because of the data in off heap is not expired and the RAM is
used completely.
Thanks,
Praveen
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Ignite node is down due to full RAM usage
Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!
Unfortunately I would not expect anyone to be debugging your 1.8 cluster
since most people upgraded to 2.x.
Next time this happens, can you capture heap dump from problematic node?
Dominator graph & per-class histogram may help tremendously.
Regards,
--
Ilya Kasnacheev
пт, 22 мар. 2019 г. в 15:10, praveeng <pr...@gmail.com>:
> Hi,
>
> Ignite version : 1.8
> One of the ignite node in 3node cluster is down due to full usage of RAM.
>
> At that point of time i can observe the following logs on this node:
>
> [00:32:02,119][INFO
>
> ][grid-timeout-worker-#7%CasinoApacheIgniteServices%][IgniteKernal%CasinoApacheIgniteServices]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
> ^-- Node [id=9f8df386, name=CasinoApacheIgniteServices,
> uptime=23:21:45:744]
> ^-- H/N/C [hosts=8, nodes=8, CPUs=44]
> ^-- CPU [cur=8.33%, avg=1.6%, GC=0%]
> ^-- Heap [used=3886MB, free=36.65%, comm=6134MB]
> ^-- Non heap [used=78MB, free=85.96%, comm=529MB]
> ^-- Public thread pool [active=0, idle=0, qSize=0]
> ^-- System thread pool [active=0, idle=16, qSize=0]
> ^-- Outbound messages queue [size=0]
>
> [00:33:24,674][WARN
>
> ][exchange-worker-#23%CasinoApacheIgniteServices%][GridCachePartitionExchangeManager]
> Failed to wait for partition map exchange [topVer=AffinityTopologyVersion
> [topVer=84, minorTopVer=0], node=9f8df386-2886-451f-b1ff-53713878d432].
> Dumping pending objects that might be the cause:
> [00:33:24,674][WARN
>
> ][exchange-worker-#23%CasinoApacheIgniteServices%][GridCachePartitionExchangeManager]
> Failed to wait for partition map exchange [topVer=AffinityTopologyVersion
> [topVer=84, minorTopVer=0], node=9f8df386-2886-451f-b1ff-53713878d432].
> Dumping pending objects that might be the cause:
>
>
> SAR stats for memory usage on this date:
>
> -- mar 6
> 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit
> %commit kbactive kbinact kbdirty
> 12:10:01 PM 170120 16090232 98.95 0 3393384 8222696
> 45.02 9887268 2088504 60
> 01:50:01 PM 168176 16092176 98.97 0 2120848 8224724
> 45.03 10804712 1596792 48
> 03:10:01 PM 199128 16061224 98.78 0 991832 8224904
> 45.04 11384652 1241284 436
> 04:10:01 PM 153060 16107292 99.06 0 229984 8224880
> 45.04 11255628 1627600 208
> 04:20:01 PM 165580 16094772 98.98 0 78572 8224828
> 45.03 11338592 1560944 52
> 04:30:01 PM 153508 16106844 99.06 0 29740 8224872
> 45.03 11436544 1579468 44
> 04:40:01 PM 162184 16098168 99.00 0 33152 8224892
> 45.04 11606584 1580388 24
> 11:10:01 PM 370956 15889396 97.72 0 74816 8225312
> 45.04 11927676 1610828 36
> 11:20:01 PM 348576 15911776 97.86 0 69012 8225272
> 45.04 11929820 1602748 48
> 11:30:01 PM 359132 15901220 97.79 0 27060 8225308
> 45.04 11912656 1577848 36
> 11:40:01 PM 340252 15920100 97.91 0 24908 8225272
> 45.04 11910516 1577668 32
> 11:50:01 PM 308340 15952012 98.10 0 39208 8242284
> 45.13 11914564 1589208 48
> Average: 253568 16006784 98.44 0 2317289 8226063
> 45.04 10368276 1955525 142
>
> Please find the attached file for the cache configuration.
>
> ignite-clb-cache-config_dev.xml
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t1753/ignite-clb-cache-config_dev.xml>
>
>
> Please find the memory snapshot which is captured by app dynamics tool in
> the attachment.
> memorySnapshot.JPG
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t1753/memorySnapshot.JPG>
>
>
> Following is my analysis.
> When the data is evicting from on heap to off heap, there is not much space
> in off heap.
> Due to that off heap memory usage is full and application has become slow
> and unresponsive.
>
> Even the data in off heap is not expired because of that there is not much
> free memory in RAM.
> After i restarted the application on this node, the RAM usage has become to
> 25% and now it's usage is 45%.
>
> can you please check and suggest once.
>
> Thanks,
> Praveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>