You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by "Lewis, Cory (Genworth)" <co...@genworth.com> on 2018/07/23 15:45:00 UTC
Ignite threads in wait/park; cache stops responding
Hi,
We are trying to convert our app to use Ignite in place of our current caching implementation(ifinispan). It will work fine for an hour or so, and then things begin to fall apart. Eventually all our applications AJP threads will be blocked waiting on ignite to retrieve cached objects.
We are currently using 2.2.
Here are some of the log messages we see as things begin to deterioriate:
[CLIENT] WARNING [org.apache.ignite.internal.diagnostic] (grid-timeout-worker-#15%null%) Found long running cache future [startTime=10:09:00.032, curTime=10:10:45.486, fut=GridPartitionedSingleGetFuture [topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], key=UserKeyCacheObjectImpl [part=75, val=bisyspa, hasValBytes=true], readThrough=true, forcePrimary=false, futId=b382ea9b461-fd143a5a-1e15-4157-ab2c-e1b11e84af1b, trackable=true, subjId=9a152366-4292-459c-8959-403a4da42357, taskName=null, deserializeBinary=true, skipVals=false, expiryPlc=null, canRemap=true, needVer=false, keepCacheObjects=false, recovery=false, node=TcpDiscoveryNode [id=bc32133d-4188-455f-8dbf-d4a13f9b4be6, addrs=[10.60.9.4, 127.0.0.1], sockAddrs=[/127.0.0.1:47500, /10.60.9.4:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1532123004911, loc=false, ver=2.2.0#20170915-sha1:5747ce6b, isClient=false]]]
[CLIENT] WARNING [org.apache.ignite.internal.util.typedef.G] (grid-timeout-worker-#15%null%) >>> Possible starvation in striped pool.
Thread name: sys-stripe-1-#2%null%
Queue: []
Deadlock: false
Completed: 4
Thread [name="sys-stripe-1-#2%null%", id=130, state=RUNNABLE, blockCnt=0, waitCnt=4]
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:247)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:306)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:186)
at o.a.i.i.binary.BinaryObjectImpl.toString(BinaryObjectImpl.java:853)
at java.lang.String.valueOf(String.java:2849)
at o.a.i.i.util.GridStringBuilder.a(GridStringBuilder.java:101)
at o.a.i.i.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:884)
at o.a.i.i.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:786)
at o.a.i.i.processors.cache.distributed.near.GridNearSingleGetResponse.toString(GridNearSingleGetResponse.java:317)
at java.lang.String.valueOf(String.java:2849)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetResponse(GridDhtCacheAdapter.java:336)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$1400(GridDhtAtomicCache.java:129)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$15.apply(GridDhtAtomicCache.java:421)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$15.apply(GridDhtAtomicCache.java:416)
at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1042)
at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:561)
at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
at java.lang.Thread.run(Thread.java:748)
[SERVER] WARNING: Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=as0418alyn.dmz.genworth.net/172.16.32.171:47100, writeTimeout=30000]
I don't have a thread dump at the moment of that stripe error, but have attached what I do have that shows many of the threads waiting on a get. Also attached are our ignite server and client configs. We are trying to run with 2 servers and 2 clients; I have attached one client(the other didn't show anything happening at all), and both server thread dumps.
Thanks in advance for any insight.
RE: Ignite threads in wait/park; cache stops responding
Posted by "Lewis, Cory (Genworth)" <co...@genworth.com>.
I will try the expiry policy thing.
We currently have some dependences in our app that prevent us from going to jdk 1.8; we’ll have to go do some larger refactoring/upgrades first before we can try 2.6, but looks like that’s what we may have to do.
Thanks for your input. I’ll come back if anything interesting happens or we get to 2.6 and still have issues.
Thanks,
Cory
From: Stanislav Lukyanov <st...@gmail.com>
Sent: Monday, July 23, 2018 12:31 PM
To: user@ignite.apache.org
Subject: RE: Ignite threads in wait/park; cache stops responding
Hi,
I’d suggest try upgrading from 2.2 to 2.6.
One thing that stands out in your configs is expiryPolicy settings.
expiryPolicy property was deprecated, you need to use expiryPolicyFactory instead.
I happen to have recently shared a sample of setting it in an XML config on SO:
https://stackoverflow.com/questions/51459535/how-to-set-expiry-policy-for-ignite-cache/51469556<https://protect-us.mimecast.com/s/M1pHC1w7q6SO7g57CLeLED?domain=stackoverflow.com>
If neither upgrading nor replacing expiryPolicy with expiryPolicyFactory helps, please share your Ignite logs alongside with thread dumps.
Thanks,
Stan
From: Lewis, Cory (Genworth)<ma...@genworth.com>
Sent: 23 июля 2018 г. 18:45
To: user@ignite.apache.org<ma...@ignite.apache.org>
Subject: Ignite threads in wait/park; cache stops responding
Hi,
We are trying to convert our app to use Ignite in place of our current caching implementation(ifinispan). It will work fine for an hour or so, and then things begin to fall apart. Eventually all our applications AJP threads will be blocked waiting on ignite to retrieve cached objects.
We are currently using 2.2.
Here are some of the log messages we see as things begin to deterioriate:
[CLIENT] WARNING [org.apache.ignite.internal.diagnostic] (grid-timeout-worker-#15%null%) Found long running cache future [startTime=10:09:00.032, curTime=10:10:45.486, fut=GridPartitionedSingleGetFuture [topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], key=UserKeyCacheObjectImpl [part=75, val=bisyspa, hasValBytes=true], readThrough=true, forcePrimary=false, futId=b382ea9b461-fd143a5a-1e15-4157-ab2c-e1b11e84af1b, trackable=true, subjId=9a152366-4292-459c-8959-403a4da42357, taskName=null, deserializeBinary=true, skipVals=false, expiryPlc=null, canRemap=true, needVer=false, keepCacheObjects=false, recovery=false, node=TcpDiscoveryNode [id=bc32133d-4188-455f-8dbf-d4a13f9b4be6, addrs=[10.60.9.4, 127.0.0.1<https://protect-us.mimecast.com/s/BHzSC2kJr8t6NrONT1cR7f?domain=127.0.0.1>], sockAddrs=[/127.0.0.1:47500<https://protect-us.mimecast.com/s/bL6oC31Jv6sx5305f2FZAU>, /10.60.9.4:47500<https://protect-us.mimecast.com/s/lVOjC4xKw6t6pP3pTWi6Sy>], discPort=47500, order=1, intOrder=1, lastExchangeTime=1532123004911, loc=false, ver=2.2.0#20170915-sha1:5747ce6b, isClient=false]]]
[CLIENT] WARNING [org.apache.ignite.internal.util.typedef.G] (grid-timeout-worker-#15%null%) >>> Possible starvation in striped pool.
Thread name: sys-stripe-1-#2%null%
Queue: []
Deadlock: false
Completed: 4
Thread [name="sys-stripe-1-#2%null%", id=130, state=RUNNABLE, blockCnt=0, waitCnt=4]
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:247)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:306)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:186)
at o.a.i.i.binary.BinaryObjectImpl.toString(BinaryObjectImpl.java:853)
at java.lang.String.valueOf(String.java:2849)
at o.a.i.i.util.GridStringBuilder.a(GridStringBuilder.java:101)
at o.a.i.i.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:884)
at o.a.i.i.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:786)
at o.a.i.i.processors.cache.distributed.near.GridNearSingleGetResponse.toString(GridNearSingleGetResponse.java:317)
at java.lang.String.valueOf(String.java:2849)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetResponse(GridDhtCacheAdapter.java:336)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$1400(GridDhtAtomicCache.java:129)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$15.apply(GridDhtAtomicCache.java:421)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$15.apply(GridDhtAtomicCache.java:416)
at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1042)
at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:561)
at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
at java.lang.Thread.run(Thread.java:748)
[SERVER] WARNING: Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=as0418alyn.dmz.genworth.net/172.16.32.171:47100, writeTimeout=30000]
I don't have a thread dump at the moment of that stripe error, but have attached what I do have that shows many of the threads waiting on a get. Also attached are our ignite server and client configs. We are trying to run with 2 servers and 2 clients; I have attached one client(the other didn't show anything happening at all), and both server thread dumps.
Thanks in advance for any insight.
RE: Ignite threads in wait/park; cache stops responding
Posted by Stanislav Lukyanov <st...@gmail.com>.
Hi,
I’d suggest try upgrading from 2.2 to 2.6.
One thing that stands out in your configs is expiryPolicy settings.
expiryPolicy property was deprecated, you need to use expiryPolicyFactory instead.
I happen to have recently shared a sample of setting it in an XML config on SO:
https://stackoverflow.com/questions/51459535/how-to-set-expiry-policy-for-ignite-cache/51469556
If neither upgrading nor replacing expiryPolicy with expiryPolicyFactory helps, please share your Ignite logs alongside with thread dumps.
Thanks,
Stan
From: Lewis, Cory (Genworth)
Sent: 23 июля 2018 г. 18:45
To: user@ignite.apache.org
Subject: Ignite threads in wait/park; cache stops responding
Hi,
We are trying to convert our app to use Ignite in place of our current caching implementation(ifinispan). It will work fine for an hour or so, and then things begin to fall apart. Eventually all our applications AJP threads will be blocked waiting on ignite to retrieve cached objects.
We are currently using 2.2.
Here are some of the log messages we see as things begin to deterioriate:
[CLIENT] WARNING [org.apache.ignite.internal.diagnostic] (grid-timeout-worker-#15%null%) Found long running cache future [startTime=10:09:00.032, curTime=10:10:45.486, fut=GridPartitionedSingleGetFuture [topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], key=UserKeyCacheObjectImpl [part=75, val=bisyspa, hasValBytes=true], readThrough=true, forcePrimary=false, futId=b382ea9b461-fd143a5a-1e15-4157-ab2c-e1b11e84af1b, trackable=true, subjId=9a152366-4292-459c-8959-403a4da42357, taskName=null, deserializeBinary=true, skipVals=false, expiryPlc=null, canRemap=true, needVer=false, keepCacheObjects=false, recovery=false, node=TcpDiscoveryNode [id=bc32133d-4188-455f-8dbf-d4a13f9b4be6, addrs=[10.60.9.4, 127.0.0.1], sockAddrs=[/127.0.0.1:47500, /10.60.9.4:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1532123004911, loc=false, ver=2.2.0#20170915-sha1:5747ce6b, isClient=false]]]
[CLIENT] WARNING [org.apache.ignite.internal.util.typedef.G] (grid-timeout-worker-#15%null%) >>> Possible starvation in striped pool.
Thread name: sys-stripe-1-#2%null%
Queue: []
Deadlock: false
Completed: 4
Thread [name="sys-stripe-1-#2%null%", id=130, state=RUNNABLE, blockCnt=0, waitCnt=4]
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:247)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:306)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
at o.a.i.i.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:186)
at o.a.i.i.binary.BinaryObjectImpl.toString(BinaryObjectImpl.java:853)
at java.lang.String.valueOf(String.java:2849)
at o.a.i.i.util.GridStringBuilder.a(GridStringBuilder.java:101)
at o.a.i.i.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:884)
at o.a.i.i.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:786)
at o.a.i.i.processors.cache.distributed.near.GridNearSingleGetResponse.toString(GridNearSingleGetResponse.java:317)
at java.lang.String.valueOf(String.java:2849)
at java.lang.StringBuilder.append(StringBuilder.java:128)
at o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetResponse(GridDhtCacheAdapter.java:336)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$1400(GridDhtAtomicCache.java:129)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$15.apply(GridDhtAtomicCache.java:421)
at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$15.apply(GridDhtAtomicCache.java:416)
at o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1042)
at o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:561)
at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
at o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
at o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
at o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
at o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
at o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
at o.a.i.i.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at o.a.i.i.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
at java.lang.Thread.run(Thread.java:748)
[SERVER] WARNING: Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=as0418alyn.dmz.genworth.net/172.16.32.171:47100, writeTimeout=30000]
I don't have a thread dump at the moment of that stripe error, but have attached what I do have that shows many of the threads waiting on a get. Also attached are our ignite server and client configs. We are trying to run with 2 servers and 2 clients; I have attached one client(the other didn't show anything happening at all), and both server thread dumps.
Thanks in advance for any insight.
Re: Ignite threads in wait/park; cache stops responding
Posted by mcherkasov <mc...@gridgain.com>.
Hi Lewisc,
Please try to upgrade ignite to latest version 2.6, I think it makes sense
to investigate this problem only with the latest version of Ignite. Please
share with us logs from all nodes and take stack traces from all nodes too.
Thanks,
Mike.
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/