You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Zhenya Stanilovsky via user <us...@ignite.apache.org> on 2022/09/07 10:54:50 UTC

Re[4]: Checkpointing threads

Ok, Raymond i understand. But seems no one have good answer here, it depends on appropriate fs and near (probably cloud) layer implementation.
If you not observe «throttling» messages (described in prev link) seems it`s all ok, but of course you can benchmark your io by yourself with 3-rd party tool.
 
>Thanks Zhenya.
> 
>I have seen the link you provide has a lot of good information on this system. But it does not talk about the check point writers in any detail.
> 
>I appreciate this cannot be a bottleneck, my question is more related to: "If I have more check pointing threads will check points take less time". In our case we use AWS EFS so if each checkpoint thread is spending relatively long times blocking on write I/O to the persistent store then more check points allow more concurrent writes to take place. Of course, if the check point threads themselves utilise async I/O tasks and interleave I/O activities on that basis then there may not be an opportunity for performance improvement, but I am not an expert in the Ignite code base :)
> 
>Raymond.
>   
>On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>
>>No, there is no any log and metrics suggestions and as i told earlier — this place can`t became a bottleneck, if you have any performance problems — describe them somehow wider and interesting reading here [1]
>> 
>>[1]  https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>>    
>>>Thanks Zhenya. 
>>> 
>>>Is there any logging or metrics that would indicate if there was value increasing the size of this pool?
>>> 
>>> 
>>>On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>>>Hi  Raymond
>>>> 
>>>>checkpoint threads is responsible for dumping modified pages, so you may consider it as io bound only operation and pool size is amount of disc writing workers.
>>>>I think that default is enough and no need for raising it, but it also up to you.
>>>>   
>>>>>Hi,
>>>>> 
>>>>>I am looking at our configuration of the Ignite checkpointing system to ensure we have it tuned correctly.
>>>>> 
>>>>>There is a checkpointing thread pool defined, which defaults to 4 threads in size. I have not been able to find much of a discussion on when/how this pool size should be changed to reflect the node size Ignite is running on.
>>>>> 
>>>>>In our case, we are running 16 core servers with 128 GB RAM with persistence on an NFS storage layer.
>>>>> 
>>>>>Given the number of cores, and the relative latency of NFS compared to local SSD, is 4 checkpointing threads appropriate, or are we likely to see better performance if we increased it to 8 (or more)?
>>>>> 
>>>>>If there is a discussion related to this a pointer to it would be good (it's not really covered in the performance tuning section).
>>>>> 
>>>>>Thanks,
>>>>>Raymond.
>>>>>  --
>>>>>
>>>>>Raymond Wilson
>>>>>Trimble Distinguished Engineer, Civil Construction Software (CCS)
>>>>>11 Birmingham Drive   |   Christchurch, New Zealand
>>>>>raymond_wilson@trimble.com
>>>>>         
>>>>> 
>>>> 
>>>> 
>>>> 
>>>>  
>> 
>> 
>> 
>>  
> 
>  --
>
>Raymond Wilson
>Trimble Distinguished Engineer, Civil Construction Software (CCS)
>11 Birmingham Drive |  Christchurch, New Zealand
>raymond_wilson@trimble.com
>         
>

Re[8]: Checkpointing threads

Posted by Zhenya Stanilovsky via user <us...@ignite.apache.org>.

Not throttling, but : «Thread dump is hidden due to throttling settings» There are huge documentation about persistence tuning in apache ignite.



 
>Hi,
>Throttling is disabled in ignite config as mentioned in prev reply. What do you suggest to make ignite catchup with SSD limits on checkpointing.   
>On Mon, 12 Sept 2022, 11:32 Zhenya Stanilovsky via user, < user@ignite.apache.org > wrote:
>>
>>
>>
>> 
>>>We have observed one interesting issue with checkpointing. We are using 64G RAM 12 CPU with 3K iops/128mbps SSDs. Our application fills up the WAL directory really fast and hence the RAM. We made the following observations
>>>
>>>0. Not so bad news first, it resumes processing after getting stuck for several minutes.
>>>
>>>1. WAL and WAL Archive writes are a lot faster than writes to the work directory through checkpointing. Very curious to know why this is the case. checkpointing writes never exceeds 15 mbps while wal and wal archive go really high upto max limits of ssd
>> 
>>Very simple example : sequential changing of 1 key, so in wal you obtain all changes and in (in your terms — checkpointing) only one key change.
>> 
>>>
>>>2. We observed that when offheap memory usage tend to zero , checkpointing takes minutes to complete , sometimes 30+ minutes which stalls the application writes completely on all nodes. It means the whole cluster freezes. 
>> 
>>Seems ignite enables throttling in such a case, you need some system and cluster tuning. 
>> 
>>>
>>>3. Checkpointing thread get stuck at checkpointing page futures.get and after several minutes, it logs this error and grid resumes processing
>>>
>>>"sys-stripe-0-#1" #19 prio=5 os_prio=0 cpu=86537.69ms elapsed=2166.63s tid=0x00007fa52a6f1000 nid=0x3b waiting on condition  [0x00007fa4c58be000]
>>>   java.lang.Thread.State: WAITING (parking)
>>>at jdk.internal.misc.Unsafe.park( java.base@11.0.14.1/Native Method)
>>>at java.util.concurrent.locks.LockSupport.park( java.base@11.0.14.1/Unknown Source)
>>>at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>at org.apache.ignite.internal.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:146)
>>>at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:144)
>>>at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:317)
>>>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
>>>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
>>>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
>>>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
>>>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
>>>at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
>>>at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908)
>>>at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529)
>>>at org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
>>>at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422)
>>>at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
>>>at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
>>>at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>at java.lang.Thread.run( java.base@11.0.14.1/Unknown Source)   
>>>CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many dirty pages");
>>>checkpointReadWriteLock.readUnlock();
>>>
>>>if (timeout > 0 && U.currentTimeMillis() - start >= timeout)
>>>    failCheckpointReadLock();
>>>
>>>try {
>>>    pages
>>>        .futureFor(LOCK_RELEASED)
>>>        .getUninterruptibly();
>>>}
>>>
>>> [2022-09-09 18:58:35,148][ERROR][sys-stripe-9-#10][CheckpointTimeoutLock] Checkpoint read lock acquisition has been timed out.
>>>class org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock$CheckpointReadLockTimeoutException: Checkpoint read lock acquisition has been timed out.
>>>at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.failCheckpointReadLock(CheckpointTimeoutLock.java:210)
>>>at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:108)
>>>at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
>>>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
>>>[2022-09-09 18:58:35,148][INFO ][sys-stripe-7-#8][FailureProcessor] Thread dump is hidden due to throttling settings. Set IGNITE_DUMP_THREADS_ON_FAILURE_THROTTLING_TIMEOUT property to 0 to see all thread dumps.
>>>
>>>
>>>4. Other nodes printy below logs during the window problematic node is stuck at checkpointing
>>>
>>>[2022-09-09 18:58:35,153][WARN ][push-metrics-exporter-#80][G] >>> Possible starvation in striped pool.
>>>    Thread name: sys-stripe-5-#6
>>>    Queue: [o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@eb9f832, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[351148]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=2, arr=[273841,273843]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest [futId=1662749921887, key=BinaryObjectImpl [arr= true, ctx=false, start=0], flags=1, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0], subjId=12746da1-ac0d-4ba1-933e-5aa3f92d2f68, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[351149]]]]], o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@110ec0fa, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=10, arr=[414638,414655,414658,414661,414662,414663,414666,414668,414673,414678]]]]], o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@63ae8204, o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@2d3cc0b, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[414667]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=4, arr=[351159,351162,351163,351164]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[290762]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[400357]]]]], o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@71887193, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true, ctx=false, start=0], val=BinaryObjectImpl [arr= true, ctx=false, start=0], prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false, nearNodeId=null, nearFutId=0, flags=]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true, ctx=false, start=0], parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null, futId=1324019, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0], parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=]]]]]]
>>>    Deadlock: false
>>>    Completed: 205703
>>>   
>>>On Wed, Sep 7, 2022 at 4:25 PM Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>>>Ok, Raymond i understand. But seems no one have good answer here, it depends on appropriate fs and near (probably cloud) layer implementation.
>>>>If you not observe «throttling» messages (described in prev link) seems it`s all ok, but of course you can benchmark your io by yourself with 3-rd party tool.
>>>>   
>>>>>Thanks Zhenya.
>>>>> 
>>>>>I have seen the link you provide has a lot of good information on this system. But it does not talk about the check point writers in any detail.
>>>>> 
>>>>>I appreciate this cannot be a bottleneck, my question is more related to: "If I have more check pointing threads will check points take less time". In our case we use AWS EFS so if each checkpoint thread is spending relatively long times blocking on write I/O to the persistent store then more check points allow more concurrent writes to take place. Of course, if the check point threads themselves utilise async I/O tasks and interleave I/O activities on that basis then there may not be an opportunity for performance improvement, but I am not an expert in the Ignite code base :)
>>>>> 
>>>>>Raymond.
>>>>>   
>>>>>On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>>>>>
>>>>>>No, there is no any log and metrics suggestions and as i told earlier — this place can`t became a bottleneck, if you have any performance problems — describe them somehow wider and interesting reading here [1]
>>>>>> 
>>>>>>[1]  https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>>>>>>    
>>>>>>>Thanks Zhenya. 
>>>>>>> 
>>>>>>>Is there any logging or metrics that would indicate if there was value increasing the size of this pool?
>>>>>>> 
>>>>>>> 
>>>>>>>On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>>>>>>>Hi  Raymond
>>>>>>>> 
>>>>>>>>checkpoint threads is responsible for dumping modified pages, so you may consider it as io bound only operation and pool size is amount of disc writing workers.
>>>>>>>>I think that default is enough and no need for raising it, but it also up to you.
>>>>>>>>   
>>>>>>>>>Hi,
>>>>>>>>> 
>>>>>>>>>I am looking at our configuration of the Ignite checkpointing system to ensure we have it tuned correctly.
>>>>>>>>> 
>>>>>>>>>There is a checkpointing thread pool defined, which defaults to 4 threads in size. I have not been able to find much of a discussion on when/how this pool size should be changed to reflect the node size Ignite is running on.
>>>>>>>>> 
>>>>>>>>>In our case, we are running 16 core servers with 128 GB RAM with persistence on an NFS storage layer.
>>>>>>>>> 
>>>>>>>>>Given the number of cores, and the relative latency of NFS compared to local SSD, is 4 checkpointing threads appropriate, or are we likely to see better performance if we increased it to 8 (or more)?
>>>>>>>>> 
>>>>>>>>>If there is a discussion related to this a pointer to it would be good (it's not really covered in the performance tuning section).
>>>>>>>>> 
>>>>>>>>>Thanks,
>>>>>>>>>Raymond.
>>>>>>>>>  --
>>>>>>>>>
>>>>>>>>>Raymond Wilson
>>>>>>>>>Trimble Distinguished Engineer, Civil Construction Software (CCS)
>>>>>>>>>11 Birmingham Drive   |   Christchurch, New Zealand
>>>>>>>>>raymond_wilson@trimble.com
>>>>>>>>>         
>>>>>>>>> 
>>>>>>>>>  
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>  
>>>>> 
>>>>>  --
>>>>>
>>>>>Raymond Wilson
>>>>>Trimble Distinguished Engineer, Civil Construction Software (CCS)
>>>>>11 Birmingham Drive |  Christchurch, New Zealand
>>>>>raymond_wilson@trimble.com
>>>>>         
>>>>> 
>>>>>  
>>>> 
>>>> 
>>>> 
>>>>  
>> 
>> 
>> 
>>

Re: Re[6]: Checkpointing threads

Posted by Surinder Mehra <re...@gmail.com>.

Hi,
Throttling is disabled in ignite config as mentioned in prev reply. What do
you suggest to make ignite catchup with SSD limits on checkpointing.

On Mon, 12 Sept 2022, 11:32 Zhenya Stanilovsky via user, <
user@ignite.apache.org> wrote:

>
>
>
>
>
> We have observed one interesting issue with checkpointing. We are using
> 64G RAM 12 CPU with 3K iops/128mbps SSDs. Our application fills up the WAL
> directory really fast and hence the RAM. We made the following observations
>
> 0. Not so bad news first, it resumes processing after getting stuck for
> several minutes.
>
> 1. WAL and WAL Archive writes are a lot faster than writes to the work
> directory through checkpointing. Very curious to know why this is the case.
> checkpointing writes never exceeds 15 mbps while wal and wal archive go
> really high upto max limits of ssd
>
>
> Very simple example : sequential changing of 1 key, so in wal you obtain
> all changes and in (in your terms — checkpointing) only one key change.
>
>
>
> 2. We observed that when offheap memory usage tend to zero , checkpointing
> takes minutes to complete , sometimes 30+ minutes which stalls the
> application writes completely on all nodes. It means the whole cluster
> freezes.
>
>
> Seems ignite enables throttling in such a case, you need some system and
> cluster tuning.
>
>
>
> 3. Checkpointing thread get stuck at checkpointing page futures.get and
> after several minutes, it logs this error and grid resumes processing
>
> "sys-stripe-0-#1" #19 prio=5 os_prio=0 cpu=86537.69ms elapsed=2166.63s
> tid=0x00007fa52a6f1000 nid=0x3b waiting on condition  [0x00007fa4c58be000]
>    java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base@11.0.14.1/Native Method)
> at java.util.concurrent.locks.LockSupport.park(java.base@11.0.14.1/Unknown
> Source)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:146)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:144)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:317)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422)
> at
> org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
> at
> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(java.base@11.0.14.1/Unknown Source)
> CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many
> dirty pages");
>
> checkpointReadWriteLock.readUnlock();
>
> if (timeout > 0 && U.currentTimeMillis() - start >= timeout)
>     failCheckpointReadLock();
>
> try {
>     pages
>         .futureFor(LOCK_RELEASED)
>         .getUninterruptibly();
> }
>
>
> [2022-09-09 18:58:35,148][ERROR][sys-stripe-9-#10][CheckpointTimeoutLock]
> Checkpoint read lock acquisition has been timed out.
> class
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock$CheckpointReadLockTimeoutException:
> Checkpoint read lock acquisition has been timed out.
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.failCheckpointReadLock(CheckpointTimeoutLock.java:210)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:108)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
> [2022-09-09 18:58:35,148][INFO ][sys-stripe-7-#8][FailureProcessor] Thread
> dump is hidden due to throttling settings. Set
> IGNITE_DUMP_THREADS_ON_FAILURE_THROTTLING_TIMEOUT property to 0 to see all
> thread dumps.
>
>
> 4. Other nodes printy below logs during the window problematic node is
> stuck at checkpointing
>
> [2022-09-09 18:58:35,153][WARN ][push-metrics-exporter-#80][G] >>>
> Possible starvation in striped pool.
>     Thread name: sys-stripe-5-#6
>     Queue:
> [o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@eb9f832,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
> arr=[351148]]]]], Message closure [msg=GridIoMessage [plc=2,
> topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=2, arr=[273841,273843]]]]], Message closure
> [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
> timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest
> [futId=1662749921887, key=BinaryObjectImpl [arr= true, ctx=false, start=0],
> flags=1, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0],
> subjId=12746da1-ac0d-4ba1-933e-5aa3f92d2f68, taskNameHash=0, createTtl=-1,
> accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure
> [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
> timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=1, arr=[351149]]]]],
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@110ec0fa,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=10,
> arr=[414638,414655,414658,414661,414662,414663,414666,414668,414673,414678]]]]],
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@63ae8204,
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@2d3cc0b,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
> arr=[414667]]]]], Message closure [msg=GridIoMessage [plc=2,
> topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=4, arr=[351159,351162,351163,351164]]]]], Message
> closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
> arr=[290762]]]]], Message closure [msg=GridIoMessage [plc=2,
> topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=1, arr=[400357]]]]],
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@71887193,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true,
> ctx=false, start=0], val=BinaryObjectImpl [arr= true, ctx=false, start=0],
> prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false,
> nearNodeId=null, nearFutId=0, flags=]]]], Message closure
> [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
> timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateRequest
> [key=BinaryObjectImpl [arr= true, ctx=false, start=0],
> parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null,
> futId=1324019, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0],
> parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=]]]]]]
>     Deadlock: false
>     Completed: 205703
>
>
> On Wed, Sep 7, 2022 at 4:25 PM Zhenya Stanilovsky via user <
> user@ignite.apache.org
> <//...@ignite.apache.org>> wrote:
>
> Ok, Raymond i understand. But seems no one have good answer here, it
> depends on appropriate fs and near (probably cloud) layer implementation.
> If you not observe «throttling» messages (described in prev link) seems
> it`s all ok, but of course you can benchmark your io by yourself with 3-rd
> party tool.
>
>
>
> Thanks Zhenya.
>
> I have seen the link you provide has a lot of good information on this
> system. But it does not talk about the check point writers in any detail.
>
> I appreciate this cannot be a bottleneck, my question is more related to:
> "If I have more check pointing threads will check points take less time".
> In our case we use AWS EFS so if each checkpoint thread is spending
> relatively long times blocking on write I/O to the persistent store then
> more check points allow more concurrent writes to take place. Of course, if
> the check point threads themselves utilise async I/O tasks and
> interleave I/O activities on that basis then there may not be an
> opportunity for performance improvement, but I am not an expert in the
> Ignite code base :)
>
> Raymond.
>
>
> On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user <
> user@ignite.apache.org
> <ht...@ignite.apache.org>> wrote:
>
>
> No, there is no any log and metrics suggestions and as i told earlier
> — this place can`t became a bottleneck, if you have any performance
> problems — describe them somehow wider and interesting reading here [1]
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>
>
>
> Thanks Zhenya.
>
> Is there any logging or metrics that would indicate if there was value
> increasing the size of this pool?
>
>
> On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user <
> user@ignite.apache.org
> <ht...@ignite.apache.org>> wrote:
>
> Hi Raymond
>
> checkpoint threads is responsible for dumping modified pages, so you may
> consider it as io bound only operation and pool size is amount of
> disc writing workers.
> I think that default is enough and no need for raising it, but it also up
> to you.
>
>
>
> Hi,
>
> I am looking at our configuration of the Ignite checkpointing system to
> ensure we have it tuned correctly.
>
> There is a checkpointing thread pool defined, which defaults to 4 threads
> in size. I have not been able to find much of a discussion on when/how this
> pool size should be changed to reflect the node size Ignite is running on.
>
> In our case, we are running 16 core servers with 128 GB RAM with
> persistence on an NFS storage layer.
>
> Given the number of cores, and the relative latency of NFS compared to
> local SSD, is 4 checkpointing threads appropriate, or are we likely to see
> better performance if we increased it to 8 (or more)?
>
> If there is a discussion related to this a pointer to it would be good
> (it's not really covered in the performance tuning section).
>
> Thanks,
> Raymond.
>
> --
>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
>  |
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
>  Christchurch, New Zealand
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
> raymond_wilson@trimble.com
> <ht...@trimble.com>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive | Christchurch, New Zealand
> raymond_wilson@trimble.com
> <ht...@trimble.com>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re[6]: Checkpointing threads

Posted by Zhenya Stanilovsky via user <us...@ignite.apache.org>.




 
>We have observed one interesting issue with checkpointing. We are using 64G RAM 12 CPU with 3K iops/128mbps SSDs. Our application fills up the WAL directory really fast and hence the RAM. We made the following observations
>
>0. Not so bad news first, it resumes processing after getting stuck for several minutes.
>
>1. WAL and WAL Archive writes are a lot faster than writes to the work directory through checkpointing. Very curious to know why this is the case. checkpointing writes never exceeds 15 mbps while wal and wal archive go really high upto max limits of ssd
 
Very simple example : sequential changing of 1 key, so in wal you obtain all changes and in (in your terms — checkpointing) only one key change.
 
>
>2. We observed that when offheap memory usage tend to zero , checkpointing takes minutes to complete , sometimes 30+ minutes which stalls the application writes completely on all nodes. It means the whole cluster freezes. 
 
Seems ignite enables throttling in such a case, you need some system and cluster tuning. 
 
>
>3. Checkpointing thread get stuck at checkpointing page futures.get and after several minutes, it logs this error and grid resumes processing
>
>"sys-stripe-0-#1" #19 prio=5 os_prio=0 cpu=86537.69ms elapsed=2166.63s tid=0x00007fa52a6f1000 nid=0x3b waiting on condition  [0x00007fa4c58be000]
>   java.lang.Thread.State: WAITING (parking)
>at jdk.internal.misc.Unsafe.park( java.base@11.0.14.1/Native Method)
>at java.util.concurrent.locks.LockSupport.park( java.base@11.0.14.1/Unknown Source)
>at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>at org.apache.ignite.internal.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:146)
>at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:144)
>at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:317)
>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
>at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
>at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
>at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908)
>at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529)
>at org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
>at org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422)
>at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
>at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
>at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>at java.lang.Thread.run( java.base@11.0.14.1/Unknown Source)   
>CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many dirty pages");
>checkpointReadWriteLock.readUnlock();
>
>if (timeout > 0 && U.currentTimeMillis() - start >= timeout)
>    failCheckpointReadLock();
>
>try {
>    pages
>        .futureFor(LOCK_RELEASED)
>        .getUninterruptibly();
>}
>
> [2022-09-09 18:58:35,148][ERROR][sys-stripe-9-#10][CheckpointTimeoutLock] Checkpoint read lock acquisition has been timed out.
>class org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock$CheckpointReadLockTimeoutException: Checkpoint read lock acquisition has been timed out.
>at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.failCheckpointReadLock(CheckpointTimeoutLock.java:210)
>at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:108)
>at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
>at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
>[2022-09-09 18:58:35,148][INFO ][sys-stripe-7-#8][FailureProcessor] Thread dump is hidden due to throttling settings. Set IGNITE_DUMP_THREADS_ON_FAILURE_THROTTLING_TIMEOUT property to 0 to see all thread dumps.
>
>
>4. Other nodes printy below logs during the window problematic node is stuck at checkpointing
>
>[2022-09-09 18:58:35,153][WARN ][push-metrics-exporter-#80][G] >>> Possible starvation in striped pool.
>    Thread name: sys-stripe-5-#6
>    Queue: [o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@eb9f832, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[351148]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=2, arr=[273841,273843]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest [futId=1662749921887, key=BinaryObjectImpl [arr= true, ctx=false, start=0], flags=1, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0], subjId=12746da1-ac0d-4ba1-933e-5aa3f92d2f68, taskNameHash=0, createTtl=-1, accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[351149]]]]], o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@110ec0fa, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=10, arr=[414638,414655,414658,414661,414662,414663,414666,414668,414673,414678]]]]], o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@63ae8204, o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@2d3cc0b, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[414667]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=4, arr=[351159,351162,351163,351164]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[290762]]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, arr=[400357]]]]], o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@71887193, Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true, ctx=false, start=0], val=BinaryObjectImpl [arr= true, ctx=false, start=0], prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false, nearNodeId=null, nearFutId=0, flags=]]]], Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true, ctx=false, start=0], parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null, futId=1324019, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0], parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=]]]]]]
>    Deadlock: false
>    Completed: 205703
>   
>On Wed, Sep 7, 2022 at 4:25 PM Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>Ok, Raymond i understand. But seems no one have good answer here, it depends on appropriate fs and near (probably cloud) layer implementation.
>>If you not observe «throttling» messages (described in prev link) seems it`s all ok, but of course you can benchmark your io by yourself with 3-rd party tool.
>>   
>>>Thanks Zhenya.
>>> 
>>>I have seen the link you provide has a lot of good information on this system. But it does not talk about the check point writers in any detail.
>>> 
>>>I appreciate this cannot be a bottleneck, my question is more related to: "If I have more check pointing threads will check points take less time". In our case we use AWS EFS so if each checkpoint thread is spending relatively long times blocking on write I/O to the persistent store then more check points allow more concurrent writes to take place. Of course, if the check point threads themselves utilise async I/O tasks and interleave I/O activities on that basis then there may not be an opportunity for performance improvement, but I am not an expert in the Ignite code base :)
>>> 
>>>Raymond.
>>>   
>>>On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>>>
>>>>No, there is no any log and metrics suggestions and as i told earlier — this place can`t became a bottleneck, if you have any performance problems — describe them somehow wider and interesting reading here [1]
>>>> 
>>>>[1]  https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>>>>    
>>>>>Thanks Zhenya. 
>>>>> 
>>>>>Is there any logging or metrics that would indicate if there was value increasing the size of this pool?
>>>>> 
>>>>> 
>>>>>On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user < user@ignite.apache.org > wrote:
>>>>>>Hi  Raymond
>>>>>> 
>>>>>>checkpoint threads is responsible for dumping modified pages, so you may consider it as io bound only operation and pool size is amount of disc writing workers.
>>>>>>I think that default is enough and no need for raising it, but it also up to you.
>>>>>>   
>>>>>>>Hi,
>>>>>>> 
>>>>>>>I am looking at our configuration of the Ignite checkpointing system to ensure we have it tuned correctly.
>>>>>>> 
>>>>>>>There is a checkpointing thread pool defined, which defaults to 4 threads in size. I have not been able to find much of a discussion on when/how this pool size should be changed to reflect the node size Ignite is running on.
>>>>>>> 
>>>>>>>In our case, we are running 16 core servers with 128 GB RAM with persistence on an NFS storage layer.
>>>>>>> 
>>>>>>>Given the number of cores, and the relative latency of NFS compared to local SSD, is 4 checkpointing threads appropriate, or are we likely to see better performance if we increased it to 8 (or more)?
>>>>>>> 
>>>>>>>If there is a discussion related to this a pointer to it would be good (it's not really covered in the performance tuning section).
>>>>>>> 
>>>>>>>Thanks,
>>>>>>>Raymond.
>>>>>>>  --
>>>>>>>
>>>>>>>Raymond Wilson
>>>>>>>Trimble Distinguished Engineer, Civil Construction Software (CCS)
>>>>>>>11 Birmingham Drive   |   Christchurch, New Zealand
>>>>>>>raymond_wilson@trimble.com
>>>>>>>         
>>>>>>> 
>>>>>>>  
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>  
>>>> 
>>>> 
>>>> 
>>>>  
>>> 
>>>  --
>>>
>>>Raymond Wilson
>>>Trimble Distinguished Engineer, Civil Construction Software (CCS)
>>>11 Birmingham Drive |  Christchurch, New Zealand
>>>raymond_wilson@trimble.com
>>>         
>>> 
>>>  
>> 
>> 
>> 
>>

Re: Re[4]: Checkpointing threads

Posted by Surinder Mehra <re...@gmail.com>.

We have observed one interesting issue with checkpointing. We are using 64G
RAM 12 CPU with 3K iops/128mbps SSDs. Our application fills up the WAL
directory really fast and hence the RAM. We made the following observations

0. Not so bad news first, it resumes processing after getting stuck for
several minutes.

1. WAL and WAL Archive writes are a lot faster than writes to the work
directory through checkpointing. Very curious to know why this is the case.
checkpointing writes never exceeds 15 mbps while wal and wal archive go
really high upto max limits of ssd

2. We observed that when offheap memory usage tend to zero , checkpointing
takes minutes to complete , sometimes 30+ minutes which stalls the
application writes completely on all nodes. It means the whole cluster
freezes.

3. Checkpointing thread get stuck at checkpointing page futures.get and
after several minutes, it logs this error and grid resumes processing

"sys-stripe-0-#1" #19 prio=5 os_prio=0 cpu=86537.69ms elapsed=2166.63s
tid=0x00007fa52a6f1000 nid=0x3b waiting on condition  [0x00007fa4c58be000]
   java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@11.0.14.1/Native Method)
at java.util.concurrent.locks.LockSupport.park(java.base@11.0.14.1/Unknown
Source)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:146)
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:144)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:317)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908)
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529)
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
at
org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422)
at
org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
at
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(java.base@11.0.14.1/Unknown Source)
CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many
dirty pages");

checkpointReadWriteLock.readUnlock();

if (timeout > 0 && U.currentTimeMillis() - start >= timeout)
    failCheckpointReadLock();

try {
    pages
        .futureFor(LOCK_RELEASED)
        .getUninterruptibly();
}


[2022-09-09 18:58:35,148][ERROR][sys-stripe-9-#10][CheckpointTimeoutLock]
Checkpoint read lock acquisition has been timed out.
class
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock$CheckpointReadLockTimeoutException:
Checkpoint read lock acquisition has been timed out.
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.failCheckpointReadLock(CheckpointTimeoutLock.java:210)
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:108)
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
[2022-09-09 18:58:35,148][INFO ][sys-stripe-7-#8][FailureProcessor] Thread
dump is hidden due to throttling settings. Set
IGNITE_DUMP_THREADS_ON_FAILURE_THROTTLING_TIMEOUT property to 0 to see all
thread dumps.


4. Other nodes printy below logs during the window problematic node is
stuck at checkpointing

[2022-09-09 18:58:35,153][WARN ][push-metrics-exporter-#80][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6
    Queue:
[o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@eb9f832,
Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
arr=[351148]]]]], Message closure [msg=GridIoMessage [plc=2,
topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
[futIds=GridLongList [idx=2, arr=[273841,273843]]]]], Message closure
[msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest
[futId=1662749921887, key=BinaryObjectImpl [arr= true, ctx=false, start=0],
flags=1, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0],
subjId=12746da1-ac0d-4ba1-933e-5aa3f92d2f68, taskNameHash=0, createTtl=-1,
accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure
[msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
[futIds=GridLongList [idx=1, arr=[351149]]]]],
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@110ec0fa,
Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=10,
arr=[414638,414655,414658,414661,414662,414663,414666,414668,414673,414678]]]]],
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@63ae8204,
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@2d3cc0b,
Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
arr=[414667]]]]], Message closure [msg=GridIoMessage [plc=2,
topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
[futIds=GridLongList [idx=4, arr=[351159,351162,351163,351164]]]]], Message
closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
arr=[290762]]]]], Message closure [msg=GridIoMessage [plc=2,
topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
[futIds=GridLongList [idx=1, arr=[400357]]]]],
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@71887193,
Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true,
ctx=false, start=0], val=BinaryObjectImpl [arr= true, ctx=false, start=0],
prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false,
nearNodeId=null, nearFutId=0, flags=]]]], Message closure
[msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateRequest
[key=BinaryObjectImpl [arr= true, ctx=false, start=0],
parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null,
futId=1324019, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0],
parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=]]]]]]
    Deadlock: false
    Completed: 205703


On Wed, Sep 7, 2022 at 4:25 PM Zhenya Stanilovsky via user <
user@ignite.apache.org> wrote:

> Ok, Raymond i understand. But seems no one have good answer here, it
> depends on appropriate fs and near (probably cloud) layer implementation.
> If you not observe «throttling» messages (described in prev link) seems
> it`s all ok, but of course you can benchmark your io by yourself with 3-rd
> party tool.
>
>
> Thanks Zhenya.
>
> I have seen the link you provide has a lot of good information on this
> system. But it does not talk about the check point writers in any detail.
>
> I appreciate this cannot be a bottleneck, my question is more related to:
> "If I have more check pointing threads will check points take less time".
> In our case we use AWS EFS so if each checkpoint thread is spending
> relatively long times blocking on write I/O to the persistent store then
> more check points allow more concurrent writes to take place. Of course, if
> the check point threads themselves utilise async I/O tasks and
> interleave I/O activities on that basis then there may not be an
> opportunity for performance improvement, but I am not an expert in the
> Ignite code base :)
>
> Raymond.
>
>
> On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user <
> user@ignite.apache.org
> <//...@ignite.apache.org>> wrote:
>
>
> No, there is no any log and metrics suggestions and as i told earlier
> — this place can`t became a bottleneck, if you have any performance
> problems — describe them somehow wider and interesting reading here [1]
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>
>
>
> Thanks Zhenya.
>
> Is there any logging or metrics that would indicate if there was value
> increasing the size of this pool?
>
>
> On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user <
> user@ignite.apache.org
> <ht...@ignite.apache.org>> wrote:
>
> Hi Raymond
>
> checkpoint threads is responsible for dumping modified pages, so you may
> consider it as io bound only operation and pool size is amount of
> disc writing workers.
> I think that default is enough and no need for raising it, but it also up
> to you.
>
>
>
> Hi,
>
> I am looking at our configuration of the Ignite checkpointing system to
> ensure we have it tuned correctly.
>
> There is a checkpointing thread pool defined, which defaults to 4 threads
> in size. I have not been able to find much of a discussion on when/how this
> pool size should be changed to reflect the node size Ignite is running on.
>
> In our case, we are running 16 core servers with 128 GB RAM with
> persistence on an NFS storage layer.
>
> Given the number of cores, and the relative latency of NFS compared to
> local SSD, is 4 checkpointing threads appropriate, or are we likely to see
> better performance if we increased it to 8 (or more)?
>
> If there is a discussion related to this a pointer to it would be good
> (it's not really covered in the performance tuning section).
>
> Thanks,
> Raymond.
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
>  |
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
>  Christchurch, New Zealand
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
> raymond_wilson@trimble.com
> <ht...@trimble.com>
>
>
>
> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive | Christchurch, New Zealand
> raymond_wilson@trimble.com
> <//...@trimble.com>
>
>
>
> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>
>
>
>
>
>