You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Ilya Kasnacheev <il...@apache.org> on 2019/09/02 15:59:21 UTC

Re: How to stop a node from dying when memory is full?

Hello!

You can try enabling Page Eviction, in this case pages with K-V pairs contained in them will be dropped.

Regards,

On 2019/08/29 11:54:11, colinc <co...@yahoo.co.uk> wrote: 
> In a system that is not using native persistence, what is the recommended way
> of stopping a cluster from running out of memory - or stopping it from
> crashing when it does? 
> 
> As per the below jira, memory monitoring appears to be unreliable in the
> latest version of Ignite:
> https://issues.apache.org/jira/browse/IGNITE-12096
> 
> Even when working, this is an estimate that is updated periodically, which
> makes it hard to reliably avoid a critical OOM in a system that is rapidly
> filling caches.
> 
> It is technically possible to create a custom failure handler - but I
> understand that trapping the failure in this way is considered to be bad
> practice, since it can leave Ignite in an inconsistent state.
> 
> How are people addressing this challenge?
> 
> Regards,
> Colin.
> 
> 
> 
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
> 

Re: How to stop a node from dying when memory is full?

Posted by Mikhail Cherkasov <mc...@gridgain.com>.
Hi Dana,

Do you have java.lang.OOM or IgniteOOM ?

if you use TRANSACTIONAL_SNAPSHOT then it's transactional data, page
eviction means that you just remove some random pages, so it doesn't make
sense to have TRANSACTIONAL_SNAPSHOT mode and remove random data at
the same time, it just destroys the whole idea of transactions and data
consistency.
Might be I miss something about your case, but I would say if you have
TRANSACTIONAL_SNAPSHOT you just can not use page eviction, if you can use
page eviction, then don't use TRANSACTIONAL_SNAPSHOT.

Regarding custom failure handler, it should work, I don't see any reason
why it shouldn't, I would really appreciate if you will send us some update
about this approach.

Thanks,
Mike.


On Thu, Aug 27, 2020 at 2:13 AM danami <da...@gmail.com> wrote:

> I'd like to extend Colin's question.
>
> What if I'm using TRANSACTIONAL_SNAPSHOT mode, therefore I can't use Page
> Eviction?
> How then, other than persistence, can I avoid OOM errors?
> Can I write a custom failure handler, to clear the cache/data region for
> example? Is it technically possible? (If so, how?) Will it work or is it
> bad
> and inconsistent as Colin suggested?
> Is using persistence my only option to avoid OOM errors or do I have other
> choices?
>
> Thank you for your help,
> Dana
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


-- 
Thanks,
Mikhail.

Re: How to stop a node from dying when memory is full?

Posted by Denis Magda <dm...@apache.org>.
Folks, this article should be relevant to you. It covers all techniques to
avoid the OOM except for swapping:
https://www.gridgain.com/resources/blog/out-of-memory-apache-ignite-cluster-handling-techniques

You can use swapping as a part of your toolbox to survive the time when a
node is running out of the memory space:
https://apacheignite.readme.io/docs/swap-space

-
Denis


On Thu, Aug 27, 2020 at 2:13 AM danami <da...@gmail.com> wrote:

> I'd like to extend Colin's question.
>
> What if I'm using TRANSACTIONAL_SNAPSHOT mode, therefore I can't use Page
> Eviction?
> How then, other than persistence, can I avoid OOM errors?
> Can I write a custom failure handler, to clear the cache/data region for
> example? Is it technically possible? (If so, how?) Will it work or is it
> bad
> and inconsistent as Colin suggested?
> Is using persistence my only option to avoid OOM errors or do I have other
> choices?
>
> Thank you for your help,
> Dana
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: How to stop a node from dying when memory is full?

Posted by danami <da...@gmail.com>.
I'd like to extend Colin's question.

What if I'm using TRANSACTIONAL_SNAPSHOT mode, therefore I can't use Page
Eviction?
How then, other than persistence, can I avoid OOM errors?
Can I write a custom failure handler, to clear the cache/data region for
example? Is it technically possible? (If so, how?) Will it work or is it bad
and inconsistent as Colin suggested?
Is using persistence my only option to avoid OOM errors or do I have other
choices? 

Thank you for your help,
Dana




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How to stop a node from dying when memory is full?

Posted by colinc <co...@yahoo.co.uk>.
Thanks - that does seem to be effective at stopping the OOM condition at
least.

Is there any way to determine which cache entries were affected by the page
expiry, do you know? The EVT_CACHE_ENTRY_EVICTED doesn't seem to get fired
in this case as far as I can tell. Is that your expectation?

This is important for performing clean-up of related cache entries to ensure
referential integrity.

Regards,
Colin.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/