You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by breischl <br...@gmail.com> on 2018/07/01 15:44:26 UTC

Re: Deadlock during cache loading

@DaveHarvey, I'll look at that tomorrow. Seems potentially complicated, but
if that's what has to happen we'll figure it out. 

Interestingly, cutting the cluster to half as many nodes (by reducing the
number of backups) seems to have resolved the issue. Is there a guideline
for how large a cluster should be? 

We were running a single 44-node cluster, with 3 data backups (4 total
copies) and hitting the issue consistently. I switched to running two
separate clusters, each with 22 nodes using 1 data backup (2 total copies).
The smaller clusters seem to work perfectly every time, though I haven't
tried them as much.


@smovva - We're still actively experimenting with instance and cluster
sizing. We were running on c4.4xl instances. However we were barely using
the CPUs, but consistently have memory issues (using a 20GB heap, plus a bit
of off-heap). We just switched to r4.2xl instances which is working better
for us so far, and is a bit cheaper. However I would imagine that the
optimal size depends on your use case - it's basically a tradeoff between
the memory, CPU, networking and operational cost requirements of your use
case. 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Deadlock during cache loading

Posted by David Harvey <dh...@jobcase.com>.

transactions are easy to use: see  examples,  org.apache.ignite.
examples.datagrid.store.auto
We use them in the stream receiver.    You simply bracket the get/put in
the transaction, but use a timeout, then bracket that with an "until done"
while loop, perhaps added a sleep to backoff.
We ended up with better performance with PESSIMISTIC transactions, though
we expected OPTIMISTIC to win.

My guess would be the DataStreamer is not a fundamental contributor to the
deadlock you are seeing, and you may have discovered an ignite bug.

On Sun, Jul 1, 2018 at 11:44 AM, breischl <br...@gmail.com> wrote:

> @DaveHarvey, I'll look at that tomorrow. Seems potentially complicated, but
> if that's what has to happen we'll figure it out.
>
> Interestingly, cutting the cluster to half as many nodes (by reducing the
> number of backups) seems to have resolved the issue. Is there a guideline
> for how large a cluster should be?
>
> We were running a single 44-node cluster, with 3 data backups (4 total
> copies) and hitting the issue consistently. I switched to running two
> separate clusters, each with 22 nodes using 1 data backup (2 total copies).
> The smaller clusters seem to work perfectly every time, though I haven't
> tried them as much.
>
>
> @smovva - We're still actively experimenting with instance and cluster
> sizing. We were running on c4.4xl instances. However we were barely using
> the CPUs, but consistently have memory issues (using a 20GB heap, plus a
> bit
> of off-heap). We just switched to r4.2xl instances which is working better
> for us so far, and is a bit cheaper. However I would imagine that the
> optimal size depends on your use case - it's basically a tradeoff between
> the memory, CPU, networking and operational cost requirements of your use
> case.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>
>

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.