You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by David Harvey <dh...@jobcase.com> on 2018/02/12 14:55:37 UTC

20 minute 12x throughput drop using data streamer and Ignite persistence

I have a 8 node cluster with 244GB/node, and I see a behavior I don't have
any insight into, and which doesn't make sense.

I'm using a custom StreamReceiver which starts a transaction that starts a
transaction and updates 4 partitioned caches, 2 of which should be local
updates. Ignite Persistence is on, and there is 1 sync backup per cache.

I start out with no caches. I'm normally getting about 16K
transactions/sec, and that drops to about 1K/s for about 20 minutes, and
then recovers.

One node starts transmitting/receiving with peaks up to 260 MB/s vs. the
normal peaks which are about 60MB/s. The thread count on that node hits a
peak and stays there for the duration of the event. The SSD write times
are very low. This is prior to filling up the cache, so there are no
reads. The transmit BW drops off

The logs show nothing interesting, only checkpoints, and their frequency is
low. The checkpoint times don't get worse, and their frequency drops off,
due to throughput drop.

I have 6 threads feeding the DataStreamer from a client node. When each
finishes a batch of 200,000 transactions, it waits for the Futures for
complete, and will issue a TryFlush if it waits too long. ( The
DataStreamer API is not ideal for the case where there are multiple
threads using the same stream: when there are multiple streams, the choice
is to Flush, which degrades the throughput of the other streams, or to
wait, where the data is not sent if the buffers aren't filling. ) .
Normally each batch would take 2 minutes or so, in this case the flush did
not complete for 20 minutes. At the low point, I was seeing 260 futures
completing per second, vs, the normal ~16K.

I've attached the current configuration file. This originally occurred
when using 64 DataStreamer threads with no other thread counts changed. It
also seemed to cause peer class loading to fail and I needed to increase
the timeout to avoid that.

Thanks,
Dave Harvey

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.

Re: 20 minute 12x throughput drop using data streamer and Ignite persistence

Posted by Dave Harvey <dh...@jobcase.com>.

I've started reproducing this issue with more  statistics, but have not
reached the worst performance point yet, but somethings are starting to
become clearer:

The DataStreamer hashes the affinity key to partition, and then maps the
partition to a node, and fills a single buffer at a time for the node.  A
DataStreamer thread on the node therefore get a buffer's worth of requests
grouped by the time of the addData() call, with no per thread grouping by
affinity key (as I had originally assumed).

The test I was running was using a large amount of data where the average
number of keys for each unique affinity key is 3, with some outliers up to
50K.   One of the caches being updated in the optimistic transaction in the
StreamReceiver contains an object whose key is the affinity key, and whose
contents are the set of keys that have that affinity key.     We expect some
temporal locality for objects with the same affinity key.

We had a number of worker threads on a client node, but only one data
streamer, where we increased the buffer count.   Once we understood how the
data streamer actually worked, we made each worker have its own
DataStreamer.   This way, each worker could issue a flush, without affecting
the other workers.   That, in turn, allowed us to use smaller batches per
worker, decreasing the odds of temporal locality.

So it seems like we would get updates for the same affinity key on different
data streamer threads, and they could conflict updating the common record.  
The more keys per affinity key the more likely a conflict, and the more data
would need to be saved.   A flush operation could stall multiple workers,
and the flush operation might be dependent on requests that are conflicting.    

We chose to use OPTIMISTIC transactions because of their lack-of-deadlock
characteristics, rather than because we thought there would be high
contention.      I do think this behavior suggests something sub-optimal
about the OPTIMISTIC lock implementation, because I see a dramatic decrease
in throughput, but not a dramatic increase in transaction restarts.    

 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 20 minute 12x throughput drop using data streamer and Ignite persistence

Posted by Dave Harvey <dh...@jobcase.com>.

I've started reproducing this issue with more  statistics, but have not
reached the worst performance point yet, but somethings are starting to
become clearer:

The DataStreamer hashes the affinity key to partition, and then maps the
partition to a node, and fills a single buffer at a time for the node.  A
DataStreamer thread on the node therefore get a buffer's worth of requests
grouped by the time of the addData() call, with no per thread grouping by
affinity key (as I had originally assumed).

The test I was running was using a large amount of data where the average
number of keys for each unique affinity key is 3, with some outliers up to
50K.   One of the caches being updated in the optimistic transaction in the
StreamReceiver contains an object whose key is the affinity key, and whose
contents are the set of keys that have that affinity key.     We expect some
temporal locality for objects with the same affinity key.

We had a number of worker threads on a client node, but only one data
streamer, where we increased the buffer count.   Once we understood how the
data streamer actually worked, we made each worker have its own
DataStreamer.   This way, each worker could issue a flush, without affecting
the other workers.   That, in turn, allowed us to use smaller batches per
worker, decreasing the odds of temporal locality.

So it seems like we would get updates for the same affinity key on different
data streamer threads, and they could conflict updating the common record.  
The more keys per affinity key the more likely a conflict, and the more data
would need to be saved.   A flush operation could stall multiple workers,
and the flush operation might be dependent on requests that are conflicting.    

We chose to use OPTIMISTIC transactions because of their lack-of-deadlock
characteristics, rather than because we thought there would be high
contention.      I do think this behavior suggests something sub-optimal
about the OPTIMISTIC lock implementation, because I see a dramatic decrease
in throughput, but not a dramatic increase in transaction restarts. 
"In OPTIMISTIC transactions, entry locks are acquired on primary nodes
during the prepare step,"  does not say anything about  the order that locks
are acquired.  Sorting the locks so there is a consistent order would avoid
deadlocks.   
If there are no deadlocks, then there could be n-1 restarts of the
transaction for each commit, where n is the number of data streamer threads.    
This is the old "thundering herd" problem, which can easily be made order n
by only allowing one of the waiting threads to proceed at a time.
 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 20 minute 12x throughput drop using data streamer and Ignite persistence

Posted by Dave Harvey <dh...@jobcase.com>.

I made improvements to the statistics collection in the stream receiver, and
I'm finding an excessive number of retry's of the optimistic transactions we
are using.   I will understand that and retry.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 20 minute 12x throughput drop using data streamer and Ignite persistence

Posted by David Harvey <dh...@jobcase.com>.

We are pulling in a large number of records from an RDB, and reorganizing
the data so that our analytics will be much faster.

I'm running Sumo, and have looked at all of the log files from all the
nodes, and the only things are checkpoints and GC logs.  The checkpoints
are fast, and occur at a lower rate during the slowdowns.   GC is not a
problem at all.
(I see a project in the future where the number of messages/bytes per TOPIC
are counted.)

The average packet size goes to 6KB from a normal ~ 400 bytes.   I'm going
to add  a rebalance throttle.

On Tue, Feb 13, 2018 at 5:41 AM, slava.koptilin <sl...@gmail.com>
wrote:

> Hi Dave,
>
> Could you please provide more details about that use-case.
> Is it possible to reproduce the issue and gather JFR and log files from all
> participated nodes?
> It would be very helpful in order to understand the cause of that behavior.
>
> Thanks!
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.

Re: 20 minute 12x throughput drop using data streamer and Ignite persistence

Posted by "slava.koptilin" <sl...@gmail.com>.

Hi Dave,

Is possible to share a code snippet which illustrates DataStreamer settings
and stream receiver code?

Best regards,
Slava.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 20 minute 12x throughput drop using data streamer and Ignite persistence

Posted by "slava.koptilin" <sl...@gmail.com>.

Hi Dave,

Could you please provide more details about that use-case.
Is it possible to reproduce the issue and gather JFR and log files from all
participated nodes?
It would be very helpful in order to understand the cause of that behavior.

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/