You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "T Jake Luciani (JIRA)" <ji...@apache.org> on 2016/05/02 22:48:13 UTC

[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

    [ https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267478#comment-15267478 ] 

T Jake Luciani commented on CASSANDRA-9766:
-------------------------------------------

bq. Running LongStreamingTest on my laptop went from 24/25 seconds on trunk HEAD to 22/23 seconds with the patch applied.

Hmm, looks like the BtreeSearchIterator recycling is causing too high a CPU hit to be worth the GC savings.  I've pushed a quick commit which brings the test back down to 19 seconds for me, could you try it out and let me what you see? Without recycling BTreeSearchIterator accounts for >25% of the heap pressure :(

I think since the object is so hotly used it just causes too much contention on the recycler. It's important to avoid too much allocation but seems like in this case it's gone too far.  Perhaps we can avoid the recycler here and just keep a reusable BTreeSearchIterator in the SSTableWriter. 

bq. I would like to make sure this is justifiable and I would probably want the opinion of one more committer with more experience than me
The FastThreadLocal changes were optimization by [~benedict] from a [while back|https://github.com/netty/netty/pull/2504] plus some recycler changes.
since we already use netty and it's built to be used as a general library it seemed like a good place to start. 

bq. do we have a micro benchmark comparing Netty FastThreadLocal and the JDK ThreadLocal? 
The netty FastThreadLocal microbenchmarks show a significant throughput increase over jdk

{code}
Benchmark                                    Mode  Cnt      Score      Error  Units
FastThreadLocalBenchmark.fastThreadLocal    thrpt   20  55452.027 ±  725.713  ops/s
FastThreadLocalBenchmark.jdkThreadLocalGet  thrpt   20  35481.888 ± 1471.647  ops/s
{code}

bq. Should we perhaps make recyclable objects ref counted, at least for debugging purposes when Ref.DEBUG_ENABLED is true?

The reason I didn't do this and one reason I like the Recycler is it's not strictly required to recycle every object. If we added ref counting it would force every code path to be properly cleaned up even when we don't care about recycling. 



> Bootstrap outgoing streaming speeds are much slower than during repair
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-9766
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9766
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>         Environment: Cassandra 2.1.2. more details in the pdf attached 
>            Reporter: Alexei K
>            Assignee: T Jake Luciani
>              Labels: performance
>             Fix For: 3.x
>
>         Attachments: problem.pdf
>
>
> I have a cluster in Amazon cloud , its described in detail in the attachment. What I've noticed is that we during bootstrap we never go above 12MB/sec transmission speeds and also those speeds flat line almost like we're hitting some sort of a limit ( this remains true for other tests that I've ran) however during the repair we see much higher,variable sending rates. I've provided network charts in the attachment as well . Is there an explanation for this? Is something wrong with my configuration, or is it a possible bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)