You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Wei Deng (JIRA)" <ji...@apache.org> on 2016/08/24 22:51:21 UTC

[jira] [Commented] (CASSANDRA-11833) Bulk load mode for LCS

    [ https://issues.apache.org/jira/browse/CASSANDRA-11833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435872#comment-15435872 ] 

Wei Deng commented on CASSANDRA-11833:
--------------------------------------

I have done some rudimentary benchmark comparing a forced LCS major compaction and a regular "dumping everything into L0 with auto-compaction disabled and then enable auto-compaction and let all L0 SSTables to move up from one level to the next and so on". From the result I got I'm not convinced major compaction (at least in its current incarnation) is suitable for the "bulk load" mode we're talking about here.

The test platform is a GCE node with 4 cores and 16GB RAM. I know it's not a lot of hardware, but the observations from the benchmark led me to believe that we will easily see the same behavior on beefier machines. To have a fair comparison, I generated data via cassandra-stress and disabled auto-compaction and just leave flush running, so that I could accumulate 600+ L0 SSTables with 83MB each (about 60GB total); I then performed a snapshot so that I always have the same data set for various experiments. The max_sstable_size (which is a table LCS parameter) is always set to 160MB in my tests. Since this is C* 3.x I don't have to worry about compression and just use the default schema from cassandra-stress that will create tables as uncompressed, which is easier for comparison. I then changed the concurrent_compactors to 4, and compaction_throughput_mb_per_sec to 0 in cassandra.yaml, so that all hardware on this machine is dedicated to running compactions.

With this configuration, I restarted the JVM and changed the schema to enable auto-compaction (while keeping all other activities completely idle, i.e. no read/write or any other external distractions that could consume CPU, IO or memory) and wait for the auto-compaction to finish; as a comparison for testing major compaction, I then restarted the JVM again and leave auto-compaction disabled this time, but run "nodetool compact keyspace1.standard1" to trigger the major compaction. In both cases, once the last compaction on the keyspace1.standard1 table finished, I calculated the wall-clock time as well as the aggregated time by adding up the time spent by each individual compaction session, and the aggregated written bytes. Here are the benchmark numbers from the two runs:

compaction type      |     wall-clock time    |     aggregated time    |     aggregated written bytes
----------------------------------------------------------------------------------------------------------------------------  
auto-compaction     |         3.139 hours     |     7.474 hours            |     244,166.164 MB
major compaction   |        10.652 hours    |   10.652 hours            |     60,960 MB

So as you can see, even though major compaction generates 4x smaller write (as it's only writing every single 160MB SSTable just once), it's taking way longer to finish, because the whole major compaction is just a single thread, while auto-compaction can leverage the 4 threads more effectively.

> Bulk load mode for LCS
> ----------------------
>
>                 Key: CASSANDRA-11833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11833
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Jonathan Ellis
>              Labels: lcs
>
> We can currently avoid LCS write amplification on bulk load as follows:
> 1. Disable automatic compaction
> 2. Load all the data
> 3. Force major compaction
> 4. Re-enable automatic compaction
> It would be nice to automate this.  E.g. if we detect that L0 contains as much data as the rest of our arena, skip bubbling sstables up and just major compact automatically.  In a bulk load situation, this would result in exponentially increasing major compaction sizes -- not as ideal as the manual form with just a single one, but much better than our current behavior which is not a good experience for users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)