You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "sankalp kohli (JIRA)" <ji...@apache.org> on 2012/06/06 00:18:22 UTC

[jira] [Updated] (CASSANDRA-4310) Make Level Compaction go faster(multiple independant compactions in parallel) for insert heavy workload

     [ https://issues.apache.org/jira/browse/CASSANDRA-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sankalp kohli updated CASSANDRA-4310:
-------------------------------------

    Description: 
Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0.  


Here is a solution which will help here and also increase the performance of level compaction.

We can do many compactions in parallel for unrelated data.
1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
2) We can also do compactions with files in L1 which are not participating in L0 compactions.

This is specially useful if you are using SSD and is not bottlenecked by IO. 

I am seeing this issue in my cluster. The compactions pending are more than 50k.
I am doing multithreaded to true and also not throttling the IO by putting the value as 0. 

 



  was:
Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0. This will starve compactions in the lower levels and make things worse. 

There is a comment about this problem in the code as well.

Here is a solution which will help here and also increase the performance of level compaction.

We can do many compactions in parallel for unrelated data.
1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
2) We can also do compactions with files in L1 which are not participating in L0 compactions.

This is specially useful if you are using SSD and is not bottlenecked by IO. 

I am seeing this issue in my cluster. The compactions pending are more than 50k.
I am doing multithreaded to true and also not throttling the IO by putting the value as 0. 

 



    
> Make Level Compaction go faster(multiple independant compactions in parallel) for insert heavy workload
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4310
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4310
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: sankalp kohli
>            Priority: Minor
>              Labels: compaction, leveled
>
> Problem: If you are inserting data into cassandra and level compaction cannot catchup, you will create lot of files in L0.  
> Here is a solution which will help here and also increase the performance of level compaction.
> We can do many compactions in parallel for unrelated data.
> 1) For no over lapping levels. Ex: when L0 stable is compacting with L1, we can do compactions in other levels like L2 and L3 if they are eligible.
> 2) We can also do compactions with files in L1 which are not participating in L0 compactions.
> This is specially useful if you are using SSD and is not bottlenecked by IO. 
> I am seeing this issue in my cluster. The compactions pending are more than 50k.
> I am doing multithreaded to true and also not throttling the IO by putting the value as 0. 
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira