You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Narendra Sharma <na...@gmail.com> on 2010/12/03 21:23:08 UTC

Cassandra 0.7 - Impact of row size and columns on compaction

What is the impact (performance and I/O) of row size (in bytes) on
compaction?
What is the impact (performance and I/O) of number of super columns and
columns on compaction?

Does anyone has any details and data to share?

Thanks,
Naren

Re: Cassandra 0.7 - Impact of row size and columns on compaction

Posted by Narendra Sharma <na...@gmail.com>.
This is very useful. Thanks Aaron!

-Naren

On Sun, Dec 5, 2010 at 12:35 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> AFAIK if the entire row can be read into memory the compaction will be
> faster. The in_memory_compaction_limit_in_mb setting is used to decide how
> big the row can be before it has to use a slower two pass process.
>
> Also my understanding is that one of the main factors for compaction is the
> number of over-writes for rows / columns. e.g if the data for a row is
> spread over a lot of ss tables (for new columns and/or updates and/or
> deletes) it will take longer to compact that row.
>
> Hope that helps.
> Aaron
>
>
> On 04 Dec, 2010,at 09:23 AM, Narendra Sharma <na...@gmail.com>
> wrote:
>
> What is the impact (performance and I/O) of row size (in bytes) on
> compaction?
> What is the impact (performance and I/O) of number of super columns and
> columns on compaction?
>
> Does anyone has any details and data to share?
>
> Thanks,
> Naren
>
>

Re: Cassandra 0.7 - Impact of row size and columns on compaction

Posted by Aaron Morton <aa...@thelastpickle.com>.
AFAIK if the entire row can be read into memory the compaction will be faster. The in_memory_compaction_limit_in_mb setting is used to decide how big the row can be before it has to use a slower two pass process. 

Also my understanding is that one of the main factors for compaction is the number of over-writes for rows / columns. e.g. if the data for a row is spread over a lot of ss tables (for new columns and/or updates and/or deletes) it will take longer to compact that row. 

Hope that helps. 
Aaron
 

On 04 Dec, 2010,at 09:23 AM, Narendra Sharma <na...@gmail.com> wrote:

What is the impact (performance and I/O) of row size (in bytes) on compaction?
What is the impact (performance and I/O) of number of super columns and columns on compaction?

Does anyone has any details and data to share?

Thanks,
Naren