You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mateusz Korniak <ma...@ant.gliwice.pl> on 2012/06/14 11:39:31 UTC

Meaning of compression chunk_length_kb

Hi !
What is meaning  of
"chunk_length_kb: sets the compression chunk size in kilobytes. "
?

It means that uncompressed sstable data is compressed to approximately 
chunk_length_kb and every read needs to read approximately chunk_length_kb and 
decompress it to read any value from compressed range  ?

Or it means approximately chunk_length_kb of sstable data is compressed and 
stored on disk, so similar values must be in chunk_length_kb range to make 
compression efficient  ?

Or something else ?

Thanks in advance, regards, 

-- 
Mateusz Korniak

Re: Meaning of compression chunk_length_kb

Posted by Sylvain Lebresne <sy...@datastax.com>.
> It means that uncompressed sstable data is compressed to approximately
> chunk_length_kb and every read needs to read approximately chunk_length_kb and
> decompress it to read any value from compressed range  ?
>
> Or it means approximately chunk_length_kb of sstable data is compressed and
> stored on disk, so similar values must be in chunk_length_kb range to make
> compression efficient  ?

Pretty much the second one. We compress the sstable data by blocks of
chunk_length_kb (so chunk_length_kb of uncompressed data), yelding a number
of compressed blocks that are hopefully smaller than that. It does mean however
that every read needs to read and deserialize a full (compressed) block and
decompress it to fetch any value within this block.

Sylvain