You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joseph Lynch (JIRA)" <ji...@apache.org> on 2018/11/13 01:53:00 UTC

[jira] [Created] (CASSANDRA-14886) Add a tool for estimating compression effects for different block sizes / compressors

Joseph Lynch created CASSANDRA-14886:
----------------------------------------

             Summary: Add a tool for estimating compression effects for different block sizes / compressors
                 Key: CASSANDRA-14886
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14886
             Project: Cassandra
          Issue Type: Improvement
          Components: Compression
            Reporter: Joseph Lynch


A common question from users of compression is "which block size should I use". Until we figure out how to auto-tune the block size (or use something like zstd dictionary training), it might be useful to ship a tool similar to the one [~aweisberg] created ([gist mirror|https://gist.github.com/jolynch/411e62ac592bfb55cfdd5db87c77ef6f]) for CASSANDRA-13241 that users could point at an existing sstable and it would output expected ratios for that sstable re-compressed with either different block sizes or a different compressor all together. For example maybe something like:
{noformat}
$ /cassandra/tools/bin/sstable-compression-estimate <foo>
Compressor | Chunk Size | Ratio | Read Speed | Off-Heap Memory |                
----------------------------------------------------------------                                                                                                                                                                   
LZ4        | 4096       | 0.54  | 0.2 ms     | 100kb           |                
LZ4        | 8192       | 0.46  | 0.3 ms     | 50kb            |                
LZ4        | 16384      | 0.42  | 0.3 ms     | 24kb            |                
LZ4        | 32768      | 0.38  | 0.4 ms     | 12kb            |                
LZ4        | 65536      | 0.35  | 0.8 ms     | 6kb             |                
----------------------------------------------------------------                
Zstd       | 4096       | 0.40  | 0.3 ms     | 100kb           |                
Zstd       | 8192       | 0.34  | 0.4 ms     | 50kb            |                
Zstd       | 16384      | 0.25  | 0.5 ms     | 24kb            | 

...
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org