You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jim Witschey (JIRA)" <ji...@apache.org> on 2016/01/11 23:17:39 UTC

[jira] [Comment Edited] (CASSANDRA-10995) Consider disabling sstable compression by default in 3.x

    [ https://issues.apache.org/jira/browse/CASSANDRA-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092770#comment-15092770 ] 

Jim Witschey edited comment on CASSANDRA-10995 at 1/11/16 10:17 PM:
--------------------------------------------------------------------

One problem we currently have with benchmarking on-disk data size, in particular w.r.t. compression, is this: we don't have tools that will generate representative, compressible data. It's easy to generate random data ({{UUID}} s, random strings from {{cassandra-stress}}).

[~iamaleksey] How important is it that we use such a dataset? You'd know better than I, but I don't imagine compressibility would effect resource utilization other than disk much.


was (Author: mambocab):
One problem we currently have with benchmarking on-disk data size, in particular w.r.t. compression, is this: we don't have tools that will generate representative, compressible data. It's easy to generate random data ({{UUID}}s, random strings from {{cassandra-stress}}).

[~iamaleksey] How important is it that we use such a dataset? You'd know better than I, but I don't imagine compressibility would effect resource utilization other than disk much.

> Consider disabling sstable compression by default in 3.x
> --------------------------------------------------------
>
>                 Key: CASSANDRA-10995
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10995
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Jim Witschey
>
> With the new sstable format introduced in CASSANDRA-8099, it's very likely that enabled sstable compression is no longer the right default option.
> [~slebresne]'s [blog post|http://www.datastax.com/2015/12/storage-engine-30] on the new storage engine has some comparison numbers for 2.2/3.0, with and without compression that show that in many cases compression no longer has a significant effect on sstable sizes - all while sill consuming extra resources for both writes (compression) and reads (decompression).
> We should run a comprehensive set of benchmarks to determine whether or not compression should be switched to 'off' now in 3.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)