You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Michael Theroux <mt...@yahoo.com> on 2012/09/24 23:00:10 UTC

Cassandra compression not working?

Hello,

We are running into an unusual situation that I'm wondering if anyone has any insight on.  We've been running a Cassandra cluster for some time, with compression enabled on one column family in which text documents are stored.  We enabled compression on the column family, utilizing the SnappyCompressor and a 64k chunk length.

It was recently discovered that Cassandra was reporting a compression ratio of 0.  I took a snapshot of the data and started a cassandra node in isolation to investigate.

Running nodetool scrub, or nodetool upgradesstables had little impact on the amount of data that was being stored.

I then disabled compression and ran nodetool upgradesstables on the column family.  Again, not impact on the data size stored.

I then reenabled compression and ran nodetool upgradesstables on the column family.  This resulting in a 60% reduction in the data size stored, and Cassandra reporting a compression ration of about .38.

Any idea what is going on here?  Obviously I can go through this process in production to enable compression, however, any idea what is currently happening and why new data does not appear to be compressed?

Any insights are appreciated,
Thanks,
-Mike

Re: Cassandra compression not working?

Posted by aaron morton <aa...@thelastpickle.com>.
Nothing jumps out. Are you able to reproduce the fault on a test  node ?

There were some schema change problems in the early 1.1X releases. Did you enable compression via a schema change ?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/09/2012, at 9:14 AM, Mike <mt...@yahoo.com> wrote:

> I forgot to mention we are running Cassandra 1.1.2.
> 
> Thanks,
> -Mike
> 
> On Sep 24, 2012, at 5:00 PM, Michael Theroux <mt...@yahoo.com> wrote:
> 
>> Hello,
>> 
>> We are running into an unusual situation that I'm wondering if anyone has any insight on.  We've been running a Cassandra cluster for some time, with compression enabled on one column family in which text documents are stored.  We enabled compression on the column family, utilizing the SnappyCompressor and a 64k chunk length.
>> 
>> It was recently discovered that Cassandra was reporting a compression ratio of 0.  I took a snapshot of the data and started a cassandra node in isolation to investigate.
>> 
>> Running nodetool scrub, or nodetool upgradesstables had little impact on the amount of data that was being stored.
>> 
>> I then disabled compression and ran nodetool upgradesstables on the column family.  Again, not impact on the data size stored.
>> 
>> I then reenabled compression and ran nodetool upgradesstables on the column family.  This resulting in a 60% reduction in the data size stored, and Cassandra reporting a compression ration of about .38.
>> 
>> Any idea what is going on here?  Obviously I can go through this process in production to enable compression, however, any idea what is currently happening and why new data does not appear to be compressed?
>> 
>> Any insights are appreciated,
>> Thanks,
>> -Mike


Re: Cassandra compression not working?

Posted by Mike <mt...@yahoo.com>.
I forgot to mention we are running Cassandra 1.1.2.

Thanks,
-Mike

On Sep 24, 2012, at 5:00 PM, Michael Theroux <mt...@yahoo.com> wrote:

> Hello,
> 
> We are running into an unusual situation that I'm wondering if anyone has any insight on.  We've been running a Cassandra cluster for some time, with compression enabled on one column family in which text documents are stored.  We enabled compression on the column family, utilizing the SnappyCompressor and a 64k chunk length.
> 
> It was recently discovered that Cassandra was reporting a compression ratio of 0.  I took a snapshot of the data and started a cassandra node in isolation to investigate.
> 
> Running nodetool scrub, or nodetool upgradesstables had little impact on the amount of data that was being stored.
> 
> I then disabled compression and ran nodetool upgradesstables on the column family.  Again, not impact on the data size stored.
> 
> I then reenabled compression and ran nodetool upgradesstables on the column family.  This resulting in a 60% reduction in the data size stored, and Cassandra reporting a compression ration of about .38.
> 
> Any idea what is going on here?  Obviously I can go through this process in production to enable compression, however, any idea what is currently happening and why new data does not appear to be compressed?
> 
> Any insights are appreciated,
> Thanks,
> -Mike

Re: Cassandra compression not working?

Posted by Fred Groen <fg...@student.american.edu>.
You are going to need a fully optimized flux-capacitor for that.

On Tue, Sep 25, 2012 at 5:00 AM, Michael Theroux <mt...@yahoo.com>wrote:

> Hello,
>
> We are running into an unusual situation that I'm wondering if anyone has
> any insight on.  We've been running a Cassandra cluster for some time, with
> compression enabled on one column family in which text documents are
> stored.  We enabled compression on the column family, utilizing the
> SnappyCompressor and a 64k chunk length.
>
> It was recently discovered that Cassandra was reporting a compression
> ratio of 0.  I took a snapshot of the data and started a cassandra node in
> isolation to investigate.
>
> Running nodetool scrub, or nodetool upgradesstables had little impact on
> the amount of data that was being stored.
>
> I then disabled compression and ran nodetool upgradesstables on the column
> family.  Again, not impact on the data size stored.
>
> I then reenabled compression and ran nodetool upgradesstables on the
> column family.  This resulting in a 60% reduction in the data size stored,
> and Cassandra reporting a compression ration of about .38.
>
> Any idea what is going on here?  Obviously I can go through this process
> in production to enable compression, however, any idea what is currently
> happening and why new data does not appear to be compressed?
>
> Any insights are appreciated,
> Thanks,
> -Mike