You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Robert Wille <rw...@fold3.com> on 2015/09/24 21:59:28 UTC

Compaction not happening

I have some tables that have quite a bit of churn, and the deleted data doesn’t get compacted out of them, in spite of gc_grace_seconds=0. I periodically get updates in bulk for some of my tables. I write the new data to the tables, and then delete the old data (no updates, just insertion with new primary keys, followed by deletion of the old records). On any given day, I might add and delete 5 to 10 percent of the records. The amount of disk space that these tables take up has historically been surprisingly constant. For many months, the space didn’t vary by more than a few gigs. A couple of months ago, the tables started growing. They grew to be about 50% bigger than they used to be, and they just kept growing. We decided to upgrade our cluster (from 2.0.14 to 2.0.16), and right after the upgrade, the tables got compacted down to their original size. The size then stayed pretty constant and I was feeling pretty good about it. Unfortunately, a couple of weeks ago, they started growing again, and are now about twice their original size. I’m using leveled compaction.

One thing I’ve noticed is that back when compaction was working great, whenever I’d start writing to these tables, compactions would get triggered, and they would run for hours following the bulk writing. Now, when I’m writing records, I see short little compactions that take several seconds.

One other thing that may be relevant is that while I'm writing, max compactions pending can get into the thousands, but drops to 0 as soon as I’m done writing. Seems quite strange that Cassandra can chug through the pending compactions so quickly, while achieving so little. Half the data in these tables can be compacted out, and yet compaction does almost nothing.

This seems really strange to me:

[cid:B0BC78C5-B998-40AF-BE4C-8EFC7F2110A4@iarchives.com] [cid:5B15BB63-1D1B-4C9F-AE80-C4E8B20CA6E2@iarchives.com]

Compactions pending shoots up when I’m done writing. Doesn’t make a lot of sense.

Any thoughts on how I can figure out what’s going on? Any idea what caused the tables to be compacted following the upgrade? Any thoughts on why I used to have compactions that took hours and actually did something, but now I get compactions that run really fast, but don’t really do anything? Perhaps if I’m patient enough, the space will eventually get compacted out, and yearning for the good-old days is just a waste of time. I can accept that, although if that’s the case, I may need to buy more nodes.

Thanks in advance

Robert

Re: Compaction not happening

Posted by Robert Wille <rw...@fold3.com>.

CASSANDRA-9662 definitely sounds like the source of my spikes. Good to know they are fake. Just wish I knew why it won’t compact when 50% of my data has been tombstoned. The other day it shed 10% of its size, and hasn’t grown since, so I guess that’s something.

On Sep 28, 2015, at 6:04 PM, Paulo Motta <pa...@gmail.com>> wrote:

I don't know about the other issues, but the compaction pending spikes looks like CASSANDRA-9662 (https://issues.apache.org/jira/browse/CASSANDRA-9662). Could you try upgrading to 2.0.17/2.1.9 and check if that is fixed?

Also, if you're not already doing this, try to monitor the droppable tombstone ratio JMX metric (or inspect sstables droppable tombstone ratio with sstablemetadata) and play with the tombstone compaction subproperties: tombstone_threshold, tombstone_compaction_interval and unchecked_tombstone_compaction (more details on: http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html)

Cheers,

2015-09-28 16:36 GMT-07:00 Dan Kinder <dk...@turnitin.com>>:
+1 would be great to hear a response on this. I see similar strange behavior where "Compactions Pending" spikes up into the thousands. In my case it's a LCS table with fluctuating-but-sometimes-pretty-high write load and lots of (intentional) overwrite, infrequent deletes. C* 2.1.7.

On Thu, Sep 24, 2015 at 12:59 PM, Robert Wille <rw...@fold3.com>> wrote:
I have some tables that have quite a bit of churn, and the deleted data doesn’t get compacted out of them, in spite of gc_grace_seconds=0. I periodically get updates in bulk for some of my tables. I write the new data to the tables, and then delete the old data (no updates, just insertion with new primary keys, followed by deletion of the old records). On any given day, I might add and delete 5 to 10 percent of the records. The amount of disk space that these tables take up has historically been surprisingly constant. For many months, the space didn’t vary by more than a few gigs. A couple of months ago, the tables started growing. They grew to be about 50% bigger than they used to be, and they just kept growing. We decided to upgrade our cluster (from 2.0.14 to 2.0.16), and right after the upgrade, the tables got compacted down to their original size. The size then stayed pretty constant and I was feeling pretty good about it. Unfortunately, a couple of weeks ago, they started growing again, and are now about twice their original size. I’m using leveled compaction.

This seems really strange to me:

<PastedGraphic-1.png> <PastedGraphic-3.png>

Compactions pending shoots up when I’m done writing. Doesn’t make a lot of sense.

Thanks in advance

Robert

Re: Compaction not happening

Posted by Paulo Motta <pa...@gmail.com>.

I don't know about the other issues, but the compaction pending spikes
looks like CASSANDRA-9662 (
https://issues.apache.org/jira/browse/CASSANDRA-9662). Could you try
upgrading to 2.0.17/2.1.9 and check if that is fixed?

Also, if you're not already doing this, try to monitor the droppable
tombstone ratio JMX metric (or inspect sstables droppable tombstone ratio
with sstablemetadata) and play with the tombstone compaction subproperties:
tombstone_threshold, tombstone_compaction_interval and
unchecked_tombstone_compaction (more details on:
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html)

Cheers,

2015-09-28 16:36 GMT-07:00 Dan Kinder <dk...@turnitin.com>:

> +1 would be great to hear a response on this. I see similar strange
> behavior where "Compactions Pending" spikes up into the thousands. In my
> case it's a LCS table with fluctuating-but-sometimes-pretty-high write load
> and lots of (intentional) overwrite, infrequent deletes. C* 2.1.7.
>
> On Thu, Sep 24, 2015 at 12:59 PM, Robert Wille <rw...@fold3.com> wrote:
>
>> I have some tables that have quite a bit of churn, and the deleted data
>> doesn’t get compacted out of them, in spite of gc_grace_seconds=0. I
>> periodically get updates in bulk for some of my tables. I write the new
>> data to the tables, and then delete the old data (no updates, just
>> insertion with new primary keys, followed by deletion of the old records).
>> On any given day, I might add and delete 5 to 10 percent of the records.
>> The amount of disk space that these tables take up has historically been
>> surprisingly constant. For many months, the space didn’t vary by more than
>> a few gigs. A couple of months ago, the tables started growing. They grew
>> to be about 50% bigger than they used to be, and they just kept growing. We
>> decided to upgrade our cluster (from 2.0.14 to 2.0.16), and right after the
>> upgrade, the tables got compacted down to their original size. The size
>> then stayed pretty constant and I was feeling pretty good about it.
>> Unfortunately, a couple of weeks ago, they started growing again, and are
>> now about twice their original size. I’m using leveled compaction.
>>
>> One thing I’ve noticed is that back when compaction was working great,
>> whenever I’d start writing to these tables, compactions would get
>> triggered, and they would run for hours following the bulk writing. Now,
>> when I’m writing records, I see short little compactions that take several
>> seconds.
>>
>> One other thing that may be relevant is that while I'm writing, max
>> compactions pending can get into the thousands, but drops to 0 as soon as
>> I’m done writing. Seems quite strange that Cassandra can chug through the
>> pending compactions so quickly, while achieving so little. Half the data in
>> these tables can be compacted out, and yet compaction does almost nothing.
>>
>> This seems really strange to me:
>>
>>
>>
>> Compactions pending shoots up when I’m done writing. Doesn’t make a lot
>> of sense.
>>
>> Any thoughts on how I can figure out what’s going on? Any idea what
>> caused the tables to be compacted following the upgrade? Any thoughts on
>> why I used to have compactions that took hours and actually did something,
>> but now I get compactions that run really fast, but don’t really do
>> anything? Perhaps if I’m patient enough, the space will eventually get
>> compacted out, and yearning for the good-old days is just a waste of time.
>> I can accept that, although if that’s the case, I may need to buy more
>> nodes.
>>
>> Thanks in advance
>>
>> Robert
>>
>>
>

Re: Compaction not happening

Posted by Dan Kinder <dk...@turnitin.com>.

+1 would be great to hear a response on this. I see similar strange
behavior where "Compactions Pending" spikes up into the thousands. In my
case it's a LCS table with fluctuating-but-sometimes-pretty-high write load
and lots of (intentional) overwrite, infrequent deletes. C* 2.1.7.

On Thu, Sep 24, 2015 at 12:59 PM, Robert Wille <rw...@fold3.com> wrote:

> I have some tables that have quite a bit of churn, and the deleted data
> doesn’t get compacted out of them, in spite of gc_grace_seconds=0. I
> periodically get updates in bulk for some of my tables. I write the new
> data to the tables, and then delete the old data (no updates, just
> insertion with new primary keys, followed by deletion of the old records).
> On any given day, I might add and delete 5 to 10 percent of the records.
> The amount of disk space that these tables take up has historically been
> surprisingly constant. For many months, the space didn’t vary by more than
> a few gigs. A couple of months ago, the tables started growing. They grew
> to be about 50% bigger than they used to be, and they just kept growing. We
> decided to upgrade our cluster (from 2.0.14 to 2.0.16), and right after the
> upgrade, the tables got compacted down to their original size. The size
> then stayed pretty constant and I was feeling pretty good about it.
> Unfortunately, a couple of weeks ago, they started growing again, and are
> now about twice their original size. I’m using leveled compaction.
>
> One thing I’ve noticed is that back when compaction was working great,
> whenever I’d start writing to these tables, compactions would get
> triggered, and they would run for hours following the bulk writing. Now,
> when I’m writing records, I see short little compactions that take several
> seconds.
>
> One other thing that may be relevant is that while I'm writing, max
> compactions pending can get into the thousands, but drops to 0 as soon as
> I’m done writing. Seems quite strange that Cassandra can chug through the
> pending compactions so quickly, while achieving so little. Half the data in
> these tables can be compacted out, and yet compaction does almost nothing.
>
> This seems really strange to me:
>
>
>
> Compactions pending shoots up when I’m done writing. Doesn’t make a lot of
> sense.
>
> Any thoughts on how I can figure out what’s going on? Any idea what caused
> the tables to be compacted following the upgrade? Any thoughts on why I
> used to have compactions that took hours and actually did something, but
> now I get compactions that run really fast, but don’t really do anything?
> Perhaps if I’m patient enough, the space will eventually get compacted out,
> and yearning for the good-old days is just a waste of time. I can accept
> that, although if that’s the case, I may need to buy more nodes.
>
> Thanks in advance
>
> Robert
>
>