You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Sholes, Joshua" <Jo...@cable.comcast.com> on 2014/10/13 17:27:58 UTC

Compact tables when offline

I feel like similar questions have been asked recently but not in this specific way:

I have a cluster that has some I/O capacity issues, which I know is what's really causing this, but I've got the case where (using leveled compaction strategy) my SSTables are piling up at the lowest level.   In some cases it's even enough to cause Cassandra to crash due to not having enough available open file handles (by which I mean my cfstats output for the problem child table ends up looking something like 114000/4, 10/10, 98, 0, 0, 0).

My question is this:  Is there a command that I'm missing that I could use to force that node to do compaction on those tables and clean up some of the thousands of 100-500byte tables while Cassandra itself is offline?  I've got plenty of disk space so that's not an issue.

I'm running Cassandra 2.0.9.

--
Josh

Re: Compact tables when offline

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Oct 15, 2014 at 3:41 PM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> +1 for "nodetool disablegossip && nodetool disablethrift && nodetool
> disablebinary" (there is a binary protocol now too, port 9042, you might
> want to disable it as well depending on your clients)
>
> "nodetool enablegossip && nodetool enablethrift && nodetool enablebinary"
> to come back "online"
>

This does not actually do what you probably think it does ("disable all
potential writes to this node"), which is why I mentioned iptables.

https://issues.apache.org/jira/browse/CASSANDRA-4162 - "nodetool
disablegossip does not prevent gossip delivery of writes via
already-initiated hinted handoff"

=Rob

Re: Compact tables when offline

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
+1 for "nodetool disablegossip && nodetool disablethrift && nodetool
disablebinary" (there is a binary protocol now too, port 9042, you might
want to disable it as well depending on your clients)

"nodetool enablegossip && nodetool enablethrift && nodetool enablebinary"
to come back "online"

Cheers

2014-10-13 22:41 GMT+02:00 Robert Coli <rc...@eventbrite.com>:

> On Mon, Oct 13, 2014 at 12:26 PM, Sholes, Joshua <
> Joshua_Sholes@cable.comcast.com> wrote:
>
>>  I thought setcompactionthroughput just adjusted the compaction speed
>> when the server is online?  I'm looking for something like a scrub (which
>> as far as I know does not do this) that will compat the tables
>> appropriately while the Cassandra daemon is down.
>>
>
> Oh, sorry I missed the "offline" detail. No.
>
> You could bring up the node with join_ring=false and compact, and then
> join the ring. But if you really want the node to be isolated, you may need
> to disablegossip/disablethrift and/or firewall with iptables for the
> duration.
>
> =Rob
>
>

Re: Compact tables when offline

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Oct 13, 2014 at 12:26 PM, Sholes, Joshua <
Joshua_Sholes@cable.comcast.com> wrote:

>  I thought setcompactionthroughput just adjusted the compaction speed
> when the server is online?  I'm looking for something like a scrub (which
> as far as I know does not do this) that will compat the tables
> appropriately while the Cassandra daemon is down.
>

Oh, sorry I missed the "offline" detail. No.

You could bring up the node with join_ring=false and compact, and then join
the ring. But if you really want the node to be isolated, you may need to
disablegossip/disablethrift and/or firewall with iptables for the duration.

=Rob

Re: Compact tables when offline

Posted by "Sholes, Joshua" <Jo...@cable.comcast.com>.
I thought setcompactionthroughput just adjusted the compaction speed when the server is online?  I'm looking for something like a scrub (which as far as I know does not do this) that will compat the tables appropriately while the Cassandra daemon is down.

And I know why I've got heap pressure--the cluster's data sources are growing faster than the cluster is (and was designed for).   That's a somewhat longer-term fix involving new hardware, I think.
--
Josh

From: Robert Coli <rc...@eventbrite.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Monday, October 13, 2014 at 1:49 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Compact tables when offline

On Mon, Oct 13, 2014 at 8:27 AM, Sholes, Joshua <Jo...@cable.comcast.com>> wrote:
My question is this:  Is there a command that I'm missing that I could use to force that node to do compaction on those tables and clean up some of the thousands of 100-500byte tables while Cassandra itself is offline?  I've got plenty of disk space so that's not an issue.

nodetool setcompactionthroughput 0

But if you have that many tiny files, you probably have serious heap pressure and are flushing all the time and should figure out why.

=Rob
http://twitter.com/rcolidba

Re: Compact tables when offline

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Oct 13, 2014 at 8:27 AM, Sholes, Joshua <
Joshua_Sholes@cable.comcast.com> wrote:

>   My question is this:  Is there a command that I'm missing that I could
> use to force that node to do compaction on those tables and clean up some
> of the thousands of 100-500byte tables while Cassandra itself is offline?
> I've got plenty of disk space so that's not an issue.
>

nodetool setcompactionthroughput 0

But if you have that many tiny files, you probably have serious heap
pressure and are flushing all the time and should figure out why.

=Rob
http://twitter.com/rcolidba