You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Laing, Michael" <mi...@nytimes.com> on 2014/04/07 21:22:54 UTC

Re: Setting gc_grace_seconds to zero and skipping "nodetool repair (was RE: Timeseries with TTL)

Perhaps following this recent thread would help clarify things:

http://mail-archives.apache.org/mod_mbox/cassandra-user/201401.mbox/%3CCAKgmDnFK3PA-w+LtUsM88a15JDg275O31p4UjwoL1B7BkajQRQ@mail.gmail.com%3E

Cheers,

Michael


On Mon, Apr 7, 2014 at 2:00 PM, Donald Smith <
Donald.Smith@audiencescience.com> wrote:

>  This statement is significant: “BTW if you never delete and only ttl
> your values at a constant value, you can set gc=0 and forget about periodic
> repair of the table, saving some space, IO, CPU, and an operational step.”
>
>
> Setting gc_grace_seconds to zero has the effect of not storing hinted
> handoffs (which prevent deleted data from reappearing), I believe.
> “Periodic repair” refers to running “nodetool repair” (aka Anti-Entropy).
>
>
>
> I too have wondered if setting gc_grace_seconds to zero and skipping
> “nodetool repair” are safe.
>
>
>
> We’re using C* 2.0.6. In the 2.0.X versions, with vnodes, “nodetool repair
> …” is very slow (see https://issues.apache.org/jira/browse/CASSANDRA-5220and
> https://issues.apache.org/jira/browse/CASSANDRA-6611).    We found read
> repairs via “nodetool repair” unacceptably slow, even when we restricted it
> to one table, and often the repairs hung or failed.  We also tried subrange
> repairs and the other options.
>
>
>
> Our app does no deletes and only rarely updates a row (if there was bad
> data that needs to be replaced).  So it’s very tempting to set
> gc_grace_seconds = 0 in the table definitions and skip read repairs.
>
>
>
> But there is Cassandra documentation that warns that read repairs are
> necessary even if you don’t do deletes. For example,
> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.htmlsays:
>
>
>
>      Note: If deletions never occur, you should still schedule regular
> repairs. Be aware that setting a column to null is a delete.
>
>
>
> The apache wiki
> https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repairsays:
>
>  Unless your application performs no deletes, it is strongly recommended
> that production clusters run nodetool repair periodically on all nodes in
> the cluster.
>
> *IF* your operations team is sufficiently on the ball, you can get by
> without repair as long as you do not have hardware failure -- in that case,
> HintedHandoff <https://wiki.apache.org/cassandra/HintedHandoff> is
> adequate to repair successful updates that some replicas have missed.
> Hinted handoff is active for max_hint_window_in_ms after a replica fails.
>
> Full repair or re-bootstrap is necessary to re-replicate data lost to
> hardware failure (see below).
>
> So, if there are hardware failures, “nodetool repair” is needed.  And
> http://planetcassandra.org/general-faq/ says:
>
>
>
> Anti-Entropy Node Repair – For data that is not read frequently, or to
> update data on a node that has been down for an extended period, the node
> repair process (also referred to as anti-entropy repair) ensures that all
> data on a replica is made consistent. Node repair (using the nodetool
> utility) should be run routinely as part of regular cluster maintenance
> operations.
>
>
>
> If RF=2, ReadConsistency is ONE and data failed to get replicated to the
> second node, then during a read might the app incorrectly return “missing
> data”?
>
>
>
> It seems to me that the need to run “nodetool repair” reflects a design
> bug; it should be automated.
>
>
>
> Don
>
>
>
> *From:* Laing, Michael [mailto:michael.laing@nytimes.com]
> *Sent:* Sunday, April 06, 2014 11:31 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Timeseries with TTL
>
>
>
> Since you are using LeveledCompactionStrategy there is no major/minor
> compaction - just compaction.
>
>
>
> Leveled compaction does more work - your logs don't look unreasonable to
> me - the real question is whether your nodes can keep up w the IO. SSDs
> work best.
>
>
>
> BTW if you never delete and only ttl your values at a constant value, you
> can set gc=0 and forget about periodic repair of the table, saving some
> space, IO, CPU, and an operational step.
>
>
>
> If your nodes cannot keep up the IO, switch to SizeTieredCompaction and
> monitor read response times. Or add SSDs.
>
>
>
> In my experience, for smallish nodes running C* 2 without
> SSDs, LeveledCompactionStrategy can cause the disk cache to churn, reducing
> read performance substantially. So watch out for that.
>
>
>
> Good luck,
>
>
>
> Michael
>
>
>
> On Sun, Apr 6, 2014 at 10:25 AM, Vicent Llongo <vi...@gmail.com> wrote:
>
> Hi,
>
>
>
> Most of the queries to that table are just getting a range of values for a
> metric:
> SELECT val FROM metrics_5min WHERE uid = ? AND metric = ? AND ts >= ? AND
> ts <= ?
>
>
>
> I'm not sure from the logs what kind of compactions they are. This is what
> I see in system.log (grepping for that specific table):
>
> ...
> INFO [CompactionExecutor:742] 2014-04-06 13:30:11,223 CompactionTask.java
> (line 105) Compacting
> [SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14991-Data.db'),
> SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14990-Data.db')]
> INFO [CompactionExecutor:753] 2014-04-06 13:35:22,495 CompactionTask.java
> (line 105) Compacting
> [SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14992-Data.db'),
> SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14993-Data.db')]
> INFO [CompactionExecutor:770] 2014-04-06 13:41:09,146 CompactionTask.java
> (line 105) Compacting
> [SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14995-Data.db'),
> SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14994-Data.db')]
> INFO [CompactionExecutor:783] 2014-04-06 13:46:21,250 CompactionTask.java
> (line 105) Compacting
> [SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14996-Data.db'),
> SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14997-Data.db')]
> INFO [CompactionExecutor:798] 2014-04-06 13:51:28,369 CompactionTask.java
> (line 105) Compacting
> [SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14998-Data.db'),
> SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-14999-Data.db')]
> INFO [CompactionExecutor:816] 2014-04-06 13:57:17,585 CompactionTask.java
> (line 105) Compacting
> [SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-15000-Data.db'),
> SSTableReader(path='/mnt/disk1/cassandra/data/keyspace/metrics_5min/keyspace-metrics_5min-ic-15001-Data.db')]
> ...
>
>
>
> As you can see every ~5 minutes there's a compaction going on.
>
>
>
>
>
> On Sun, Apr 6, 2014 at 4:33 PM, Sergey Murylev <se...@gmail.com>
> wrote:
>
> Hi Vincent,
>
>
>
>
>  Is that a good pattern for Cassandra? Is there some compaction tunings I
> should take into account?
>
> Actually it depends on how you use Cassandra :). If you use it as
> key-value storage TTL works fine. But if you would use rather complex CQL
> queries to this table I not sure that it would be good.
>
>
>
>
>  With this structure is obvious that after one week inserting data, from
> that moment there's gonna be new expired columns every 5 minutes in that
> table. Because of that I've noticed that this table is being compacted
> every 5 minutes.
>
> Compaction doesn't triggered when some column expired. It triggered on
> gc_grace_seconds timeout and according compaction strategy. You can see
> more detailed description of LeveledCompactionStrategy in following
> article: Leveled compaction in Cassandra<http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra>.
>
>
> There are 2 types of compaction: minor and major, which kind of compaction
> do you see and how come to conclusion that compaction triggered every 5
> minutes? If you see major compaction that situation is very bad otherwise
> it is normal case.
>
> --
> Thanks,
> Sergey
>
>
>
>  On 06/04/14 15:48, Vicent Llongo wrote:
>
>  Hi there,
>
> I have this table where I'm inserting timeseries values with a TTL of
> 86400*7 (1week):
>
>
> CREATE TABLE metrics_5min (
>   object_id varchar,
>   metric varchar,
>   ts timestamp,
>   val double,
>   PRIMARY KEY ((object_id, metric), ts)
> )
> WITH gc_grace_seconds = 86400
> AND compaction = {'class': 'LeveledCompactionStrategy',
> 'sstable_size_in_mb' : 100};
>
>   With this structure is obvious that after one week inserting data, from
> that moment there's gonna be new expired columns every 5 minutes in that
> table. Because of that I've noticed that this table is being compacted
> every 5 minutes.
>
> Is that a good pattern for Cassandra? Is there some compaction tunings I
> should take into account?
>
> Thanks!
>
>
>
>
>
>
>
>
>