You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Roman Tkachenko (JIRA)" <ji...@apache.org> on 2015/06/11 23:03:02 UTC
[jira] [Comment Edited] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows

    [ https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582508#comment-14582508 ] 

Roman Tkachenko edited comment on CASSANDRA-9045 at 6/11/15 9:02 PM:
---------------------------------------------------------------------

Hey guys.

So I implemented writes at EACH_QUORUM several weeks ago and have been monitoring but it does not look like it fixed the issue.

Check this out. I pulled the logs from both datacenters for one of reappeared entries and correlated them with our repairs schedule. Reads (GETs) are done at LOCAL_QUORUM.

{code}
Time                    DC      RESULT
==========================================
2015-06-04T11:31:38     DC2     GET 200  --> record is present in both DCs
2015-06-04T15:25:01     DC1     GET 200
2015-06-04T19:24:06     DC1     DELETE 200  --> deleted in DC1
2015-06-04T19:45:16     DC2     GET 404  --> record disappeared from both DCs...
2015-06-05T07:10:32     DC1     GET 404
2015-06-05T10:16:28     DC2     GET 200  --> ... but somehow appeared back in DC2 (no POST requests happened for this record)
2015-06-07T18:59:57     DC1     GET 404
2AM NODE IN DC2 REPAIR
4AM NODE IN DC1 REPAIR
2015-06-08T08:27:36     DC1     GET 200  --> record is present in both DCs again, looks like DC2 "repaired" DC1
2015-06-09T15:29:50     DC2     GET 200
2015-06-09T16:05:30     DC1     DELETE 200
2015-06-09T16:05:30     DC1     GET 404
2015-06-09T21:08:24     DC2     GET 404
{code}

So the question is how the record managed to appear back in DC2... Do you have any suggestions on how we can investigate this?

Thanks,
Roman


was (Author: r0mant):
Hey guys.

So I implemented writes at EACH_QUORUM several weeks ago and has been monitoring but it does not looks like it fixed the issue.

Check this out. I pulled the logs from both datacenters for one of reappeared entries and correlated them with our repairs schedule. Reads (GETs) are done at LOCAL_QUORUM.

{code}
Time                    DC      RESULT
==========================================
2015-06-04T11:31:38     DC2     GET 200  --> record is present in both DCs
2015-06-04T15:25:01     DC1     GET 200
2015-06-04T19:24:06     DC1     DELETE 200  --> deleted in DC1
2015-06-04T19:45:16     DC2     GET 404  --> record disappeared from both DCs...
2015-06-05T07:10:32     DC1     GET 404
2015-06-05T10:16:28     DC2     GET 200  --> ... but somehow appeared back in DC2 (no POST requests happened for this record)
2015-06-07T18:59:57     DC1     GET 404
2AM NODE IN DC2 REPAIR
4AM NODE IN DC1 REPAIR
2015-06-08T08:27:36     DC1     GET 200  --> record is present in both DCs again, looks like DC2 "repaired" DC1
2015-06-09T15:29:50     DC2     GET 200
2015-06-09T16:05:30     DC1     DELETE 200
2015-06-09T16:05:30     DC1     GET 404
2015-06-09T21:08:24     DC2     GET 404
{code}

So the question is how the record managed to appear back in DC2... Do you have any suggestions on how we can investigate this?

Thanks,
Roman

> Deleted columns are resurrected after repair in wide rows
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9045
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Roman Tkachenko
>            Assignee: Marcus Eriksson
>            Priority: Critical
>             Fix For: 2.0.x
>
>         Attachments: 9045-debug-tracing.txt, another.txt, apache-cassandra-2.0.13-SNAPSHOT.jar, cqlsh.txt, debug.txt, inconsistency.txt
>
>
> Hey guys,
> After almost a week of researching the issue and trying out multiple things with (almost) no luck I was suggested (on the user@cass list) to file a report here.
> h5. Setup
> Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if it goes away)
> Multi datacenter 12+6 nodes cluster.
> h5. Schema
> {code}
> cqlsh> describe keyspace blackbook;
> CREATE KEYSPACE blackbook WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'IAD': '3',
>   'ORD': '3'
> };
> USE blackbook;
> CREATE TABLE bounces (
>   domainid text,
>   address text,
>   message text,
>   "timestamp" bigint,
>   PRIMARY KEY (domainid, address)
> ) WITH
>   bloom_filter_fp_chance=0.100000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.100000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.000000 AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'LeveledCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> {code}
> h5. Use case
> Each row (defined by a domainid) can have many many columns (bounce entries) so rows can get pretty wide. In practice, most of the rows are not that big but some of them contain hundreds of thousands and even millions of columns.
> Columns are not TTL'ed but can be deleted using the following CQL3 statement:
> {code}
> delete from bounces where domainid = 'domain.com' and address = 'alice@example.com';
> {code}
> All queries are performed using LOCAL_QUORUM CL.
> h5. Problem
> We weren't very diligent about running repairs on the cluster initially, but shorty after we started doing it we noticed that some of previously deleted columns (bounce entries) are there again, as if tombstones have disappeared.
> I have run this test multiple times via cqlsh, on the row of the customer who originally reported the issue:
> * delete an entry
> * verify it's not returned even with CL=ALL
> * run repair on nodes that own this row's key
> * the columns reappear and are returned even with CL=ALL
> I tried the same test on another row with much less data and everything was correctly deleted and didn't reappear after repair.
> h5. Other steps I've taken so far
> Made sure NTP is running on all servers and clocks are synchronized.
> Increased gc_grace_seconds to 100 days, ran full repair (on the affected keyspace) on all nodes, then changed it back to the default 10 days again. Didn't help.
> Performed one more test. Updated one of the resurrected columns, then deleted it and ran repair again. This time the updated version of the column reappeared.
> Finally, I noticed these log entries for the row in question:
> {code}
> INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java (line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
> {code}
> Figuring it may be related I bumped "in_memory_compaction_limit_in_mb" to 512MB so the row fits into it, deleted the entry and ran repair once again. The log entry for this row was gone and the columns didn't reappear.
> We have a lot of rows much larger than 512MB so can't increase this parameters forever, if that is the issue.
> Please let me know if you need more information on the case or if I can run more experiments.
> Thanks!
> Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)