You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Bartłomiej Romański (JIRA)" <ji...@apache.org> on 2013/12/26 15:01:54 UTC

[jira] [Updated] (CASSANDRA-6527) Random tombstones after adding a CF with sstableloader

     [ https://issues.apache.org/jira/browse/CASSANDRA-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bartłomiej Romański updated CASSANDRA-6527:
-------------------------------------------

    Description: 
I've marked this bug as critical since it results in a data loss without any warnings.

Here's the scenario:

- create a fresh one-node cluster with cassandra 1.2.11
- add a sample row:

{code}
CREATE KEYSPACE keyspace1 WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
use keyspace1;
create table table1 (key text primary key, value1 text);
update table1 set value1 = 'some-value' where key = 'some-key';
{code}

- flush, drain, shutdown the cluster - you should have a single sstable:

{code}
root@l1:~# ls /var/lib/cassandra/data/keyspace1/table1/
keyspace1-table1-ic-1-CompressionInfo.db  
keyspace1-table1-ic-1-Filter.db  
keyspace1-table1-ic-1-Statistics.db  
keyspace1-table1-ic-1-TOC.txt
keyspace1-table1-ic-1-Data.db             
keyspace1-table1-ic-1-Index.db   
keyspace1-table1-ic-1-Summary.db
{code}

with a perfectly correct content:

{code}
root@l1:~# sstable2json /var/lib/cassandra/data/keyspace1/table1/keyspace1-table1-ic-1-Data.db
[
{"key": "736f6d652d6b6579","columns": [["","",1387822268786000], ["value1","some-value",1387822268786000]]}
]
{code}

- create a new cluster with 2.0.3 (we've used 3 nodes with replication=2, but I guess it doesn't matter)

- copy sstable from the machine in the old cluster to one of the machines in the new cluster (we do not want to use old sstableloader)

- load sstables with sstableloader:

{code}
sstableloader -d 172.16.9.12 keyspace1/table1
{code}

- analyze the content of newly loaded sstable:

{code}
root@l13:~# sstable2json /var/lib/cassandra/data/keyspace1/table1/keyspace1-table1-jb-1-Data.db
[
{"key": "736f6d652d6b6579","metadata": {"deletionInfo": {"markedForDeleteAt":294205259775,"localDeletionTime":0}},"columns": [["","",1387824835597000], ["value1","some-value",1387824835597000]]}
]
{code}

There's a random tombstone inserted!

We've hit this bug in production. We never use delete for this column family, but the tombstones appeared for each row. The timestamp looks random. In our case it was mostly in past, but sometimes (about 3% rows) it was in the future. That's even worse than missing a row. In that case you cannot simply add it again - tombstone from the future will hide it.

Fortunately, we have noticed that quickly and canceled the migration. However, we were quite lucky. There are no warnings or errors during the whole process. Losing less than 3% of data may be hard to noticed at first sight for many kind of apps.


  was:
I've marked this bug as critical since it results in a data loss without any warnings.

Here's the scenario:

- create a fresh one-node cluster with cassandra 1.2.11
- add a sample row:

{{
CREATE KEYSPACE keyspace1 WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
use keyspace1;
create table table1 (key text primary key, value1 text);
update table1 set value1 = 'some-value' where key = 'some-key';
}}

- flush, drain, shutdown the cluster - you should have a single sstable:

{{
root@l1:~# ls /var/lib/cassandra/data/keyspace1/table1/
keyspace1-table1-ic-1-CompressionInfo.db  
keyspace1-table1-ic-1-Filter.db  
keyspace1-table1-ic-1-Statistics.db  
keyspace1-table1-ic-1-TOC.txt
keyspace1-table1-ic-1-Data.db             
keyspace1-table1-ic-1-Index.db   
keyspace1-table1-ic-1-Summary.db
}}

with a perfectly correct content:

{{
root@l1:~# sstable2json /var/lib/cassandra/data/keyspace1/table1/keyspace1-table1-ic-1-Data.db
[
{"key": "736f6d652d6b6579","columns": [["","",1387822268786000], ["value1","some-value",1387822268786000]]}
]
}}

- create a new cluster with 2.0.3 (we've used 3 nodes with replication=2, but I guess it doesn't matter)

- copy sstable from the machine in the old cluster to one of the machines in the new cluster (we do not want to use old sstableloader)

- load sstables with sstableloader:

{{
sstableloader -d 172.16.9.12 keyspace1/table1
}}

- analyze the content of newly loaded sstable:

{{
root@l13:~# sstable2json /var/lib/cassandra/data/keyspace1/table1/keyspace1-table1-jb-1-Data.db
[
{"key": "736f6d652d6b6579","metadata": {"deletionInfo": {"markedForDeleteAt":294205259775,"localDeletionTime":0}},"columns": [["","",1387824835597000], ["value1","some-value",1387824835597000]]}
]
}}

There's a random tombstone inserted!

We've hit this bug in production. We never use delete for this column family, but the tombstones appeared for each row. The timestamp looks random. In our case it was mostly in past, but sometimes (about 3% rows) it was in the future. That's even worse than missing a row. In that case you cannot simply add it again - tombstone from the future will hide it.

Fortunately, we have noticed that quickly and canceled the migration. However, we were quite lucky. There are no warnings or errors during the whole process. Losing less than 3% of data may be hard to noticed at first sight for many kind of apps.



> Random tombstones after adding a CF with sstableloader
> ------------------------------------------------------
>
>                 Key: CASSANDRA-6527
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6527
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Bartłomiej Romański
>            Priority: Critical
>
> I've marked this bug as critical since it results in a data loss without any warnings.
> Here's the scenario:
> - create a fresh one-node cluster with cassandra 1.2.11
> - add a sample row:
> {code}
> CREATE KEYSPACE keyspace1 WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
> use keyspace1;
> create table table1 (key text primary key, value1 text);
> update table1 set value1 = 'some-value' where key = 'some-key';
> {code}
> - flush, drain, shutdown the cluster - you should have a single sstable:
> {code}
> root@l1:~# ls /var/lib/cassandra/data/keyspace1/table1/
> keyspace1-table1-ic-1-CompressionInfo.db  
> keyspace1-table1-ic-1-Filter.db  
> keyspace1-table1-ic-1-Statistics.db  
> keyspace1-table1-ic-1-TOC.txt
> keyspace1-table1-ic-1-Data.db             
> keyspace1-table1-ic-1-Index.db   
> keyspace1-table1-ic-1-Summary.db
> {code}
> with a perfectly correct content:
> {code}
> root@l1:~# sstable2json /var/lib/cassandra/data/keyspace1/table1/keyspace1-table1-ic-1-Data.db
> [
> {"key": "736f6d652d6b6579","columns": [["","",1387822268786000], ["value1","some-value",1387822268786000]]}
> ]
> {code}
> - create a new cluster with 2.0.3 (we've used 3 nodes with replication=2, but I guess it doesn't matter)
> - copy sstable from the machine in the old cluster to one of the machines in the new cluster (we do not want to use old sstableloader)
> - load sstables with sstableloader:
> {code}
> sstableloader -d 172.16.9.12 keyspace1/table1
> {code}
> - analyze the content of newly loaded sstable:
> {code}
> root@l13:~# sstable2json /var/lib/cassandra/data/keyspace1/table1/keyspace1-table1-jb-1-Data.db
> [
> {"key": "736f6d652d6b6579","metadata": {"deletionInfo": {"markedForDeleteAt":294205259775,"localDeletionTime":0}},"columns": [["","",1387824835597000], ["value1","some-value",1387824835597000]]}
> ]
> {code}
> There's a random tombstone inserted!
> We've hit this bug in production. We never use delete for this column family, but the tombstones appeared for each row. The timestamp looks random. In our case it was mostly in past, but sometimes (about 3% rows) it was in the future. That's even worse than missing a row. In that case you cannot simply add it again - tombstone from the future will hide it.
> Fortunately, we have noticed that quickly and canceled the migration. However, we were quite lucky. There are no warnings or errors during the whole process. Losing less than 3% of data may be hard to noticed at first sight for many kind of apps.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)