You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Sebastian Martinka <se...@mercateo.com> on 2014/11/03 16:44:32 UTC

new data not flushed to sstables

System and Keyspace Information:
4 Nodes
Cassandra 2.0.9
cqlsh 4.1.1
CQL spec 3.1.1
Thrift protocol 19.39.0

java version "1.8.0_20"
Java(TM) SE Runtime Environment (build 1.8.0_20-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)

CREATE KEYSPACE restore_test WITH replication = {
  'class': 'SimpleStrategy',
  'replication_factor': '3'};

CREATE TABLE inkr_test (
  objid int,
  creation_date timestamp,
  data text,
  PRIMARY KEY ((objid))
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};
------------------------------------------------------------------------------------------

I created for backup and incremental restore tests some rows.
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (1, dateof(now()),'ini. Load');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (2, dateof(now()),'ini. Load');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (3, dateof(now()),'ini. Load');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (4, dateof(now()),'ini. Load');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (5, dateof(now()),'ini. Load');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (6, dateof(now()),'ini. Load');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (7, dateof(now()),'ini. Load');

cqlsh:restore_test> Select * from inkr_test;

objid | creation_date            | data
-------+--------------------------+-----------
     5 | 2014-10-29 06:20:06+0100 | ini. Load
     1 | 2014-10-29 06:19:50+0100 | ini. Load
     2 | 2014-10-29 06:19:56+0100 | ini. Load
     4 | 2014-10-29 06:20:02+0100 | ini. Load
     7 | 2014-10-29 06:20:13+0100 | ini. Load
     6 | 2014-10-29 06:20:09+0100 | ini. Load
     3 | 2014-10-29 06:19:59+0100 | ini. Load

(7 rows)

Following I executed nodetool flush on all nodes to write the data on disk. After that, I created next rows.
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (8, dateof(now()),'nach Backup');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (9, dateof(now()),'nach Backup');
cqlsh:restore_test> insert into inkr_test (objid, creation_date, data) VALUES (10, dateof(now()),'nach Backup');
cqlsh:restore_test> Select * from inkr_test;

objid | creation_date            | data
-------+--------------------------+-------------
     5 | 2014-10-29 06:20:06+0100 |   ini. Load
    10 | 2014-10-29 06:44:12+0100 | nach Backup
     1 | 2014-10-29 06:19:50+0100 |   ini. Load
     8 | 2014-10-29 06:44:00+0100 | nach Backup
     2 | 2014-10-29 06:19:56+0100 |   ini. Load
     4 | 2014-10-29 06:20:02+0100 |   ini. Load
     7 | 2014-10-29 06:20:13+0100 |   ini. Load
     6 | 2014-10-29 06:20:09+0100 |   ini. Load
     9 | 2014-10-29 06:44:04+0100 | nach Backup
     3 | 2014-10-29 06:19:59+0100 |   ini. Load

(10 rows)

Now, I executed nodetool flush only on node1 and checked the content from the created sstables:

[root@dev-stage-cassandra1 backup]# ll /opt/data/restore_test/inkr_test/
total 68
drwxr-xr-x 2 cassandra cassandra 4096 Oct 29 06:45 backups
-rw-r--r-- 1 cassandra cassandra   43 Oct 29 06:35 restore_test-inkr_test-jb-2-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra  213 Oct 29 06:35 restore_test-inkr_test-jb-2-Data.db
-rw-r--r-- 1 cassandra cassandra  336 Oct 29 06:35 restore_test-inkr_test-jb-2-Filter.db
-rw-r--r-- 1 cassandra cassandra   90 Oct 29 06:35 restore_test-inkr_test-jb-2-Index.db
-rw-r--r-- 1 cassandra cassandra 4393 Oct 29 06:35 restore_test-inkr_test-jb-2-Statistics.db
-rw-r--r-- 1 cassandra cassandra   80 Oct 29 06:35 restore_test-inkr_test-jb-2-Summary.db
-rw-r--r-- 1 cassandra cassandra   79 Oct 29 06:35 restore_test-inkr_test-jb-2-TOC.txt
-rw-r--r-- 2 cassandra cassandra   43 Oct 29 06:45 restore_test-inkr_test-jb-3-CompressionInfo.db
-rw-r--r-- 2 cassandra cassandra  130 Oct 29 06:45 restore_test-inkr_test-jb-3-Data.db
-rw-r--r-- 2 cassandra cassandra   16 Oct 29 06:45 restore_test-inkr_test-jb-3-Filter.db
-rw-r--r-- 2 cassandra cassandra   36 Oct 29 06:45 restore_test-inkr_test-jb-3-Index.db
-rw-r--r-- 2 cassandra cassandra 4389 Oct 29 06:45 restore_test-inkr_test-jb-3-Statistics.db
-rw-r--r-- 2 cassandra cassandra   80 Oct 29 06:45 restore_test-inkr_test-jb-3-Summary.db
-rw-r--r-- 2 cassandra cassandra   79 Oct 29 06:45 restore_test-inkr_test-jb-3-TOC.txt
[root@dev-stage-cassandra1 backup]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-3-Data.db > /home/sstable2json.log
[root@dev-stage-cassandra1 backup]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-2-Data.db > /home/sstable2json2.log

# content from restore_test-inkr_test-jb-2-Data.db
[
{"key": "00000001","columns": [["","",1414559990531000], ["creation_date","2014-10-29 06:19+0100",1414559990531000], ["data","ini. Load",1414559990531000]]},
{"key": "00000002","columns": [["","",1414559996219000], ["creation_date","2014-10-29 06:19+0100",1414559996219000], ["data","ini. Load",1414559996219000]]},
{"key": "00000004","columns": [["","",1414560002891000], ["creation_date","2014-10-29 06:20+0100",1414560002891000], ["data","ini. Load",1414560002891000]]},
{"key": "00000006","columns": [["","",1414560009867000], ["creation_date","2014-10-29 06:20+0100",1414560009867000], ["data","ini. Load",1414560009867000]]},
{"key": "00000003","columns": [["","",1414559999371000], ["creation_date","2014-10-29 06:19+0100",1414559999371000], ["data","ini. Load",1414559999371000]]}
]
# content from restore_test-inkr_test-jb-3-Data.db
[
{"key": "00000008","columns": [["","",1414561440089000], ["creation_date","2014-10-29 06:44+0100",1414561440089000], ["data","nach Backup",1414561440089000]]},
{"key": "00000009","columns": [["","",1414561444022000], ["creation_date","2014-10-29 06:44+0100",1414561444022000], ["data","nach Backup",1414561444022000]]}
]

After I couldn't find the entire data, I executed nodetool flush on node 2, 3 and 4 and checked the content from the sstables.

[root@dev-stage-cassandra2 ~]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-2-Data.db > /home/sstable2json.log
[root@dev-stage-cassandra2 ~]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-1-Data.db > /home/sstable2json2.log

# content from restore_test-inkr_test-jb-1-Data.db
[
{"key": "00000005","columns": [["","",1414560006131000], ["creation_date","2014-10-29 06:20+0100",1414560006131000], ["data","ini. Load",1414560006131000]]},
{"key": "00000002","columns": [["","",1414559996219000], ["creation_date","2014-10-29 06:19+0100",1414559996219000], ["data","ini. Load",1414559996219000]]},
{"key": "00000004","columns": [["","",1414560002891000], ["creation_date","2014-10-29 06:20+0100",1414560002891000], ["data","ini. Load",1414560002891000]]},
{"key": "00000007","columns": [["","",1414560013034000], ["creation_date","2014-10-29 06:20+0100",1414560013034000], ["data","ini. Load",1414560013034000]]}
]

# content from restore_test-inkr_test-jb-2-Data.db
[
{"key": "0000000a","columns": [["","",1414561452413000], ["creation_date","2014-10-29 06:44+0100",1414561452413000], ["data","nach Backup",1414561452413000]]},
{"key": "00000008","columns": [["","",1414561440089000], ["creation_date","2014-10-29 06:44+0100",1414561440089000], ["data","nach Backup",1414561440089000]]}
]


[root@dev-stage-cassandra3 ~]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-2-Data.db > /home/sstable2json.log
[root@dev-stage-cassandra3 ~]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-1-Data.db > /home/sstable2json2.log

# content from restore_test-inkr_test-jb-1-Data.db
[
{"key": "00000005","columns": [["","",1414560006131000], ["creation_date","2014-10-29 06:20+0100",1414560006131000], ["data","ini. Load",1414560006131000]]},
{"key": "00000001","columns": [["","",1414559990531000], ["creation_date","2014-10-29 06:19+0100",1414559990531000], ["data","ini. Load",1414559990531000]]},
{"key": "00000002","columns": [["","",1414559996219000], ["creation_date","2014-10-29 06:19+0100",1414559996219000], ["data","ini. Load",1414559996219000]]},
{"key": "00000004","columns": [["","",1414560002891000], ["creation_date","2014-10-29 06:20+0100",1414560002891000], ["data","ini. Load",1414560002891000]]},
{"key": "00000007","columns": [["","",1414560013034000], ["creation_date","2014-10-29 06:20+0100",1414560013034000], ["data","ini. Load",1414560013034000]]},
{"key": "00000006","columns": [["","",1414560009867000], ["creation_date","2014-10-29 06:20+0100",1414560009867000], ["data","ini. Load",1414560009867000]]},
{"key": "00000003","columns": [["","",1414559999371000], ["creation_date","2014-10-29 06:19+0100",1414559999371000], ["data","ini. Load",1414559999371000]]}
]

# content from restore_test-inkr_test-jb-2-Data.db
[
{"key": "0000000a","columns": [["","",1414561452413000], ["creation_date","2014-10-29 06:44+0100",1414561452413000], ["data","nach Backup",1414561452413000]]},
{"key": "00000009","columns": [["","",1414561444022000], ["creation_date","2014-10-29 06:44+0100",1414561444022000], ["data","nach Backup",1414561444022000]]}
]

[root@dev-stage-cassandra4 ~]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-2-Data.db > /home/sstable2json.log
[root@dev-stage-cassandra4 ~]# sstable2json /opt/data/restore_test/inkr_test/restore_test-inkr_test-jb-1-Data.db > /home/sstable2json2.log

# content from restore_test-inkr_test-jb-1-Data.db
[
{"key": "00000005","columns": [["","",1414560006131000], ["creation_date","2014-10-29 06:20+0100",1414560006131000], ["data","ini. Load",1414560006131000]]},
{"key": "00000001","columns": [["","",1414559990531000], ["creation_date","2014-10-29 06:19+0100",1414559990531000], ["data","ini. Load",1414559990531000]]},
{"key": "00000007","columns": [["","",1414560013034000], ["creation_date","2014-10-29 06:20+0100",1414560013034000], ["data","ini. Load",1414560013034000]]},
{"key": "00000006","columns": [["","",1414560009867000], ["creation_date","2014-10-29 06:20+0100",1414560009867000], ["data","ini. Load",1414560009867000]]},
{"key": "00000003","columns": [["","",1414559999371000], ["creation_date","2014-10-29 06:19+0100",1414559999371000], ["data","ini. Load",1414559999371000]]}
]

# content from restore_test-inkr_test-jb-2-Data.db
[
{"key": "0000000a","columns": [["","",1414561452413000], ["creation_date","2014-10-29 06:44+0100",1414561452413000], ["data","nach Backup",1414561452413000]]},
{"key": "00000008","columns": [["","",1414561440089000], ["creation_date","2014-10-29 06:44+0100",1414561440089000], ["data","nach Backup",1414561440089000]]},
{"key": "00000009","columns": [["","",1414561444022000], ["creation_date","2014-10-29 06:44+0100",1414561444022000], ["data","nach Backup",1414561444022000]]}
]

I assumed,  that a flush write all data in the sstables and we can use it for backup and restore. Did I forget something or is my understanding wrong?

Best Regards,
Sebastian

Re: new data not flushed to sstables

Posted by Bryan Talbot <br...@playnext.com>.
On Mon, Nov 3, 2014 at 7:44 AM, Sebastian Martinka <
sebastian.martinka@mercateo.com> wrote:

>  System and Keyspace Information:
>
> 4 Nodes
>
>

> CREATE KEYSPACE restore_test WITH replication = {  'class':
> 'SimpleStrategy',
>
>   'replication_factor': '3'};
>
>
>
>
>
> I assumed,  that a flush write all data in the sstables and we can use it
> for backup and restore. Did I forget something or is my understanding
> wrong?
>
>
>
I think you forgot that with N=4 and RF=3 that each node will contain
approximately 75% of the data. From a quick eyeball check of the json-dump
you provided, it looks like partition-key values are contained on 3 nodes
and are absent from 1 which is exactly as expected.

-Bryan