You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Thakrar, Jayesh" <jt...@conversantmedia.com> on 2018/01/25 16:05:23 UTC

TWCS not deleting expired sstables

Wondering if I can get some pointers to what's happening here and why sstables that I think should be expired are not being dropped.

Here's the table's compaction property - note also set "unchecked_tombstone_compaction" to true.

compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '7', 'compaction_window_unit': 'DAYS', 'max_threshold': '4', 'min_threshold': '4', 'unchecked_tombstone_compaction': 'true'}

We insert data with timestamp and TTL programmatically.

Here's one set of sstable that I expect to be removed:

$ ls -lt *Data.db | tail -5
-rw-r--r--. 1 vchadoop vchadoop  31245097312 Sep 20 17:16 mc-1308-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop  31524316252 Sep 19 14:27 mc-1187-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop  21405216502 Sep 18 14:14 mc-1070-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop  13609890747 Sep 13 20:53 mc-178-big-Data.db

$ date +%s
1516895877

$ date
Thu Jan 25 15:58:00 UTC 2018

$ sstablemetadata $PWD/mc-130-big-Data.db | head -20
SSTable: /ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1/mc-130-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.010000
Minimum timestamp: 1496602800000000
Maximum timestamp: 1498078800000000
SSTable min local deletion time: 1507924954
SSTable max local deletion time: 1509400832
Compressor: org.apache.cassandra.io.compress.LZ4Compressor
Compression ratio: 0.17430158132352797
TTL min: 2630598
TTL max: 4086188
First token: -9177441867697829836 (key=823134638755651936)
Last token: 9155171035305804798 (key=395118640769012487)
minClustringValues: [-1, da, 3, 1498082382078, -9223371818124305448, -9223371652504795402, -1]
maxClustringValues: [61818, tpt, 325, 1496602800000, -4611686088173246790, 9223372014135560885, 1]
Estimated droppable tombstones: 1.1983492967652476
SSTable Level: 0
Repaired at: 0
Replay positions covered: {CommitLogPosition(segmentId=1505171071629, position=7157684)=CommitLogPosition(segmentId=1505171075152, position=6263269)}
totalColumnsSet: 111047277


Re: TWCS not deleting expired sstables

Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
Thank you so much Kurt - that helped!!

So here's the thing, I was logged into the server as my own id and had Cassandra binaries under my home directory with the default configuration.
Cassandra was running on the server with a service account and with its own user-id and configuration.

I assumed that like nodetool, this command would also use the Cassandra service JMX or server port.
However your suggestion made me realize that that's not the case.

So I copied the service account's Cassandra/conf to my local install and things worked.

I also looked up the source code and it did not indicate any "connection" via service or jmx port :)

So now need to see why the data is not getting purged.

Thanks again!!

From: kurt greaves <ku...@instaclustr.com>
Date: Wednesday, January 31, 2018 at 11:27 PM
To: User <us...@cassandra.apache.org>
Subject: Re: TWCS not deleting expired sstables

Well, that shouldn't happen. Seems like it's possibly not looking in the correct location for data directories. Try setting CASSANDRA_INCLUDE=<path to cassandra.in.sh<http://cassandra.in.sh>> prior to running the script?
e.g: CASSANDRA_INCLUDE=<path_to_cassandra_bin>/cassandra.in.sh<http://cassandra.in.sh> sstableexpiredblockers ae raw_logs_by_user

On 30 January 2018 at 15:34, Thakrar, Jayesh <jt...@conversantmedia.com>> wrote:
Thanks Kurt and Kenneth.

Now only if they would work as expected.

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls -lt | tail
-rw-r--r--. 1 vchadoop vchadoop    286889260 Sep 18 14:14 mc-1070-big-Index.db
-rw-r--r--. 1 vchadoop vchadoop        12236 Sep 13 20:53 mc-178-big-Statistics.db
-rw-r--r--. 1 vchadoop vchadoop           92 Sep 13 20:53 mc-178-big-TOC.txt
-rw-r--r--. 1 vchadoop vchadoop      9371211 Sep 13 20:53 mc-178-big-CompressionInfo.db
-rw-r--r--. 1 vchadoop vchadoop           10 Sep 13 20:53 mc-178-big-Digest.crc32
-rw-r--r--. 1 vchadoop vchadoop  13609890747<tel:(360)%20989-0747> Sep 13 20:53 mc-178-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop      1394968 Sep 13 20:53 mc-178-big-Summary.db
-rw-r--r--. 1 vchadoop vchadoop     11172592 Sep 13 20:53 mc-178-big-Filter.db
-rw-r--r--. 1 vchadoop vchadoop    190508739 Sep 13 20:53 mc-178-big-Index.db
drwxr-xr-x. 2 vchadoop vchadoop           10 Sep 12 21:47 backups

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>sstableexpiredblockers ae raw_logs_by_user
Exception in thread "main" java.lang.IllegalArgumentException: Unknown keyspace/table ae.raw_logs_by_user
                at org.apache.cassandra.tools.SSTableExpiredBlockers.main(SSTableExpiredBlockers.java:66)

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>sstableexpiredblockers system peers
No sstables for system.peers

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls -l ../../system/peers-37f71aca7dc2383ba70672528af04d4f/
total 308
drwxr-xr-x. 2 vchadoop vchadoop     10 Sep 11 22:59 backups
-rw-rw-r--. 1 vchadoop vchadoop     83 Jan 25 02:11 mc-137-big-CompressionInfo.db
-rw-rw-r--. 1 vchadoop vchadoop 180369 Jan 25 02:11 mc-137-big-Data.db
-rw-rw-r--. 1 vchadoop vchadoop     10 Jan 25 02:11 mc-137-big-Digest.crc32
-rw-rw-r--. 1 vchadoop vchadoop     64 Jan 25 02:11 mc-137-big-Filter.db
-rw-rw-r--. 1 vchadoop vchadoop    386 Jan 25 02:11 mc-137-big-Index.db
-rw-rw-r--. 1 vchadoop vchadoop   5171 Jan 25 02:11 mc-137-big-Statistics.db
-rw-rw-r--. 1 vchadoop vchadoop     56 Jan 25 02:11 mc-137-big-Summary.db
-rw-rw-r--. 1 vchadoop vchadoop     92 Jan 25 02:11 mc-137-big-TOC.txt
-rw-rw-r--. 1 vchadoop vchadoop     43 Jan 29 21:11 mc-138-big-CompressionInfo.db
-rw-rw-r--. 1 vchadoop vchadoop   9723 Jan 29 21:11 mc-138-big-Data.db
-rw-rw-r--. 1 vchadoop vchadoop     10 Jan 29 21:11 mc-138-big-Digest.crc32
-rw-rw-r--. 1 vchadoop vchadoop     16 Jan 29 21:11 mc-138-big-Filter.db
-rw-rw-r--. 1 vchadoop vchadoop     17 Jan 29 21:11 mc-138-big-Index.db
-rw-rw-r--. 1 vchadoop vchadoop   5015 Jan 29 21:11 mc-138-big-Statistics.db
-rw-rw-r--. 1 vchadoop vchadoop     56 Jan 29 21:11 mc-138-big-Summary.db
-rw-rw-r--. 1 vchadoop vchadoop     92 Jan 29 21:11 mc-138-big-TOC.txt
-rw-rw-r--. 1 vchadoop vchadoop     43 Jan 29 21:53 mc-139-big-CompressionInfo.db
-rw-rw-r--. 1 vchadoop vchadoop  18908 Jan 29 21:53 mc-139-big-Data.db
-rw-rw-r--. 1 vchadoop vchadoop     10 Jan 29 21:53 mc-139-big-Digest.crc32
-rw-rw-r--. 1 vchadoop vchadoop     16 Jan 29 21:53 mc-139-big-Filter.db
-rw-rw-r--. 1 vchadoop vchadoop     36 Jan 29 21:53 mc-139-big-Index.db
-rw-rw-r--. 1 vchadoop vchadoop   5055 Jan 29 21:53 mc-139-big-Statistics.db
-rw-rw-r--. 1 vchadoop vchadoop     56 Jan 29 21:53 mc-139-big-Summary.db
-rw-rw-r--. 1 vchadoop vchadoop     92 Jan 29 21:53 mc-139-big-TOC.txt



From: Kenneth Brotman <ke...@yahoo.com.INVALID>
Date: Tuesday, January 30, 2018 at 7:37 AM
To: <us...@cassandra.apache.org>>
Subject: RE: TWCS not deleting expired sstables

Wow!  It’s in the DataStax documentation: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSStabExpiredBlockers.html

Other nice tools there as well: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableUtilitiesTOC.html

Kenneth Brotman

From: kurt greaves [mailto:kurt@instaclustr.com<ma...@instaclustr.com>]
Sent: Monday, January 29, 2018 8:20 PM
To: User
Subject: Re: TWCS not deleting expired sstables

Likely a read repair caused old data to be brought into a newer SSTable. Try running sstableexpiredblockers to find out if there's a newer SSTable blocking that one from being dropped.​


Re: TWCS not deleting expired sstables

Posted by kurt greaves <ku...@instaclustr.com>.
Well, that shouldn't happen. Seems like it's possibly not looking in the
correct location for data directories. Try setting CASSANDRA_INCLUDE=<path
to cassandra.in.sh> prior to running the script?
e.g: CASSANDRA_INCLUDE=<path_to_cassandra_bin>/cassandra.in.sh
sstableexpiredblockers ae raw_logs_by_user

On 30 January 2018 at 15:34, Thakrar, Jayesh <jt...@conversantmedia.com>
wrote:

> Thanks Kurt and Kenneth.
>
>
>
> Now only if they would work as expected.
>
>
>
> *node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls
> -lt | tail *
>
> -rw-r--r--. 1 vchadoop vchadoop    286889260 Sep 18 14:14
> mc-1070-big-Index.db
>
> -rw-r--r--. 1 vchadoop vchadoop        12236 Sep 13 20:53
> mc-178-big-Statistics.db
>
> -rw-r--r--. 1 vchadoop vchadoop           92 Sep 13 20:53
> mc-178-big-TOC.txt
>
> -rw-r--r--. 1 vchadoop vchadoop      9371211 Sep 13 20:53
> mc-178-big-CompressionInfo.db
>
> -rw-r--r--. 1 vchadoop vchadoop           10 Sep 13 20:53
> mc-178-big-Digest.crc32
>
> -rw-r--r--. 1 vchadoop vchadoop  13609890747 <(360)%20989-0747> Sep 13
> 20:53 mc-178-big-Data.db
>
> -rw-r--r--. 1 vchadoop vchadoop      1394968 Sep 13 20:53
> mc-178-big-Summary.db
>
> -rw-r--r--. 1 vchadoop vchadoop     11172592 Sep 13 20:53
> mc-178-big-Filter.db
>
> -rw-r--r--. 1 vchadoop vchadoop    190508739 Sep 13 20:53
> mc-178-big-Index.db
>
> drwxr-xr-x. 2 vchadoop vchadoop           10 Sep 12 21:47 backups
>
>
>
> *node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>sstableexpiredblockers
> ae raw_logs_by_user*
>
> Exception in thread "main" java.lang.IllegalArgumentException: Unknown
> keyspace/table ae.raw_logs_by_user
>
>                 at org.apache.cassandra.tools.SSTableExpiredBlockers.main(
> SSTableExpiredBlockers.java:66)
>
>
>
> *node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>sstableexpiredblockers
> system peers*
>
> No sstables for system.peers
>
>
>
> *node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls
> -l ../../system/peers-37f71aca7dc2383ba70672528af04d4f/*
>
> total 308
>
> drwxr-xr-x. 2 vchadoop vchadoop     10 Sep 11 22:59 backups
>
> -rw-rw-r--. 1 vchadoop vchadoop     83 Jan 25 02:11
> mc-137-big-CompressionInfo.db
>
> -rw-rw-r--. 1 vchadoop vchadoop 180369 Jan 25 02:11 mc-137-big-Data.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     10 Jan 25 02:11 mc-137-big-Digest.crc32
>
> -rw-rw-r--. 1 vchadoop vchadoop     64 Jan 25 02:11 mc-137-big-Filter.db
>
> -rw-rw-r--. 1 vchadoop vchadoop    386 Jan 25 02:11 mc-137-big-Index.db
>
> -rw-rw-r--. 1 vchadoop vchadoop   5171 Jan 25 02:11
> mc-137-big-Statistics.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     56 Jan 25 02:11 mc-137-big-Summary.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     92 Jan 25 02:11 mc-137-big-TOC.txt
>
> -rw-rw-r--. 1 vchadoop vchadoop     43 Jan 29 21:11
> mc-138-big-CompressionInfo.db
>
> -rw-rw-r--. 1 vchadoop vchadoop   9723 Jan 29 21:11 mc-138-big-Data.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     10 Jan 29 21:11 mc-138-big-Digest.crc32
>
> -rw-rw-r--. 1 vchadoop vchadoop     16 Jan 29 21:11 mc-138-big-Filter.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     17 Jan 29 21:11 mc-138-big-Index.db
>
> -rw-rw-r--. 1 vchadoop vchadoop   5015 Jan 29 21:11
> mc-138-big-Statistics.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     56 Jan 29 21:11 mc-138-big-Summary.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     92 Jan 29 21:11 mc-138-big-TOC.txt
>
> -rw-rw-r--. 1 vchadoop vchadoop     43 Jan 29 21:53
> mc-139-big-CompressionInfo.db
>
> -rw-rw-r--. 1 vchadoop vchadoop  18908 Jan 29 21:53 mc-139-big-Data.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     10 Jan 29 21:53 mc-139-big-Digest.crc32
>
> -rw-rw-r--. 1 vchadoop vchadoop     16 Jan 29 21:53 mc-139-big-Filter.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     36 Jan 29 21:53 mc-139-big-Index.db
>
> -rw-rw-r--. 1 vchadoop vchadoop   5055 Jan 29 21:53
> mc-139-big-Statistics.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     56 Jan 29 21:53 mc-139-big-Summary.db
>
> -rw-rw-r--. 1 vchadoop vchadoop     92 Jan 29 21:53 mc-139-big-TOC.txt
>
>
>
>
>
>
>
> *From: *Kenneth Brotman <ke...@yahoo.com.INVALID>
> *Date: *Tuesday, January 30, 2018 at 7:37 AM
> *To: *<us...@cassandra.apache.org>
> *Subject: *RE: TWCS not deleting expired sstables
>
>
>
> Wow!  It’s in the DataStax documentation: https://docs.datastax.com/en/
> dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/
> toolsSStabExpiredBlockers.html
>
>
>
> Other nice tools there as well: https://docs.datastax.com/en/
> dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/
> toolsSSTableUtilitiesTOC.html
>
>
>
> Kenneth Brotman
>
>
>
> *From:* kurt greaves [mailto:kurt@instaclustr.com]
> *Sent:* Monday, January 29, 2018 8:20 PM
> *To:* User
> *Subject:* Re: TWCS not deleting expired sstables
>
>
>
> Likely a read repair caused old data to be brought into a newer SSTable.
> Try running sstableexpiredblockers to find out if there's a newer SSTable
> blocking that one from being dropped.​
>

Re: TWCS not deleting expired sstables

Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
Thanks Kurt and Kenneth.

Now only if they would work as expected.

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls -lt | tail
-rw-r--r--. 1 vchadoop vchadoop    286889260 Sep 18 14:14 mc-1070-big-Index.db
-rw-r--r--. 1 vchadoop vchadoop        12236 Sep 13 20:53 mc-178-big-Statistics.db
-rw-r--r--. 1 vchadoop vchadoop           92 Sep 13 20:53 mc-178-big-TOC.txt
-rw-r--r--. 1 vchadoop vchadoop      9371211 Sep 13 20:53 mc-178-big-CompressionInfo.db
-rw-r--r--. 1 vchadoop vchadoop           10 Sep 13 20:53 mc-178-big-Digest.crc32
-rw-r--r--. 1 vchadoop vchadoop  13609890747 Sep 13 20:53 mc-178-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop      1394968 Sep 13 20:53 mc-178-big-Summary.db
-rw-r--r--. 1 vchadoop vchadoop     11172592 Sep 13 20:53 mc-178-big-Filter.db
-rw-r--r--. 1 vchadoop vchadoop    190508739 Sep 13 20:53 mc-178-big-Index.db
drwxr-xr-x. 2 vchadoop vchadoop           10 Sep 12 21:47 backups

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>sstableexpiredblockers ae raw_logs_by_user
Exception in thread "main" java.lang.IllegalArgumentException: Unknown keyspace/table ae.raw_logs_by_user
                at org.apache.cassandra.tools.SSTableExpiredBlockers.main(SSTableExpiredBlockers.java:66)

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>sstableexpiredblockers system peers
No sstables for system.peers

node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls -l ../../system/peers-37f71aca7dc2383ba70672528af04d4f/
total 308
drwxr-xr-x. 2 vchadoop vchadoop     10 Sep 11 22:59 backups
-rw-rw-r--. 1 vchadoop vchadoop     83 Jan 25 02:11 mc-137-big-CompressionInfo.db
-rw-rw-r--. 1 vchadoop vchadoop 180369 Jan 25 02:11 mc-137-big-Data.db
-rw-rw-r--. 1 vchadoop vchadoop     10 Jan 25 02:11 mc-137-big-Digest.crc32
-rw-rw-r--. 1 vchadoop vchadoop     64 Jan 25 02:11 mc-137-big-Filter.db
-rw-rw-r--. 1 vchadoop vchadoop    386 Jan 25 02:11 mc-137-big-Index.db
-rw-rw-r--. 1 vchadoop vchadoop   5171 Jan 25 02:11 mc-137-big-Statistics.db
-rw-rw-r--. 1 vchadoop vchadoop     56 Jan 25 02:11 mc-137-big-Summary.db
-rw-rw-r--. 1 vchadoop vchadoop     92 Jan 25 02:11 mc-137-big-TOC.txt
-rw-rw-r--. 1 vchadoop vchadoop     43 Jan 29 21:11 mc-138-big-CompressionInfo.db
-rw-rw-r--. 1 vchadoop vchadoop   9723 Jan 29 21:11 mc-138-big-Data.db
-rw-rw-r--. 1 vchadoop vchadoop     10 Jan 29 21:11 mc-138-big-Digest.crc32
-rw-rw-r--. 1 vchadoop vchadoop     16 Jan 29 21:11 mc-138-big-Filter.db
-rw-rw-r--. 1 vchadoop vchadoop     17 Jan 29 21:11 mc-138-big-Index.db
-rw-rw-r--. 1 vchadoop vchadoop   5015 Jan 29 21:11 mc-138-big-Statistics.db
-rw-rw-r--. 1 vchadoop vchadoop     56 Jan 29 21:11 mc-138-big-Summary.db
-rw-rw-r--. 1 vchadoop vchadoop     92 Jan 29 21:11 mc-138-big-TOC.txt
-rw-rw-r--. 1 vchadoop vchadoop     43 Jan 29 21:53 mc-139-big-CompressionInfo.db
-rw-rw-r--. 1 vchadoop vchadoop  18908 Jan 29 21:53 mc-139-big-Data.db
-rw-rw-r--. 1 vchadoop vchadoop     10 Jan 29 21:53 mc-139-big-Digest.crc32
-rw-rw-r--. 1 vchadoop vchadoop     16 Jan 29 21:53 mc-139-big-Filter.db
-rw-rw-r--. 1 vchadoop vchadoop     36 Jan 29 21:53 mc-139-big-Index.db
-rw-rw-r--. 1 vchadoop vchadoop   5055 Jan 29 21:53 mc-139-big-Statistics.db
-rw-rw-r--. 1 vchadoop vchadoop     56 Jan 29 21:53 mc-139-big-Summary.db
-rw-rw-r--. 1 vchadoop vchadoop     92 Jan 29 21:53 mc-139-big-TOC.txt



From: Kenneth Brotman <ke...@yahoo.com.INVALID>
Date: Tuesday, January 30, 2018 at 7:37 AM
To: <us...@cassandra.apache.org>
Subject: RE: TWCS not deleting expired sstables

Wow!  It’s in the DataStax documentation: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSStabExpiredBlockers.html

Other nice tools there as well: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableUtilitiesTOC.html

Kenneth Brotman

From: kurt greaves [mailto:kurt@instaclustr.com]
Sent: Monday, January 29, 2018 8:20 PM
To: User
Subject: Re: TWCS not deleting expired sstables

Likely a read repair caused old data to be brought into a newer SSTable. Try running sstableexpiredblockers to find out if there's a newer SSTable blocking that one from being dropped.​

RE: TWCS not deleting expired sstables

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.
Wow!  It’s in the DataStax documentation: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSStabExpiredBlockers.html

 

Other nice tools there as well: https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableUtilitiesTOC.html

 

Kenneth Brotman

 

From: kurt greaves [mailto:kurt@instaclustr.com] 
Sent: Monday, January 29, 2018 8:20 PM
To: User
Subject: Re: TWCS not deleting expired sstables

 

Likely a read repair caused old data to be brought into a newer SSTable. Try running sstableexpiredblockers to find out if there's a newer SSTable blocking that one from being dropped.​


Re: TWCS not deleting expired sstables

Posted by kurt greaves <ku...@instaclustr.com>.
Likely a read repair caused old data to be brought into a newer SSTable.
Try running sstableexpiredblockers to find out if there's a newer SSTable
blocking that one from being dropped.​

Re: TWCS not deleting expired sstables

Posted by "Thakrar, Jayesh" <jt...@conversantmedia.com>.
Hi Brian,

Thanks for your response.
Yes, I did look at that post.
In fact after reading that post, I set the "unchecked_tombstone_compaction" to true.

For the sstable in the example below (and its neighbors), all the data for those time windows have been compacted into a single sstable, so there is no dependency or delay caused from other sstables.

Thanks,
Jayesh


From: <br...@gmail.com>
Date: Sunday, January 28, 2018 at 1:02 PM
To: <us...@cassandra.apache.org>
Subject: Re: TWCS not deleting expired sstables

I would start here:  http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html

Specifically the “Hints and repairs” and “Timestamp overlap” sections might be of use.
-B

On Jan 25, 2018, at 11:05 AM, Thakrar, Jayesh <jt...@conversantmedia.com>> wrote:
Wondering if I can get some pointers to what's happening here and why sstables that I think should be expired are not being dropped.

Here's the table's compaction property - note also set "unchecked_tombstone_compaction" to true.

compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '7', 'compaction_window_unit': 'DAYS', 'max_threshold': '4', 'min_threshold': '4', 'unchecked_tombstone_compaction': 'true'}

We insert data with timestamp and TTL programmatically.

Here's one set of sstable that I expect to be removed:

$ ls -lt *Data.db | tail -5
-rw-r--r--. 1 vchadoop vchadoop  31245097312 Sep 20 17:16 mc-1308-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop  31524316252 Sep 19 14:27 mc-1187-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop  21405216502 Sep 18 14:14 mc-1070-big-Data.db
-rw-r--r--. 1 vchadoop vchadoop  13609890747 Sep 13 20:53 mc-178-big-Data.db

$ date +%s
1516895877

$ date
Thu Jan 25 15:58:00 UTC 2018

$ sstablemetadata $PWD/mc-130-big-Data.db | head -20
SSTable: /ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1/mc-130-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.010000
Minimum timestamp: 1496602800000000
Maximum timestamp: 1498078800000000
SSTable min local deletion time: 1507924954
SSTable max local deletion time: 1509400832
Compressor: org.apache.cassandra.io.compress.LZ4Compressor
Compression ratio: 0.17430158132352797
TTL min: 2630598
TTL max: 4086188
First token: -9177441867697829836 (key=823134638755651936)
Last token: 9155171035305804798 (key=395118640769012487)
minClustringValues: [-1, da, 3, 1498082382078, -9223371818124305448, -9223371652504795402, -1]
maxClustringValues: [61818, tpt, 325, 1496602800000, -4611686088173246790, 9223372014135560885, 1]
Estimated droppable tombstones: 1.1983492967652476
SSTable Level: 0
Repaired at: 0
Replay positions covered: {CommitLogPosition(segmentId=1505171071629, position=7157684)=CommitLogPosition(segmentId=1505171075152, position=6263269)}
totalColumnsSet: 111047277


Re: TWCS not deleting expired sstables

Posted by br...@gmail.com.
I would start here:  http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html

Specifically the “Hints and repairs” and “Timestamp overlap” sections might be of use.  

-B

> On Jan 25, 2018, at 11:05 AM, Thakrar, Jayesh <jt...@conversantmedia.com> wrote:
> 
> Wondering if I can get some pointers to what's happening here and why sstables that I think should be expired are not being dropped.
>  
> Here's the table's compaction property - note also set "unchecked_tombstone_compaction" to true.
>  
> compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '7', 'compaction_window_unit': 'DAYS', 'max_threshold': '4', 'min_threshold': '4', 'unchecked_tombstone_compaction': 'true'}
>  
> We insert data with timestamp and TTL programmatically.
>  
> Here's one set of sstable that I expect to be removed:
>  
> $ ls -lt *Data.db | tail -5
> -rw-r--r--. 1 vchadoop vchadoop  31245097312 Sep 20 17:16 mc-1308-big-Data.db
> -rw-r--r--. 1 vchadoop vchadoop  31524316252 Sep 19 14:27 mc-1187-big-Data.db
> -rw-r--r--. 1 vchadoop vchadoop  21405216502 Sep 18 14:14 mc-1070-big-Data.db
> -rw-r--r--. 1 vchadoop vchadoop  13609890747 Sep 13 20:53 mc-178-big-Data.db
>  
> $ date +%s
> 1516895877
>  
> $ date 
> Thu Jan 25 15:58:00 UTC 2018
>  
> $ sstablemetadata $PWD/mc-130-big-Data.db | head -20
> SSTable: /ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1/mc-130-big
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Bloom Filter FP chance: 0.010000
> Minimum timestamp: 1496602800000000
> Maximum timestamp: 1498078800000000
> SSTable min local deletion time: 1507924954
> SSTable max local deletion time: 1509400832
> Compressor: org.apache.cassandra.io.compress.LZ4Compressor
> Compression ratio: 0.17430158132352797
> TTL min: 2630598
> TTL max: 4086188
> First token: -9177441867697829836 (key=823134638755651936)
> Last token: 9155171035305804798 (key=395118640769012487)
> minClustringValues: [-1, da, 3, 1498082382078, -9223371818124305448, -9223371652504795402, -1]
> maxClustringValues: [61818, tpt, 325, 1496602800000, -4611686088173246790, 9223372014135560885, 1]
> Estimated droppable tombstones: 1.1983492967652476
> SSTable Level: 0
> Repaired at: 0
> Replay positions covered: {CommitLogPosition(segmentId=1505171071629, position=7157684)=CommitLogPosition(segmentId=1505171075152, position=6263269)}
> totalColumnsSet: 111047277
>