You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Piush (Jira)" <ji...@apache.org> on 2019/10/26 21:31:00 UTC

[jira] [Updated] (CASSANDRA-15378) Data not cleaned up from disk for SSTables after compaction

     [ https://issues.apache.org/jira/browse/CASSANDRA-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Piush updated CASSANDRA-15378:
------------------------------
    Description: 
Hello Team,

We have an application where we create data in cf, and delete the data based on the partition key on a frequent basis. We have gc_grace_seconds set to lower value (2 mins) to evict tombstones on the cf.

 

We are noticing a behaviour where even though the number of records in cf is 0, the data is left back  on disk in cassandra data directory for the specific cf. 

 

Size on filesystem for cfs {{subscriber_event_by_id_shadow}}{{, }}{{subscriber_event_shadow}}{{}}

 112M subscriber_event_by_id_shadow-4f08b880f59311e98530a93a5d955b83

129M subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83}}

we see 0 records on this table

 

cqlsh:apim> select count (id) from subscriber_event_shadow;

 

*count*

-------

     *0*

 

(1 rows)

 

Warnings :

Aggregation query used without partition key

 

cqlsh:apim> select count(id) from subscriber_event_by_id_shadow;

 

*count*

-------

     *0*

 

(1 rows)

 

Schema for the cfs



CREATE TABLE apim.subscriber_event_by_id_shadow (

    transaction_id uuid,

    shadow_version text,

    id uuid,

    namespace text,

    generated_at timeuuid,

    api_version text,

    created_at timestamp,

    event text,

    event_type text,

    filter text,

    metadata map<text, text>,

    name text,

    occ_keys list<text>,

    operation text,

    payload blob,

    retries int,

    scope text,

    shadow boolean,

    shadow_id timeuuid,

    shadow_metadata map<text, text>,

    state text,

    summary text,

    title text,

    type text,

    updated_at timestamp,

    url text,

    PRIMARY KEY (transaction_id, shadow_version, id, namespace, generated_at)

) WITH CLUSTERING ORDER BY (shadow_version ASC, id ASC, namespace ASC, generated_at ASC)

    AND bloom_filter_fp_chance = 0.01

    AND caching = \{'keys': 'ALL', 'rows_per_partition': '10'}

    AND comment = ''

    AND compaction = \{'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4', 'tombstone_threshold': '0.1', 'unchecked_tombstone_compaction': 'true'}

    AND compression = \{'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}

    AND crc_check_chance = 1.0

    AND dclocal_read_repair_chance = 0.1

    AND default_time_to_live = 0

    AND gc_grace_seconds = 120

    AND max_index_interval = 2048

    AND memtable_flush_period_in_ms = 0

    AND min_index_interval = 128

    AND read_repair_chance = 0.0

    AND speculative_retry = '99PERCENTILE';

 ```

We see gc_grace_seconds set to 120 (2 mins), my understanding is that tombstones should have been evicted and cleaned disk.

 

However the keyspace has following contents in the file system.

bash-4.2$ cd subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83

bash-4.2$ du -sh *

4.0K backups

4.0K mc-102-big-CompressionInfo.db

22M mc-102-big-Data.db

4.0K mc-102-big-Digest.crc32

4.0K mc-102-big-Filter.db

8.0K mc-102-big-Index.db

8.0K mc-102-big-Statistics.db

4.0K mc-102-big-Summary.db

4.0K mc-102-big-TOC.txt

4.0K mc-103-big-CompressionInfo.db

4.5M mc-103-big-Data.db

4.0K mc-103-big-Digest.crc32

4.0K mc-103-big-Filter.db

4.0K mc-103-big-Index.db

8.0K mc-103-big-Statistics.db

4.0K mc-103-big-Summary.db

4.0K mc-103-big-TOC.txt

4.0K mc-104-big-CompressionInfo.db

4.0K mc-104-big-Data.db

4.0K mc-104-big-Digest.crc32

4.0K mc-104-big-Filter.db

4.0K mc-104-big-Index.db

8.0K mc-104-big-Statistics.db

4.0K mc-104-big-Summary.db

4.0K mc-104-big-TOC.txt

8.0K mc-95-big-CompressionInfo.db

52M mc-95-big-Data.db

4.0K mc-95-big-Digest.crc32

4.0K mc-95-big-Filter.db

8.0K mc-95-big-Index.db

8.0K mc-95-big-Statistics.db

4.0K mc-95-big-Summary.db

4.0K mc-95-big-TOC.txt

8.0K mc-96-big-CompressionInfo.db

51M mc-96-big-Data.db

4.0K mc-96-big-Digest.crc32

4.0K mc-96-big-Filter.db

12K mc-96-big-Index.db

8.0K mc-96-big-Statistics.db

4.0K mc-96-big-Summary.db

4.0K mc-96-big-TOC.txt

4.0K snapshots

bash-4.2$

Not able to figure out why we see .db files with 50 MB of data on disk.

  was:
Hello Team,

We have an application where we create data in cf, and delete the data based on the partition key on a frequent basis. We have gc_grace_seconds set to lower value (2 mins) to evict tombstones on the cf.

 

We are noticing a behaviour where even though the number of records in cf is 0, the data is left back  on disk in cassandra data directory for the specific cf. 

 

Size on filesystem for cfs {{subscriber_event_by_id_shadow}}{{, }}{{subscriber_event_shadow}}{{}}

```

 {{112M	subscriber_event_by_id_shadow-4f08b880f59311e98530a93a5d955b83 129M	subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83}}

```

we see 0 records on this table

```

cqlsh:apim> select count(*) from subscriber_event_shadow;

 

*count*

-------

     *0*

 

(1 rows)

 

Warnings :

Aggregation query used without partition key

 

cqlsh:apim> select count(*) from subscriber_event_by_id_shadow;

 

*count*

-------

     *0*

 

(1 rows)

 ```

Schema for the cfs
```

CREATE TABLE apim.subscriber_event_by_id_shadow (

    transaction_id uuid,

    shadow_version text,

    id uuid,

    namespace text,

    generated_at timeuuid,

    api_version text,

    created_at timestamp,

    event text,

    event_type text,

    filter text,

    metadata map<text, text>,

    name text,

    occ_keys list<text>,

    operation text,

    payload blob,

    retries int,

    scope text,

    shadow boolean,

    shadow_id timeuuid,

    shadow_metadata map<text, text>,

    state text,

    summary text,

    title text,

    type text,

    updated_at timestamp,

    url text,

    PRIMARY KEY (transaction_id, shadow_version, id, namespace, generated_at)

) WITH CLUSTERING ORDER BY (shadow_version ASC, id ASC, namespace ASC, generated_at ASC)

    AND bloom_filter_fp_chance = 0.01

    AND caching = \{'keys': 'ALL', 'rows_per_partition': '10'}

    AND comment = ''

    AND compaction = \{'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4', 'tombstone_threshold': '0.1', 'unchecked_tombstone_compaction': 'true'}

    AND compression = \{'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}

    AND crc_check_chance = 1.0

    AND dclocal_read_repair_chance = 0.1

    AND default_time_to_live = 0

    AND gc_grace_seconds = 120

    AND max_index_interval = 2048

    AND memtable_flush_period_in_ms = 0

    AND min_index_interval = 128

    AND read_repair_chance = 0.0

    AND speculative_retry = '99PERCENTILE';

 ```

We see gc_grace_seconds set to 120 (2 mins), my understanding is that tombstones should have been evicted and cleaned disk.



 

However the keyspace has following contents in the file system.

bash-4.2$ cd subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83

bash-4.2$ du -sh *

4.0K backups

4.0K mc-102-big-CompressionInfo.db

22M mc-102-big-Data.db

4.0K mc-102-big-Digest.crc32

4.0K mc-102-big-Filter.db

8.0K mc-102-big-Index.db

8.0K mc-102-big-Statistics.db

4.0K mc-102-big-Summary.db

4.0K mc-102-big-TOC.txt

4.0K mc-103-big-CompressionInfo.db

4.5M mc-103-big-Data.db

4.0K mc-103-big-Digest.crc32

4.0K mc-103-big-Filter.db

4.0K mc-103-big-Index.db

8.0K mc-103-big-Statistics.db

4.0K mc-103-big-Summary.db

4.0K mc-103-big-TOC.txt

4.0K mc-104-big-CompressionInfo.db

4.0K mc-104-big-Data.db

4.0K mc-104-big-Digest.crc32

4.0K mc-104-big-Filter.db

4.0K mc-104-big-Index.db

8.0K mc-104-big-Statistics.db

4.0K mc-104-big-Summary.db

4.0K mc-104-big-TOC.txt

8.0K mc-95-big-CompressionInfo.db

52M mc-95-big-Data.db

4.0K mc-95-big-Digest.crc32

4.0K mc-95-big-Filter.db

8.0K mc-95-big-Index.db

8.0K mc-95-big-Statistics.db

4.0K mc-95-big-Summary.db

4.0K mc-95-big-TOC.txt

8.0K mc-96-big-CompressionInfo.db

51M mc-96-big-Data.db

4.0K mc-96-big-Digest.crc32

4.0K mc-96-big-Filter.db

12K mc-96-big-Index.db

8.0K mc-96-big-Statistics.db

4.0K mc-96-big-Summary.db

4.0K mc-96-big-TOC.txt

4.0K snapshots

bash-4.2$ 

Not able to figure out why we see .db files with 50 MB of data on disk.


> Data not cleaned up from disk for SSTables after compaction
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-15378
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15378
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Piush
>            Priority: Normal
>
> Hello Team,
> We have an application where we create data in cf, and delete the data based on the partition key on a frequent basis. We have gc_grace_seconds set to lower value (2 mins) to evict tombstones on the cf.
>  
> We are noticing a behaviour where even though the number of records in cf is 0, the data is left back  on disk in cassandra data directory for the specific cf. 
>  
> Size on filesystem for cfs {{subscriber_event_by_id_shadow}}{{, }}{{subscriber_event_shadow}}{{}}
>  112M subscriber_event_by_id_shadow-4f08b880f59311e98530a93a5d955b83
> 129M subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83}}
> we see 0 records on this table
>  
> cqlsh:apim> select count (id) from subscriber_event_shadow;
>  
> *count*
> -------
>      *0*
>  
> (1 rows)
>  
> Warnings :
> Aggregation query used without partition key
>  
> cqlsh:apim> select count(id) from subscriber_event_by_id_shadow;
>  
> *count*
> -------
>      *0*
>  
> (1 rows)
>  
> Schema for the cfs
> CREATE TABLE apim.subscriber_event_by_id_shadow (
>     transaction_id uuid,
>     shadow_version text,
>     id uuid,
>     namespace text,
>     generated_at timeuuid,
>     api_version text,
>     created_at timestamp,
>     event text,
>     event_type text,
>     filter text,
>     metadata map<text, text>,
>     name text,
>     occ_keys list<text>,
>     operation text,
>     payload blob,
>     retries int,
>     scope text,
>     shadow boolean,
>     shadow_id timeuuid,
>     shadow_metadata map<text, text>,
>     state text,
>     summary text,
>     title text,
>     type text,
>     updated_at timestamp,
>     url text,
>     PRIMARY KEY (transaction_id, shadow_version, id, namespace, generated_at)
> ) WITH CLUSTERING ORDER BY (shadow_version ASC, id ASC, namespace ASC, generated_at ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = \{'keys': 'ALL', 'rows_per_partition': '10'}
>     AND comment = ''
>     AND compaction = \{'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4', 'tombstone_threshold': '0.1', 'unchecked_tombstone_compaction': 'true'}
>     AND compression = \{'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 120
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99PERCENTILE';
>  ```
> We see gc_grace_seconds set to 120 (2 mins), my understanding is that tombstones should have been evicted and cleaned disk.
>  
> However the keyspace has following contents in the file system.
> bash-4.2$ cd subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83
> bash-4.2$ du -sh *
> 4.0K backups
> 4.0K mc-102-big-CompressionInfo.db
> 22M mc-102-big-Data.db
> 4.0K mc-102-big-Digest.crc32
> 4.0K mc-102-big-Filter.db
> 8.0K mc-102-big-Index.db
> 8.0K mc-102-big-Statistics.db
> 4.0K mc-102-big-Summary.db
> 4.0K mc-102-big-TOC.txt
> 4.0K mc-103-big-CompressionInfo.db
> 4.5M mc-103-big-Data.db
> 4.0K mc-103-big-Digest.crc32
> 4.0K mc-103-big-Filter.db
> 4.0K mc-103-big-Index.db
> 8.0K mc-103-big-Statistics.db
> 4.0K mc-103-big-Summary.db
> 4.0K mc-103-big-TOC.txt
> 4.0K mc-104-big-CompressionInfo.db
> 4.0K mc-104-big-Data.db
> 4.0K mc-104-big-Digest.crc32
> 4.0K mc-104-big-Filter.db
> 4.0K mc-104-big-Index.db
> 8.0K mc-104-big-Statistics.db
> 4.0K mc-104-big-Summary.db
> 4.0K mc-104-big-TOC.txt
> 8.0K mc-95-big-CompressionInfo.db
> 52M mc-95-big-Data.db
> 4.0K mc-95-big-Digest.crc32
> 4.0K mc-95-big-Filter.db
> 8.0K mc-95-big-Index.db
> 8.0K mc-95-big-Statistics.db
> 4.0K mc-95-big-Summary.db
> 4.0K mc-95-big-TOC.txt
> 8.0K mc-96-big-CompressionInfo.db
> 51M mc-96-big-Data.db
> 4.0K mc-96-big-Digest.crc32
> 4.0K mc-96-big-Filter.db
> 12K mc-96-big-Index.db
> 8.0K mc-96-big-Statistics.db
> 4.0K mc-96-big-Summary.db
> 4.0K mc-96-big-TOC.txt
> 4.0K snapshots
> bash-4.2$
> Not able to figure out why we see .db files with 50 MB of data on disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org