You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Eric Wong <Er...@solvians.com> on 2021/12/14 05:50:14 UTC
Try to understand expires_at in liveness_info

Hello Cassandra guru:

We are using Cassandra 4.0.1.  I want to understand the meaning of liveness_info from sstabledump output.  The following is a sample output of a record from sstabledump

{
  "partition" : {
   "key" : [ "1065", "25034769", "6" ],
   "position" : 110384220
   },
  "rows" : [
   {
     "type" : "row",
     "position" : 110384255,
     "clustering" : [ "2021-12-03 08:11:00.000Z" ],
     "liveness_info" : { "tstamp" : "2021-12-03T08:11:00Z", "ttl" : 259200, "expires_at" : "2021-12-13T08:34:04Z", "expired" : false },
     "cells" : [
     { "name" : "close", "value" : {"size": 10000, "ts": "2021-12-03 08:11:51.919Z", "value": 132.259} },
     { "name" : "high", "value" : {"size": 10000, "ts": "2021-12-03 08:11:37.852Z", "value": 132.263} },
     { "name" : "low", "value" : {"size": 10000, "ts": "2021-12-03 08:11:21.377Z", "value": 132.251} },
     { "name" : "open", "value" : {"size": 10000, "ts": "2021-12-03 08:11:00.434Z", "value": 132.261} }
        ]
      }
    ]
  },

What I am puzzled about is the "expires_at" timestamp.

The TTL is 259200, which is 3 days.  Yet, the expires_at is set for 10 days later.  These non-expiring data seems to be stuck in the db longer than expected.  This cause read latency towards the end of the week.

I am not sure where the 10 days "expires_at" is coming from.  Our keyspace is configured as follows:

CREATE TABLE storage_system.sample_rate (
    market smallint,
    sin bigint,
    field smallint,
    slot timestamp,
   close frozen<pricerecord>,
    high frozen<pricerecord>,
    low frozen<pricerecord>,
    open frozen<pricerecord>,
    PRIMARY KEY ((market, sin, field), slot)
) WITH CLUSTERING ORDER BY (slot ASC)
    AND additional_write_policy = '99p'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'HOURS', 'max_threshold': '32', 'min_threshold': '4', 'tombstone_compaction_interval': '86400', 'unchecked_tombstone_compaction': 'true', 'unsafe_aggressive_sstable_expiration': 'true'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 86400
    AND extensions = {}
    AND gc_grace_seconds = 3600
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

Our application has logic that Monday to Thurs records has a TTL of one day.  Friday records has a TTL of 3 days.

The Monday to Thursday records are cleaning up properly.  It is always the Friday's data seem to have extended expires_at.

Thanks in advance for anyone who can provide some pointers on where to check for problem.
Eric Wong