You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Avi Levi <av...@indeni.com> on 2017/10/02 14:50:35 UTC

Creating materialized view for bulk selections

I have metrics streaming in load of 10mil per minute.
I am using ttl in order to maintain the retention.
I have a cleanup process that needs to clean resources from other places
(not cassandra) when that key is no longer exists (i.e the it's ttl reach
it's due).

for that I thought to maintain table metric_retention((key), expire_time)
this table will keep getting inserts will either add new record or update

*insert into metric_retention (key, expire_time) values ([metric key],
[metric timestamp]+ttl)*

I thought of creating materialized view

CREATE MATERIALIZED VIEW metricsdb.mv_metric_retention AS
    SELECT exp_time, mkey
    FROM *metric_retention*
    WHERE exp_time IS NOT NULL
    PRIMARY KEY (exp_time, mkey)
    WITH CLUSTERING ORDER BY (mkey ASC)

most of queries  will have the following shape
*"select key from metric_retention where Token(expire_time ) < Token(now)" *

of course once the resources are cleared I need also to delete those keys
from this table as well.

I do expect that reads will not be very often (couple of times a day) but
inserts/updates as I stated will suffer heavy load (10mil per minute) .

creating a materialized view over the metric_retaention table is the proper
approach here ?
are there any pitfalls to look out from ?

Thanks
Avi