You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ahmy Yulrizka <yu...@gmail.com> on 2014/11/18 15:38:51 UTC

Counter Deletion in a window

Hi

I'm very new with cassandra and I've created a schema like this

CREATE TABLE statistics.stats_count (
    id text,
    metric text,
    resolution text,
    time timestamp,
    value counter,
    PRIMARY KEY ((id, metric, resolution), time)
)

Below is an example of data.
This data basically just track a counter value of POST api call in 2
resolution (minute and hour) of a user


 id     | metric     | resolution | time                     | value
--------+------------+------------+--------------------------+--------
 user-1 |  POST-call |     minute | 2000-12-15 06:07:00+0100 |      1
 user-1 |  POST-call |     minute | 2000-12-15 06:08:00+0100 |      1
 user-1 |  POST-call |       hour | 2000-12-15 06:00:00+0100 |      2


This schema would make it easy for me to propagate a value to a higher
resolution.
everytime time i insert a minute data, i also insert a hour data on that
window.

I have a couple of requirement:

1. Each resolution have it's own TTL. minute for 1 week, hour for one month
2. I never updated counter outside of the TTL. I never update minute of
data that is more that 1 week ago,

I just found out that I can not set a ttl on a counter. I was planning to
do an insert followed by delete to clean up.
for example:
  * udate  id: user-1, metric: POST-call, resolution: minute, time: 2000-12-15
06:08:00+0100, value:  value + 1
  * delete id: user-1, metric: POST-call, resolution: minute, time: 2000-12-8
06:08:00+0100 (time is 1 week ago)

So my question is.

1. Is this a proper use of a counter
2. would the delete operation has impact on performance ?
3. Is it better if I don't use counter ? use an integer column type and use
the TTL ?

I would really appreciate some advice.

Thank you.




Ahmy Yulrizka
http://ahmy.yulrizka.com
@yulrizka

Re: Counter Deletion in a window

Posted by Tyler Hobbs <ty...@datastax.com>.
On Tue, Nov 18, 2014 at 8:38 AM, Ahmy Yulrizka <yu...@gmail.com> wrote:

>
>
> 1. Is this a proper use of a counter
>

It seems reasonable.


> 2. would the delete operation has impact on performance ?
>

Depending on how you query the data, no.  If you restrict the query to not
cover times where you have deleted the counters, the only impact should be
to compactions and repairs.


> 3. Is it better if I don't use counter ? use an integer column type and
> use the TTL ?


Are you getting multiple updates to the value from different sources?  If
so, you need to use a counter (or pick a master somewhere to coordinate
updates).  If you're only going to update each counter from a single source
and you always have the full count (not just a delta), normal ints are a
better choice.


-- 
Tyler Hobbs
DataStax <http://datastax.com/>