You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Mark Robson <ma...@gmail.com> on 2009/11/30 10:58:47 UTC

Re: Access counts (was: The concurrent access problem and solutions)

Personally I'd have each server record the access counts itself in local
storage for a while, then push them up to cassandra with a column name which
is unique to that server and push instance. This creates a delay before the
access counts are updated, I assume this is ok.

So we'd see something like

product21_access_count:{'server1_time12345678':42,
'server2_time123124':99,'server3_time123127385'} ...

Now someone who wants to know the exact count can just read the entire row
and add them up.

Of course over time, this will consume more and more storage, so a
summarisation process (which you run just one instance of, or have a
protocol for avoiding trying to summarise the same items at once) can come
along and consolidate them into a single count then you get:

product21_access_count:{'total':141}

And if any of the individual servers were pushing more data at the same
time, that's fine too.

Mark