You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chen Wang <ch...@gmail.com> on 2015/01/13 02:15:57 UTC

Design a system maintaining historical view of users.

Hey Guys,
I am seeking advice on design a system that maintains a historical view of
a user's activities in past one year. Each user can have different
activities: email_open, email_click, item_view, add_to_cart, purchase etc.
The query I would like to do is, for example,

Find all customers who browse item A in the past 6 month, and also clicked
an email.
and I would like the query to be done in reasonable time frame. (for
example, within 30 minutes to retrieve 10million such users)

I can have customer_id as the row key, column family be 'Activity', then
have certain attributes associated with the column family,something like:

custer_id, browse:{item_id:12334, timestamp:epoc}

Is Cassandra a good candidate for such system? we have a hbase cluster in
place, but it does not seem like a good candidate to achieve such queries.

Thanks in advance.
Chen