You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jacob Rhoden <ja...@me.com> on 2014/11/12 23:29:51 UTC

Cassandra patterns/design for setting up a history/version/change log table?

Hi Guys,

Assuming you have, for example, an “account” table, and an “account_history” table which simply tracks older versions of what a persons account looks like when an administrator edits a customer account.

Given that we don’t have the luxury of a safe transaction to update the account record, i.e. to do:

 - select account details
 - compare old account details with new account details
 - if there are changes to the account"
    - copy old account details to account_history table
    - update account

How do people deal with this in a multi data centre environment? The closest thing I can think of is something like this on “save":

 - insert new record into account_history table
 - update record into account table"
 - every hour look for duplicate rows in account_history table and duplicate where someone did a save that did not change any fields on the account table.

My biggest problem with the above, is, what happens if you want to bulk load a data file into your account table, and it — for example — contains 1 million records, and only actually changes 100 account entries. For bulk loading you could probably resort to doing a "select before update” just to prevent 1 million pointless updates into the account_history table, but that feels a bit yucky. Some sort of java stored procedure might help here, but surely this is a common enough use case that we shouldn’t have to write custom java code for the Cassandra right?

Thanks!
Jacob