You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jacob Rhoden <ja...@me.com> on 2014/11/12 23:29:51 UTC
Cassandra patterns/design for setting up a history/version/change log
table?
Hi Guys,
Assuming you have, for example, an “account” table, and an “account_history” table which simply tracks older versions of what a persons account looks like when an administrator edits a customer account.
Given that we don’t have the luxury of a safe transaction to update the account record, i.e. to do:
- select account details
- compare old account details with new account details
- if there are changes to the account"
- copy old account details to account_history table
- update account
How do people deal with this in a multi data centre environment? The closest thing I can think of is something like this on “save":
- insert new record into account_history table
- update record into account table"
- every hour look for duplicate rows in account_history table and duplicate where someone did a save that did not change any fields on the account table.
My biggest problem with the above, is, what happens if you want to bulk load a data file into your account table, and it — for example — contains 1 million records, and only actually changes 100 account entries. For bulk loading you could probably resort to doing a "select before update” just to prevent 1 million pointless updates into the account_history table, but that feels a bit yucky. Some sort of java stored procedure might help here, but surely this is a common enough use case that we shouldn’t have to write custom java code for the Cassandra right?
Thanks!
Jacob