You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kent Närling <ke...@gmail.com> on 2011/08/01 00:15:16 UTC

Re: Using Cassandra for transaction logging, good idea?

Sounds interesting.

Reading a bit on snowflake it seems a bit uncertain if it fulfills the A & B
criterias?

ie:

>     A, eventually return all known transactions
>     B, Not return the same transaction more than once

Also, any reflections on the general idea to use Cassandra like this?

It would seem to me that i you set the write consistency very high then it
should be possible to be a reliability comparable to a classic transactional
database?

Since we are talking business transactions here it is VERY important that
once we write a transaction we know that it will not be lost or partially
written etc.
On the other hand we also know that it is insert only (ie no updates) the
insert operation is atomic so it could fit the cassandra model quite well?

Any other people using cassandra to store business data like this?

On Sun, Jul 31, 2011 at 10:20 PM, Lior Golan [via
cassandra-user@incubator.apache.org] <
ml-node+6639001-1863341575-348244@n2.nabble.com> wrote:

> How about using Snowflake to generate the transaction ids:
> https://github.com/twitter/snowflake****
>
> ** **
>
> *From:* Kent Narling [mailto:[hidden email]<http://user/SendEmail.jtp?type=node&node=6639001&i=0>]
>
> *Sent:* Thursday, July 28, 2011 5:46 PM
> *To:* [hidden email]<http://user/SendEmail.jtp?type=node&node=6639001&i=1>
>
> *Subject:* Using Cassandra for transaction logging, good idea?****
>
> ** **
>
> Hi!
>
>
> I am considering to use cassandra for clustered transaction logging in a
> project.
>
> What I need are in principal 3 functions:
>
> 1 - Log transaction with a unique (but possibly non-sequential) id
> 2 - Fetch transaction with a specific id
> 3 - Fetch X new transactions "after" a specific cursor/transaction
>      This function must be guaranteed to:
>      A, eventually return all known transactions
>      B, Not return the same transaction more than once
>      The order of the transactions fetches does not have to be strictly
> time-sorted
>      but in practice it probably has to be based on some time-oriented
> order to be able to support cursors.
>
> I can see that 1 & 2 are trivial to solve in Cassandra, but is there any
> elegant way to solve 3?
> Since there might be multiple nodes logging transactions, their clocks
> might not be perfectly synchronized (to millisec level) etc so sorting on
> time is not stable.
> Possibly creating a synchronized incremental id might be one option but
> that could create a cluster bottleneck etc?
>
> Another alternative might be to use cassandra for 1 & 2 and then store an
> ordered list of id:s in a standard DB. This might be a reasonable compromise
> since 3 is less critical from a HA point of view, but maybe someone can
> point me to a more elegant solution using Cassandra?  ****
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-for-transaction-logging-good-idea-tp6630109p6639001.html
>  To start a new topic under cassandra-user@incubator.apache.org, email
> ml-node+3065146-230304179-348244@n2.nabble.com
> To unsubscribe from cassandra-user@incubator.apache.org, click here<http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3065146&code=a2VudC5uYXJsaW5nQGdtYWlsLmNvbXwzMDY1MTQ2fC04NTYyMzM1Ng==>.
>
>


--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-for-transaction-logging-good-idea-tp6630109p6639212.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: Using Cassandra for transaction logging, good idea?

Posted by aaron morton <aa...@thelastpickle.com>.
If you are doing insert only it should be ok. If you want a unique and roughly ordered Tx id perhaps consider a TimeUUID in the first case, they are as ordered as the clocks generating the UUID's. Which is about as good as snowflake does, cannot remember what resolution the two use.  

Be aware that writes are not isolated, if your write involves multiple columns readers may see some of the columns before all are written. This may be an issue if you are doing reads around the are as the new transactions are written.

Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 1 Aug 2011, at 10:15, Kent Närling wrote:

> Sounds interesting.
> 
> Reading a bit on snowflake it seems a bit uncertain if it fulfills the A & B criterias?
> 
> ie:
> 
> >     A, eventually return all known transactions 
> >     B, Not return the same transaction more than once 
> 
> Also, any reflections on the general idea to use Cassandra like this?
> 
> It would seem to me that i you set the write consistency very high then it should be possible to be a reliability comparable to a classic transactional database?
> 
> Since we are talking business transactions here it is VERY important that once we write a transaction we know that it will not be lost or partially written etc.
> On the other hand we also know that it is insert only (ie no updates) the insert operation is atomic so it could fit the cassandra model quite well?
> 
> Any other people using cassandra to store business data like this?
> 
> On Sun, Jul 31, 2011 at 10:20 PM, Lior Golan [via [hidden email]] <[hidden email]> wrote:
> How about using Snowflake to generate the transaction ids: https://github.com/twitter/snowflake
> 
>  
> 
> From: Kent Narling [mailto:[hidden email]] 
> 
> Sent: Thursday, July 28, 2011 5:46 PM
> To: [hidden email]
> 
> Subject: Using Cassandra for transaction logging, good idea?
> 
>  
> 
> Hi! 
> 
> 
> 
> I am considering to use cassandra for clustered transaction logging in a project. 
> 
> What I need are in principal 3 functions: 
> 
> 1 - Log transaction with a unique (but possibly non-sequential) id 
> 2 - Fetch transaction with a specific id 
> 3 - Fetch X new transactions "after" a specific cursor/transaction 
>      This function must be guaranteed to: 
>      A, eventually return all known transactions 
>      B, Not return the same transaction more than once 
>      The order of the transactions fetches does not have to be strictly time-sorted 
>      but in practice it probably has to be based on some time-oriented order to be able to support cursors. 
> 
> I can see that 1 & 2 are trivial to solve in Cassandra, but is there any elegant way to solve 3? 
> Since there might be multiple nodes logging transactions, their clocks might not be perfectly synchronized (to millisec level) etc so sorting on time is not stable. 
> Possibly creating a synchronized incremental id might be one option but that could create a cluster bottleneck etc? 
> 
> Another alternative might be to use cassandra for 1 & 2 and then store an ordered list of id:s in a standard DB. This might be a reasonable compromise since 3 is less critical from a HA point of view, but maybe someone can point me to a more elegant solution using Cassandra? 
> 
> 
> 
> If you reply to this email, your message will be added to the discussion below:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-for-transaction-logging-good-idea-tp6630109p6639001.html
> To start a new topic under [hidden email], email [hidden email] 
> To unsubscribe from [hidden email], click here.
> 
> 
> View this message in context: Re: Using Cassandra for transaction logging, good idea?
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.