You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jérémy SEVELLEC <js...@gmail.com> on 2011/08/04 09:31:06 UTC

transaction principle between multiple column families

Hi All,

Making "transaction" like is my actual preoccupation of the moment.

My (simplified) need is :

- update data in column family #1
- insert data in column family #2

My need is to see these operations in a single "transaction" because the
data is tightly coupled.

I use zookeeper/cage to make distributed lock to avoid multiple client
inserting or updating on the same data.

But there is a problem there is a fail when inserting in column family 2
because i have to "rollback" updated data of the column family #1.


My reading on the subject is that to solve the fail :
- Can we really consider that "write never fail" with cassandra from the
time the execution of a mutation happened on a node. What can be the cause
of fail at this point?
So is it important to thinking about this potential problem? (yes in my
opinion but i'm not totally sure).
- Make a retry first. Is there really a chance for the second try to succeed
if the first fail?
-  keep the "transaction data" to have the possibility to rollback
programmatically by deleting the inserting data. The problem is on the
updated data to rollback because old values are lost. I read what "Read
before write" is a bad idea to save old values before the update. the
problem remains, so how to do?


Do you have any feedback on this topic?

Regards,

--
Jérémy

Re: transaction principle between multiple column families

Posted by Jérémy SEVELLEC <js...@gmail.com>.

Hi Aaron,

Thank's for your answer.

In my case, insertions on my multiple CF's are not on the same key.

I've ever read the presentation you advise to me and i think that write
transaction log in a special CF is the track i will follow to try to solve
my problem.

Cheers,


2011/8/5 aaron morton <aa...@thelastpickle.com>

> A write for a single row is atomic, including writing to multiple CF's in
> with the same row key.
> http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic
>
> They are not isolated though, reads may see the write partially applied.
>
> Have a look at the data modelling slides here
> http://www.datastax.com/events/cassandrasf2011/presentations for a
> discussion on how to create an application Tx log.
>
> Rollback is difficult / impossible if you are doing updates, you would need
> to understand what to roll the value back to and would have to lock anyone
> else updating until you have finished the abort. Which is probably
> (technically) an unbounded wait.
>
> Most people use zookeeper / cages to serialise access to a particular key.
> Cheers
>
> -----------------
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4 Aug 2011, at 17:31, Jérémy SEVELLEC wrote:
>
> Hi All,
>
> Making "transaction" like is my actual preoccupation of the moment.
>
> My (simplified) need is :
>
> - update data in column family #1
> - insert data in column family #2
>
> My need is to see these operations in a single "transaction" because the
> data is tightly coupled.
>
> I use zookeeper/cage to make distributed lock to avoid multiple client
> inserting or updating on the same data.
>
> But there is a problem there is a fail when inserting in column family 2
> because i have to "rollback" updated data of the column family #1.
>
>
> My reading on the subject is that to solve the fail :
> - Can we really consider that "write never fail" with cassandra from the
> time the execution of a mutation happened on a node. What can be the cause
> of fail at this point?
> So is it important to thinking about this potential problem? (yes in my
> opinion but i'm not totally sure).
> - Make a retry first. Is there really a chance for the second try to
> succeed if the first fail?
> -  keep the "transaction data" to have the possibility to rollback
> programmatically by deleting the inserting data. The problem is on the
> updated data to rollback because old values are lost. I read what "Read
> before write" is a bad idea to save old values before the update. the
> problem remains, so how to do?
>
>
> Do you have any feedback on this topic?
>
> Regards,
>
> --
> Jérémy
>
>
>


-- 
Jérémy

Re: transaction principle between multiple column families

Posted by aaron morton <aa...@thelastpickle.com>.

A write for a single row is atomic, including writing to multiple CF's in with the same row key. http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic

They are not isolated though, reads may see the write partially applied. 

Have a look at the data modelling slides here http://www.datastax.com/events/cassandrasf2011/presentations for a discussion on how to create an application Tx log. 

Rollback is difficult / impossible if you are doing updates, you would need to understand what to roll the value back to and would have to lock anyone else updating until you have finished the abort. Which is probably (technically) an unbounded wait.

Most people use zookeeper / cages to serialise access to a particular key. 
Cheers

-----------------
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 4 Aug 2011, at 17:31, Jérémy SEVELLEC wrote:

> Hi All,
> 
> Making "transaction" like is my actual preoccupation of the moment.
> 
> My (simplified) need is :
> 
> - update data in column family #1
> - insert data in column family #2
> 
> My need is to see these operations in a single "transaction" because the data is tightly coupled.
> 
> I use zookeeper/cage to make distributed lock to avoid multiple client inserting or updating on the same data.
> 
> But there is a problem there is a fail when inserting in column family 2  because i have to "rollback" updated data of the column family #1.
> 
> 
> My reading on the subject is that to solve the fail :
> - Can we really consider that "write never fail" with cassandra from the time the execution of a mutation happened on a node. What can be the cause of fail at this point?
> So is it important to thinking about this potential problem? (yes in my opinion but i'm not totally sure).
> - Make a retry first. Is there really a chance for the second try to succeed if the first fail?
> -  keep the "transaction data" to have the possibility to rollback programmatically by deleting the inserting data. The problem is on the updated data to rollback because old values are lost. I read what "Read before write" is a bad idea to save old values before the update. the problem remains, so how to do?
> 
> 
> Do you have any feedback on this topic?
> 
> Regards,
> 
> --
> Jérémy
>