You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Alex Yiu <bi...@gmail.com> on 2010/07/20 21:58:29 UTC

more questions on Cassandra ACID properties

Hi,

I have more questions on Cassandra ACID properties.
Say, I have a row that has 3 columns already: colA, colB and colC

And, if two *concurrent* clients perform a different insert(...) into the
same row,
one insert is for colD and the other insert is for colE.
Then, Cassandra would guarantee both columns will be added to the same row.

Is that correct?

That is, insert(...) of a column does NOT involving reading and rewriting
other existing columns of the same row?
That is, we do not face the following situation:
client X: read colA, colB and colC; then write: colA, colB, colC and colD
client Y: read colA, colB and colC; then write: colA, colB, colC and colE


BTW, it seems to me that insert() API as described in the wiki page:
http://wiki.apache.org/cassandra/API
should handle updating an existing column as well by the replacing the
existing column value.

If that is the case, I guess we should change the wording from "insert" to
"insert or update" in the wiki doc
And, ideally, insert(...) API operation name would be adapted
to update_or_insert(...)


Looking forward to replies that may confirm my understanding.
Thanks!


Regards,
Alex Yiu

Re: more questions on Cassandra ACID properties

Posted by Alex Yiu <bi...@gmail.com>.

Hi, all,
(Jonathan Ellis, Jonathan Shook, Aaron Morton)

Thanks for the confirmation.

JonE, the "update" wording has been added to wiki page w.r.t. to insert and
mutation API.


Regards,
Alex Yiu



On Tue, Jul 20, 2010 at 2:02 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Tue, Jul 20, 2010 at 2:58 PM, Alex Yiu <bi...@gmail.com>
> wrote:
> > Say, I have a row that has 3 columns already: colA, colB and colC
> > And, if two *concurrent* clients perform a different insert(...) into the
> > same row,
> > one insert is for colD and the other insert is for colE.
> > Then, Cassandra would guarantee both columns will be added to the same
> row.
> > Is that correct?
>
> Yes.
>
> > That is, insert(...) of a column does NOT involving reading and rewriting
> > other existing columns of the same row?
>
> Right.
>
> > If that is the case, I guess we should change the wording from "insert"
> to
> > "insert or update" in the wiki doc
>
> That would be a good change.  (Click "Login" to create an account.)
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: more questions on Cassandra ACID properties

Posted by Jonathan Ellis <jb...@gmail.com>.

On Tue, Jul 20, 2010 at 2:58 PM, Alex Yiu <bi...@gmail.com> wrote:
> Say, I have a row that has 3 columns already: colA, colB and colC
> And, if two *concurrent* clients perform a different insert(...) into the
> same row,
> one insert is for colD and the other insert is for colE.
> Then, Cassandra would guarantee both columns will be added to the same row.
> Is that correct?

Yes.

> That is, insert(...) of a column does NOT involving reading and rewriting
> other existing columns of the same row?

Right.

> If that is the case, I guess we should change the wording from "insert" to
> "insert or update" in the wiki doc

That would be a good change.  (Click "Login" to create an account.)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: more questions on Cassandra ACID properties

Posted by Jonathan Shook <js...@gmail.com>.

You are correct. In this case, Cassandra would journal two writes to
the same logical row, but they would be 2 independent writes. Writes
do not depend on reads, so they are self-contained. If either column
exists already, it will be overwritten.

These journaled actions would then be applied to the memtables, and
optionally to the on-disk structures depending on the configuration.
(Asynchronous accumulation and flushing provides the best performance,
but "write through" persistence is an option in the config)

The memtables may have to be read and written, but they only keep a
logical instance of each row, from what I know. Maybe a dev can
confirm this.

On Tue, Jul 20, 2010 at 2:58 PM, Alex Yiu <bi...@gmail.com> wrote:
>
>
> Hi,
> I have more questions on Cassandra ACID properties.
> Say, I have a row that has 3 columns already: colA, colB and colC
> And, if two *concurrent* clients perform a different insert(...) into the
> same row,
> one insert is for colD and the other insert is for colE.
> Then, Cassandra would guarantee both columns will be added to the same row.
> Is that correct?
> That is, insert(...) of a column does NOT involving reading and rewriting
> other existing columns of the same row?
> That is, we do not face the following situation:
> client X: read colA, colB and colC; then write: colA, colB, colC and colD
> client Y: read colA, colB and colC; then write: colA, colB, colC and colE
>
> BTW, it seems to me that insert() API as described in the wiki page:
> http://wiki.apache.org/cassandra/API
> should handle updating an existing column as well by the replacing the
> existing column value.
> If that is the case, I guess we should change the wording from "insert" to
> "insert or update" in the wiki doc
> And, ideally, insert(...) API operation name would be adapted
> to update_or_insert(...)
>
> Looking forward to replies that may confirm my understanding.
> Thanks!
>
> Regards,
> Alex Yiu
>
>

Re: more questions on Cassandra ACID properties

Posted by Aaron Morton <aa...@thelastpickle.com>.

Yes, both inserts (colD and colE) will succeed if you send insert() or batch_mutation()s from the client.

It's also correct to think of them as insert-or-update calls.

Aaron


On 21 Jul, 2010,at 07:58 AM, Alex Yiu <bi...@gmail.com> wrote:

>
>
> Hi,
>
> I have more questions on Cassandra ACID properties. 
> Say, I have a row that has 3 columns already: colA, colB and colC
>
> And, if two *concurrent* clients perform a different insert(...) into the same row,
> one insert is for colD and the other insert is for colE. 
> Then, Cassandra would guarantee both columns will be added to the same row.
>
> Is that correct? 
>
> That is, insert(...) of a column does NOT involving reading and rewriting other existing columns of the same row? 
> That is, we do not face the following situation:
> client X: read colA, colB and colC; then write: colA, colB, colC and colD
> client Y: read colA, colB and colC; then write: colA, colB, colC and colE
>
>
> BTW, it seems to me that insert() API as described in the wiki page: 
> http://wikiapache.org/cassandra/API
> should handle updating an existing column as well by the replacing the existing column value.
>
> If that is the case, I guess we should change the wording from "insert" to "insert or update" in the wiki doc
> And, ideally, insert(...) API operation name would be adapted to update_or_insert(...)
>
>
> Looking forward to replies that may confirm my understanding.
> Thanks!
>
>
> Regards,
> Alex Yiu
>
>