You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Kant Kodali <ka...@peernova.com> on 2016/10/04 18:29:38 UTC

How to write a trigger in Cassandra to only detect updates of an existing row?

Hi all,
How to write a trigger in Cassandra to detect updates? My requirement is that I
want a trigger to alert me only when there is an update to an existing row and
looks like given the way INSERT and Update works this might be hard to do
because INSERT will just overwrite if there is an existing row and Update
becomes new insert where there is no row that belongs to certain partition key.
is there a way to solve this problem?
Thanks,

kant

Re: How to write a trigger in Cassandra to only detect updates of an existing row?

Posted by Kant Kodali <ka...@peernova.com>.

Hi Siddharth,
That seems like a cool trick. but since I am looking for only updates of an
existing row how would I know from this logic "insert/update(length > 0)" do I
need to create a hashmap for every row and keep track oflength > 0 but that
would blow up the memory right.
Thanks,kant
 





On Tue, Oct 4, 2016 1:46 PM, siddharth verma sidd.verma29.list@gmail.com
wrote:
Hi,consider the schemapk1 text,ck1 textv1 text,v2 text.PRIMARY KEY(pk1,ck1)
1. insert into ks.tablename(pk1,ck1,v1,v2) values('PK1,'CK1','a','a');2. delete
from ks.tablename where pk1='PK2' and ck1='CK2';
3. insert into ks.tablename(pk1,ck1) values('PK3,'CK3');4. insert into
ks.tablename(pk1,ck1,v1) values('PK4,'CK4','a');
3rd case is "insert of the form when ONLY primary key values are specified"
if you are sure, case 3 will never occur from your application, you can check on
length of "next"(as in the code snippet),next.length() will be greater than zero
in case 1,4next.length() will be equal to zero in case 2,3

Thus, inspite of 3 being an insert, in the code snippet, it might appear to be a
delete.

Rephrasing"If you are sure that your application will NOT do an insert of the
form when ONLY primary key values are specified, you can check the length of
next, to indicate whether it is an insert/update(where atleast one non primary
key column value is inserted) or a delete if length is zero."If you are sure
case 3 will never occur,then checking the next.length(), you can decide whether
it is an insert/update(length > 0) OR delete(length == 0)
I would urge you to try the snippet once on you own, to see what kind of data it
produces in next. You could dump the output of next in a column for audit table,
to see that output.

RegardsSiddharth Verma
On Wed, Oct 5, 2016 at 1:23 AM, Kant Kodali <ka...@peernova.com>  wrote:
Hi Siddharth,
I don't quite follow the assumption "If you are sure that your application will
NOT do an insert of the form when ONLY primary key values are specified, you can
check the length of next, to indicate whether it is an insert/update(where
atleast one non primary key column value is inserted) or a delete if length is
zero.". Could you please provide an example ?
Thanks,kant

 





On Tue, Oct 4, 2016 12:34 PM, siddharth verma sidd.verma29.list@gmail.com
wrote:
Hi,I am not sure whether it will help you or not.Code snippet :public
Collection<Mutation> augment(Partition update){...    StringBuilder next=new
StringBuilder();    SearchIterator<Clustering, Row> searchIterator =
update.searchIterator(ColumnFilter.all(update.metadata()),false);       
while(searchIterator.hasNext()){            next.append(searchIterator.
next(Clustering.EMPTY).toString()+"\001");        }...//next carries non primary
key column values}
If you are sure that your application will NOT do an insert of the form when
ONLY primary key values are specified, you can check the length of next, to
indicate whether it is an insert/update(where atleast one non primary key column
value is inserted) or a delete if length is zero.
The code snippet is to the best of my knowledge, however, kindly try it once at
your end, as this was part of some legacy code, and I am not completely sure
about it.
Here, if the assumption stated above holds true, you could avoid a cassandra
select for that key.
ThanksSiddharth Verma

On Wed, Oct 5, 2016 at 12:20 AM, Kant Kodali <ka...@peernova.com>  wrote:
Thanks a lot, This helps me to make a decision on not to write one for the
performance reasons you pointed out!

 





On Tue, Oct 4, 2016 11:42 AM, Eric Stevens mightye@gmail.com
wrote:
You would have to perform a SELECT on the row in the trigger code in order to
determine if there was underlying data.  Cassandra is in essence an append-only
data store, when an INSERT or UPDATE is executed, it has no idea if there is
already a row underlying it, and for write performance reasons it also doesn't
care.
Note that if you do this, you're going to introduce a giant bottleneck in your
write path and increase the IO cost of writes.  You'll also probably have some
race conditions such that if two writes to the same row happen in quick
succession your trigger might not notice that one of them is writing to the same
row as the other. You might need to resort to CAS operations to overcome that,
along with its associated overhead.  But all that said, it should be possible,
though you'll have to write it for yourself in your trigger code.


On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <ka...@peernova.com> wrote:
Hi all,
How to write a trigger in Cassandra to detect updates? My requirement is that I
want a trigger to alert me only when there is an update to an existing row and
looks like given the way INSERT and Update works this might be hard to do
because INSERT will just overwrite if there is an existing row and Update
becomes new insert where there is no row that belongs to certain partition key.
is there a way to solve this problem?
Thanks,

kant

Re: How to write a trigger in Cassandra to only detect updates of an existing row?

Posted by siddharth verma <si...@gmail.com>.

Hi,
consider the schema
pk1 text,
ck1 text
v1 text,
v2 text.
PRIMARY KEY(pk1,ck1)

1. insert into ks.tablename(pk1,ck1,v1,v2) values('PK1,'CK1','a','a');
2. delete from ks.tablename where pk1='PK2' and ck1='CK2';
3. insert into ks.tablename(pk1,ck1) values('PK3,'CK3');
4. insert into ks.tablename(pk1,ck1,v1) values('PK4,'CK4','a');

3rd case is "insert of the form when ONLY primary key values are specified"

if you are sure, case 3 will never occur from your application, you can
check on length of "next"(as in the code snippet),
next.length() will be greater than zero in case 1,4
next.length() will be equal to zero in case 2,3

Thus, inspite of 3 being an insert, in the code snippet, it might appear to
be a delete.


Rephrasing
"If you are sure that your application will NOT do an insert of the form
when ONLY primary key values are specified, you can check the length of
next, to indicate whether it is an insert/update(where atleast one non
primary key column value is inserted) or a delete if length is zero."
If you are sure case 3 will never occur,
then checking the next.length(), you can decide whether it is an
insert/update(length > 0) OR delete(length == 0)

I would urge you to try the snippet once on you own, to see what kind of
data it produces in *next*. You could dump the output of next in a column
for audit table, to see that output.


Regards
Siddharth Verma

On Wed, Oct 5, 2016 at 1:23 AM, Kant Kodali <ka...@peernova.com> wrote:

> Hi Siddharth,
>
> I don't quite follow the assumption "If you are sure that your
> application will NOT do an insert of the form when ONLY primary key values
> are specified, you can check the length of next, to indicate whether it is
> an insert/update(where atleast one non primary key column value is
> inserted) or a delete if length is zero.". Could you please provide an
> example ?
>
> Thanks,
> kant
>
>
>
> On Tue, Oct 4, 2016 12:34 PM, siddharth verma sidd.verma29.list@gmail.com
> wrote:
>
>> Hi,
>> I am not sure whether it will help you or not.
>> Code snippet :
>> public Collection<Mutation> augment(Partition update)
>> {
>> ...
>>     StringBuilder next=new StringBuilder();
>>     SearchIterator<Clustering, Row> searchIterator =
>> update.searchIterator(ColumnFilter.all(update.metadata()),false);
>>         while(searchIterator.hasNext()){
>>             next.append(searchIterator.next(Clustering.EMPTY).
>> toString()+"\001");
>>         }
>> ...
>> //next carries non primary key column values
>> }
>>
>> If you are sure that your application will NOT do an insert of the form
>> when ONLY primary key values are specified, you can check the length of
>> next, to indicate whether it is an insert/update(where atleast one non
>> primary key column value is inserted) or a delete if length is zero.
>>
>> The code snippet is to the best of my knowledge, however, kindly try it
>> once at your end, as this was part of some legacy code, and I am not
>> completely sure about it.
>>
>> Here, if the assumption stated above holds true, you could avoid a
>> cassandra select for that key.
>>
>> Thanks
>> Siddharth Verma
>>
>>
>> On Wed, Oct 5, 2016 at 12:20 AM, Kant Kodali <ka...@peernova.com> wrote:
>>
>> Thanks a lot, This helps me to make a decision on not to write one for
>> the performance reasons you pointed out!
>>
>>
>>
>> On Tue, Oct 4, 2016 11:42 AM, Eric Stevens mightye@gmail.com wrote:
>>
>> You would have to perform a SELECT on the row in the trigger code in
>> order to determine if there was underlying data.  Cassandra is in essence
>> an append-only data store, when an INSERT or UPDATE is executed, it has no
>> idea if there is already a row underlying it, and for write performance
>> reasons it also doesn't care.
>>
>> Note that if you do this, you're going to introduce a giant bottleneck in
>> your write path and increase the IO cost of writes.  You'll also probably
>> have some race conditions such that if two writes to the same row happen in
>> quick succession your trigger might not notice that one of them is writing
>> to the same row as the other. You might need to resort to CAS operations to
>> overcome that, along with its associated overhead.  But all that said, it
>> should be possible, though you'll have to write it for yourself in your
>> trigger code.
>>
>>
>>
>> On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <ka...@peernova.com> wrote:
>>
>> Hi all,
>>
>> How to write a trigger in Cassandra to detect updates? My requirement is
>> that I want a trigger to alert me only when there is an update to an
>> existing row and looks like given the way INSERT and Update works this
>> might be hard to do because INSERT will just overwrite if there is an
>> existing row and Update becomes new insert where there is no row that
>> belongs to certain partition key. is there a way to solve this problem?
>>
>> Thanks,
>>
>> kant
>>
>>
>>

Re: How to write a trigger in Cassandra to only detect updates of an existing row?

Posted by Kant Kodali <ka...@peernova.com>.

Hi Siddharth,
I don't quite follow the assumption "If you are sure that your application will
NOT do an insert of the form when ONLY primary key values are specified, you can
check the length of next, to indicate whether it is an insert/update(where
atleast one non primary key column value is inserted) or a delete if length is
zero.". Could you please provide an example ?
Thanks,kant
 





On Tue, Oct 4, 2016 12:34 PM, siddharth verma sidd.verma29.list@gmail.com
wrote:
Hi,I am not sure whether it will help you or not.Code snippet :public
Collection<Mutation> augment(Partition update){...    StringBuilder next=new
StringBuilder();    SearchIterator<Clustering, Row> searchIterator =
update.searchIterator(ColumnFilter.all(update.metadata()),false);       
while(searchIterator.hasNext()){           
next.append(searchIterator.next(Clustering.EMPTY).toString()+"\001");        }
...//next carries non primary key column values}
If you are sure that your application will NOT do an insert of the form when
ONLY primary key values are specified, you can check the length of next, to
indicate whether it is an insert/update(where atleast one non primary key column
value is inserted) or a delete if length is zero.
The code snippet is to the best of my knowledge, however, kindly try it once at
your end, as this was part of some legacy code, and I am not completely sure
about it.
Here, if the assumption stated above holds true, you could avoid a cassandra
select for that key.
ThanksSiddharth Verma

On Wed, Oct 5, 2016 at 12:20 AM, Kant Kodali <ka...@peernova.com>  wrote:
Thanks a lot, This helps me to make a decision on not to write one for the
performance reasons you pointed out!

 





On Tue, Oct 4, 2016 11:42 AM, Eric Stevens mightye@gmail.com
wrote:
You would have to perform a SELECT on the row in the trigger code in order to
determine if there was underlying data.  Cassandra is in essence an append-only
data store, when an INSERT or UPDATE is executed, it has no idea if there is
already a row underlying it, and for write performance reasons it also doesn't
care.
Note that if you do this, you're going to introduce a giant bottleneck in your
write path and increase the IO cost of writes.  You'll also probably have some
race conditions such that if two writes to the same row happen in quick
succession your trigger might not notice that one of them is writing to the same
row as the other. You might need to resort to CAS operations to overcome that,
along with its associated overhead.  But all that said, it should be possible,
though you'll have to write it for yourself in your trigger code.


On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <ka...@peernova.com> wrote:
Hi all,
How to write a trigger in Cassandra to detect updates? My requirement is that I
want a trigger to alert me only when there is an update to an existing row and
looks like given the way INSERT and Update works this might be hard to do
because INSERT will just overwrite if there is an existing row and Update
becomes new insert where there is no row that belongs to certain partition key.
is there a way to solve this problem?
Thanks,

kant

Re: How to write a trigger in Cassandra to only detect updates of an existing row?

Posted by siddharth verma <si...@gmail.com>.

Hi,
I am not sure whether it will help you or not.
Code snippet :
public Collection<Mutation> augment(Partition update)
{
...
    StringBuilder next=new StringBuilder();
    SearchIterator<Clustering, Row> searchIterator =
update.searchIterator(ColumnFilter.all(update.metadata()),false);
        while(searchIterator.hasNext()){

next.append(searchIterator.next(Clustering.EMPTY).toString()+"\001");
        }
...
//next carries non primary key column values
}

If you are sure that your application will NOT do an insert of the form
when ONLY primary key values are specified, you can check the length of
next, to indicate whether it is an insert/update(where atleast one non
primary key column value is inserted) or a delete if length is zero.

The code snippet is to the best of my knowledge, however, kindly try it
once at your end, as this was part of some legacy code, and I am not
completely sure about it.

Here, if the assumption stated above holds true, you could avoid a
cassandra select for that key.

Thanks
Siddharth Verma


On Wed, Oct 5, 2016 at 12:20 AM, Kant Kodali <ka...@peernova.com> wrote:

> Thanks a lot, This helps me to make a decision on not to write one for the
> performance reasons you pointed out!
>
>
>
> On Tue, Oct 4, 2016 11:42 AM, Eric Stevens mightye@gmail.com wrote:
>
>> You would have to perform a SELECT on the row in the trigger code in
>> order to determine if there was underlying data.  Cassandra is in essence
>> an append-only data store, when an INSERT or UPDATE is executed, it has no
>> idea if there is already a row underlying it, and for write performance
>> reasons it also doesn't care.
>>
>> Note that if you do this, you're going to introduce a giant bottleneck in
>> your write path and increase the IO cost of writes.  You'll also probably
>> have some race conditions such that if two writes to the same row happen in
>> quick succession your trigger might not notice that one of them is writing
>> to the same row as the other. You might need to resort to CAS operations to
>> overcome that, along with its associated overhead.  But all that said, it
>> should be possible, though you'll have to write it for yourself in your
>> trigger code.
>>
>>
>>
>> On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <ka...@peernova.com> wrote:
>>
>> Hi all,
>>
>> How to write a trigger in Cassandra to detect updates? My requirement is
>> that I want a trigger to alert me only when there is an update to an
>> existing row and looks like given the way INSERT and Update works this
>> might be hard to do because INSERT will just overwrite if there is an
>> existing row and Update becomes new insert where there is no row that
>> belongs to certain partition key. is there a way to solve this problem?
>>
>> Thanks,
>>
>> kant
>>
>>

Re: How to write a trigger in Cassandra to only detect updates of an existing row?

Posted by Kant Kodali <ka...@peernova.com>.

Thanks a lot, This helps me to make a decision on not to write one for the
performance reasons you pointed out!
 





On Tue, Oct 4, 2016 11:42 AM, Eric Stevens mightye@gmail.com
wrote:
You would have to perform a SELECT on the row in the trigger code in order to
determine if there was underlying data.  Cassandra is in essence an append-only
data store, when an INSERT or UPDATE is executed, it has no idea if there is
already a row underlying it, and for write performance reasons it also doesn't
care.
Note that if you do this, you're going to introduce a giant bottleneck in your
write path and increase the IO cost of writes.  You'll also probably have some
race conditions such that if two writes to the same row happen in quick
succession your trigger might not notice that one of them is writing to the same
row as the other. You might need to resort to CAS operations to overcome that,
along with its associated overhead.  But all that said, it should be possible,
though you'll have to write it for yourself in your trigger code.


On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <ka...@peernova.com> wrote:
Hi all,
How to write a trigger in Cassandra to detect updates? My requirement is that I
want a trigger to alert me only when there is an update to an existing row and
looks like given the way INSERT and Update works this might be hard to do
because INSERT will just overwrite if there is an existing row and Update
becomes new insert where there is no row that belongs to certain partition key.
is there a way to solve this problem?
Thanks,

kant

Re: How to write a trigger in Cassandra to only detect updates of an existing row?

Posted by Eric Stevens <mi...@gmail.com>.

You would have to perform a SELECT on the row in the trigger code in order
to determine if there was underlying data.  Cassandra is in essence an
append-only data store, when an INSERT or UPDATE is executed, it has no
idea if there is already a row underlying it, and for write performance
reasons it also doesn't care.

Note that if you do this, you're going to introduce a giant bottleneck in
your write path and increase the IO cost of writes.  You'll also probably
have some race conditions such that if two writes to the same row happen in
quick succession your trigger might not notice that one of them is writing
to the same row as the other. You might need to resort to CAS operations to
overcome that, along with its associated overhead.  But all that said, it
should be possible, though you'll have to write it for yourself in your
trigger code.

On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <ka...@peernova.com> wrote:

> Hi all,
>
> How to write a trigger in Cassandra to detect updates? My requirement is
> that I want a trigger to alert me only when there is an update to an
> existing row and looks like given the way INSERT and Update works this
> might be hard to do because INSERT will just overwrite if there is an
> existing row and Update becomes new insert where there is no row that
> belongs to certain partition key. is there a way to solve this problem?
>
> Thanks,
>
> kant
>