You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by DE VITO Dominique <do...@thalesgroup.com> on 2014/03/10 15:50:08 UTC

about trigger execution ??? // RE: sending notifications through data replication on remote clusters

> You should be able to achieve what you're looking for with a trigger vs. a modification to the core of Cassandra.
>
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support

Well, good point.

It leads to the question: (a) are triggers executed on all (local+remote) coordinator nodes (and then, N DC => N coordinator nodes => N executions of the triggers) ?

(b) Or are triggers executed only on the first coordinator node, and not the (next/remote DC) coordinator nodes ?

My opinion is (b), and in that case, triggers won't do the job.
(b) would make sense, because the first coordinator node would augment original row mutations and propagate them towards other coordinator nodes. Then, no need to execute triggers on other (remote) coordinator nodes.

Is there somebody knowing about trigger execution : is it (a) or (b) ?

Thanks.

Dominique




On Mon, Mar 10, 2014 at 10:06 AM, DE VITO Dominique <do...@thalesgroup.com> wrote:
> On 03/10/2014 07:49 AM, DE VITO Dominique wrote:
> > If I update a data on DC1, I just want apps "connected-first" to DC2
> > to be informed when this data is available on DC2 after replication.
>
> If I run a SELECT, I'm going to receive the latest data per the read conditions (ONE, TWO, QUORUM), regardless of location of the client connection. If using > network aware topology, you'll get the most current data in that DC.
>
> > When using Thrift, one way could be to modify CassandraServer class,
> > to send notification to apps according to data coming in into the
> > coordinator node of DC2.
> >
> > Is it "common" (~ the way to do it) ?
> >
> > Is there another way to do so ?
> >
> > When using CQL, is there a precise "src code" place to modify for the
> > same purpose ?
>
> Notifying connected clients about random INSERT or UPDATE statements that ran somewhere seems to be far, far outside the scope of storing data. Just configure your client to SELECT in the manner that you need.
>
> I may not fully understand your problem and could be simplifying things in my head, so feel free to expand.
>
> --
> Michael
First of all, thanks for you answer and your attention.

I know about SELECT.
The idea, here, is to avoid doing POLLING regularly, as it could be easily a performance nightmare.
The idea is to replace POLLING with PUSH, just like in many cases like SEDA architecture, or CQRS architecture, or continuous querying with some data stores.

So, following this PUSH idea, it would be nice to inform apps connected to a preferred DC that some new data have been replicated, and is now "available".

I hope it's clearer.

Dominique



RE: about trigger execution ??? // RE: sending notifications through data replication on remote clusters

Posted by DE VITO Dominique <do...@thalesgroup.com>.
Thanks a lot.

[@@ THALES GROUP INTERNAL @@]

De : Edward Capriolo [mailto:edlinuxguru@gmail.com]
Envoyé : lundi 10 mars 2014 16:47
À : user@cassandra.apache.org
Objet : Re: about trigger execution ??? // RE: sending notifications through data replication on remote clusters

Just so you know you should probably apply the
[jira] [Commented] (CASSANDRA-6790) Triggers are broken in trunk ...<https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCcQFjAA&url=https%3A%2F%2Fwww.mail-archive.com%2Fcommits%40cassandra.apache.org%2Fmsg80563.html&ei=Hd4dU-7IGOTq0wHT-YHQCw&usg=AFQjCNELJe3hmp_gJWXih91S1CL2f4KLtQ&sig2=xJ5h_7FqX-qZ6iVgXwpr-g&bvm=bv.62578216,d.dmQ>

patch because triggers are currently only called on batch_mutate and will fail if called on insert.

On Mon, Mar 10, 2014 at 10:50 AM, DE VITO Dominique <do...@thalesgroup.com>> wrote:
> You should be able to achieve what you're looking for with a trigger vs. a modification to the core of Cassandra.
>
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support

Well, good point.

It leads to the question: (a) are triggers executed on all (local+remote) coordinator nodes (and then, N DC => N coordinator nodes => N executions of the triggers) ?

(b) Or are triggers executed only on the first coordinator node, and not the (next/remote DC) coordinator nodes ?

My opinion is (b), and in that case, triggers won't do the job.
(b) would make sense, because the first coordinator node would augment original row mutations and propagate them towards other coordinator nodes. Then, no need to execute triggers on other (remote) coordinator nodes.

Is there somebody knowing about trigger execution : is it (a) or (b) ?

Thanks.

Dominique




On Mon, Mar 10, 2014 at 10:06 AM, DE VITO Dominique <do...@thalesgroup.com>> wrote:
> On 03/10/2014 07:49 AM, DE VITO Dominique wrote:
> > If I update a data on DC1, I just want apps "connected-first" to DC2
> > to be informed when this data is available on DC2 after replication.
>
> If I run a SELECT, I'm going to receive the latest data per the read conditions (ONE, TWO, QUORUM), regardless of location of the client connection. If using > network aware topology, you'll get the most current data in that DC.
>
> > When using Thrift, one way could be to modify CassandraServer class,
> > to send notification to apps according to data coming in into the
> > coordinator node of DC2.
> >
> > Is it "common" (~ the way to do it) ?
> >
> > Is there another way to do so ?
> >
> > When using CQL, is there a precise "src code" place to modify for the
> > same purpose ?
>
> Notifying connected clients about random INSERT or UPDATE statements that ran somewhere seems to be far, far outside the scope of storing data. Just configure your client to SELECT in the manner that you need.
>
> I may not fully understand your problem and could be simplifying things in my head, so feel free to expand.
>
> --
> Michael
First of all, thanks for you answer and your attention.

I know about SELECT.
The idea, here, is to avoid doing POLLING regularly, as it could be easily a performance nightmare.
The idea is to replace POLLING with PUSH, just like in many cases like SEDA architecture, or CQRS architecture, or continuous querying with some data stores.

So, following this PUSH idea, it would be nice to inform apps connected to a preferred DC that some new data have been replicated, and is now "available".

I hope it's clearer.

Dominique




Re: about trigger execution ??? // RE: sending notifications through data replication on remote clusters

Posted by Edward Capriolo <ed...@gmail.com>.
Just so you know you should probably apply the
[jira] [Commented] (*CASSANDRA*-6790) *Triggers* are broken in trunk
*...*<https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCcQFjAA&url=https%3A%2F%2Fwww.mail-archive.com%2Fcommits%40cassandra.apache.org%2Fmsg80563.html&ei=Hd4dU-7IGOTq0wHT-YHQCw&usg=AFQjCNELJe3hmp_gJWXih91S1CL2f4KLtQ&sig2=xJ5h_7FqX-qZ6iVgXwpr-g&bvm=bv.62578216,d.dmQ>

patch because triggers are currently only called on batch_mutate and will
fail if called on insert.


On Mon, Mar 10, 2014 at 10:50 AM, DE VITO Dominique <
dominique.devito@thalesgroup.com> wrote:

> > You should be able to achieve what you're looking for with a trigger vs.
> a modification to the core of Cassandra.
>
> >
>
> >
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support
>
>
>
> Well, good point.
>
>
>
> It leads to the question: (a) are triggers executed on all (local+remote)
> coordinator nodes (and then, N DC => N coordinator nodes => N executions of
> the triggers) ?
>
>
>
> (b) Or are triggers executed only on the first coordinator node, and not
> the (next/remote DC) coordinator nodes ?
>
>
>
> My opinion is (b), and in that case, triggers won't do the job.
>
> (b) would make sense, because the first coordinator node would augment
> original row mutations and propagate them towards other coordinator nodes.
> Then, no need to execute triggers on other (remote) coordinator nodes.
>
>
>
> Is there somebody knowing about trigger execution : is it (a) or (b) ?
>
>
>
> Thanks.
>
>
>
> Dominique
>
>
>
>
>
>
>
>
>
> On Mon, Mar 10, 2014 at 10:06 AM, DE VITO Dominique <
> dominique.devito@thalesgroup.com> wrote:
>
> > On 03/10/2014 07:49 AM, DE VITO Dominique wrote:
>
> > > If I update a data on DC1, I just want apps "connected-first" to DC2
>
> > > to be informed when this data is available on DC2 after replication.
>
> >
>
> > If I run a SELECT, I'm going to receive the latest data per the read
> conditions (ONE, TWO, QUORUM), regardless of location of the client
> connection. If using > network aware topology, you'll get the most current
> data in that DC.
>
> >
>
> > > When using Thrift, one way could be to modify CassandraServer class,
>
> > > to send notification to apps according to data coming in into the
>
> > > coordinator node of DC2.
>
> > >
>
> > > Is it "common" (~ the way to do it) ?
>
> > >
>
> > > Is there another way to do so ?
>
> > >
>
> > > When using CQL, is there a precise "src code" place to modify for the
>
> > > same purpose ?
>
> >
>
> > Notifying connected clients about random INSERT or UPDATE statements
> that ran somewhere seems to be far, far outside the scope of storing data.
> Just configure your client to SELECT in the manner that you need.
>
> >
>
> > I may not fully understand your problem and could be simplifying things
> in my head, so feel free to expand.
>
> >
>
> > --
>
> > Michael
>
> First of all, thanks for you answer and your attention.
>
>
>
> I know about SELECT.
>
> The idea, here, is to avoid doing POLLING regularly, as it could be easily
> a performance nightmare.
>
> The idea is to replace POLLING with PUSH, just like in many cases like
> SEDA architecture, or CQRS architecture, or continuous querying with some
> data stores.
>
>
>
> So, following this PUSH idea, it would be nice to inform apps connected to
> a preferred DC that some new data have been replicated, and is now
> "available".
>
>
>
> I hope it's clearer.
>
>
>
> Dominique
>
>
>
>
>