You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Praveen Baratam <pr...@gmail.com> on 2012/04/22 18:23:37 UTC

Server Side Logic/Script - Triggers / StoreProc

I found that Triggers are coming in Cassandra 1.2 (
https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any
StoreProc like pattern.

I know this has been discussed so many times but never met with
any initiative. Even Groovy was staged out of the trunk.

Cassandra is great for logging and as such will be infinitely more useful
if some logic can be pushed into the Cassandra cluster nearer to the
location of Data to generate a materialized view useful for applications.

Server Side Scripts/Routines in Distributed Databases could soon prove to
be the differentiating factor.

Let me reiterate things with a use case.

In our application we store time series data in wide rows with TTL set on
each point to prevent data from growing beyond acceptable limits. Still the
data size can be a limiting factor to move all of it from the cluster node
to the querying node and then to the application via thrift for processing
and presentation.

Ideally we should process the data on the residing node and pass only the
materialized view of the data upstream. This should be trivial if Cassandra
implements some sort of server side scripting and CQL semantics to call it.

Is anybody else interested in a similar feature? Is it being worked on? Are
there any alternative strategies to this problem?

Praveen

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Praveen Baratam <pr...@gmail.com>.

The point with NoSQL is flexibility and RDBMS is structure and guarantees.

Both patterns IMHO do overlap. But they do have different USPs.

On Mon, Apr 30, 2012 at 3:51 AM, Maxim Potekhin <po...@bnl.gov> wrote:

> About a year ago I started getting a strange feeling that
> the noSQL community is busy re-creating RDBMS in minute detail.
>
> Why did we bother in the first place?
>
> Maxim
>
>
>
> On 4/27/2012 6:49 PM, Data Craftsman wrote:
> > Howdy,
> >
> > Some Polyglot Persistence(NoSQL) products started support server side
> > scripting, similar to RDBMS store procedure.
> > E.g. Redis Lua scripting.
> >
> > I wish it is Python when Cassandra has the server side scripting feature.
> >
> > FYI,
> >
> > http://antirez.com/post/250
> >
> >
> http://nosql.mypopescu.com/post/19949274021/alchemydb-an-integrated-graphdb-rdbms-kv-store
> >
> > "server side scripting support is an extremely powerful tool. Having
> > processing close to data (i.e. data locality) is a well known
> > advantage, ..., it can open the doors to completely new features."
> >
> > Thanks,
> >
> > Charlie (@mujiang) һ�� ľ��
> > =======
> > Data Architect Developer
> > http://mujiang.blogspot.com
> >
> > On Sun, Apr 22, 2012 at 9:35 AM, Brian O'Neill <bo...@gmail.com>
> wrote:
> >> Praveen,
> >>
> >> We are certainly interested. To get things moving we implemented an
> add-on
> >> for Cassandra to demonstrate the viability (using AOP):
> >> https://github.com/hmsonline/cassandra-triggers
> >>
> >> Right now the implementation executes triggers asynchronously, allowing
> you
> >> to implement a java interface and plugin your own java class that will
> get
> >> called for every insert.
> >>
> >> Per the discussion on 1311, we intend to extend our proof of concept to
> be
> >> able to invoke scripts as well.  (minimally we'll enable javascript, but
> >> we'll probably allow for ruby and groovy as well)
> >>
> >> -brian
> >>
> >> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
> >>
> >> I found that Triggers are coming in Cassandra 1.2
> >> (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention
> of any
> >> StoreProc like pattern.
> >>
> >> I know this has been discussed so many times but never met with
> >> any initiative. Even Groovy was staged out of the trunk.
> >>
> >> Cassandra is great for logging and as such will be infinitely more
> useful if
> >> some logic can be pushed into the Cassandra cluster nearer to the
> location
> >> of Data to generate a materialized view useful for applications.
> >>
> >> Server Side Scripts/Routines in Distributed Databases could soon prove
> to be
> >> the differentiating factor.
> >>
> >> Let me reiterate things with a use case.
> >>
> >> In our application we store time series data in wide rows with TTL set
> on
> >> each point to prevent data from growing beyond acceptable limits. Still
> the
> >> data size can be a limiting factor to move all of it from the cluster
> node
> >> to the querying node and then to the application via thrift for
> processing
> >> and presentation.
> >>
> >> Ideally we should process the data on the residing node and pass only
> the
> >> materialized view of the data upstream. This should be trivial if
> Cassandra
> >> implements some sort of server side scripting and CQL semantics to call
> it.
> >>
> >> Is anybody else interested in a similar feature? Is it being worked on?
> Are
> >> there any alternative strategies to this problem?
> >>
> >> Praveen
> >>
> >>
> >>
> >> --
> >> Brian ONeill
> >> Lead Architect, Health Market Science (http://healthmarketscience.com)
> >> mobile:215.588.6024
> >> blog: http://weblogs.java.net/blog/boneill42/
> >> blog: http://brianoneill.blogspot.com/
> >>
> >
> >
>
>

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Maxim Potekhin <po...@bnl.gov>.

About a year ago I started getting a strange feeling that
the noSQL community is busy re-creating RDBMS in minute detail.

Why did we bother in the first place?

Maxim



On 4/27/2012 6:49 PM, Data Craftsman wrote:
> Howdy,
>
> Some Polyglot Persistence(NoSQL) products started support server side
> scripting, similar to RDBMS store procedure.
> E.g. Redis Lua scripting.
>
> I wish it is Python when Cassandra has the server side scripting feature.
>
> FYI,
>
> http://antirez.com/post/250
>
> http://nosql.mypopescu.com/post/19949274021/alchemydb-an-integrated-graphdb-rdbms-kv-store
>
> "server side scripting support is an extremely powerful tool. Having
> processing close to data (i.e. data locality) is a well known
> advantage, ..., it can open the doors to completely new features."
>
> Thanks,
>
> Charlie (@mujiang) 一个 木匠
> =======
> Data Architect Developer
> http://mujiang.blogspot.com
>
> On Sun, Apr 22, 2012 at 9:35 AM, Brian O'Neill <bo...@gmail.com> wrote:
>> Praveen,
>>
>> We are certainly interested. To get things moving we implemented an add-on
>> for Cassandra to demonstrate the viability (using AOP):
>> https://github.com/hmsonline/cassandra-triggers
>>
>> Right now the implementation executes triggers asynchronously, allowing you
>> to implement a java interface and plugin your own java class that will get
>> called for every insert.
>>
>> Per the discussion on 1311, we intend to extend our proof of concept to be
>> able to invoke scripts as well.  (minimally we'll enable javascript, but
>> we'll probably allow for ruby and groovy as well)
>>
>> -brian
>>
>> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
>>
>> I found that Triggers are coming in Cassandra 1.2
>> (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any
>> StoreProc like pattern.
>>
>> I know this has been discussed so many times but never met with
>> any initiative. Even Groovy was staged out of the trunk.
>>
>> Cassandra is great for logging and as such will be infinitely more useful if
>> some logic can be pushed into the Cassandra cluster nearer to the location
>> of Data to generate a materialized view useful for applications.
>>
>> Server Side Scripts/Routines in Distributed Databases could soon prove to be
>> the differentiating factor.
>>
>> Let me reiterate things with a use case.
>>
>> In our application we store time series data in wide rows with TTL set on
>> each point to prevent data from growing beyond acceptable limits. Still the
>> data size can be a limiting factor to move all of it from the cluster node
>> to the querying node and then to the application via thrift for processing
>> and presentation.
>>
>> Ideally we should process the data on the residing node and pass only the
>> materialized view of the data upstream. This should be trivial if Cassandra
>> implements some sort of server side scripting and CQL semantics to call it.
>>
>> Is anybody else interested in a similar feature? Is it being worked on? Are
>> there any alternative strategies to this problem?
>>
>> Praveen
>>
>>
>>
>> --
>> Brian ONeill
>> Lead Architect, Health Market Science (http://healthmarketscience.com)
>> mobile:215.588.6024
>> blog: http://weblogs.java.net/blog/boneill42/
>> blog: http://brianoneill.blogspot.com/
>>
>
>

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Jeffrey Kesselman <je...@gmail.com>.

It should be noted that, in a distributed storage environment, scripting
*at the node of storage* is much more powerful then higher up at some
broker.  Its easy to do this wrong.


2012/4/27 Data Craftsman <da...@gmail.com>

> Howdy,
>
> Some Polyglot Persistence(NoSQL) products started support server side
> scripting, similar to RDBMS store procedure.
> E.g. Redis Lua scripting.
>
> I wish it is Python when Cassandra has the server side scripting feature.
>
> FYI,
>
> http://antirez.com/post/250
>
>
> http://nosql.mypopescu.com/post/19949274021/alchemydb-an-integrated-graphdb-rdbms-kv-store
>
> "server side scripting support is an extremely powerful tool. Having
> processing close to data (i.e. data locality) is a well known
> advantage, ..., it can open the doors to completely new features."
>
> Thanks,
>
> Charlie (@mujiang) 一个 木匠
> =======
> Data Architect Developer
> http://mujiang.blogspot.com
>
> On Sun, Apr 22, 2012 at 9:35 AM, Brian O'Neill <bo...@gmail.com>
> wrote:
> > Praveen,
> >
> > We are certainly interested. To get things moving we implemented an
> add-on
> > for Cassandra to demonstrate the viability (using AOP):
> > https://github.com/hmsonline/cassandra-triggers
> >
> > Right now the implementation executes triggers asynchronously, allowing
> you
> > to implement a java interface and plugin your own java class that will
> get
> > called for every insert.
> >
> > Per the discussion on 1311, we intend to extend our proof of concept to
> be
> > able to invoke scripts as well.  (minimally we'll enable javascript, but
> > we'll probably allow for ruby and groovy as well)
> >
> > -brian
> >
> > On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
> >
> > I found that Triggers are coming in Cassandra 1.2
> > (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention
> of any
> > StoreProc like pattern.
> >
> > I know this has been discussed so many times but never met with
> > any initiative. Even Groovy was staged out of the trunk.
> >
> > Cassandra is great for logging and as such will be infinitely more
> useful if
> > some logic can be pushed into the Cassandra cluster nearer to the
> location
> > of Data to generate a materialized view useful for applications.
> >
> > Server Side Scripts/Routines in Distributed Databases could soon prove
> to be
> > the differentiating factor.
> >
> > Let me reiterate things with a use case.
> >
> > In our application we store time series data in wide rows with TTL set on
> > each point to prevent data from growing beyond acceptable limits. Still
> the
> > data size can be a limiting factor to move all of it from the cluster
> node
> > to the querying node and then to the application via thrift for
> processing
> > and presentation.
> >
> > Ideally we should process the data on the residing node and pass only the
> > materialized view of the data upstream. This should be trivial if
> Cassandra
> > implements some sort of server side scripting and CQL semantics to call
> it.
> >
> > Is anybody else interested in a similar feature? Is it being worked on?
> Are
> > there any alternative strategies to this problem?
> >
> > Praveen
> >
> >
> >
> > --
> > Brian ONeill
> > Lead Architect, Health Market Science (http://healthmarketscience.com)
> > mobile:215.588.6024
> > blog: http://weblogs.java.net/blog/boneill42/
> > blog: http://brianoneill.blogspot.com/
> >
>
>
>
> --
> --
> Thanks,
>
> Charlie (@mujiang) 一个 木匠
> =======
> Data Architect Developer
> http://mujiang.blogspot.com
>



-- 
It's always darkest just before you are eaten by a grue.

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Data Craftsman <da...@gmail.com>.

Howdy,

Some Polyglot Persistence(NoSQL) products started support server side
scripting, similar to RDBMS store procedure.
E.g. Redis Lua scripting.

I wish it is Python when Cassandra has the server side scripting feature.

FYI,

http://antirez.com/post/250

http://nosql.mypopescu.com/post/19949274021/alchemydb-an-integrated-graphdb-rdbms-kv-store

"server side scripting support is an extremely powerful tool. Having
processing close to data (i.e. data locality) is a well known
advantage, ..., it can open the doors to completely new features."

Thanks,

Charlie (@mujiang) 一个 木匠
=======
Data Architect Developer
http://mujiang.blogspot.com

On Sun, Apr 22, 2012 at 9:35 AM, Brian O'Neill <bo...@gmail.com> wrote:
> Praveen,
>
> We are certainly interested. To get things moving we implemented an add-on
> for Cassandra to demonstrate the viability (using AOP):
> https://github.com/hmsonline/cassandra-triggers
>
> Right now the implementation executes triggers asynchronously, allowing you
> to implement a java interface and plugin your own java class that will get
> called for every insert.
>
> Per the discussion on 1311, we intend to extend our proof of concept to be
> able to invoke scripts as well.  (minimally we'll enable javascript, but
> we'll probably allow for ruby and groovy as well)
>
> -brian
>
> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
>
> I found that Triggers are coming in Cassandra 1.2
> (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any
> StoreProc like pattern.
>
> I know this has been discussed so many times but never met with
> any initiative. Even Groovy was staged out of the trunk.
>
> Cassandra is great for logging and as such will be infinitely more useful if
> some logic can be pushed into the Cassandra cluster nearer to the location
> of Data to generate a materialized view useful for applications.
>
> Server Side Scripts/Routines in Distributed Databases could soon prove to be
> the differentiating factor.
>
> Let me reiterate things with a use case.
>
> In our application we store time series data in wide rows with TTL set on
> each point to prevent data from growing beyond acceptable limits. Still the
> data size can be a limiting factor to move all of it from the cluster node
> to the querying node and then to the application via thrift for processing
> and presentation.
>
> Ideally we should process the data on the residing node and pass only the
> materialized view of the data upstream. This should be trivial if Cassandra
> implements some sort of server side scripting and CQL semantics to call it.
>
> Is anybody else interested in a similar feature? Is it being worked on? Are
> there any alternative strategies to this problem?
>
> Praveen
>
>
>
> --
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://weblogs.java.net/blog/boneill42/
> blog: http://brianoneill.blogspot.com/
>



-- 
--
Thanks,

Charlie (@mujiang) 一个 木匠
=======
Data Architect Developer
http://mujiang.blogspot.com

Re: question about updates internal work in case of cache

Posted by Sylvain Lebresne <sy...@datastax.com>.

On Mon, Apr 23, 2012 at 10:19 AM, DE VITO Dominique
<do...@thalesgroup.com> wrote:
> Hi,
>
>
>
> Let's suppose a column (name+value) is cached in memory, with timestamp T.
>
>
>
> 1) An update, for this column, arrives with exactly the *same* timestamp,
> and the *same* value.
>
> Is the commitlog updated ?
>
>
>
> 2) An update, for this column, arrives with a timestamp < T.
>
> Is the commitlog updated ?
>

Yes to both, the commit log is always updated. In fact, the commit log
insertion is done in parallel and independently with in-memory updates
(which include caches updates).

--
Sylvain

>
>
> Thanks for your help.
>
>
>
> Regards,
>
> Dominique
>
>

question about updates internal work in case of cache

Posted by DE VITO Dominique <do...@thalesgroup.com>.

Hi,

Let's suppose a column (name+value) is cached in memory, with timestamp T.

1) An update, for this column, arrives with exactly the *same* timestamp, and the *same* value.
Is the commitlog updated ?

2) An update, for this column, arrives with a timestamp < T.
Is the commitlog updated ?

Thanks for your help.

Regards,
Dominique

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by aaron morton <aa...@thelastpickle.com>.

Out of interest some questions…

When writing through triggers how do you handle the CL guarantee ? Is the CL level checked once at the start or checked for each embedded code invocation ? 

Do you still guarantee the (non counter) writes as idempotent ?  i.e. do the triggers need to be deterministic ? Can clients retry operations that timed out ?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 23/04/2012, at 5:13 AM, Colin Clark wrote:

> In my opinion, triggers/stored procedures are an absolute requirement for any distributed database.
> 
> We've been using stored procedures in Cassandra now for a while, we've made modifications such that we don't really write directly anymore but pass everything through either a default stored procedures (which is just what was there before) or a dynamically loaded piece of java.
> 
> These stored procedures can call other dynamically loaded pieces of java as well - we don't have any plans to implement any scripting capabilities.  We can also 'select' from procedures.
> 
> The idea of downloading data from a distributed data base for processing flies in the face of what nosql and bigdata is all about - you've got to do it in the db.
> 
> On Apr 22, 2012, at 11:35 AM, Brian O'Neill wrote:
> 
>> Praveen,
>> 
>> We are certainly interested. To get things moving we implemented an add-on for Cassandra to demonstrate the viability (using AOP):
>> https://github.com/hmsonline/cassandra-triggers
>> 
>> Right now the implementation executes triggers asynchronously, allowing you to implement a java interface and plugin your own java class that will get called for every insert.
>> 
>> Per the discussion on 1311, we intend to extend our proof of concept to be able to invoke scripts as well.  (minimally we'll enable javascript, but we'll probably allow for ruby and groovy as well)
>> 
>> -brian
>> 
>> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
>> 
>>> I found that Triggers are coming in Cassandra 1.2 (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any StoreProc like pattern.
>>> 
>>> I know this has been discussed so many times but never met with any initiative. Even Groovy was staged out of the trunk.
>>> 
>>> Cassandra is great for logging and as such will be infinitely more useful if some logic can be pushed into the Cassandra cluster nearer to the location of Data to generate a materialized view useful for applications.
>>> 
>>> Server Side Scripts/Routines in Distributed Databases could soon prove to be the differentiating factor.
>>> 
>>> Let me reiterate things with a use case.
>>> 
>>> In our application we store time series data in wide rows with TTL set on each point to prevent data from growing beyond acceptable limits. Still the data size can be a limiting factor to move all of it from the cluster node to the querying node and then to the application via thrift for processing and presentation.
>>> 
>>> Ideally we should process the data on the residing node and pass only the materialized view of the data upstream. This should be trivial if Cassandra implements some sort of server side scripting and CQL semantics to call it.
>>> 
>>> Is anybody else interested in a similar feature? Is it being worked on? Are there any alternative strategies to this problem?
>>> 
>>> Praveen
>>> 
>>> 
>> 
>> -- 
>> Brian ONeill
>> Lead Architect, Health Market Science (http://healthmarketscience.com)
>> mobile:215.588.6024
>> blog: http://weblogs.java.net/blog/boneill42/
>> blog: http://brianoneill.blogspot.com/
>> 
>

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Praveen Baratam <pr...@gmail.com>.

Hello Collin,

You mean to say you have made un-published modifications to Cassandra such
that you can execute logic in Db in read and write path?

Do you have any plans to open your code base and document it?

Praveen

On Sun, Apr 22, 2012 at 10:43 PM, Colin Clark <co...@gmail.com> wrote:

> In my opinion, triggers/stored procedures are an absolute requirement for
> any distributed database.
>
> We've been using stored procedures in Cassandra now for a while, we've
> made modifications such that we don't really write directly anymore but
> pass everything through either a default stored procedures (which is just
> what was there before) or a dynamically loaded piece of java.
>
> These stored procedures can call other dynamically loaded pieces of java
> as well - we don't have any plans to implement any scripting capabilities.
>  We can also 'select' from procedures.
>
> The idea of downloading data from a distributed data base for processing
> flies in the face of what nosql and bigdata is all about - you've got to do
> it in the db.
>
> On Apr 22, 2012, at 11:35 AM, Brian O'Neill wrote:
>
> > Praveen,
> >
> > We are certainly interested. To get things moving we implemented an
> add-on for Cassandra to demonstrate the viability (using AOP):
> > https://github.com/hmsonline/cassandra-triggers
> >
> > Right now the implementation executes triggers asynchronously, allowing
> you to implement a java interface and plugin your own java class that will
> get called for every insert.
> >
> > Per the discussion on 1311, we intend to extend our proof of concept to
> be able to invoke scripts as well.  (minimally we'll enable javascript, but
> we'll probably allow for ruby and groovy as well)
> >
> > -brian
> >
> > On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
> >
> >> I found that Triggers are coming in Cassandra 1.2 (
> https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of
> any StoreProc like pattern.
> >>
> >> I know this has been discussed so many times but never met with any
> initiative. Even Groovy was staged out of the trunk.
> >>
> >> Cassandra is great for logging and as such will be infinitely more
> useful if some logic can be pushed into the Cassandra cluster nearer to the
> location of Data to generate a materialized view useful for applications.
> >>
> >> Server Side Scripts/Routines in Distributed Databases could soon prove
> to be the differentiating factor.
> >>
> >> Let me reiterate things with a use case.
> >>
> >> In our application we store time series data in wide rows with TTL set
> on each point to prevent data from growing beyond acceptable limits. Still
> the data size can be a limiting factor to move all of it from the cluster
> node to the querying node and then to the application via thrift for
> processing and presentation.
> >>
> >> Ideally we should process the data on the residing node and pass only
> the materialized view of the data upstream. This should be trivial if
> Cassandra implements some sort of server side scripting and CQL semantics
> to call it.
> >>
> >> Is anybody else interested in a similar feature? Is it being worked on?
> Are there any alternative strategies to this problem?
> >>
> >> Praveen
> >>
> >>
> >
> > --
> > Brian ONeill
> > Lead Architect, Health Market Science (http://healthmarketscience.com)
> > mobile:215.588.6024
> > blog: http://weblogs.java.net/blog/boneill42/
> > blog: http://brianoneill.blogspot.com/
> >
>
>

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Colin Clark <co...@gmail.com>.

In my opinion, triggers/stored procedures are an absolute requirement for any distributed database.

We've been using stored procedures in Cassandra now for a while, we've made modifications such that we don't really write directly anymore but pass everything through either a default stored procedures (which is just what was there before) or a dynamically loaded piece of java.

These stored procedures can call other dynamically loaded pieces of java as well - we don't have any plans to implement any scripting capabilities.  We can also 'select' from procedures.

The idea of downloading data from a distributed data base for processing flies in the face of what nosql and bigdata is all about - you've got to do it in the db.

On Apr 22, 2012, at 11:35 AM, Brian O'Neill wrote:

> Praveen,
> 
> We are certainly interested. To get things moving we implemented an add-on for Cassandra to demonstrate the viability (using AOP):
> https://github.com/hmsonline/cassandra-triggers
> 
> Right now the implementation executes triggers asynchronously, allowing you to implement a java interface and plugin your own java class that will get called for every insert.
> 
> Per the discussion on 1311, we intend to extend our proof of concept to be able to invoke scripts as well.  (minimally we'll enable javascript, but we'll probably allow for ruby and groovy as well)
> 
> -brian
> 
> On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:
> 
>> I found that Triggers are coming in Cassandra 1.2 (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any StoreProc like pattern.
>> 
>> I know this has been discussed so many times but never met with any initiative. Even Groovy was staged out of the trunk.
>> 
>> Cassandra is great for logging and as such will be infinitely more useful if some logic can be pushed into the Cassandra cluster nearer to the location of Data to generate a materialized view useful for applications.
>> 
>> Server Side Scripts/Routines in Distributed Databases could soon prove to be the differentiating factor.
>> 
>> Let me reiterate things with a use case.
>> 
>> In our application we store time series data in wide rows with TTL set on each point to prevent data from growing beyond acceptable limits. Still the data size can be a limiting factor to move all of it from the cluster node to the querying node and then to the application via thrift for processing and presentation.
>> 
>> Ideally we should process the data on the residing node and pass only the materialized view of the data upstream. This should be trivial if Cassandra implements some sort of server side scripting and CQL semantics to call it.
>> 
>> Is anybody else interested in a similar feature? Is it being worked on? Are there any alternative strategies to this problem?
>> 
>> Praveen
>> 
>> 
> 
> -- 
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://weblogs.java.net/blog/boneill42/
> blog: http://brianoneill.blogspot.com/
>

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Brian O'Neill <bo...@gmail.com>.

Praveen,

We are certainly interested. To get things moving we implemented an add-on for Cassandra to demonstrate the viability (using AOP):
https://github.com/hmsonline/cassandra-triggers

Right now the implementation executes triggers asynchronously, allowing you to implement a java interface and plugin your own java class that will get called for every insert.

Per the discussion on 1311, we intend to extend our proof of concept to be able to invoke scripts as well.  (minimally we'll enable javascript, but we'll probably allow for ruby and groovy as well)

-brian

On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:

> I found that Triggers are coming in Cassandra 1.2 (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any StoreProc like pattern.
> 
> I know this has been discussed so many times but never met with any initiative. Even Groovy was staged out of the trunk.
> 
> Cassandra is great for logging and as such will be infinitely more useful if some logic can be pushed into the Cassandra cluster nearer to the location of Data to generate a materialized view useful for applications.
> 
> Server Side Scripts/Routines in Distributed Databases could soon prove to be the differentiating factor.
> 
> Let me reiterate things with a use case.
> 
> In our application we store time series data in wide rows with TTL set on each point to prevent data from growing beyond acceptable limits. Still the data size can be a limiting factor to move all of it from the cluster node to the querying node and then to the application via thrift for processing and presentation.
> 
> Ideally we should process the data on the residing node and pass only the materialized view of the data upstream. This should be trivial if Cassandra implements some sort of server side scripting and CQL semantics to call it.
> 
> Is anybody else interested in a similar feature? Is it being worked on? Are there any alternative strategies to this problem?
> 
> Praveen
> 
> 

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/

Re: Server Side Logic/Script - Triggers / StoreProc

Posted by Brian O'Neill <bo...@gmail.com>.

Praveen,

We are certainly interested. To get things moving we implemented an add-on for Cassandra to demonstrate the viability (using AOP):
https://github.com/hmsonline/cassandra-triggers

Right now the implementation executes triggers asynchronously, allowing you to implement a java interface and plugin your own java class that will get called for every insert.

Per the discussion on 1311, we intend to extend our proof of concept to be able to invoke scripts as well.  (minimally we'll enable javascript, but we'll probably allow for ruby and groovy as well)

-brian

On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:

> I found that Triggers are coming in Cassandra 1.2 (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any StoreProc like pattern.
> 
> I know this has been discussed so many times but never met with any initiative. Even Groovy was staged out of the trunk.
> 
> Cassandra is great for logging and as such will be infinitely more useful if some logic can be pushed into the Cassandra cluster nearer to the location of Data to generate a materialized view useful for applications.
> 
> Server Side Scripts/Routines in Distributed Databases could soon prove to be the differentiating factor.
> 
> Let me reiterate things with a use case.
> 
> In our application we store time series data in wide rows with TTL set on each point to prevent data from growing beyond acceptable limits. Still the data size can be a limiting factor to move all of it from the cluster node to the querying node and then to the application via thrift for processing and presentation.
> 
> Ideally we should process the data on the residing node and pass only the materialized view of the data upstream. This should be trivial if Cassandra implements some sort of server side scripting and CQL semantics to call it.
> 
> Is anybody else interested in a similar feature? Is it being worked on? Are there any alternative strategies to this problem?
> 
> Praveen
> 
> 

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/