You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Nulik Nol <nu...@gmail.com> on 2013/07/26 09:32:37 UTC

Custom commands in cassandra

Hi,
I am a startup in development stage, and I want to embed my app
functionality into cassandra's server. How might it be done? Some
databases allow you to load server-side extensions or commands that
are executed upon client's request as some sort of stored procedures.
Another databases let you embed the engine into your app. What would
be the way to do this with cassandra?  My reasons to merge the app
into cassandra's code is to gain speed so I will be running only one
java vm process along multiple cores inside a node.

Will appreciate any comment.

Thanks
Nulik

Re: Custom commands in cassandra

Posted by Jon Haddad <jo...@jonhaddad.com>.

Aside from the problems mentioned below, it's a rare case that tightly coupling your application code directly into your database makes it easier to maintain your codebase, especially as you scale.

If you roll out your custom Cassandra application, then decide you need search, will you also embed elastic search?  What if you want to use something that's not written in Java?

Communication protocols were written for a reason.

Jon

On Aug 14, 2013, at 7:51 PM, Aaron Morton <aa...@thelastpickle.com> wrote:

>> They also stuck themselves on Cassandra 0.7 forever.
> To reinforce that point, look at the data stax site or the last conference for some of the performance metrics comparing 1.2 to 1.0 and before. 
> 
> While you are worrying about the transport to cassandra, the project making things go faster. IMHO you will get more value for money ensuring you have timely access to the work other people do.  
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 13/08/2013, at 12:12 PM, Robert Coli <rc...@eventbrite.com> wrote:
> 
>> On Mon, Jul 29, 2013 at 12:42 PM, Nulik Nol <nu...@gmail.com> wrote:
>> > Embedding the server will add *a lot* of complexity.
>> 
>> that's a conjecture one would come at first sight, but if you analyze
>> it , it is the opposite. Complexity increases with code, and
>> communication between processes (like via socket or memory buffer)
>> implies more code, so if you embed and call server objects directly,
>> your code will be simpler.
>> 
>> Leaving aside the somewhat nonsensical result of applying this thinking to computing in general...
>> 
>> ... have you ever spoken with someone who has maintained such a large patchset to a database?
>> 
>> The people I spoke with at the Cassandra Summit who had forked Cassandra 0.7 so that they could remove thrift and get maximum performance from their patched version did in fact get improved performance. They also stuck themselves on Cassandra 0.7 forever.
>> 
>> Complexity is not just measured in lines of code, it derives in part from the maintainability of any given solution. In the open source world, this means being able to rebase your forked project to track/merge with upstream.
>> 
>> =Rob
>> 
>

Re: Custom commands in cassandra

Posted by Aaron Morton <aa...@thelastpickle.com>.

> They also stuck themselves on Cassandra 0.7 forever.
To reinforce that point, look at the data stax site or the last conference for some of the performance metrics comparing 1.2 to 1.0 and before. 

While you are worrying about the transport to cassandra, the project making things go faster. IMHO you will get more value for money ensuring you have timely access to the work other people do.  

Cheers

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/08/2013, at 12:12 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Mon, Jul 29, 2013 at 12:42 PM, Nulik Nol <nu...@gmail.com> wrote:
> > Embedding the server will add *a lot* of complexity.
> 
> that's a conjecture one would come at first sight, but if you analyze
> it , it is the opposite. Complexity increases with code, and
> communication between processes (like via socket or memory buffer)
> implies more code, so if you embed and call server objects directly,
> your code will be simpler.
> 
> Leaving aside the somewhat nonsensical result of applying this thinking to computing in general...
> 
> ... have you ever spoken with someone who has maintained such a large patchset to a database?
> 
> The people I spoke with at the Cassandra Summit who had forked Cassandra 0.7 so that they could remove thrift and get maximum performance from their patched version did in fact get improved performance. They also stuck themselves on Cassandra 0.7 forever.
> 
> Complexity is not just measured in lines of code, it derives in part from the maintainability of any given solution. In the open source world, this means being able to rebase your forked project to track/merge with upstream.
> 
> =Rob
>

Re: Custom commands in cassandra

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Jul 29, 2013 at 12:42 PM, Nulik Nol <nu...@gmail.com> wrote:

> > Embedding the server will add *a lot* of complexity.
>
> that's a conjecture one would come at first sight, but if you analyze
> it , it is the opposite. Complexity increases with code, and
> communication between processes (like via socket or memory buffer)
> implies more code, so if you embed and call server objects directly,
> your code will be simpler.

Leaving aside the somewhat nonsensical result of applying this thinking to
computing in general...

... have you ever spoken with someone who has maintained such a large
patchset to a database?

The people I spoke with at the Cassandra Summit who had forked Cassandra
0.7 so that they could remove thrift and get maximum performance from their
patched version did in fact get improved performance. They also stuck
themselves on Cassandra 0.7 forever.

Complexity is not just measured in lines of code, it derives in part from
the maintainability of any given solution. In the open source world, this
means being able to rebase your forked project to track/merge with upstream.

=Rob

Re: Custom commands in cassandra

Posted by Nulik Nol <nu...@gmail.com>.

hi

> Have you identified issues where throughput or latency is an issue ?
no, I am on design stage of my app and I want to do it the fastest way
possible from the beginning
> Most performance gains are going to be made by getting the data model right.
hope to get it right, and with embedding will get even more performance.
>
>
> Embedding the server will add *a lot* of complexity.

that's a conjecture one would come at first sight, but if you analyze
it , it is the opposite. Complexity increases with code, and
communication between processes (like via socket or memory buffer)
implies more code, so if you embed and call server objects directly,
your code will be simpler. JVM already implements security model
between objects, so no isolation of app code from server code is
necessary.  On the countrary,the performance gains will be much bigger
, specially for lots of queries of small objects, because each query
includes communication overhead.

btw, I already found a good example of embedding, it is implemented in
a graph database called Titan. here are the files to look for:
https://github.com/thinkaurelius/titan/tree/master/titan-cassandra/src/main/java/com/thinkaurelius/titan/diskstorage/cassandra/embedded

Regards

Re: Custom commands in cassandra

Posted by aaron morton <aa...@thelastpickle.com>.

>  My reasons to merge the app
> into cassandra's code is to gain speed so I will be running only one
> java vm process along multiple cores inside a node.
Have you identified issues where throughput or latency is an issue ? 
Most performance gains are going to be made by getting the data model right. 

Embedding the server will add *a lot* of complexity. 
 
Cheers

-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 27/07/2013, at 9:35 AM, Radim Kolar <hs...@filez.com> wrote:

> basic osgi integration is easy
> 
> you need to get osgi compatible container and hookup it to cassandra daemon. Its very easy to do - about 5 lines.
> osgi container can be accessed from network, you need to deploy your application into container on each node and start it up. Then use some RPC mechanism like thrift or JMS for communication with server part of your application.

Re: Custom commands in cassandra

Posted by Radim Kolar <hs...@filez.com>.

basic osgi integration is easy

you need to get osgi compatible container and hookup it to cassandra 
daemon. Its very easy to do - about 5 lines.
osgi container can be accessed from network, you need to deploy your 
application into container on each node and start it up. Then use some 
RPC mechanism like thrift or JMS for communication with server part of 
your application.

Re: Custom commands in cassandra

Posted by Nulik Nol <nu...@gmail.com>.

On Fri, Jul 26, 2013 at 4:31 AM, Radim Kolar <hs...@filez.com> wrote:
>
>> What would be the way to do this with cassandra?
>
> embed app into server, use OSGi.
Thanks, but a quick search of cassandra's source didn't return any
word like "osgi", are you sure I can emmbed my code into cassandra ?
could you tell me the class name ? or any example link ?

Regards
Nulik

Re: Custom commands in cassandra

Posted by Radim Kolar <hs...@filez.com>.

> What would be the way to do this with cassandra?
embed app into server, use OSGi.