You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Edward Capriolo <ed...@gmail.com> on 2013/12/24 21:31:49 UTC

Cassandra unit testing becoming nearly impossible: suggesting alternative.

I am not sure there how many people have been around developing Cassandra
for as long as I have, but the state of all the client libraries and the
cassandra server is WORD_I_DONT_WANT_TO_SAY.

Here is an example of something I am seeing:
ERROR 14:59:45,845 Exception in thread Thread[Thrift:5,5,main]
java.lang.AbstractMethodError: org.apache.thrift.ProcessFunction.isOneway()Z
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:51)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
DEBUG 14:59:51,654 retryPolicy for schema_triggers is 0.99

In short: If you are new to cassandra and only using the newest client I am
sure everything is peachy for you.

For people that have been using Cassandra for a while it is harder to "jump
ship" when something better comes along. You need sometimes to support both
hector and astyanax, it happens.

For a while I have been using hector. Even not to use hector as an API, but
the one nice thing I got from hector was a simple EmbeddedServer that would
clean up after itself. Hector seems badly broken at the moment. I have no
idea how the current versions track with anything out there in the
cassandra world.

For a while I played with https://github.com/Netflix/astyanax, which has
it's own version and schemes and dependent libraries. (astyanax has some
packaging error that forces me into maven3)

Enter cassandra 2.0 which forces you into java 0.7. Besides that it has
it's own kit of things it seems to want.

I am guessing since hectors embedded server does not work, and I should go
to https://github.com/jsevellec/cassandra-unit not sure...really...how
anyone does this anymore. I am sure I could dive into the source code and
figure this out, but I would just rather have a stable piece of code that
brings up the embedded server that "just works" and "continues working".

I can not seem to get this working right either. (since it includes hector
I see from the pom)

Between thrift, cassandra,client x, it is almost impossible to build a sane
classpath, and that is not even counting the fact that people have their
own classpath issues (with guava mismatches etc).

I think the only sane thing to do is start shipping cassandra-embedded like
this:

https://github.com/kstyrc/embedded-redis

In other words package embedded-cassandra as a binary. Don't force the
client/application developer to bring cassandra on the classpath and fight
with mismatches in thrift/guava etc. That or provide a completely shaded
cassandra server for embedded testing. As it stands now trying to support a
setup that uses more than one client or works with multiple versions of
cassandra is major pita.  (aka library x compiled against 1.2.0 library y
compiled against 2.0.3)

Does anyone have any thoughts on this, or tried something similar?

Edward

Re: Cassandra unit testing becoming nearly impossible: suggesting alternative.

Posted by Joe Stein <cr...@gmail.com>.
I updated my repo with Vagrant and bash scripts to install Cassandra 2.0.3
https://github.com/stealthly/scala-cassandra/

0) git clone https://github.com/stealthly/scala-cassandra
1) cd scala-cassandra
2) vagrant up

Cassandra will be running in the virtual machine on 172.16.7.2 and is
accessible from your host machine (cqlsh, your app, whatever).

To verify step 3 would be ./sbt test just to make sure everything is
running right.

Everyone time you rebuild the VM (takes a minute or two) it is a whole new
instance.  If you fork foreground you have to worry about data and that not
isolated and other stuff.

On Fri, Dec 27, 2013 at 10:48 PM, Edward Capriolo <ed...@gmail.com>wrote:

> I think i will invest the time launching cassandra in a forked forground
> process, maybe building the yaml dynamically.
>
> On Friday, December 27, 2013, Nate McCall <na...@thelastpickle.com> wrote:
> > I've also moved on to container-based (using Vagrant+docker) setup for
> doing automated integration stuff. This is more difficult to configure for
> build systems like Jenkins, but it can be done and once completed the
> benefits are substantial - as Joe notes, the most immediate is the removal
> of variance between different environments.
> > However, for in process testing with Maven or similar, the Usergrid
> project [0] probably has the most functionally advanced test architecture
> [1]. Do understand that it took us a very long time to get there and
> involves some fairly tight integration with JUnit and (to a lesser degree)
> maven.
> > The UG plumbing is purpose built towards a specific data model so it's
> not something that can be just dropped in, but it can be pulled apart in a
> straight forward way (provided you understand JUnit - which is not really
> trivial) and generalized pretty easily. It's all ASF-licensed, so take what
> you need if you find it useful.
> > [0] https://usergrid.incubator.apache.org/
> > [1]
> https://github.com/usergrid/usergrid/blob/master/stack/test-utils/src/main/java/org/usergrid/cassandra/CassandraResource.java
> >
> > On Wed, Dec 25, 2013 at 2:42 PM, Joe Stein <cr...@gmail.com> wrote:
> >
> > I have been using vagrant (e.g.
> https://github.com/stealthly/scala-cassandra/ ) which is 100%
> reproducible across devs and test systems (prod in some cases).  Also have
> a Docker setup too https://github.com/pegasussolutions/docker-cassandra .
>  I have been doing this more and more with clients to better mimic
> production before production and smoothing the release process from
> development.  I also use packer (scripts released soon) to build images too
> (http://packer.io)
> > Love vagrant, packer and docker!!!  Apache Mesos too :)
> >
> >
> > /*******************************************
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop
> > ********************************************/
> >
> > On Dec 25, 2013, at 3:28 PM, horschi <ho...@gmail.com> wrote:
> >
> > Hi Ed,
> >
> > my opinion on unit testing with C* is: Use the real database, not any
> embedded crap :-)
> >
> > All you need are fast truncates, by which I mean:
> > JVM_OPTS="$JVM_OPTS -Dcassandra.unsafesystem=true"
> > and
> > auto_snapshot: false
> >
> > This setup works really nice for me (C* 1.1 and 1.2, have not tested 2.0
> yet).
> >
> > Imho this setup is better for multiple reasons:
> > - No extra classpath issues
> > - Faster: Running JUnits and C* in one JVM would require a really large
> heap (for me at least).
> > - Faster: No Cassandra startup everytime I run my tests.
> >
> > The only downside is that developers must change the properties in their
> configs.
> >
> > cheers,
> > Christian
> >
> >
> >
> > On Tue, Dec 24, 2013 at 9:31 PM, Edward Capriolo <ed...@gmail.com>
> wrote:
> >
> > I am not sure there how many people have been around developing
> Cassandra for as long as I have, but the state of all the client libraries
> and the cassandra server is WORD_I_DONT_WANT_TO_SAY.
> > Here is an example of something I am seeing:
> > ERROR 14:59:45,845 Exception in thread Thread[Thrift:5,5,main]
> > java.lang.AbstractMethodError:
> org.apache.thrift.ProcessFunction.isOneway()Z
> > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:51)
> > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> > at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
> > at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > at java.lang.Thread.run(Thread.java:722)
> > DEBUG 14:59:51,654 retryPolicy for schema_triggers is 0.99
> > In short: If you are new to cassandra and only using the newest client I
> am sure everything is peachy for you.
> > For people that have been using Cassandra for a while it is harder to
> "jump ship" when something better comes along. You need sometimes to
> support both hector and astyanax, it happens.
> > For a while I have been using hector. Even not to use hector as an API,
> but the one nice thing I got from hector was a simple EmbeddedServer that
> would clean up after itself. Hector seems badly broken at the moment. I
> have no idea how the current versions track with anything out there in the
> cassandra world.
> > For a while I played with https://github.com/Netflix/astyanax, which
> has it's own version and schemes and dependent libraries. (astyanax has
> some packaging error that forces me into maven3
> >
> > --
> > -----------------
> > Nate McCall
> > Austin, TX
> > @zznate
> >
> > Co-Founder & Sr. Technical Consultant
> > Apache Cassandra Consulting
> > http://www.thelastpickle.com
>
> --
> Sorry this was sent from mobile. Will do less grammar and spell check than
> usual.
>

Re: Cassandra unit testing becoming nearly impossible: suggesting alternative.

Posted by Edward Capriolo <ed...@gmail.com>.
I think i will invest the time launching cassandra in a forked forground
process, maybe building the yaml dynamically.

On Friday, December 27, 2013, Nate McCall <na...@thelastpickle.com> wrote:
> I've also moved on to container-based (using Vagrant+docker) setup for
doing automated integration stuff. This is more difficult to configure for
build systems like Jenkins, but it can be done and once completed the
benefits are substantial - as Joe notes, the most immediate is the removal
of variance between different environments.
> However, for in process testing with Maven or similar, the Usergrid
project [0] probably has the most functionally advanced test architecture
[1]. Do understand that it took us a very long time to get there and
involves some fairly tight integration with JUnit and (to a lesser degree)
maven.
> The UG plumbing is purpose built towards a specific data model so it's
not something that can be just dropped in, but it can be pulled apart in a
straight forward way (provided you understand JUnit - which is not really
trivial) and generalized pretty easily. It's all ASF-licensed, so take what
you need if you find it useful.
> [0] https://usergrid.incubator.apache.org/
> [1]
https://github.com/usergrid/usergrid/blob/master/stack/test-utils/src/main/java/org/usergrid/cassandra/CassandraResource.java
>
> On Wed, Dec 25, 2013 at 2:42 PM, Joe Stein <cr...@gmail.com> wrote:
>
> I have been using vagrant (e.g.
https://github.com/stealthly/scala-cassandra/ ) which is 100% reproducible
across devs and test systems (prod in some cases).  Also have a Docker
setup too https://github.com/pegasussolutions/docker-cassandra .  I have
been doing this more and more with clients to better mimic production
before production and smoothing the release process from development.  I
also use packer (scripts released soon) to build images too (
http://packer.io)
> Love vagrant, packer and docker!!!  Apache Mesos too :)
>
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop
> ********************************************/
>
> On Dec 25, 2013, at 3:28 PM, horschi <ho...@gmail.com> wrote:
>
> Hi Ed,
>
> my opinion on unit testing with C* is: Use the real database, not any
embedded crap :-)
>
> All you need are fast truncates, by which I mean:
> JVM_OPTS="$JVM_OPTS -Dcassandra.unsafesystem=true"
> and
> auto_snapshot: false
>
> This setup works really nice for me (C* 1.1 and 1.2, have not tested 2.0
yet).
>
> Imho this setup is better for multiple reasons:
> - No extra classpath issues
> - Faster: Running JUnits and C* in one JVM would require a really large
heap (for me at least).
> - Faster: No Cassandra startup everytime I run my tests.
>
> The only downside is that developers must change the properties in their
configs.
>
> cheers,
> Christian
>
>
>
> On Tue, Dec 24, 2013 at 9:31 PM, Edward Capriolo <ed...@gmail.com>
wrote:
>
> I am not sure there how many people have been around developing Cassandra
for as long as I have, but the state of all the client libraries and the
cassandra server is WORD_I_DONT_WANT_TO_SAY.
> Here is an example of something I am seeing:
> ERROR 14:59:45,845 Exception in thread Thread[Thrift:5,5,main]
> java.lang.AbstractMethodError:
org.apache.thrift.ProcessFunction.isOneway()Z
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:51)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> DEBUG 14:59:51,654 retryPolicy for schema_triggers is 0.99
> In short: If you are new to cassandra and only using the newest client I
am sure everything is peachy for you.
> For people that have been using Cassandra for a while it is harder to
"jump ship" when something better comes along. You need sometimes to
support both hector and astyanax, it happens.
> For a while I have been using hector. Even not to use hector as an API,
but the one nice thing I got from hector was a simple EmbeddedServer that
would clean up after itself. Hector seems badly broken at the moment. I
have no idea how the current versions track with anything out there in the
cassandra world.
> For a while I played with https://github.com/Netflix/astyanax, which has
it's own version and schemes and dependent libraries. (astyanax has some
packaging error that forces me into maven3
>
> --
> -----------------
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com

-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Re: Cassandra unit testing becoming nearly impossible: suggesting alternative.

Posted by Nate McCall <na...@thelastpickle.com>.
I've also moved on to container-based (using Vagrant+docker) setup for
doing automated integration stuff. This is more difficult to configure for
build systems like Jenkins, but it can be done and once completed the
benefits are substantial - as Joe notes, the most immediate is the removal
of variance between different environments.

However, for in process testing with Maven or similar, the Usergrid project
[0] probably has the most functionally advanced test architecture [1]. Do
understand that it took us a very long time to get there and involves some
fairly tight integration with JUnit and (to a lesser degree) maven.

The UG plumbing is purpose built towards a specific data model so it's not
something that can be just dropped in, but it can be pulled apart in a
straight forward way (provided you understand JUnit - which is not really
trivial) and generalized pretty easily. It's all ASF-licensed, so take what
you need if you find it useful.

[0] https://usergrid.incubator.apache.org/
[1]
https://github.com/usergrid/usergrid/blob/master/stack/test-utils/src/main/java/org/usergrid/cassandra/CassandraResource.java


On Wed, Dec 25, 2013 at 2:42 PM, Joe Stein <cr...@gmail.com> wrote:

> I have been using vagrant (e.g.
> https://github.com/stealthly/scala-cassandra/ ) which is 100%
> reproducible across devs and test systems (prod in some cases).  Also have
> a Docker setup too https://github.com/pegasussolutions/docker-cassandra .
>  I have been doing this more and more with clients to better mimic
> production before production and smoothing the release process from
> development.  I also use packer (scripts released soon) to build images too
> (http://packer.io)
>
> Love vagrant, packer and docker!!!  Apache Mesos too :)
>
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop
> ********************************************/
>
>
> On Dec 25, 2013, at 3:28 PM, horschi <ho...@gmail.com> wrote:
>
> Hi Ed,
>
> my opinion on unit testing with C* is: Use the real database, not any
> embedded crap :-)
>
> All you need are fast truncates, by which I mean:
> JVM_OPTS="$JVM_OPTS -Dcassandra.unsafesystem=true"
> and
> auto_snapshot: false
>
> This setup works really nice for me (C* 1.1 and 1.2, have not tested 2.0
> yet).
>
> Imho this setup is better for multiple reasons:
> - No extra classpath issues
> - Faster: Running JUnits and C* in one JVM would require a really large
> heap (for me at least).
> - Faster: No Cassandra startup everytime I run my tests.
>
> The only downside is that developers must change the properties in their
> configs.
>
> cheers,
> Christian
>
>
>
> On Tue, Dec 24, 2013 at 9:31 PM, Edward Capriolo <ed...@gmail.com>wrote:
>
>> I am not sure there how many people have been around developing Cassandra
>> for as long as I have, but the state of all the client libraries and the
>> cassandra server is WORD_I_DONT_WANT_TO_SAY.
>>
>> Here is an example of something I am seeing:
>> ERROR 14:59:45,845 Exception in thread Thread[Thrift:5,5,main]
>> java.lang.AbstractMethodError:
>> org.apache.thrift.ProcessFunction.isOneway()Z
>> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:51)
>> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> at java.lang.Thread.run(Thread.java:722)
>> DEBUG 14:59:51,654 retryPolicy for schema_triggers is 0.99
>>
>> In short: If you are new to cassandra and only using the newest client I
>> am sure everything is peachy for you.
>>
>> For people that have been using Cassandra for a while it is harder to
>> "jump ship" when something better comes along. You need sometimes to
>> support both hector and astyanax, it happens.
>>
>> For a while I have been using hector. Even not to use hector as an API,
>> but the one nice thing I got from hector was a simple EmbeddedServer that
>> would clean up after itself. Hector seems badly broken at the moment. I
>> have no idea how the current versions track with anything out there in the
>> cassandra world.
>>
>> For a while I played with https://github.com/Netflix/astyanax, which has
>> it's own version and schemes and dependent libraries. (astyanax has some
>> packaging error that forces me into maven3)
>>
>> Enter cassandra 2.0 which forces you into java 0.7. Besides that it has
>> it's own kit of things it seems to want.
>>
>> I am guessing since hectors embedded server does not work, and I should
>> go to https://github.com/jsevellec/cassandra-unit not
>> sure...really...how anyone does this anymore. I am sure I could dive into
>> the source code and figure this out, but I would just rather have a stable
>> piece of code that brings up the embedded server that "just works" and
>> "continues working".
>>
>> I can not seem to get this working right either. (since it includes
>> hector I see from the pom)
>>
>> Between thrift, cassandra,client x, it is almost impossible to build a
>> sane classpath, and that is not even counting the fact that people have
>> their own classpath issues (with guava mismatches etc).
>>
>> I think the only sane thing to do is start shipping cassandra-embedded
>> like this:
>>
>> https://github.com/kstyrc/embedded-redis
>>
>> In other words package embedded-cassandra as a binary. Don't force the
>> client/application developer to bring cassandra on the classpath and fight
>> with mismatches in thrift/guava etc. That or provide a completely shaded
>> cassandra server for embedded testing. As it stands now trying to support a
>> setup that uses more than one client or works with multiple versions of
>> cassandra is major pita.  (aka library x compiled against 1.2.0 library y
>> compiled against 2.0.3)
>>
>> Does anyone have any thoughts on this, or tried something similar?
>>
>> Edward
>>
>>
>


-- 
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Cassandra unit testing becoming nearly impossible: suggesting alternative.

Posted by Joe Stein <cr...@gmail.com>.
I have been using vagrant (e.g. https://github.com/stealthly/scala-cassandra/ ) which is 100% reproducible across devs and test systems (prod in some cases).  Also have a Docker setup too https://github.com/pegasussolutions/docker-cassandra .  I have been doing this more and more with clients to better mimic production before production and smoothing the release process from development.  I also use packer (scripts released soon) to build images too (http://packer.io)

Love vagrant, packer and docker!!!  Apache Mesos too :)


/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop
********************************************/


On Dec 25, 2013, at 3:28 PM, horschi <ho...@gmail.com> wrote:

> Hi Ed,
> 
> my opinion on unit testing with C* is: Use the real database, not any embedded crap :-)
> 
> All you need are fast truncates, by which I mean: 
> JVM_OPTS="$JVM_OPTS -Dcassandra.unsafesystem=true" 
> and
> auto_snapshot: false
> 
> This setup works really nice for me (C* 1.1 and 1.2, have not tested 2.0 yet).
> 
> Imho this setup is better for multiple reasons:
> - No extra classpath issues
> - Faster: Running JUnits and C* in one JVM would require a really large heap (for me at least).
> - Faster: No Cassandra startup everytime I run my tests.
> 
> The only downside is that developers must change the properties in their configs.
> 
> cheers,
> Christian
> 
> 
> 
> On Tue, Dec 24, 2013 at 9:31 PM, Edward Capriolo <ed...@gmail.com> wrote:
> I am not sure there how many people have been around developing Cassandra for as long as I have, but the state of all the client libraries and the cassandra server is WORD_I_DONT_WANT_TO_SAY.
> 
> Here is an example of something I am seeing:
> ERROR 14:59:45,845 Exception in thread Thread[Thrift:5,5,main]
> java.lang.AbstractMethodError: org.apache.thrift.ProcessFunction.isOneway()Z
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:51)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> DEBUG 14:59:51,654 retryPolicy for schema_triggers is 0.99
> 
> In short: If you are new to cassandra and only using the newest client I am sure everything is peachy for you.
> 
> For people that have been using Cassandra for a while it is harder to "jump ship" when something better comes along. You need sometimes to support both hector and astyanax, it happens. 
> 
> For a while I have been using hector. Even not to use hector as an API, but the one nice thing I got from hector was a simple EmbeddedServer that would clean up after itself. Hector seems badly broken at the moment. I have no idea how the current versions track with anything out there in the cassandra world. 
> 
> For a while I played with https://github.com/Netflix/astyanax, which has it's own version and schemes and dependent libraries. (astyanax has some packaging error that forces me into maven3)
> 
> Enter cassandra 2.0 which forces you into java 0.7. Besides that it has it's own kit of things it seems to want. 
> 
> I am guessing since hectors embedded server does not work, and I should go to https://github.com/jsevellec/cassandra-unit not sure...really...how anyone does this anymore. I am sure I could dive into the source code and figure this out, but I would just rather have a stable piece of code that brings up the embedded server that "just works" and "continues working".
> 
> I can not seem to get this working right either. (since it includes hector I see from the pom)
> 
> Between thrift, cassandra,client x, it is almost impossible to build a sane classpath, and that is not even counting the fact that people have their own classpath issues (with guava mismatches etc).
> 
> I think the only sane thing to do is start shipping cassandra-embedded like this:
> 
> https://github.com/kstyrc/embedded-redis
> 
> In other words package embedded-cassandra as a binary. Don't force the client/application developer to bring cassandra on the classpath and fight with mismatches in thrift/guava etc. That or provide a completely shaded cassandra server for embedded testing. As it stands now trying to support a setup that uses more than one client or works with multiple versions of cassandra is major pita.  (aka library x compiled against 1.2.0 library y compiled against 2.0.3)
> 
> Does anyone have any thoughts on this, or tried something similar?  
> 
> Edward
> 
> 

Re: Cassandra unit testing becoming nearly impossible: suggesting alternative.

Posted by horschi <ho...@gmail.com>.
Hi Ed,

my opinion on unit testing with C* is: Use the real database, not any
embedded crap :-)

All you need are fast truncates, by which I mean:
JVM_OPTS="$JVM_OPTS -Dcassandra.unsafesystem=true"
and
auto_snapshot: false

This setup works really nice for me (C* 1.1 and 1.2, have not tested 2.0
yet).

Imho this setup is better for multiple reasons:
- No extra classpath issues
- Faster: Running JUnits and C* in one JVM would require a really large
heap (for me at least).
- Faster: No Cassandra startup everytime I run my tests.

The only downside is that developers must change the properties in their
configs.

cheers,
Christian



On Tue, Dec 24, 2013 at 9:31 PM, Edward Capriolo <ed...@gmail.com>wrote:

> I am not sure there how many people have been around developing Cassandra
> for as long as I have, but the state of all the client libraries and the
> cassandra server is WORD_I_DONT_WANT_TO_SAY.
>
> Here is an example of something I am seeing:
> ERROR 14:59:45,845 Exception in thread Thread[Thrift:5,5,main]
> java.lang.AbstractMethodError:
> org.apache.thrift.ProcessFunction.isOneway()Z
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:51)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:194)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> DEBUG 14:59:51,654 retryPolicy for schema_triggers is 0.99
>
> In short: If you are new to cassandra and only using the newest client I
> am sure everything is peachy for you.
>
> For people that have been using Cassandra for a while it is harder to
> "jump ship" when something better comes along. You need sometimes to
> support both hector and astyanax, it happens.
>
> For a while I have been using hector. Even not to use hector as an API,
> but the one nice thing I got from hector was a simple EmbeddedServer that
> would clean up after itself. Hector seems badly broken at the moment. I
> have no idea how the current versions track with anything out there in the
> cassandra world.
>
> For a while I played with https://github.com/Netflix/astyanax, which has
> it's own version and schemes and dependent libraries. (astyanax has some
> packaging error that forces me into maven3)
>
> Enter cassandra 2.0 which forces you into java 0.7. Besides that it has
> it's own kit of things it seems to want.
>
> I am guessing since hectors embedded server does not work, and I should go
> to https://github.com/jsevellec/cassandra-unit not sure...really...how
> anyone does this anymore. I am sure I could dive into the source code and
> figure this out, but I would just rather have a stable piece of code that
> brings up the embedded server that "just works" and "continues working".
>
> I can not seem to get this working right either. (since it includes hector
> I see from the pom)
>
> Between thrift, cassandra,client x, it is almost impossible to build a
> sane classpath, and that is not even counting the fact that people have
> their own classpath issues (with guava mismatches etc).
>
> I think the only sane thing to do is start shipping cassandra-embedded
> like this:
>
> https://github.com/kstyrc/embedded-redis
>
> In other words package embedded-cassandra as a binary. Don't force the
> client/application developer to bring cassandra on the classpath and fight
> with mismatches in thrift/guava etc. That or provide a completely shaded
> cassandra server for embedded testing. As it stands now trying to support a
> setup that uses more than one client or works with multiple versions of
> cassandra is major pita.  (aka library x compiled against 1.2.0 library y
> compiled against 2.0.3)
>
> Does anyone have any thoughts on this, or tried something similar?
>
> Edward
>
>