You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Łukasz Dywicki <lu...@code-house.org> on 2013/07/09 15:02:15 UTC

Configuration of network connectors

Hello,
First of all I would like to say hello to cassandra user and developer community. :)

I write because we are using Cassandra in our unit tests and we have some troubles with network connectivity. We ca not run multiple cassandra instances during tests because we would need to randomize configuration of port and so on. For now if we try to fork our tests we get "address already in use" on one from two ports - native or thrift. In other apache projects we can "VM" connectors (ActiveMQ, Camel, Mina) based on in-memory queue. I took some time to see how CassandraDaemon starts servers and it's kinda of hardcoded. I thought about changing configuration to be more like:

servers:
  - class org.apache.cassandra.thrift.ThriftServer
  - class org.apache.cassandra.transport.Server

Then we will be able to disable these servers for unit tests:
servers:
  - class org.apache.cassandra.vm.VmServer

This requires some small changes in daemon code and client libraries. I'm not really deeply involved in cassandra stuff so I don't know the internal architecture and implications thus I look forward for you to discuss this topic.

Cheers,
Łukasz Dywicki
--
luke@code-house.org
Twitter: ldywicki
Blog: http://dywicki.pl
Code-House - http://code-house.org


Re: Configuration of network connectors

Posted by Gary Dusbabek <gd...@gmail.com>.
You'd still end up needing to fork tests because of the singleton problem.
(Google that one.)

I say this without trying, but it shouldn't be terribly hard for you to
code up some fixture classes that allow you to test the storage layer, so
long as you can tolerate each test running in a forked VM. Same goes for
the ports, with the exception of JMX iirc.

Gary.


On Tuesday, July 9, 2013, Łukasz Dywicki wrote:

> Jeremy,
> Sadly it does not cover our case. We have unit tests and we want to test
> really basic things like mappings of data contained in cassandra to our
> model. For that we don't need cluster at all because in unit tests we don't
> want to test data distribution. We also would like to run everything in
> JVM, thus CCM written in Python is not really what we need.
> What we are looking for is minimal cassandra set up which could be
> embedded and used concurrently multiple times. For example we now use
> CassandraUnit:
>
> @Rule
> public CassandraUnit unit = new CassandraUnit(new EmptyDataSet(),
> "embedded-cassandra.yaml");
>
> @Test
> public void fistTest() {
> // do something with data
> }
>
> @Test
> public void secondTest() {
> // do something else
> }
>
> In this set up JUnit will launch new CassandraDaemon for every test. If we
> set FORK_MODE per test then we may have two cassandra instances running at
> the same time. First test which launch CassandraDaemon will pass, second
> may fail due port usage conflict. That's why we thought about testing
> without network layer. This can save some time. It would be great because
> for some older hardware used by our developers it takes up to 9 minutes to
> run build with all unit tests. Some of this time is consumed by startup and
> shutdown of cassandra.
>
> Cheers,
> Łukasz Dywicki
> --
> luke@code-house.org <javascript:;>
> Twitter: ldywicki
> Blog: http://dywicki.pl
> Code-House - http://code-house.org
>
> Wiadomość napisana przez Jeremy Hanna <jeremy.hanna1234@gmail.com<javascript:;>>
> w dniu 9 lip 2013, o godz. 15:22:
>
> > Have you seen https://github.com/pcmanus/ccm as described in
> http://www.datastax.com/dev/blog/ccm-a-development-tool-for-creating-local-cassandra-clustersor does that not fit your use case?
> >
> > On 9 Jul 2013, at 14:02, Łukasz Dywicki <luke@code-house.org<javascript:;>>
> wrote:
> >
> >> Hello,
> >> First of all I would like to say hello to cassandra user and developer
> community. :)
> >>
> >> I write because we are using Cassandra in our unit tests and we have
> some troubles with network connectivity. We ca not run multiple cassandra
> instances during tests because we would need to randomize configuration of
> port and so on. For now if we try to fork our tests we get "address already
> in use" on one from two ports - native or thrift. In other apache projects
> we can "VM" connectors (ActiveMQ, Camel, Mina) based on in-memory queue. I
> took some time to see how CassandraDaemon starts servers and it's kinda of
> hardcoded. I thought about changing configuration to be more like:
> >>
> >> servers:
> >> - class org.apache.cassandra.thrift.ThriftServer
> >> - class org.apache.cassandra.transport.Server
> >>
> >> Then we will be able to disable these servers for unit tests:
> >> servers:
> >> - class org.apache.cassandra.vm.VmServer
> >>
> >> This requires some small changes in daemon code and client libraries.
> I'm not really deeply involved in cassandra stuff so I don't know the
> internal architecture and implications thus I look forward for you to
> discuss this topic.
> >>
> >> Cheers,
> >> Łukasz Dywicki
> >> --
> >> luke@code-house.org <javascript:;>
> >> Twitter: ldywicki
> >> Blog: http://dywicki.pl
> >> Code-House - http://code-house.org
> >>
> >
>
>

Re: Configuration of network connectors

Posted by Łukasz Dywicki <lu...@code-house.org>.
Jeremy,
Sadly it does not cover our case. We have unit tests and we want to test really basic things like mappings of data contained in cassandra to our model. For that we don't need cluster at all because in unit tests we don't want to test data distribution. We also would like to run everything in JVM, thus CCM written in Python is not really what we need.
What we are looking for is minimal cassandra set up which could be embedded and used concurrently multiple times. For example we now use CassandraUnit:

@Rule
public CassandraUnit unit = new CassandraUnit(new EmptyDataSet(), "embedded-cassandra.yaml");

@Test
public void fistTest() {
// do something with data
}

@Test
public void secondTest() {
// do something else
}

In this set up JUnit will launch new CassandraDaemon for every test. If we set FORK_MODE per test then we may have two cassandra instances running at the same time. First test which launch CassandraDaemon will pass, second may fail due port usage conflict. That's why we thought about testing without network layer. This can save some time. It would be great because for some older hardware used by our developers it takes up to 9 minutes to run build with all unit tests. Some of this time is consumed by startup and shutdown of cassandra.

Cheers,
Łukasz Dywicki
--
luke@code-house.org
Twitter: ldywicki
Blog: http://dywicki.pl
Code-House - http://code-house.org

Wiadomość napisana przez Jeremy Hanna <je...@gmail.com> w dniu 9 lip 2013, o godz. 15:22:

> Have you seen https://github.com/pcmanus/ccm as described in http://www.datastax.com/dev/blog/ccm-a-development-tool-for-creating-local-cassandra-clusters or does that not fit your use case?
> 
> On 9 Jul 2013, at 14:02, Łukasz Dywicki <lu...@code-house.org> wrote:
> 
>> Hello,
>> First of all I would like to say hello to cassandra user and developer community. :)
>> 
>> I write because we are using Cassandra in our unit tests and we have some troubles with network connectivity. We ca not run multiple cassandra instances during tests because we would need to randomize configuration of port and so on. For now if we try to fork our tests we get "address already in use" on one from two ports - native or thrift. In other apache projects we can "VM" connectors (ActiveMQ, Camel, Mina) based on in-memory queue. I took some time to see how CassandraDaemon starts servers and it's kinda of hardcoded. I thought about changing configuration to be more like:
>> 
>> servers:
>> - class org.apache.cassandra.thrift.ThriftServer
>> - class org.apache.cassandra.transport.Server
>> 
>> Then we will be able to disable these servers for unit tests:
>> servers:
>> - class org.apache.cassandra.vm.VmServer
>> 
>> This requires some small changes in daemon code and client libraries. I'm not really deeply involved in cassandra stuff so I don't know the internal architecture and implications thus I look forward for you to discuss this topic.
>> 
>> Cheers,
>> Łukasz Dywicki
>> --
>> luke@code-house.org
>> Twitter: ldywicki
>> Blog: http://dywicki.pl
>> Code-House - http://code-house.org
>> 
> 


Re: Configuration of network connectors

Posted by Jeremy Hanna <je...@gmail.com>.
Have you seen https://github.com/pcmanus/ccm as described in http://www.datastax.com/dev/blog/ccm-a-development-tool-for-creating-local-cassandra-clusters or does that not fit your use case?

On 9 Jul 2013, at 14:02, Łukasz Dywicki <lu...@code-house.org> wrote:

> Hello,
> First of all I would like to say hello to cassandra user and developer community. :)
> 
> I write because we are using Cassandra in our unit tests and we have some troubles with network connectivity. We ca not run multiple cassandra instances during tests because we would need to randomize configuration of port and so on. For now if we try to fork our tests we get "address already in use" on one from two ports - native or thrift. In other apache projects we can "VM" connectors (ActiveMQ, Camel, Mina) based on in-memory queue. I took some time to see how CassandraDaemon starts servers and it's kinda of hardcoded. I thought about changing configuration to be more like:
> 
> servers:
>  - class org.apache.cassandra.thrift.ThriftServer
>  - class org.apache.cassandra.transport.Server
> 
> Then we will be able to disable these servers for unit tests:
> servers:
>  - class org.apache.cassandra.vm.VmServer
> 
> This requires some small changes in daemon code and client libraries. I'm not really deeply involved in cassandra stuff so I don't know the internal architecture and implications thus I look forward for you to discuss this topic.
> 
> Cheers,
> Łukasz Dywicki
> --
> luke@code-house.org
> Twitter: ldywicki
> Blog: http://dywicki.pl
> Code-House - http://code-house.org
>