You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by kannan chandrasekaran <ck...@yahoo.com> on 2010/09/08 00:42:32 UTC

Few questions regarding cassandra deployment on windows

Hi All,

We are currently considering Cassandra for our application. 

Platform:  
* a single-node cluster.
* windows '08
* 64-bit jvm  

For the sake of brevity let,
Cassandra service =  a single node cassandra server running as an embedded 
service inside a JVM


My use cases: 
1) Start with a schema ( keyspace and set of column families under it) in a 
cassandra service
2) Need to be able to replicate the same schema structure (add new 
keyspace/columnfamilies with different names ofcourse).
3) Because of some existing limitations in my application, I need to be able to 
write to the keyspace/column-families from a cassandra service and read the 
written changes from a different cassandra service. Both the write and the read 
"cassandra-services" are sharing the same Data directory. I understand that the 
application has to take care of any naming collisions. 



Couple Questions related to the above mentioned usecases:
1) I want to spawn a new JVM and launch Cassandra as an embedded service 
programatically instead of using the startup.bat. I would like to know if that 
is possible and any pointers in that direction would be really helpful. ( 
use-case1)
2) I understand that there are provisions for live schema changes in 0.7 ( thank 
you guys !!!), but since I cant use a beta version in production, I am 
restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5 ? 
More specifically, I am planning to make runtime changes to the storage.conf xml 
file followed by a cassandra service restart
3) Can I switch the data directory at run-time ?  (use-case 3). In order to not 
disrupt read while the writes are in progress, I am thinking something like, 
copy the existing data-dir into a new location; write to a new data directory; 
once the write is complete; switch pointers and restart the cassandra service to 
read from the new directory to pick up the updated changes

Any help is greatly appreciated.

Thanks 
Kannan

Re: Few questions regarding cassandra deployment on windows

Posted by kannan chandrasekaran <ck...@yahoo.com>.

Thank you. That was helpful.  But as mentioned in the comments section of 
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/ 
section, the embedded server cannot be shutdown unless the JVM is shutdown due 
to Cassandra's design limitation. Is there a specific reason for this limitation 
? If yes, Can someone please help me understand the reason...

Thanks
Kannan

________________________________
From: Courtney <sa...@live.co.uk>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 5:31:46 PM
Subject: Re: Few questions regarding cassandra deployment on windows

I haven't looked at your previos e-mail( s) or the  responses to them but have a 
look at 
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/
the post was written by one of the guys who maintains  the hector cassandra 
client.

In any case the simple and short answer is yes, he did  it, so ...

From: kannan chandrasekaran 
Sent: Wednesday, September 08, 2010 1:20 AM
To: user@cassandra.apache.org 
Subject: Re: Few questions regarding cassandra deployment on  windows

Can you  please elaborate on why you think Cassandra would not be suitable for 
this  ?

Main reasons why we think  cassandra because,
1) We are on  focusing on moving to a distributed architecture very soon and 
using cassandra  as a backend naturally lends to this.
2) Our schema is relatively simple and  we wanted quick read and write access. 
Cassandra response times were faster than  Mysql and we expect it to satisfy our 
requirements ( without the need for a  cache layer).
3) I believe with 0.7's live schema updates, the need for  changing the xml 
files and restarting the service would go away. so I believe  usecase2 is only 
difficult in the 0.6 versions... 

I am more interested  in knowing if we can start/run/stop  cassandra as a 
embedded service within  a jvm

Thanks
Kannan

________________________________
 From: Benjamin Black  <b...@b3k.us>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 4:38:41 PM
Subject: Re: Few questions regarding  cassandra deployment on windows

This does not sound like a good  application for Cassandra at all.  Why
are you using it?

On Tue,  Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
>  Hi All,
>
> We are currently considering Cassandra for our  application.
>
> Platform:
> * a single-node cluster.
>  * windows '08
> * 64-bit jvm
>
> For the sake of brevity  let,
> Cassandra service =  a single node cassandra server running as  an embedded
> service inside a JVM
>
>
> My use  cases:
> 1) Start with a schema ( keyspace and set of column families  under it) in a
> cassandra service
> 2) Need to be able to replicate  the same schema structure (add new
> keyspace/columnfamilies with  different names ofcourse).
> 3) Because of some existing limitations in my  application, I need to be able
> to write to the keyspace/column-families  from a cassandra service and read
> the written changes from a different  cassandra service. Both the write and
> the read "cassandra-services" are  sharing the same Data directory. I
> understand that the application has  to take care of any naming collisions.
>
>
> Couple Questions  related to the above mentioned usecases:
> 1) I want to spawn a new JVM  and launch Cassandra as an embedded service
> programatically instead of  using the startup.bat. I would like to know if
> that is possible and any  pointers in that direction would be really helpful.
> ( use-case1)
>  2) I understand that there are provisions for live schema changes in 0.7  (
> thank you guys !!!), but since I cant use a beta version in  production, I am
> restricted to 0.6 for now. Is it possible to to support  use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime  changes to the
> storage.conf xml file followed by a cassandra service  restart
> 3) Can I switch the data directory at run-time ?  (use-case  3). In order to
> not disrupt read while the writes are in progress, I am  thinking something
> like, copy the existing data-dir into a new location;  write to a new data
> directory; once the write is complete; switch  pointers and restart the
> cassandra service to read from the new  directory to pick up the updated
> changes
>
> Any help is  greatly appreciated.
>
> Thanks
>  Kannan
>
>
>

Re: Few questions regarding cassandra deployment on windows

Posted by Courtney <sa...@live.co.uk>.

I haven't looked at your previos e-mail( s) or the responses to them but have a look at http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/
the post was written by one of the guys who maintains the hector cassandra client.

In any case the simple and short answer is yes, he did it, so ...


From: kannan chandrasekaran 
Sent: Wednesday, September 08, 2010 1:20 AM
To: user@cassandra.apache.org 
Subject: Re: Few questions regarding cassandra deployment on windows


Can you please elaborate on why you think Cassandra would not be suitable for this ?

Main reasons why we think  cassandra because,
1) We are on focusing on moving to a distributed architecture very soon and using cassandra as a backend naturally lends to this.
2) Our schema is relatively simple and we wanted quick read and write access. Cassandra response times were faster than Mysql and we expect it to satisfy our requirements ( without the need for a cache layer).
3) I believe with 0.7's live schema updates, the need for changing the xml files and restarting the service would go away. so I believe usecase2 is only difficult in the 0.6 versions... 

I am more interested in knowing if we can start/run/stop  cassandra as a embedded service within a jvm

Thanks
Kannan







--------------------------------------------------------------------------------
From: Benjamin Black <b...@b3k.us>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 4:38:41 PM
Subject: Re: Few questions regarding cassandra deployment on windows

This does not sound like a good application for Cassandra at all.  Why
are you using it?

On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service =  a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ?  (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>

Re: Few questions regarding cassandra deployment on windows

Posted by kannan chandrasekaran <ck...@yahoo.com>.

Can you please elaborate on why you think Cassandra would not be suitable for 
this ?

Main reasons why we think  cassandra because,
1) We are on focusing on moving to a distributed architecture very soon and 
using cassandra as a backend naturally lends to this.
2) Our schema is relatively simple and we wanted quick read and write access. 
Cassandra response times were faster than Mysql and we expect it to satisfy our 
requirements ( without the need for a cache layer).
3) I believe with 0.7's live schema updates, the need for changing the xml files 
and restarting the service would go away. so I believe usecase2 is only 
difficult in the 0.6 versions... 


I am more interested in knowing if we can start/run/stop  cassandra as a 
embedded service within a jvm

Thanks
Kannan







________________________________
From: Benjamin Black <b...@b3k.us>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 4:38:41 PM
Subject: Re: Few questions regarding cassandra deployment on windows

This does not sound like a good application for Cassandra at all.  Why
are you using it?

On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service =  a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ?  (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>

Re: Few questions regarding cassandra deployment on windows

Posted by Benjamin Black <b...@b3k.us>.

This does not sound like a good application for Cassandra at all.  Why
are you using it?

On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service =  a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ?  (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>

Re: Few questions regarding cassandra deployment on windows

Posted by Gary Dusbabek <gd...@gmail.com>.

On Thu, Sep 9, 2010 at 22:23, kannan chandrasekaran <ck...@yahoo.com> wrote:
>
> Thanks for the replies.... My comments in Bold...
> Kannan
>
>
> From: Gary Dusbabek <gd...@gmail.com>
> To: user@cassandra.apache.org
> Sent: Thu, September 9, 2010 5:43:31 AM
> Subject: Re: Few questions regarding cassandra deployment on windows
>
> On Tue, Sep 7, 2010 at 17:42, kannan chandrasekaran <ck...@yahoo.com> wrote:
> > Hi All,
> >
> > We are currently considering Cassandra for our application.
> >
> > Platform:
> > * a single-node cluster.
> > * windows '08
> > * 64-bit jvm
> >
> > For the sake of brevity let,
> > Cassandra service =  a single node cassandra server running as an embedded
> > service inside a JVM
> >
> >
> > My use cases:
> > 1) Start with a schema ( keyspace and set of column families under it) in a
> > cassandra service
> > 2) Need to be able to replicate the same schema structure (add new
> > keyspace/columnfamilies with different names ofcourse).
> > 3) Because of some existing limitations in my application, I need to be able
> > to write to the keyspace/column-families from a cassandra service and read
> > the written changes from a different cassandra service. Both the write and
> > the read "cassandra-services" are sharing the same Data directory. I
> > understand that the application has to take care of any naming collisions.
> >
> >
> > Couple Questions related to the above mentioned usecases:
> > 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> > programatically instead of using the startup.bat. I would like to know if
> > that is possible and any pointers in that direction would be really helpful.
> > ( use-case1)
>
> There are a couple ways to do this.  I've used two of them (tanuki and
> objectweb).  Tanuki is better imo.  I've never used it, but Apache
> procrun is probably worth checking out.
>
>
> Someone pointed out http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/ as way to start it inside a jvm.. Do you still think tanuki/procrun is needed ?

I misinterpreted your first email to be that you wanted to run as a
windows service.  You don't need tanuki/procrun if you just wish to
simply run Cassandra.

>
> > 2) I understand that there are provisions for live schema changes in 0.7 (
> > thank you guys !!!), but since I cant use a beta version in production, I am
> > restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> > ? More specifically, I am planning to make runtime changes to the
> > storage.conf xml file followed by a cassandra service restart
>
> Yes, but this is a manual process and will not scale well.
>
> I agree, but I believe 0.7 solves this problem without a restart . Correct me if I am wrong. Any ideas when the 0.7 is set to release ?

When it's ready.  :)

>
> > 3) Can I switch the data directory at run-time ?  (use-case 3). In order to
> > not disrupt read while the writes are in progress, I am thinking something
> > like, copy the existing data-dir into a new location; write to a new data
> > directory; once the write is complete; switch pointers and restart the
> > cassandra service to read from the new directory to pick up the updated
> > changes
> >
>
>
> Pointing two cassandra instances at the same data directory?  This is
> a bad idea.  I've never tried it, so I don't know exactly what will
> happen, but I imagine you would corrupt your system tables pretty
> quickly and your commit log wouldn't be too happy either.  This is a
> completely unsupported way of using cassandra.
>
> Noted and will avoid it. But Can you please explain what happens if I write to a different location and then copy the keyspace & the system  directories into an existing location ? Any ideas if that will work ?
>

As long as you take care to ensure that the replacement node uses the
same token as the replaced node, nothing adverse will happen.  Hints
living throughout the cluster destined for the replaced node will
become invalid if the IP is changed on the replacement node.

Gary.

Re: Few questions regarding cassandra deployment on windows

Posted by kannan chandrasekaran <ck...@yahoo.com>.

Thanks for the replies.... My comments in Bold...
Kannan

From: Gary Dusbabek <gd...@gmail.com>
To: user@cassandra.apache.org
Sent: Thu, September 9, 2010 5:43:31 AM
Subject: Re: Few questions regarding cassandra deployment on windows

On Tue, Sep 7, 2010 at 17:42, kannan chandrasekaran <ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service =  a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)

There are a couple ways to do this.  I've used two of them (tanuki and
objectweb).  Tanuki is better imo.  I've never used it, but Apache
procrun is probably worth checking out.

Someone pointed out 
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/ as 
way to start it inside a jvm.. Do you still think tanuki/procrun is needed ?

> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart

Yes, but this is a manual process and will not scale well.

I agree, but I believe 0.7 solves this problem without a restart . Correct me if 
I am wrong. Any ideas when the 0.7 is set to release ?

> 3) Can I switch the data directory at run-time ?  (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>

Pointing two cassandra instances at the same data directory?  This is
a bad idea.  I've never tried it, so I don't know exactly what will
happen, but I imagine you would corrupt your system tables pretty
quickly and your commit log wouldn't be too happy either.  This is a
completely unsupported way of using cassandra.

Noted and will avoid it. But Can you please explain what happens if I write to a 
different location and then copy the keyspace & the system  directories into an 
existing location ? Any ideas if that will work ?

Thank you once again !!!
Gary

Re: Few questions regarding cassandra deployment on windows

Posted by Gary Dusbabek <gd...@gmail.com>.

On Tue, Sep 7, 2010 at 17:42, kannan chandrasekaran <ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service =  a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)

There are a couple ways to do this.  I've used two of them (tanuki and
objectweb).  Tanuki is better imo.  I've never used it, but Apache
procrun is probably worth checking out.

> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart

Yes, but this is a manual process and will not scale well.

> 3) Can I switch the data directory at run-time ?  (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>

Pointing two cassandra instances at the same data directory?  This is
a bad idea.  I've never tried it, so I don't know exactly what will
happen, but I imagine you would corrupt your system tables pretty
quickly and your commit log wouldn't be too happy either.  This is a
completely unsupported way of using cassandra.

Gary