You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by kannan chandrasekaran <ck...@yahoo.com> on 2010/09/08 00:42:32 UTC
Few questions regarding cassandra deployment on windows
Hi All,
We are currently considering Cassandra for our application.
Platform:
* a single-node cluster.
* windows '08
* 64-bit jvm
For the sake of brevity let,
Cassandra service = a single node cassandra server running as an embedded
service inside a JVM
My use cases:
1) Start with a schema ( keyspace and set of column families under it) in a
cassandra service
2) Need to be able to replicate the same schema structure (add new
keyspace/columnfamilies with different names ofcourse).
3) Because of some existing limitations in my application, I need to be able to
write to the keyspace/column-families from a cassandra service and read the
written changes from a different cassandra service. Both the write and the read
"cassandra-services" are sharing the same Data directory. I understand that the
application has to take care of any naming collisions.
Couple Questions related to the above mentioned usecases:
1) I want to spawn a new JVM and launch Cassandra as an embedded service
programatically instead of using the startup.bat. I would like to know if that
is possible and any pointers in that direction would be really helpful. (
use-case1)
2) I understand that there are provisions for live schema changes in 0.7 ( thank
you guys !!!), but since I cant use a beta version in production, I am
restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5 ?
More specifically, I am planning to make runtime changes to the storage.conf xml
file followed by a cassandra service restart
3) Can I switch the data directory at run-time ? (use-case 3). In order to not
disrupt read while the writes are in progress, I am thinking something like,
copy the existing data-dir into a new location; write to a new data directory;
once the write is complete; switch pointers and restart the cassandra service to
read from the new directory to pick up the updated changes
Any help is greatly appreciated.
Thanks
Kannan
Re: Few questions regarding cassandra deployment on windows
Posted by kannan chandrasekaran <ck...@yahoo.com>.
Thank you. That was helpful. But as mentioned in the comments section of
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/
section, the embedded server cannot be shutdown unless the JVM is shutdown due
to Cassandra's design limitation. Is there a specific reason for this limitation
? If yes, Can someone please help me understand the reason...
Thanks
Kannan
________________________________
From: Courtney <sa...@live.co.uk>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 5:31:46 PM
Subject: Re: Few questions regarding cassandra deployment on windows
I haven't looked at your previos e-mail( s) or the responses to them but have a
look at
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/
the post was written by one of the guys who maintains the hector cassandra
client.
In any case the simple and short answer is yes, he did it, so ...
From: kannan chandrasekaran
Sent: Wednesday, September 08, 2010 1:20 AM
To: user@cassandra.apache.org
Subject: Re: Few questions regarding cassandra deployment on windows
Can you please elaborate on why you think Cassandra would not be suitable for
this ?
Main reasons why we think cassandra because,
1) We are on focusing on moving to a distributed architecture very soon and
using cassandra as a backend naturally lends to this.
2) Our schema is relatively simple and we wanted quick read and write access.
Cassandra response times were faster than Mysql and we expect it to satisfy our
requirements ( without the need for a cache layer).
3) I believe with 0.7's live schema updates, the need for changing the xml
files and restarting the service would go away. so I believe usecase2 is only
difficult in the 0.6 versions...
I am more interested in knowing if we can start/run/stop cassandra as a
embedded service within a jvm
Thanks
Kannan
________________________________
From: Benjamin Black <b...@b3k.us>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 4:38:41 PM
Subject: Re: Few questions regarding cassandra deployment on windows
This does not sound like a good application for Cassandra at all. Why
are you using it?
On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service = a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>
Re: Few questions regarding cassandra deployment on windows
Posted by Courtney <sa...@live.co.uk>.
I haven't looked at your previos e-mail( s) or the responses to them but have a look at http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/
the post was written by one of the guys who maintains the hector cassandra client.
In any case the simple and short answer is yes, he did it, so ...
From: kannan chandrasekaran
Sent: Wednesday, September 08, 2010 1:20 AM
To: user@cassandra.apache.org
Subject: Re: Few questions regarding cassandra deployment on windows
Can you please elaborate on why you think Cassandra would not be suitable for this ?
Main reasons why we think cassandra because,
1) We are on focusing on moving to a distributed architecture very soon and using cassandra as a backend naturally lends to this.
2) Our schema is relatively simple and we wanted quick read and write access. Cassandra response times were faster than Mysql and we expect it to satisfy our requirements ( without the need for a cache layer).
3) I believe with 0.7's live schema updates, the need for changing the xml files and restarting the service would go away. so I believe usecase2 is only difficult in the 0.6 versions...
I am more interested in knowing if we can start/run/stop cassandra as a embedded service within a jvm
Thanks
Kannan
--------------------------------------------------------------------------------
From: Benjamin Black <b...@b3k.us>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 4:38:41 PM
Subject: Re: Few questions regarding cassandra deployment on windows
This does not sound like a good application for Cassandra at all. Why
are you using it?
On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service = a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>
Re: Few questions regarding cassandra deployment on windows
Posted by kannan chandrasekaran <ck...@yahoo.com>.
Can you please elaborate on why you think Cassandra would not be suitable for
this ?
Main reasons why we think cassandra because,
1) We are on focusing on moving to a distributed architecture very soon and
using cassandra as a backend naturally lends to this.
2) Our schema is relatively simple and we wanted quick read and write access.
Cassandra response times were faster than Mysql and we expect it to satisfy our
requirements ( without the need for a cache layer).
3) I believe with 0.7's live schema updates, the need for changing the xml files
and restarting the service would go away. so I believe usecase2 is only
difficult in the 0.6 versions...
I am more interested in knowing if we can start/run/stop cassandra as a
embedded service within a jvm
Thanks
Kannan
________________________________
From: Benjamin Black <b...@b3k.us>
To: user@cassandra.apache.org
Sent: Tue, September 7, 2010 4:38:41 PM
Subject: Re: Few questions regarding cassandra deployment on windows
This does not sound like a good application for Cassandra at all. Why
are you using it?
On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service = a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>
Re: Few questions regarding cassandra deployment on windows
Posted by Benjamin Black <b...@b3k.us>.
This does not sound like a good application for Cassandra at all. Why
are you using it?
On Tue, Sep 7, 2010 at 3:42 PM, kannan chandrasekaran
<ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service = a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
> 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
> Any help is greatly appreciated.
>
> Thanks
> Kannan
>
>
>
Re: Few questions regarding cassandra deployment on windows
Posted by Gary Dusbabek <gd...@gmail.com>.
On Thu, Sep 9, 2010 at 22:23, kannan chandrasekaran <ck...@yahoo.com> wrote:
>
> Thanks for the replies.... My comments in Bold...
> Kannan
>
>
> From: Gary Dusbabek <gd...@gmail.com>
> To: user@cassandra.apache.org
> Sent: Thu, September 9, 2010 5:43:31 AM
> Subject: Re: Few questions regarding cassandra deployment on windows
>
> On Tue, Sep 7, 2010 at 17:42, kannan chandrasekaran <ck...@yahoo.com> wrote:
> > Hi All,
> >
> > We are currently considering Cassandra for our application.
> >
> > Platform:
> > * a single-node cluster.
> > * windows '08
> > * 64-bit jvm
> >
> > For the sake of brevity let,
> > Cassandra service = a single node cassandra server running as an embedded
> > service inside a JVM
> >
> >
> > My use cases:
> > 1) Start with a schema ( keyspace and set of column families under it) in a
> > cassandra service
> > 2) Need to be able to replicate the same schema structure (add new
> > keyspace/columnfamilies with different names ofcourse).
> > 3) Because of some existing limitations in my application, I need to be able
> > to write to the keyspace/column-families from a cassandra service and read
> > the written changes from a different cassandra service. Both the write and
> > the read "cassandra-services" are sharing the same Data directory. I
> > understand that the application has to take care of any naming collisions.
> >
> >
> > Couple Questions related to the above mentioned usecases:
> > 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> > programatically instead of using the startup.bat. I would like to know if
> > that is possible and any pointers in that direction would be really helpful.
> > ( use-case1)
>
> There are a couple ways to do this. I've used two of them (tanuki and
> objectweb). Tanuki is better imo. I've never used it, but Apache
> procrun is probably worth checking out.
>
>
> Someone pointed out http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/ as way to start it inside a jvm.. Do you still think tanuki/procrun is needed ?
I misinterpreted your first email to be that you wanted to run as a
windows service. You don't need tanuki/procrun if you just wish to
simply run Cassandra.
>
> > 2) I understand that there are provisions for live schema changes in 0.7 (
> > thank you guys !!!), but since I cant use a beta version in production, I am
> > restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> > ? More specifically, I am planning to make runtime changes to the
> > storage.conf xml file followed by a cassandra service restart
>
> Yes, but this is a manual process and will not scale well.
>
> I agree, but I believe 0.7 solves this problem without a restart . Correct me if I am wrong. Any ideas when the 0.7 is set to release ?
When it's ready. :)
>
> > 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> > not disrupt read while the writes are in progress, I am thinking something
> > like, copy the existing data-dir into a new location; write to a new data
> > directory; once the write is complete; switch pointers and restart the
> > cassandra service to read from the new directory to pick up the updated
> > changes
> >
>
>
> Pointing two cassandra instances at the same data directory? This is
> a bad idea. I've never tried it, so I don't know exactly what will
> happen, but I imagine you would corrupt your system tables pretty
> quickly and your commit log wouldn't be too happy either. This is a
> completely unsupported way of using cassandra.
>
> Noted and will avoid it. But Can you please explain what happens if I write to a different location and then copy the keyspace & the system directories into an existing location ? Any ideas if that will work ?
>
As long as you take care to ensure that the replacement node uses the
same token as the replaced node, nothing adverse will happen. Hints
living throughout the cluster destined for the replaced node will
become invalid if the IP is changed on the replacement node.
Gary.
Re: Few questions regarding cassandra deployment on windows
Posted by kannan chandrasekaran <ck...@yahoo.com>.
Thanks for the replies.... My comments in Bold...
Kannan
From: Gary Dusbabek <gd...@gmail.com>
To: user@cassandra.apache.org
Sent: Thu, September 9, 2010 5:43:31 AM
Subject: Re: Few questions regarding cassandra deployment on windows
On Tue, Sep 7, 2010 at 17:42, kannan chandrasekaran <ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service = a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
There are a couple ways to do this. I've used two of them (tanuki and
objectweb). Tanuki is better imo. I've never used it, but Apache
procrun is probably worth checking out.
Someone pointed out
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/ as
way to start it inside a jvm.. Do you still think tanuki/procrun is needed ?
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
Yes, but this is a manual process and will not scale well.
I agree, but I believe 0.7 solves this problem without a restart . Correct me if
I am wrong. Any ideas when the 0.7 is set to release ?
> 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
Pointing two cassandra instances at the same data directory? This is
a bad idea. I've never tried it, so I don't know exactly what will
happen, but I imagine you would corrupt your system tables pretty
quickly and your commit log wouldn't be too happy either. This is a
completely unsupported way of using cassandra.
Noted and will avoid it. But Can you please explain what happens if I write to a
different location and then copy the keyspace & the system directories into an
existing location ? Any ideas if that will work ?
Thank you once again !!!
Gary
Re: Few questions regarding cassandra deployment on windows
Posted by Gary Dusbabek <gd...@gmail.com>.
On Tue, Sep 7, 2010 at 17:42, kannan chandrasekaran <ck...@yahoo.com> wrote:
> Hi All,
>
> We are currently considering Cassandra for our application.
>
> Platform:
> * a single-node cluster.
> * windows '08
> * 64-bit jvm
>
> For the sake of brevity let,
> Cassandra service = a single node cassandra server running as an embedded
> service inside a JVM
>
>
> My use cases:
> 1) Start with a schema ( keyspace and set of column families under it) in a
> cassandra service
> 2) Need to be able to replicate the same schema structure (add new
> keyspace/columnfamilies with different names ofcourse).
> 3) Because of some existing limitations in my application, I need to be able
> to write to the keyspace/column-families from a cassandra service and read
> the written changes from a different cassandra service. Both the write and
> the read "cassandra-services" are sharing the same Data directory. I
> understand that the application has to take care of any naming collisions.
>
>
> Couple Questions related to the above mentioned usecases:
> 1) I want to spawn a new JVM and launch Cassandra as an embedded service
> programatically instead of using the startup.bat. I would like to know if
> that is possible and any pointers in that direction would be really helpful.
> ( use-case1)
There are a couple ways to do this. I've used two of them (tanuki and
objectweb). Tanuki is better imo. I've never used it, but Apache
procrun is probably worth checking out.
> 2) I understand that there are provisions for live schema changes in 0.7 (
> thank you guys !!!), but since I cant use a beta version in production, I am
> restricted to 0.6 for now. Is it possible to to support use-case 2 in 0.6.5
> ? More specifically, I am planning to make runtime changes to the
> storage.conf xml file followed by a cassandra service restart
Yes, but this is a manual process and will not scale well.
> 3) Can I switch the data directory at run-time ? (use-case 3). In order to
> not disrupt read while the writes are in progress, I am thinking something
> like, copy the existing data-dir into a new location; write to a new data
> directory; once the write is complete; switch pointers and restart the
> cassandra service to read from the new directory to pick up the updated
> changes
>
Pointing two cassandra instances at the same data directory? This is
a bad idea. I've never tried it, so I don't know exactly what will
happen, but I imagine you would corrupt your system tables pretty
quickly and your commit log wouldn't be too happy either. This is a
completely unsupported way of using cassandra.
Gary