You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Krzysztof Księżyk <kk...@gmail.com> on 2016/03/07 11:57:42 UTC

moving keyspaces to another disk while Cassandra is running

Hi,

I have small Cassandra cluster running on boxes with 256GB SSD and 2TB HDD. 
Originally SSD was for system and commit log and HDD for data. But 
unfortunately because of nature of queries, performance was not satisfactory 
and to improve it, data were moved to SSD as well. Now problem is with SSD 
size being too small to keep all data. As there's one keyspace created a 
month, my idea was to move historical keyspaces to HDD and make a symlink. 
These historical keyspaces are not queried often so it shouldn't affect 
performance much. I've written simple script that rsyncs data, stops 
Cassandra, makes symlink and starts Cassandra node again, but I have question 
if there will be problem if I make hotswap - without stopping Cassandra 
daemon. So... rsync, rename current keyspace folder in data dir, make symlink 
to new location on HDD. One good thing is that while keyspace is fully feed 
with data, it no longer changes. I see on lsof output that even if keyspace 
is not queried, Cassandra keeps files opened, so I guess it's not safe to 
hotswap, but I'd like to make sure.

Kind regards -
Krzysztof Ksiezyk


Re: moving keyspaces to another disk while Cassandra is running

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
FWIW:
http://thelastpickle.com/blog/2016/02/25/removing-a-disk-mapping-from-cassandra.html

I know it is not exactly what you want, but I believe it might be useful.

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-03-09 14:20 GMT+01:00 Krzysztof Księżyk <kk...@gmail.com>:

> Robert Coli <rcoli <at> eventbrite.com> writes:
>
> >
> > On Mon, Mar 7, 2016 at 2:57 AM, Krzysztof Księżyk <kksiezyk <at>
> gmail.com> wrote:I see on lsof output that even if keyspace
> > is not queried, Cassandra keeps files opened, so I guess it's not safe to
> > hotswap, but I'd like to make sure.
> >
> >
> > It is not safe for exactly this reason. Just restart your nodes.
> >
> > Were I doing this process, I would :
> >
> > 1) do initial rsync
> > 2) stop node
> > 3) do rsync again, with --delete for files which are no longer in the
> source. This is very important, or you risk resurrecting SSTables which
> have already been compacted away, which can be PERMANENTLY FATAL TO THE
> CONSISTENCY OF ALL INVOLVED DATA.
> > 4) start node
> >
> > =Rob
> >
> >
>
> Hi,
>
> Thanks for answer. I do it exactly this way.
>
> Kind regards -
> Krzysztof Ksiezyk
>

Re: moving keyspaces to another disk while Cassandra is running

Posted by Krzysztof Księżyk <kk...@gmail.com>.
Robert Coli <rcoli <at> eventbrite.com> writes:

> 
> On Mon, Mar 7, 2016 at 2:57 AM, Krzysztof Księżyk <kksiezyk <at> 
gmail.com> wrote:I see on lsof output that even if keyspace
> is not queried, Cassandra keeps files opened, so I guess it's not safe to
> hotswap, but I'd like to make sure.
> 
> 
> It is not safe for exactly this reason. Just restart your nodes.
> 
> Were I doing this process, I would :
> 
> 1) do initial rsync
> 2) stop node
> 3) do rsync again, with --delete for files which are no longer in the 
source. This is very important, or you risk resurrecting SSTables which 
have already been compacted away, which can be PERMANENTLY FATAL TO THE 
CONSISTENCY OF ALL INVOLVED DATA.
> 4) start node
> 
> =Rob
> 
> 

Hi,

Thanks for answer. I do it exactly this way.

Kind regards -
Krzysztof Ksiezyk

Re: moving keyspaces to another disk while Cassandra is running

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Mar 7, 2016 at 2:57 AM, Krzysztof Księżyk <kk...@gmail.com>
wrote:

> I see on lsof output that even if keyspace
> is not queried, Cassandra keeps files opened, so I guess it's not safe to
> hotswap, but I'd like to make sure.
>

It is not safe for exactly this reason. Just restart your nodes.

Were I doing this process, I would :

1) do initial rsync
2) stop node
3) do rsync again, with --delete for files which are no longer in the
source. This is very important, or you risk resurrecting SSTables which
have already been compacted away, which can be PERMANENTLY FATAL TO THE
CONSISTENCY OF ALL INVOLVED DATA.
4) start node

=Rob

Re: moving keyspaces to another disk while Cassandra is running

Posted by Krzysztof Księżyk <kk...@gmail.com>.
Jack Krupansky <jack.krupansky <at> gmail.com> writes:

> 
> 
> If your data is replicated properly (RF=3) and you do QUORUM reads and 
writes, you should be able to shut down one node, adjust the configuration, 
and restart that node and all should be fine. Do it quickly enough (less 
than an hour) and the node should quickly catch up with any changes.
> How small is small? Six nodes?
> 
> 
> -- Jack Krupansky
> On Mon, Mar 7, 2016 at 5:57 AM, Krzysztof Księżyk <kksiezyk <at> 
gmail.com> wrote:Hi,
> I have small Cassandra cluster running on boxes with 256GB SSD and 2TB 
HDD.
> Originally SSD was for system and commit log and HDD for data. But
> unfortunately because of nature of queries, performance was not 
satisfactory
> and to improve it, data were moved to SSD as well. Now problem is with 
SSD
> size being too small to keep all data. As there's one keyspace created a
> month, my idea was to move historical keyspaces to HDD and make a 
symlink.
> These historical keyspaces are not queried often so it shouldn't affect
> performance much. I've written simple script that rsyncs data, stops
> Cassandra, makes symlink and starts Cassandra node again, but I have 
question
> if there will be problem if I make hotswap - without stopping Cassandra
> daemon. So... rsync, rename current keyspace folder in data dir, make 
symlink
> to new location on HDD. One good thing is that while keyspace is fully 
feed
> with data, it no longer changes. I see on lsof output that even if 
keyspace
> is not queried, Cassandra keeps files opened, so I guess it's not safe to
> hotswap, but I'd like to make sure.
> Kind regards -
> Krzysztof Ksiezyk
> 
> 
> 
> 
> 


Hi,

Thank you for answer. Small cluster - 4 boxes. Data has RF=2 and worker 
uses 
read-consistency-level = ONE. Good thing is that data is immutable in 
keyspace after initial load, but problem is that worker querying Cassandra 
(using Titan) dies when node it uses, is down, even for a short period. 
Seems it's by bad design of worker but I can't fix it easily now, so 
initial idea was to preserve Cassandra online whole the time.

Kind regards -
Krzysztof Ksiezyk

Re: moving keyspaces to another disk while Cassandra is running

Posted by Jack Krupansky <ja...@gmail.com>.
If your data is replicated properly (RF=3) and you do QUORUM reads and
writes, you should be able to shut down one node, adjust the configuration,
and restart that node and all should be fine. Do it quickly enough (less
than an hour) and the node should quickly catch up with any changes.

How small is small? Six nodes?

-- Jack Krupansky

On Mon, Mar 7, 2016 at 5:57 AM, Krzysztof Księżyk <kk...@gmail.com>
wrote:

> Hi,
>
> I have small Cassandra cluster running on boxes with 256GB SSD and 2TB HDD.
> Originally SSD was for system and commit log and HDD for data. But
> unfortunately because of nature of queries, performance was not
> satisfactory
> and to improve it, data were moved to SSD as well. Now problem is with SSD
> size being too small to keep all data. As there's one keyspace created a
> month, my idea was to move historical keyspaces to HDD and make a symlink.
> These historical keyspaces are not queried often so it shouldn't affect
> performance much. I've written simple script that rsyncs data, stops
> Cassandra, makes symlink and starts Cassandra node again, but I have
> question
> if there will be problem if I make hotswap - without stopping Cassandra
> daemon. So... rsync, rename current keyspace folder in data dir, make
> symlink
> to new location on HDD. One good thing is that while keyspace is fully feed
> with data, it no longer changes. I see on lsof output that even if keyspace
> is not queried, Cassandra keeps files opened, so I guess it's not safe to
> hotswap, but I'd like to make sure.
>
> Kind regards -
> Krzysztof Ksiezyk
>
>