You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by David McNelis <dm...@gmail.com> on 2013/12/30 01:28:35 UTC

Cleanup and old files

I am currently running a cluster with 1.2.8.  One of my larger column
families on one of my nodes has keyspace-tablename-ic-####-Data.db with a
modify date in August.

Since august we have added several nodes (with vnodes), with the same
number of vnodes as all the existing nodes.

As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data
of the original 15 nodes should have been essentially balanced out to the 6
new nodes.  (1/15 + 1/16 + .... 1/21).

When I run a cleanup, however, the old data files never get updated, and I
can't believe that they all should have remained the same.

The only recently updated files in that data directory are secondary index
sstable files.  Am I doing something wrong here?  Am I thinking about this
wrong?

David

Re: Cleanup and old files

Posted by David McNelis <dm...@gmail.com>.
I see the SSTable in this log statement:   Stream context metadata (along
with a bunch of other files)....but I do not see it in the list of files
"Opening" (which I see quite a bit of, as expected).

Safe to try moving that file off server (to a backup location)?  If I tried
this, would I want to shut down the node first and monitor startup to see
if it all of a sudden is 'missing' something / throws an error then?


On Mon, Dec 30, 2013 at 9:26 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> Check the SSTable is actually in use by cassandra, if it’s missing a
> component or otherwise corrupt it will not be opened at run time and so not
> included in all the fun games the other SSTables get to play.
>
> If you have the last startup in the logs check for an “Opening… “ message
> or an ERROR about the file.
>
> Cheers
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 30/12/2013, at 1:28 pm, David McNelis <dm...@gmail.com> wrote:
>
> I am currently running a cluster with 1.2.8.  One of my larger column
> families on one of my nodes has keyspace-tablename-ic-####-Data.db with a
> modify date in August.
>
> Since august we have added several nodes (with vnodes), with the same
> number of vnodes as all the existing nodes.
>
> As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data
> of the original 15 nodes should have been essentially balanced out to the 6
> new nodes.  (1/15 + 1/16 + .... 1/21).
>
> When I run a cleanup, however, the old data files never get updated, and I
> can't believe that they all should have remained the same.
>
> The only recently updated files in that data directory are secondary index
> sstable files.  Am I doing something wrong here?  Am I thinking about this
> wrong?
>
> David
>
>
>

Re: Cleanup and old files

Posted by Aaron Morton <aa...@thelastpickle.com>.
Check the SSTable is actually in use by cassandra, if it’s missing a component or otherwise corrupt it will not be opened at run time and so not included in all the fun games the other SSTables get to play. 

If you have the last startup in the logs check for an “Opening… “ message or an ERROR about the file. 

Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/12/2013, at 1:28 pm, David McNelis <dm...@gmail.com> wrote:

> I am currently running a cluster with 1.2.8.  One of my larger column families on one of my nodes has keyspace-tablename-ic-####-Data.db with a modify date in August.
> 
> Since august we have added several nodes (with vnodes), with the same number of vnodes as all the existing nodes.
> 
> As a result, (we've since gone from 15 to 21 nodes), then ~32% of my data of the original 15 nodes should have been essentially balanced out to the 6 new nodes.  (1/15 + 1/16 + .... 1/21).
> 
> When I run a cleanup, however, the old data files never get updated, and I can't believe that they all should have remained the same.
> 
> The only recently updated files in that data directory are secondary index sstable files.  Am I doing something wrong here?  Am I thinking about this wrong?
> 
> David