You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jürgen Albersdorfer <Ju...@zweiradteile.net> on 2018/04/04 11:32:41 UTC

Urgent Problem - Disk full

Hi,

I have an urgent Problem. - I will run out of disk space in near future.
Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
Keyspace Replication Factor RF=3. I run C* Version 3.11.2
We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.

From Application Standpoint it would be safe to loose some of the oldest Data.

Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?

Or maybe there is a different way to free some disk space? - Any suggestions?

best regards
Jürgen Albersdorfer

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: Urgent Problem - Disk full

Posted by Jürgen Albersdorfer <ja...@gmail.com>.

Thank You All for your hints on this. 
I added another folder on the commitlog Disk to compensate immediate urgency.

Next Step will be to reorganize and deduplicate the data into a 2nd table. 
Then drop the original one, clean the snapshot, consolidate back all data Files away from the commitlog Disk and Setup Monitoring ;)

Thank You, regards
Jürgen

RE: Urgent Problem - Disk full

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.

Agreed that you tend to add capacity to nodes or add nodes once you know you have no unneeded data in the cluster.

From: Alain RODRIGUEZ [mailto:arodrime@gmail.com] 
Sent: Wednesday, April 04, 2018 9:10 AM
To: user cassandra.apache.org
Subject: Re: Urgent Problem - Disk full

Hi,

When the disks are full, here are the options I can think of depending on the situation and how 'full' the disk really is:

- Add capacity - Add a disk, use JBOD adding a second location folder for the sstables and move some of them around then restart Cassandra. Or add a new node.
- Reduce disk space used. Some options come to my mind to reduce space used:

1 - Clean tombstones if any (use sstablemetadata for example to check the number of tombstones). If you have some not being purged, my first guess would be to set 'unchecked_tombstone_compaction' to 'true' at the node level. Yet be aware that this will trigger some compactions, that before freeing space, start by taking some more temporary!

If remaining space is really low on one node, you can control to compact only on the sstables having the higher tombstone ratio after you made the change above and that fit in the disk space you have left. It can even be scripted. It worked for me in the past with disk 100% full. If you do so, you might have to disable/reenable automatic compactions at key moments as well

2 -  If you added nodes recently to the data center you can consider running a 'nodetool cleanup', but here again, it will start by using more space for temporary sstables, and might have no positive impacts if the node only own data for its token ranges.

3 - Another common way to easily claim space is to clear snapshots that are not needed and might have been forgotten or taken by Cassandra: 'nodetool clearsnapshot'. This has no other risk than removing a useful backup.

4 - Delete data from this table or another table (effectively), directly removing the sstables indeed - as you use TWCS. If you don't need the data anyway.

5 - Truncate one of those other tables we tend to have that are written 'just in case' and actually never used and never read for months. It has been a powerful way out of this situation for me in the past too :). I would say: be sure that the disk space is used properly.

There is zero reason to believe a full repair would make this better and a lot of reason to believe it’ll make it worse

I second that too, just in case. Really, do not run a repair. The only thing it could do is bring more data to a node that really don't need it for now.

Finally, when this is behind you, the disk size is something you could consider monitoring as it is way easier to fix it when the disk is not completely full and it can be fixed preemptively. Usually, 50 to 20% of free disk is recommended depending on your use case.

C*heers,

-----------------------

Alain Rodriguez - @arodream - alain@thelastpickle.com

France / Spain

The Last Pickle - Apache Cassandra Consulting

http://www.thelastpickle.com <http://wwwthelastpickle.com> 

2018-04-04 15:34 GMT+01:00 Kenneth Brotman <ke...@yahoo.com.invalid>:

There's also the old snapshots to remove that could be a significant amount of memory.

-----Original Message-----
From: Kenneth Brotman [mailto:kenbrotman@yahoo.com.INVALID]
Sent: Wednesday, April 04, 2018 7:28 AM
To: user@cassandra.apache.org
Subject: RE: Urgent Problem - Disk full

Jeff,

Just wondering: why wouldn't the answer be to:
        1. move anything you want to archive to colder storage off the cluster,
        2. nodetool cleanup
        3. snapshot
        4. use delete command to remove archived data.

Kenneth Brotman

-----Original Message-----
From: Jeff Jirsa [mailto:jjirsa@gmail.com]
Sent: Wednesday, April 04, 2018 7:10 AM
To: user@cassandra.apache.org
Subject: Re: Urgent Problem - Disk full

Yes, this works in TWCS.

Note though that if you have tombstone compaction subproperties set, there may be sstables with newer filesystem timestamps that actually hold older Cassandra data, in which case sstablemetadata can help finding the sstables with truly old timestamps

Also if you’ve expanded the cluster over time and you see an imbalance of disk usage on the oldest hosts, “nodetool cleanup” will likely free up some of that data

--
Jeff Jirsa

> On Apr 4, 2018, at 4:32 AM, Jürgen Albersdorfer <Ju...@zweiradteile.net> wrote:
>
> Hi,
>
> I have an urgent Problem. - I will run out of disk space in near future.
> Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
> Keyspace Replication Factor RF=3. I run C* Version 3.11.2
> We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.
>
> From Application Standpoint it would be safe to loose some of the oldest Data.
>
> Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?
>
> Or maybe there is a different way to free some disk space? - Any suggestions?
>
> best regards
> Jürgen Albersdorfer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org <ma...@cassandraapache.org> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org <ma...@cassandra.apacheorg> 
For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org <ma...@cassandra.apacheorg> 
For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org <ma...@cassandra.apacheorg> 
For additional commands, e-mail: user-help@cassandra.apache.org

Re: Urgent Problem - Disk full

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi,

When the disks are full, here are the options I can think of depending on
the situation and how 'full' the disk really is:

- Add capacity - Add a disk, use JBOD adding a second location folder for
the sstables and move some of them around then restart Cassandra. Or add a
new node.
- Reduce disk space used. Some options come to my mind to reduce space used:

1 - Clean tombstones *if any* (use sstablemetadata for example to check the
number of tombstones). If you have some not being purged, my first guess
would be to set 'unchecked_tombstone_compaction' to 'true' at the node
level. Yet be aware that this will trigger some compactions, that before
freeing space, start by taking some more temporary!

If remaining space is really low on one node, you can control to compact
only on the sstables having the higher tombstone ratio after you made the
change above and that fit in the disk space you have left. It can even be
scripted. It worked for me in the past with disk 100% full. If you do so,
you might have to disable/reenable automatic compactions at key moments as
well

2 -  If you added nodes recently to the data center you can consider
running a 'nodetool cleanup', but here again, it will start by using more
space for temporary sstables, and might have no positive impacts if the
node only own data for its token ranges.

3 - Another common way to easily claim space is to clear snapshots that are
not needed and might have been forgotten or taken by Cassandra:
'nodetool clearsnapshot'. This has no other risk than removing a useful
backup.

4 - Delete data from this table or another table (effectively),
directly removing the sstables indeed - as you use TWCS. If you don't need
the data anyway.

5 - Truncate one of those other tables we tend to have that are written
'just in case' and actually never used and never read for months. It has
been a powerful way out of this situation for me in the past too :). I
would say: be sure that the disk space is used properly.

There is zero reason to believe a full repair would make this better and a
> lot of reason to believe it’ll make it worse
>

I second that too, just in case. Really, do not run a repair. The only
thing it could do is bring more data to a node that really don't need it
for now.

Finally, when this is behind you, the disk size is something you could
consider monitoring as it is way easier to fix it when the disk is not
completely full and it can be fixed preemptively. Usually, 50 to 20% of
free disk is recommended depending on your use case.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-04-04 15:34 GMT+01:00 Kenneth Brotman <ke...@yahoo.com.invalid>:

> There's also the old snapshots to remove that could be a significant
> amount of memory.
>
> -----Original Message-----
> From: Kenneth Brotman [mailto:kenbrotman@yahoo.com.INVALID]
> Sent: Wednesday, April 04, 2018 7:28 AM
> To: user@cassandra.apache.org
> Subject: RE: Urgent Problem - Disk full
>
> Jeff,
>
> Just wondering: why wouldn't the answer be to:
>         1. move anything you want to archive to colder storage off the
> cluster,
>         2. nodetool cleanup
>         3. snapshot
>         4. use delete command to remove archived data.
>
> Kenneth Brotman
>
> -----Original Message-----
> From: Jeff Jirsa [mailto:jjirsa@gmail.com]
> Sent: Wednesday, April 04, 2018 7:10 AM
> To: user@cassandra.apache.org
> Subject: Re: Urgent Problem - Disk full
>
> Yes, this works in TWCS.
>
> Note though that if you have tombstone compaction subproperties set, there
> may be sstables with newer filesystem timestamps that actually hold older
> Cassandra data, in which case sstablemetadata can help finding the sstables
> with truly old timestamps
>
> Also if you’ve expanded the cluster over time and you see an imbalance of
> disk usage on the oldest hosts, “nodetool cleanup” will likely free up some
> of that data
>
>
>
> --
> Jeff Jirsa
>
>
> > On Apr 4, 2018, at 4:32 AM, Jürgen Albersdorfer <
> Juergen.Albersdorfer@zweiradteile.net> wrote:
> >
> > Hi,
> >
> > I have an urgent Problem. - I will run out of disk space in near future.
> > Largest Table is a Time-Series Table with TimeWindowCompactionStrategy
> (TWCS) and default_time_to_live = 0
> > Keyspace Replication Factor RF=3. I run C* Version 3.11.2
> > We have grown the Cluster over time, so SSTable files have different
> Dates on different Nodes.
> >
> > From Application Standpoint it would be safe to loose some of the oldest
> Data.
> >
> > Is it safe to delete some of the oldest SSTable Files, which will no
> longer get touched by TWCS Compaction any more, while Node is clean
> Shutdown? - And doing so for one Node after another?
> >
> > Or maybe there is a different way to free some disk space? - Any
> suggestions?
> >
> > best regards
> > Jürgen Albersdorfer
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: user-help@cassandra.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

RE: Urgent Problem - Disk full

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.

There's also the old snapshots to remove that could be a significant amount of memory.

-----Original Message-----
From: Kenneth Brotman [mailto:kenbrotman@yahoo.com.INVALID] 
Sent: Wednesday, April 04, 2018 7:28 AM
To: user@cassandra.apache.org
Subject: RE: Urgent Problem - Disk full

Jeff,

Just wondering: why wouldn't the answer be to:
	1. move anything you want to archive to colder storage off the cluster, 
	2. nodetool cleanup
	3. snapshot
	4. use delete command to remove archived data.

Kenneth Brotman

-----Original Message-----
From: Jeff Jirsa [mailto:jjirsa@gmail.com] 
Sent: Wednesday, April 04, 2018 7:10 AM
To: user@cassandra.apache.org
Subject: Re: Urgent Problem - Disk full

Yes, this works in TWCS. 

Note though that if you have tombstone compaction subproperties set, there may be sstables with newer filesystem timestamps that actually hold older Cassandra data, in which case sstablemetadata can help finding the sstables with truly old timestamps

Also if you’ve expanded the cluster over time and you see an imbalance of disk usage on the oldest hosts, “nodetool cleanup” will likely free up some of that data



-- 
Jeff Jirsa


> On Apr 4, 2018, at 4:32 AM, Jürgen Albersdorfer <Ju...@zweiradteile.net> wrote:
> 
> Hi,
> 
> I have an urgent Problem. - I will run out of disk space in near future.
> Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
> Keyspace Replication Factor RF=3. I run C* Version 3.11.2
> We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.
> 
> From Application Standpoint it would be safe to loose some of the oldest Data.
> 
> Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?
> 
> Or maybe there is a different way to free some disk space? - Any suggestions?
> 
> best regards
> Jürgen Albersdorfer
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

RE: Urgent Problem - Disk full

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.

Jeff,

Just wondering: why wouldn't the answer be to:
	1. move anything you want to archive to colder storage off the cluster, 
	2. nodetool cleanup
	3. snapshot
	4. use delete command to remove archived data.

Kenneth Brotman

-----Original Message-----
From: Jeff Jirsa [mailto:jjirsa@gmail.com] 
Sent: Wednesday, April 04, 2018 7:10 AM
To: user@cassandra.apache.org
Subject: Re: Urgent Problem - Disk full

Yes, this works in TWCS. 

Note though that if you have tombstone compaction subproperties set, there may be sstables with newer filesystem timestamps that actually hold older Cassandra data, in which case sstablemetadata can help finding the sstables with truly old timestamps

Also if you’ve expanded the cluster over time and you see an imbalance of disk usage on the oldest hosts, “nodetool cleanup” will likely free up some of that data



-- 
Jeff Jirsa


> On Apr 4, 2018, at 4:32 AM, Jürgen Albersdorfer <Ju...@zweiradteile.net> wrote:
> 
> Hi,
> 
> I have an urgent Problem. - I will run out of disk space in near future.
> Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
> Keyspace Replication Factor RF=3. I run C* Version 3.11.2
> We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.
> 
> From Application Standpoint it would be safe to loose some of the oldest Data.
> 
> Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?
> 
> Or maybe there is a different way to free some disk space? - Any suggestions?
> 
> best regards
> Jürgen Albersdorfer
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: Urgent Problem - Disk full

Posted by Jeff Jirsa <jj...@gmail.com>.

Yes, this works in TWCS. 

Note though that if you have tombstone compaction subproperties set, there may be sstables with newer filesystem timestamps that actually hold older Cassandra data, in which case sstablemetadata can help finding the sstables with truly old timestamps

Also if you’ve expanded the cluster over time and you see an imbalance of disk usage on the oldest hosts, “nodetool cleanup” will likely free up some of that data



-- 
Jeff Jirsa


> On Apr 4, 2018, at 4:32 AM, Jürgen Albersdorfer <Ju...@zweiradteile.net> wrote:
> 
> Hi,
> 
> I have an urgent Problem. - I will run out of disk space in near future.
> Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
> Keyspace Replication Factor RF=3. I run C* Version 3.11.2
> We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.
> 
> From Application Standpoint it would be safe to loose some of the oldest Data.
> 
> Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?
> 
> Or maybe there is a different way to free some disk space? - Any suggestions?
> 
> best regards
> Jürgen Albersdorfer
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: Urgent Problem - Disk full

Posted by Jeff Jirsa <jj...@gmail.com>.

There is zero reason to believe a full repair would make this better and a lot of reason to believe it’ll make it worse

For casual observers following along at home, this is probably not the answer you’re looking for.

-- 
Jeff Jirsa


> On Apr 4, 2018, at 4:37 AM, Rahul Singh <ra...@gmail.com> wrote:
> 
> Nothing a full repair won’t be able to fix. 
> 
>> On Apr 4, 2018, 7:32 AM -0400, Jürgen Albersdorfer <Ju...@zweiradteile.net>, wrote:
>> Hi,
>> 
>> I have an urgent Problem. - I will run out of disk space in near future.
>> Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
>> Keyspace Replication Factor RF=3. I run C* Version 3.11.2
>> We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.
>> 
>> From Application Standpoint it would be safe to loose some of the oldest Data.
>> 
>> Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?
>> 
>> Or maybe there is a different way to free some disk space? - Any suggestions?
>> 
>> best regards
>> Jürgen Albersdorfer
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org

RE: Urgent Problem - Disk full

Posted by Kenneth Brotman <ke...@yahoo.com.INVALID>.

Assuming the data model is good and there haven’t been any sudden jumps in memory use, it seems like the normal thing to do is archive some of the old time series data that you don’t care about.

 

Kenneth Brotman

 

From: Rahul Singh [mailto:rahul.xavier.singh@gmail.com] 
Sent: Wednesday, April 04, 2018 4:38 AM
To: user@cassandra.apache.org; user@cassandra.apache.org
Subject: Re: Urgent Problem - Disk full

 

Nothing a full repair won’t be able to fix. 


On Apr 4, 2018, 7:32 AM -0400, Jürgen Albersdorfer <Ju...@zweiradteile.net>, wrote:



Hi,

I have an urgent Problem. - I will run out of disk space in near future.
Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
Keyspace Replication Factor RF=3. I run C* Version 3.11.2
We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.

From Application Standpoint it would be safe to loose some of the oldest Data.

Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?

Or maybe there is a different way to free some disk space? - Any suggestions?

best regards
Jürgen Albersdorfer

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: Urgent Problem - Disk full

Posted by Rahul Singh <ra...@gmail.com>.

Nothing a full repair won’t be able to fix.

On Apr 4, 2018, 7:32 AM -0400, Jürgen Albersdorfer <Ju...@zweiradteile.net>, wrote:
> Hi,
>
> I have an urgent Problem. - I will run out of disk space in near future.
> Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0
> Keyspace Replication Factor RF=3. I run C* Version 3.11.2
> We have grown the Cluster over time, so SSTable files have different Dates on different Nodes.
>
> From Application Standpoint it would be safe to loose some of the oldest Data.
>
> Is it safe to delete some of the oldest SSTable Files, which will no longer get touched by TWCS Compaction any more, while Node is clean Shutdown? - And doing so for one Node after another?
>
> Or maybe there is a different way to free some disk space? - Any suggestions?
>
> best regards
> Jürgen Albersdorfer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org