You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Terje Marthinussen (JIRA)" <ji...@apache.org> on 2011/06/16 04:54:47 UTC

[jira] [Created] (CASSANDRA-2779) files not cleaned up by GC?

files not cleaned up by GC?
---------------------------

                 Key: CASSANDRA-2779
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
             Project: Cassandra
          Issue Type: Bug
            Reporter: Terje Marthinussen
            Priority: Critical
             Fix For: 0.8.0


This is 0.8.0 + a few 0.8.1 patches on repair.

We tested repair on 2 nodes in the cluster last night. 
Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.

We now see:
ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
 INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space

And we see a lot of them:
 INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space

Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 

It is trying to compact 2 sstables of 58 and 41GB.

If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.

It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.

Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?

At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-2779.
---------------------------------------

    Resolution: Duplicate

CASSANDRA-2521 is the right fix here.

> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Terje Marthinussen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050314#comment-13050314 ] 

Terje Marthinussen commented on CASSANDRA-2779:
-----------------------------------------------

If we get CASSANDRA-2521 in for 0.8 then first thing would be to check if this magically fixes itself.

If not, I think I can volunteer to tune things further. 
Starting to get familiar with the code and I already have a setup to test and reproduce.

> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Terje Marthinussen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050213#comment-13050213 ] 

Terje Marthinussen commented on CASSANDRA-2779:
-----------------------------------------------

The problem is the disk use.

I have no problem with the disk use if all recommendations for Cassandra is changed from typically "2x disk space required to hold the dataset" to "4x disk space vs. what is needed to hold the dataset".

Either we have to fix this issue to keep disk space down or we have to change the recommendations.

> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Terje Marthinussen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Terje Marthinussen reopened CASSANDRA-2779:
-------------------------------------------


Reopening until a decision is made to fix either documentation or disk usage,

Things like:
http://wiki.apache.org/cassandra/CassandraHardware
As covered in MemtableSSTable, compactions can require up to 100% of your in-use space temporarily in the worst case

should not exist, and in addition to an up to 100% increase of data from streaming, you need 100% for compaction and headroom to avoid full GC from running every few minutes like occured in this Jira (basically an unusable service)

> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-2779.
---------------------------------------

       Resolution: Not A Problem
    Fix Version/s:     (was: 0.8.0)

I don't see a problem here (I think repair holding references to files being streamed is totally reasonable)

> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Terje Marthinussen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Terje Marthinussen updated CASSANDRA-2779:
------------------------------------------

    Priority: Major  (was: Critical)

Dropping to major again. Things do get cleaned up, eventually...

> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>             Fix For: 0.8.0
>
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2779) files not cleaned up by GC?

Posted by "Terje Marthinussen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050191#comment-13050191 ] 

Terje Marthinussen commented on CASSANDRA-2779:
-----------------------------------------------

Interestingly enough... Just as I posted this it started removing the files and the case study disappeared and started working fine.

A few minutes before doing so, if completed "manual repair", so yes, I guess repair is a likely candidate for keeping file references.

If I remember right, it builds a list of files to stream at the start. They may not be dereferenced fully until the streaming finishes?


> files not cleaned up by GC?
> ---------------------------
>
>                 Key: CASSANDRA-2779
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2779
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Terje Marthinussen
>            Priority: Critical
>             Fix For: 0.8.0
>
>
> This is 0.8.0 + a few 0.8.1 patches on repair.
> We tested repair on 2 nodes in the cluster last night. 
> Interestingly enough, I don't believe the node described here is in any way neighbour of the nodes we tested repair on so I am not sure why it is streaming data both in and out, but in any case, it has joined the streaming party.
> We now see:
> ERROR [CompactionExecutor:5] 2011-06-16 09:12:23,928 CompactionManager.java (line 510) insufficient space to compact even the two smallest files, aborting
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
> And we see a lot of them:
>  INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java (line 2071) requesting GC to free disk space
> Available disk is 105GB and it is trying to compact a set of the largest sstables. There is probably easily enough disk to do so, but the estimation is not sufficient (lots of dupes here after streaming I think, probably heavily affected by CASSANDRA-2698). 
> It is trying to compact 2 sstables of 58 and 41GB.
> If I look at the data dir, I see 46 *Compacted files which makes up an additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26, so 10 hours old.
> It does however succeed  at cleaning up some files. There are definitely files which do get deleted. Just that there is a lot which is not.
> Either the GC cleanup tactic is seriously flawed or we have a potential bug keeping references to sstable objects?
> At least one of the sstables not cleaned up dates back before the repair was started, but most of them is from afterwards.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira