You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Shotaro Kamio <ka...@gmail.com> on 2011/06/15 16:50:42 UTC

Re: Forcing Cassandra to free up some space

We've encountered the situation that compacted sstable files aren't
deleted after node repair. Even when gc is triggered via jmx, it
sometimes leaves compacted files. In a case, a lot of files are left.
Some files stay more than 10 hours already. There is no guarantee that
gc will cleanup all compacted sstable files.

We have a great interest on the following ticket.
https://issues.apache.org/jira/browse/CASSANDRA-2521


Regards,
Shotaro


On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman <je...@gmail.com> wrote:
> Im also not sure that will guarantee all space is cleaned up.  It
> really depends on what you are doing inside Cassandra.  If you have
> your on garbage collect that is just in some way tied to the gc run,
> then it will run when  it runs.
>
> If otoh you are associating records in your storage with specific
> objects in memory and using one of the post-mortem hooks (finalize or
> PhantomReference) to tell you to clean up that particular record then
> its quite possible they wont all get cleaned up.  In general hotspot
> does not find and clean every candidate object on every GC run.  It
> starts with the easiest/fastest to find and then sees what more it
> thinks it needs to do to create enough memory for anticipated near
> future needs.
>
> On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>> In summary, system.gc works fine unless you've deliberately done
>> something like setting the -XX:-DisableExplicitGC flag.
>>
>> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
>> <ko...@a-bb.net> wrote:
>>> So, in summary, there is no way to predictably and efficiently tell Cassandra to get rid of all of the extra space it is using on disk?
>>>
>>> ----- Original Message -----
>>> From: "Jeffrey Kesselman" <je...@gmail.com>
>>> To: user@cassandra.apache.org
>>> Sent: Thursday, May 26, 2011 8:57:49 PM
>>> Subject: Re: Forcing Cassandra to free up some space
>>>
>>> Which JVM?  Which collector?  There have been and continue to be many.
>>>
>>> Hotspot itself supports a number of different collectors with
>>> different behaviors.   Many of them do not collect every candidate on
>>> every gc, but merely the easiest ones to find.  This is why depending
>>> on finalizers is a *bad* idea in java code.  They may well never get
>>> run.  (Finalizer is one of a few features the Sun Java team always
>>> regretted putting in Java to start with.  It has caused quite a few
>>> application problems over the years)
>>>
>>> The really important thing is that NONE of these behaviors of the
>>> colelctors are guaranteed by specification not to change from version
>>> to version.  Basing your code on non-specified behaviors is a good way
>>> to hit mysterious failures on updates.
>>>
>>> For instance, in the mid 90s, IBM had a mode of their Vm called
>>> "infinite heap."  it *never* garbage collected, even if you called
>>> System.gc.  Instead it just threw away address space and counted on
>>> the total memory needs for the life of the program being less then the
>>> total addressable space of the processor.
>>>
>>> It was *very* fast for certain kinds of applications.
>>>
>>> Far from being pedantic, not depending on undocumented behavior is
>>> simply good engineering.
>>>
>>>
>>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>> I've read the relevant source. While you're pedantically correct re
>>>> the spec, you're wrong as to what the JVM actually does.
>>>>
>>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman <je...@gmail.com> wrote:
>>>>> Some references...
>>>>>
>>>>> "An object enters an unreachable state when no more strong references
>>>>> to it exist. When an object is unreachable, it is a candidate for
>>>>> collection. Note the wording: Just because an object is a candidate
>>>>> for collection doesn't mean it will be immediately collected. The JVM
>>>>> is free to delay collection until there is an immediate need for the
>>>>> memory being consumed by the object."
>>>>>
>>>>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>>>>>
>>>>> and "Calling the gc method suggests that the Java Virtual Machine
>>>>> expend effort toward recycling unused objects"
>>>>>
>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>>>>>
>>>>> It goes on to say that the VM will make a "best effort", but "best
>>>>> effort" is *deliberately* left up to the definition of the gc
>>>>> implementor.
>>>>>
>>>>> I guess you missed the many lectures I have given on this subject over
>>>>> the years at Java One Conferences....
>>>>>
>>>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>>>>> It's a common misunderstanding that system.gc is only a suggestion; on
>>>>>> any VM you're likely to run Cassandra on, System.gc will actually
>>>>>> invoke a full collection.
>>>>>>
>>>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman <je...@gmail.com> wrote:
>>>>>>> Actually this is no gaurantee.   Its a common misunderstanding that
>>>>>>> System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
>>>>>>> has the option as to when and how much it gcs
>>>>>>>
>>>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis" <jb...@gmail.com> wrote:
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jonathan Ellis
>>>>>> Project Chair, Apache Cassandra
>>>>>> co-founder of DataStax, the source for professional Cassandra support
>>>>>> http://www.datastax.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> It's always darkest just before you are eaten by a grue.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of DataStax, the source for professional Cassandra support
>>>> http://www.datastax.com
>>>>
>>>
>>>
>>>
>>> --
>>> It's always darkest just before you are eaten by a grue.
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>



-- 
Shotaro Kamio

Re: Forcing Cassandra to free up some space

Posted by Ryan King <ry...@twitter.com>.
There's a ticket open to address this:

https://issues.apache.org/jira/browse/CASSANDRA-1974

-ryan

On Wed, Jun 15, 2011 at 8:49 AM, Terje Marthinussen
<tm...@gmail.com> wrote:
>
>
> On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen
> <tm...@gmail.com> wrote:
>>
>> Even if the gc call cleaned all files, it is not really acceptable on a
>> decent sized cluster due to the impact full gc has on performance.
>> Especially non-needed ones.
>>
>
> Not acceptable as running GC on every node in the cluster will further
> increase the time period when you have degraded performance.
>
> Terje
>

Re: Forcing Cassandra to free up some space

Posted by Terje Marthinussen <tm...@gmail.com>.
On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen <
tmarthinussen@gmail.com> wrote:

> Even if the gc call cleaned all files, it is not really acceptable on a
> decent sized cluster due to the impact full gc has on performance.
> Especially non-needed ones.
>
>
Not acceptable as running GC on every node in the cluster will further
increase the time period when you have degraded performance.

Terje

Re: Forcing Cassandra to free up some space

Posted by Peter Schuller <pe...@infidyne.com>.
> Even if the gc call cleaned all files, it is not really acceptable on a
> decent sized cluster due to the impact full gc has on performance.
> Especially non-needed ones.

You can run with -XX:+ExplicitGCInvokesConcurrent to "safely" trigger
CMS cycles. However that also means System.gc() semantics changes so
I'm not sure off hand what'll happen to the auto-system.gc code in
cassandra that attempts to free space.

CASSANDRA-2521 is IMO the real solution.

-- 
/ Peter Schuller

Re: Forcing Cassandra to free up some space

Posted by AJ <aj...@dude.podzone.net>.
In regards to cleaning-up old sstable files, I posed this question 
before as I noticed after taking a snapshot, the older files 
(pre-compaction) shared no links with the snapshots.  Therefore, (if the 
Cass snapshot functionality is working correctly) those older files can 
be manually deleted.  The reasoning is simply because if you were to do 
a backup based on the snapshots that Cass created, then those older 
(pre-compation) files would be left-out of the backup.  Therefore, they 
are no longer needed.

But, I never got a definitive answer to this.  If the Cass snapshot 
functionality can be relied upon with 100% confidence, then all you have 
to do is take a snapshot, then delete all the files with hard links <= 1 
and with mod times prior to the snapshotted files.  But, again, this is 
only considered safe if the Cass snapshot function is 100% reliable.  I 
have no reason to believe it's not... just saying.

On 6/15/2011 9:48 AM, Terje Marthinussen wrote:
> Even if the gc call cleaned all files, it is not really acceptable on 
> a decent sized cluster due to the impact full gc has on performance. 
> Especially non-needed ones.
>
> The delay in file deletion can also at times make it hard to see how 
> much spare disk you actually have.
>
> We easily see 100% increase in disk use which extends for long periods 
> of time before anything gets cleaned up. This can be quite misleading 
> and I believe on a couple of occasions we seen short term full disk 
> scenarios during testing as a result of cleanup not happening entirely 
> when it should...
>
> Terje
>
> On Wed, Jun 15, 2011 at 11:50 PM, Shotaro Kamio <kamioshot@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     We've encountered the situation that compacted sstable files aren't
>     deleted after node repair. Even when gc is triggered via jmx, it
>     sometimes leaves compacted files. In a case, a lot of files are left.
>     Some files stay more than 10 hours already. There is no guarantee that
>     gc will cleanup all compacted sstable files.
>
>     We have a great interest on the following ticket.
>     https://issues.apache.org/jira/browse/CASSANDRA-2521
>
>
>     Regards,
>     Shotaro
>
>
>     On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman
>     <jeffpk@gmail.com <ma...@gmail.com>> wrote:
>     > Im also not sure that will guarantee all space is cleaned up.  It
>     > really depends on what you are doing inside Cassandra.  If you have
>     > your on garbage collect that is just in some way tied to the gc run,
>     > then it will run when  it runs.
>     >
>     > If otoh you are associating records in your storage with specific
>     > objects in memory and using one of the post-mortem hooks
>     (finalize or
>     > PhantomReference) to tell you to clean up that particular record
>     then
>     > its quite possible they wont all get cleaned up.  In general hotspot
>     > does not find and clean every candidate object on every GC run.  It
>     > starts with the easiest/fastest to find and then sees what more it
>     > thinks it needs to do to create enough memory for anticipated near
>     > future needs.
>     >
>     > On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis
>     <jbellis@gmail.com <ma...@gmail.com>> wrote:
>     >> In summary, system.gc works fine unless you've deliberately done
>     >> something like setting the -XX:-DisableExplicitGC flag.
>     >>
>     >> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
>     >> <konstantinn@a-bb.net <ma...@a-bb.net>> wrote:
>     >>> So, in summary, there is no way to predictably and efficiently
>     tell Cassandra to get rid of all of the extra space it is using on
>     disk?
>     >>>
>     >>> ----- Original Message -----
>     >>> From: "Jeffrey Kesselman" <jeffpk@gmail.com
>     <ma...@gmail.com>>
>     >>> To: user@cassandra.apache.org <ma...@cassandra.apache.org>
>     >>> Sent: Thursday, May 26, 2011 8:57:49 PM
>     >>> Subject: Re: Forcing Cassandra to free up some space
>     >>>
>     >>> Which JVM?  Which collector?  There have been and continue to
>     be many.
>     >>>
>     >>> Hotspot itself supports a number of different collectors with
>     >>> different behaviors.   Many of them do not collect every
>     candidate on
>     >>> every gc, but merely the easiest ones to find.  This is why
>     depending
>     >>> on finalizers is a *bad* idea in java code.  They may well
>     never get
>     >>> run.  (Finalizer is one of a few features the Sun Java team always
>     >>> regretted putting in Java to start with.  It has caused quite
>     a few
>     >>> application problems over the years)
>     >>>
>     >>> The really important thing is that NONE of these behaviors of the
>     >>> colelctors are guaranteed by specification not to change from
>     version
>     >>> to version.  Basing your code on non-specified behaviors is a
>     good way
>     >>> to hit mysterious failures on updates.
>     >>>
>     >>> For instance, in the mid 90s, IBM had a mode of their Vm called
>     >>> "infinite heap."  it *never* garbage collected, even if you called
>     >>> System.gc.  Instead it just threw away address space and
>     counted on
>     >>> the total memory needs for the life of the program being less
>     then the
>     >>> total addressable space of the processor.
>     >>>
>     >>> It was *very* fast for certain kinds of applications.
>     >>>
>     >>> Far from being pedantic, not depending on undocumented behavior is
>     >>> simply good engineering.
>     >>>
>     >>>
>     >>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis
>     <jbellis@gmail.com <ma...@gmail.com>> wrote:
>     >>>> I've read the relevant source. While you're pedantically
>     correct re
>     >>>> the spec, you're wrong as to what the JVM actually does.
>     >>>>
>     >>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman
>     <jeffpk@gmail.com <ma...@gmail.com>> wrote:
>     >>>>> Some references...
>     >>>>>
>     >>>>> "An object enters an unreachable state when no more strong
>     references
>     >>>>> to it exist. When an object is unreachable, it is a
>     candidate for
>     >>>>> collection. Note the wording: Just because an object is a
>     candidate
>     >>>>> for collection doesn't mean it will be immediately
>     collected. The JVM
>     >>>>> is free to delay collection until there is an immediate need
>     for the
>     >>>>> memory being consumed by the object."
>     >>>>>
>     >>>>>
>     http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>     >>>>>
>     >>>>> and "Calling the gc method suggests that the Java Virtual
>     Machine
>     >>>>> expend effort toward recycling unused objects"
>     >>>>>
>     >>>>>
>     http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>     <http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc%28%29>
>     >>>>>
>     >>>>> It goes on to say that the VM will make a "best effort", but
>     "best
>     >>>>> effort" is *deliberately* left up to the definition of the gc
>     >>>>> implementor.
>     >>>>>
>     >>>>> I guess you missed the many lectures I have given on this
>     subject over
>     >>>>> the years at Java One Conferences....
>     >>>>>
>     >>>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis
>     <jbellis@gmail.com <ma...@gmail.com>> wrote:
>     >>>>>> It's a common misunderstanding that system.gc is only a
>     suggestion; on
>     >>>>>> any VM you're likely to run Cassandra on, System.gc will
>     actually
>     >>>>>> invoke a full collection.
>     >>>>>>
>     >>>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman
>     <jeffpk@gmail.com <ma...@gmail.com>> wrote:
>     >>>>>>> Actually this is no gaurantee.   Its a common
>     misunderstanding that
>     >>>>>>> System.gc "forces" gc.  It does not. It is a suggestion
>     only. The vm always
>     >>>>>>> has the option as to when and how much it gcs
>     >>>>>>>
>     >>>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis"
>     <jbellis@gmail.com <ma...@gmail.com>> wrote:
>     >>>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> --
>     >>>>>> Jonathan Ellis
>     >>>>>> Project Chair, Apache Cassandra
>     >>>>>> co-founder of DataStax, the source for professional
>     Cassandra support
>     >>>>>> http://www.datastax.com
>     >>>>>>
>     >>>>>
>     >>>>>
>     >>>>>
>     >>>>> --
>     >>>>> It's always darkest just before you are eaten by a grue.
>     >>>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>> --
>     >>>> Jonathan Ellis
>     >>>> Project Chair, Apache Cassandra
>     >>>> co-founder of DataStax, the source for professional Cassandra
>     support
>     >>>> http://www.datastax.com
>     >>>>
>     >>>
>     >>>
>     >>>
>     >>> --
>     >>> It's always darkest just before you are eaten by a grue.
>     >>>
>     >>
>     >>
>     >>
>     >> --
>     >> Jonathan Ellis
>     >> Project Chair, Apache Cassandra
>     >> co-founder of DataStax, the source for professional Cassandra
>     support
>     >> http://www.datastax.com
>     >>
>     >
>     >
>     >
>     > --
>     > It's always darkest just before you are eaten by a grue.
>     >
>
>
>
>     --
>     Shotaro Kamio
>
>


Re: Forcing Cassandra to free up some space

Posted by Terje Marthinussen <tm...@gmail.com>.
Even if the gc call cleaned all files, it is not really acceptable on a
decent sized cluster due to the impact full gc has on performance.
Especially non-needed ones.

The delay in file deletion can also at times make it hard to see how much
spare disk you actually have.

We easily see 100% increase in disk use which extends for long periods of
time before anything gets cleaned up. This can be quite misleading and I
believe on a couple of occasions we seen short term full disk scenarios
during testing as a result of cleanup not happening entirely when it
should...

Terje

On Wed, Jun 15, 2011 at 11:50 PM, Shotaro Kamio <ka...@gmail.com> wrote:

> We've encountered the situation that compacted sstable files aren't
> deleted after node repair. Even when gc is triggered via jmx, it
> sometimes leaves compacted files. In a case, a lot of files are left.
> Some files stay more than 10 hours already. There is no guarantee that
> gc will cleanup all compacted sstable files.
>
> We have a great interest on the following ticket.
> https://issues.apache.org/jira/browse/CASSANDRA-2521
>
>
> Regards,
> Shotaro
>
>
> On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman <je...@gmail.com>
> wrote:
> > Im also not sure that will guarantee all space is cleaned up.  It
> > really depends on what you are doing inside Cassandra.  If you have
> > your on garbage collect that is just in some way tied to the gc run,
> > then it will run when  it runs.
> >
> > If otoh you are associating records in your storage with specific
> > objects in memory and using one of the post-mortem hooks (finalize or
> > PhantomReference) to tell you to clean up that particular record then
> > its quite possible they wont all get cleaned up.  In general hotspot
> > does not find and clean every candidate object on every GC run.  It
> > starts with the easiest/fastest to find and then sees what more it
> > thinks it needs to do to create enough memory for anticipated near
> > future needs.
> >
> > On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >> In summary, system.gc works fine unless you've deliberately done
> >> something like setting the -XX:-DisableExplicitGC flag.
> >>
> >> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
> >> <ko...@a-bb.net> wrote:
> >>> So, in summary, there is no way to predictably and efficiently tell
> Cassandra to get rid of all of the extra space it is using on disk?
> >>>
> >>> ----- Original Message -----
> >>> From: "Jeffrey Kesselman" <je...@gmail.com>
> >>> To: user@cassandra.apache.org
> >>> Sent: Thursday, May 26, 2011 8:57:49 PM
> >>> Subject: Re: Forcing Cassandra to free up some space
> >>>
> >>> Which JVM?  Which collector?  There have been and continue to be many.
> >>>
> >>> Hotspot itself supports a number of different collectors with
> >>> different behaviors.   Many of them do not collect every candidate on
> >>> every gc, but merely the easiest ones to find.  This is why depending
> >>> on finalizers is a *bad* idea in java code.  They may well never get
> >>> run.  (Finalizer is one of a few features the Sun Java team always
> >>> regretted putting in Java to start with.  It has caused quite a few
> >>> application problems over the years)
> >>>
> >>> The really important thing is that NONE of these behaviors of the
> >>> colelctors are guaranteed by specification not to change from version
> >>> to version.  Basing your code on non-specified behaviors is a good way
> >>> to hit mysterious failures on updates.
> >>>
> >>> For instance, in the mid 90s, IBM had a mode of their Vm called
> >>> "infinite heap."  it *never* garbage collected, even if you called
> >>> System.gc.  Instead it just threw away address space and counted on
> >>> the total memory needs for the life of the program being less then the
> >>> total addressable space of the processor.
> >>>
> >>> It was *very* fast for certain kinds of applications.
> >>>
> >>> Far from being pedantic, not depending on undocumented behavior is
> >>> simply good engineering.
> >>>
> >>>
> >>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>>> I've read the relevant source. While you're pedantically correct re
> >>>> the spec, you're wrong as to what the JVM actually does.
> >>>>
> >>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman <je...@gmail.com>
> wrote:
> >>>>> Some references...
> >>>>>
> >>>>> "An object enters an unreachable state when no more strong references
> >>>>> to it exist. When an object is unreachable, it is a candidate for
> >>>>> collection. Note the wording: Just because an object is a candidate
> >>>>> for collection doesn't mean it will be immediately collected. The JVM
> >>>>> is free to delay collection until there is an immediate need for the
> >>>>> memory being consumed by the object."
> >>>>>
> >>>>>
> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
> >>>>>
> >>>>> and "Calling the gc method suggests that the Java Virtual Machine
> >>>>> expend effort toward recycling unused objects"
> >>>>>
> >>>>>
> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
> >>>>>
> >>>>> It goes on to say that the VM will make a "best effort", but "best
> >>>>> effort" is *deliberately* left up to the definition of the gc
> >>>>> implementor.
> >>>>>
> >>>>> I guess you missed the many lectures I have given on this subject
> over
> >>>>> the years at Java One Conferences....
> >>>>>
> >>>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>>>>> It's a common misunderstanding that system.gc is only a suggestion;
> on
> >>>>>> any VM you're likely to run Cassandra on, System.gc will actually
> >>>>>> invoke a full collection.
> >>>>>>
> >>>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman <
> jeffpk@gmail.com> wrote:
> >>>>>>> Actually this is no gaurantee.   Its a common misunderstanding that
> >>>>>>> System.gc "forces" gc.  It does not. It is a suggestion only. The
> vm always
> >>>>>>> has the option as to when and how much it gcs
> >>>>>>>
> >>>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis" <jb...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Jonathan Ellis
> >>>>>> Project Chair, Apache Cassandra
> >>>>>> co-founder of DataStax, the source for professional Cassandra
> support
> >>>>>> http://www.datastax.com
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> It's always darkest just before you are eaten by a grue.
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Jonathan Ellis
> >>>> Project Chair, Apache Cassandra
> >>>> co-founder of DataStax, the source for professional Cassandra support
> >>>> http://www.datastax.com
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> It's always darkest just before you are eaten by a grue.
> >>>
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of DataStax, the source for professional Cassandra support
> >> http://www.datastax.com
> >>
> >
> >
> >
> > --
> > It's always darkest just before you are eaten by a grue.
> >
>
>
>
> --
> Shotaro Kamio
>

Re: Forcing Cassandra to free up some space

Posted by Ryan King <ry...@twitter.com>.
There's a ticket open for this:
https://issues.apache.org/jira/browse/CASSANDRA-2521. Vote on it if
you think its important.

-ryan

On Wed, Jun 15, 2011 at 7:34 PM, Jeffrey Kesselman <je...@gmail.com> wrote:
> The GC cleanup approach, if depending on specific objects being GCd,
> is fundamentally flawed.
>
> I brought this up earlier, won't restart that thread.  It should be in
> the archives.
>
>
> On Wed, Jun 15, 2011 at 10:17 PM, Terje Marthinussen
> <tm...@gmail.com> wrote:
>> Watching this on a node here right now and it sort of shows how bad this can
>> get.
>> This node still has 109GB free disk by the way...
>> INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [ScheduledTasks:1] 2011-06-16 09:29:29,494 GCInspector.java (line 128)
>> GC for ParNew: 392 ms, 398997776 reclaimed leaving 2334786808 used; max is
>> 10844635136
>>  INFO [ScheduledTasks:1] 2011-06-16 09:29:32,831 GCInspector.java (line 128)
>> GC for ParNew: 737 ms, 332336832 reclaimed leaving 2473311448 used; max is
>> 10844635136
>>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java
>> (line 2071) requesting GC to free disk space
>>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java
>> (line 2071) requesting GC to free disk space
>> If I look at the data dir, I see 46 *Compacted files which makes up an
>> additional 137GB of space.
>> The oldest of these Compacted files dates back to Jun 16th 01:26.
>> If these got deleted, there should actually be enough disk for the node to
>> run a full compaction run if needed.
>> Either the GC cleanup tactic is seriously flawed or  we have a potential bug
>> keeping references far longer than needed?
>> Terje
>>
>>
>> On Wed, Jun 15, 2011 at 11:50 PM, Shotaro Kamio <ka...@gmail.com> wrote:
>>>
>>> We've encountered the situation that compacted sstable files aren't
>>> deleted after node repair. Even when gc is triggered via jmx, it
>>> sometimes leaves compacted files. In a case, a lot of files are left.
>>> Some files stay more than 10 hours already. There is no guarantee that
>>> gc will cleanup all compacted sstable files.
>>>
>>> We have a great interest on the following ticket.
>>> https://issues.apache.org/jira/browse/CASSANDRA-2521
>>>
>>>
>>> Regards,
>>> Shotaro
>>>
>>>
>>> On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman <je...@gmail.com>
>>> wrote:
>>> > Im also not sure that will guarantee all space is cleaned up.  It
>>> > really depends on what you are doing inside Cassandra.  If you have
>>> > your on garbage collect that is just in some way tied to the gc run,
>>> > then it will run when  it runs.
>>> >
>>> > If otoh you are associating records in your storage with specific
>>> > objects in memory and using one of the post-mortem hooks (finalize or
>>> > PhantomReference) to tell you to clean up that particular record then
>>> > its quite possible they wont all get cleaned up.  In general hotspot
>>> > does not find and clean every candidate object on every GC run.  It
>>> > starts with the easiest/fastest to find and then sees what more it
>>> > thinks it needs to do to create enough memory for anticipated near
>>> > future needs.
>>> >
>>> > On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis <jb...@gmail.com>
>>> > wrote:
>>> >> In summary, system.gc works fine unless you've deliberately done
>>> >> something like setting the -XX:-DisableExplicitGC flag.
>>> >>
>>> >> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
>>> >> <ko...@a-bb.net> wrote:
>>> >>> So, in summary, there is no way to predictably and efficiently tell
>>> >>> Cassandra to get rid of all of the extra space it is using on disk?
>>> >>>
>>> >>> ----- Original Message -----
>>> >>> From: "Jeffrey Kesselman" <je...@gmail.com>
>>> >>> To: user@cassandra.apache.org
>>> >>> Sent: Thursday, May 26, 2011 8:57:49 PM
>>> >>> Subject: Re: Forcing Cassandra to free up some space
>>> >>>
>>> >>> Which JVM?  Which collector?  There have been and continue to be many.
>>> >>>
>>> >>> Hotspot itself supports a number of different collectors with
>>> >>> different behaviors.   Many of them do not collect every candidate on
>>> >>> every gc, but merely the easiest ones to find.  This is why depending
>>> >>> on finalizers is a *bad* idea in java code.  They may well never get
>>> >>> run.  (Finalizer is one of a few features the Sun Java team always
>>> >>> regretted putting in Java to start with.  It has caused quite a few
>>> >>> application problems over the years)
>>> >>>
>>> >>> The really important thing is that NONE of these behaviors of the
>>> >>> colelctors are guaranteed by specification not to change from version
>>> >>> to version.  Basing your code on non-specified behaviors is a good way
>>> >>> to hit mysterious failures on updates.
>>> >>>
>>> >>> For instance, in the mid 90s, IBM had a mode of their Vm called
>>> >>> "infinite heap."  it *never* garbage collected, even if you called
>>> >>> System.gc.  Instead it just threw away address space and counted on
>>> >>> the total memory needs for the life of the program being less then the
>>> >>> total addressable space of the processor.
>>> >>>
>>> >>> It was *very* fast for certain kinds of applications.
>>> >>>
>>> >>> Far from being pedantic, not depending on undocumented behavior is
>>> >>> simply good engineering.
>>> >>>
>>> >>>
>>> >>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis <jb...@gmail.com>
>>> >>> wrote:
>>> >>>> I've read the relevant source. While you're pedantically correct re
>>> >>>> the spec, you're wrong as to what the JVM actually does.
>>> >>>>
>>> >>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman <je...@gmail.com>
>>> >>>> wrote:
>>> >>>>> Some references...
>>> >>>>>
>>> >>>>> "An object enters an unreachable state when no more strong
>>> >>>>> references
>>> >>>>> to it exist. When an object is unreachable, it is a candidate for
>>> >>>>> collection. Note the wording: Just because an object is a candidate
>>> >>>>> for collection doesn't mean it will be immediately collected. The
>>> >>>>> JVM
>>> >>>>> is free to delay collection until there is an immediate need for the
>>> >>>>> memory being consumed by the object."
>>> >>>>>
>>> >>>>>
>>> >>>>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>>> >>>>>
>>> >>>>> and "Calling the gc method suggests that the Java Virtual Machine
>>> >>>>> expend effort toward recycling unused objects"
>>> >>>>>
>>> >>>>>
>>> >>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>>> >>>>>
>>> >>>>> It goes on to say that the VM will make a "best effort", but "best
>>> >>>>> effort" is *deliberately* left up to the definition of the gc
>>> >>>>> implementor.
>>> >>>>>
>>> >>>>> I guess you missed the many lectures I have given on this subject
>>> >>>>> over
>>> >>>>> the years at Java One Conferences....
>>> >>>>>
>>> >>>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis <jb...@gmail.com>
>>> >>>>> wrote:
>>> >>>>>> It's a common misunderstanding that system.gc is only a suggestion;
>>> >>>>>> on
>>> >>>>>> any VM you're likely to run Cassandra on, System.gc will actually
>>> >>>>>> invoke a full collection.
>>> >>>>>>
>>> >>>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman
>>> >>>>>> <je...@gmail.com> wrote:
>>> >>>>>>> Actually this is no gaurantee.   Its a common misunderstanding
>>> >>>>>>> that
>>> >>>>>>> System.gc "forces" gc.  It does not. It is a suggestion only. The
>>> >>>>>>> vm always
>>> >>>>>>> has the option as to when and how much it gcs
>>> >>>>>>>
>>> >>>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis" <jb...@gmail.com>
>>> >>>>>>> wrote:
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Jonathan Ellis
>>> >>>>>> Project Chair, Apache Cassandra
>>> >>>>>> co-founder of DataStax, the source for professional Cassandra
>>> >>>>>> support
>>> >>>>>> http://www.datastax.com
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> It's always darkest just before you are eaten by a grue.
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Jonathan Ellis
>>> >>>> Project Chair, Apache Cassandra
>>> >>>> co-founder of DataStax, the source for professional Cassandra support
>>> >>>> http://www.datastax.com
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> It's always darkest just before you are eaten by a grue.
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Jonathan Ellis
>>> >> Project Chair, Apache Cassandra
>>> >> co-founder of DataStax, the source for professional Cassandra support
>>> >> http://www.datastax.com
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > It's always darkest just before you are eaten by a grue.
>>> >
>>>
>>>
>>>
>>> --
>>> Shotaro Kamio
>>
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>

Re: Forcing Cassandra to free up some space

Posted by Jeffrey Kesselman <je...@gmail.com>.
The GC cleanup approach, if depending on specific objects being GCd,
is fundamentally flawed.

I brought this up earlier, won't restart that thread.  It should be in
the archives.


On Wed, Jun 15, 2011 at 10:17 PM, Terje Marthinussen
<tm...@gmail.com> wrote:
> Watching this on a node here right now and it sort of shows how bad this can
> get.
> This node still has 109GB free disk by the way...
> INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [ScheduledTasks:1] 2011-06-16 09:29:29,494 GCInspector.java (line 128)
> GC for ParNew: 392 ms, 398997776 reclaimed leaving 2334786808 used; max is
> 10844635136
>  INFO [ScheduledTasks:1] 2011-06-16 09:29:32,831 GCInspector.java (line 128)
> GC for ParNew: 737 ms, 332336832 reclaimed leaving 2473311448 used; max is
> 10844635136
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java
> (line 2071) requesting GC to free disk space
>  INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java
> (line 2071) requesting GC to free disk space
> If I look at the data dir, I see 46 *Compacted files which makes up an
> additional 137GB of space.
> The oldest of these Compacted files dates back to Jun 16th 01:26.
> If these got deleted, there should actually be enough disk for the node to
> run a full compaction run if needed.
> Either the GC cleanup tactic is seriously flawed or  we have a potential bug
> keeping references far longer than needed?
> Terje
>
>
> On Wed, Jun 15, 2011 at 11:50 PM, Shotaro Kamio <ka...@gmail.com> wrote:
>>
>> We've encountered the situation that compacted sstable files aren't
>> deleted after node repair. Even when gc is triggered via jmx, it
>> sometimes leaves compacted files. In a case, a lot of files are left.
>> Some files stay more than 10 hours already. There is no guarantee that
>> gc will cleanup all compacted sstable files.
>>
>> We have a great interest on the following ticket.
>> https://issues.apache.org/jira/browse/CASSANDRA-2521
>>
>>
>> Regards,
>> Shotaro
>>
>>
>> On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman <je...@gmail.com>
>> wrote:
>> > Im also not sure that will guarantee all space is cleaned up.  It
>> > really depends on what you are doing inside Cassandra.  If you have
>> > your on garbage collect that is just in some way tied to the gc run,
>> > then it will run when  it runs.
>> >
>> > If otoh you are associating records in your storage with specific
>> > objects in memory and using one of the post-mortem hooks (finalize or
>> > PhantomReference) to tell you to clean up that particular record then
>> > its quite possible they wont all get cleaned up.  In general hotspot
>> > does not find and clean every candidate object on every GC run.  It
>> > starts with the easiest/fastest to find and then sees what more it
>> > thinks it needs to do to create enough memory for anticipated near
>> > future needs.
>> >
>> > On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis <jb...@gmail.com>
>> > wrote:
>> >> In summary, system.gc works fine unless you've deliberately done
>> >> something like setting the -XX:-DisableExplicitGC flag.
>> >>
>> >> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
>> >> <ko...@a-bb.net> wrote:
>> >>> So, in summary, there is no way to predictably and efficiently tell
>> >>> Cassandra to get rid of all of the extra space it is using on disk?
>> >>>
>> >>> ----- Original Message -----
>> >>> From: "Jeffrey Kesselman" <je...@gmail.com>
>> >>> To: user@cassandra.apache.org
>> >>> Sent: Thursday, May 26, 2011 8:57:49 PM
>> >>> Subject: Re: Forcing Cassandra to free up some space
>> >>>
>> >>> Which JVM?  Which collector?  There have been and continue to be many.
>> >>>
>> >>> Hotspot itself supports a number of different collectors with
>> >>> different behaviors.   Many of them do not collect every candidate on
>> >>> every gc, but merely the easiest ones to find.  This is why depending
>> >>> on finalizers is a *bad* idea in java code.  They may well never get
>> >>> run.  (Finalizer is one of a few features the Sun Java team always
>> >>> regretted putting in Java to start with.  It has caused quite a few
>> >>> application problems over the years)
>> >>>
>> >>> The really important thing is that NONE of these behaviors of the
>> >>> colelctors are guaranteed by specification not to change from version
>> >>> to version.  Basing your code on non-specified behaviors is a good way
>> >>> to hit mysterious failures on updates.
>> >>>
>> >>> For instance, in the mid 90s, IBM had a mode of their Vm called
>> >>> "infinite heap."  it *never* garbage collected, even if you called
>> >>> System.gc.  Instead it just threw away address space and counted on
>> >>> the total memory needs for the life of the program being less then the
>> >>> total addressable space of the processor.
>> >>>
>> >>> It was *very* fast for certain kinds of applications.
>> >>>
>> >>> Far from being pedantic, not depending on undocumented behavior is
>> >>> simply good engineering.
>> >>>
>> >>>
>> >>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis <jb...@gmail.com>
>> >>> wrote:
>> >>>> I've read the relevant source. While you're pedantically correct re
>> >>>> the spec, you're wrong as to what the JVM actually does.
>> >>>>
>> >>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman <je...@gmail.com>
>> >>>> wrote:
>> >>>>> Some references...
>> >>>>>
>> >>>>> "An object enters an unreachable state when no more strong
>> >>>>> references
>> >>>>> to it exist. When an object is unreachable, it is a candidate for
>> >>>>> collection. Note the wording: Just because an object is a candidate
>> >>>>> for collection doesn't mean it will be immediately collected. The
>> >>>>> JVM
>> >>>>> is free to delay collection until there is an immediate need for the
>> >>>>> memory being consumed by the object."
>> >>>>>
>> >>>>>
>> >>>>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>> >>>>>
>> >>>>> and "Calling the gc method suggests that the Java Virtual Machine
>> >>>>> expend effort toward recycling unused objects"
>> >>>>>
>> >>>>>
>> >>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>> >>>>>
>> >>>>> It goes on to say that the VM will make a "best effort", but "best
>> >>>>> effort" is *deliberately* left up to the definition of the gc
>> >>>>> implementor.
>> >>>>>
>> >>>>> I guess you missed the many lectures I have given on this subject
>> >>>>> over
>> >>>>> the years at Java One Conferences....
>> >>>>>
>> >>>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis <jb...@gmail.com>
>> >>>>> wrote:
>> >>>>>> It's a common misunderstanding that system.gc is only a suggestion;
>> >>>>>> on
>> >>>>>> any VM you're likely to run Cassandra on, System.gc will actually
>> >>>>>> invoke a full collection.
>> >>>>>>
>> >>>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman
>> >>>>>> <je...@gmail.com> wrote:
>> >>>>>>> Actually this is no gaurantee.   Its a common misunderstanding
>> >>>>>>> that
>> >>>>>>> System.gc "forces" gc.  It does not. It is a suggestion only. The
>> >>>>>>> vm always
>> >>>>>>> has the option as to when and how much it gcs
>> >>>>>>>
>> >>>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis" <jb...@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Jonathan Ellis
>> >>>>>> Project Chair, Apache Cassandra
>> >>>>>> co-founder of DataStax, the source for professional Cassandra
>> >>>>>> support
>> >>>>>> http://www.datastax.com
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> It's always darkest just before you are eaten by a grue.
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Jonathan Ellis
>> >>>> Project Chair, Apache Cassandra
>> >>>> co-founder of DataStax, the source for professional Cassandra support
>> >>>> http://www.datastax.com
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> It's always darkest just before you are eaten by a grue.
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Jonathan Ellis
>> >> Project Chair, Apache Cassandra
>> >> co-founder of DataStax, the source for professional Cassandra support
>> >> http://www.datastax.com
>> >>
>> >
>> >
>> >
>> > --
>> > It's always darkest just before you are eaten by a grue.
>> >
>>
>>
>>
>> --
>> Shotaro Kamio
>
>



-- 
It's always darkest just before you are eaten by a grue.

Re: Forcing Cassandra to free up some space

Posted by Terje Marthinussen <tm...@gmail.com>.
Watching this on a node here right now and it sort of shows how bad this can
get.
This node still has 109GB free disk by the way...

INFO [CompactionExecutor:5] 2011-06-16 09:11:59,164 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:5] 2011-06-16 09:12:23,929 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:5] 2011-06-16 09:12:46,489 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:17:53,299 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:18:17,782 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:18:42,078 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:19:06,984 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:19:32,079 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:19:57,265 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:20:22,706 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:20:47,331 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:21:13,062 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:21:38,288 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:22:03,500 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:22:29,407 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:22:55,577 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:23:20,951 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:23:46,448 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:3] 2011-06-16 09:24:12,030 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [ScheduledTasks:1] 2011-06-16 09:29:29,494 GCInspector.java (line 128)
GC for ParNew: 392 ms, 398997776 reclaimed leaving 2334786808 used; max is
10844635136
 INFO [ScheduledTasks:1] 2011-06-16 09:29:32,831 GCInspector.java (line 128)
GC for ParNew: 737 ms, 332336832 reclaimed leaving 2473311448 used; max is
10844635136
 INFO [CompactionExecutor:6] 2011-06-16 09:48:00,633 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 09:48:26,119 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 09:48:49,002 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 10:10:20,196 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 10:10:45,322 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:6] 2011-06-16 10:11:07,619 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:7] 2011-06-16 11:01:45,562 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:7] 2011-06-16 11:02:10,236 StorageService.java
(line 2071) requesting GC to free disk space
 INFO [CompactionExecutor:7] 2011-06-16 11:05:31,297 StorageService.java
(line 2071) requesting GC to free disk space

If I look at the data dir, I see 46 *Compacted files which makes up an
additional 137GB of space.
The oldest of these Compacted files dates back to Jun 16th 01:26.

If these got deleted, there should actually be enough disk for the node to
run a full compaction run if needed.

Either the GC cleanup tactic is seriously flawed or  we have a potential bug
keeping references far longer than needed?

Terje



On Wed, Jun 15, 2011 at 11:50 PM, Shotaro Kamio <ka...@gmail.com> wrote:

> We've encountered the situation that compacted sstable files aren't
> deleted after node repair. Even when gc is triggered via jmx, it
> sometimes leaves compacted files. In a case, a lot of files are left.
> Some files stay more than 10 hours already. There is no guarantee that
> gc will cleanup all compacted sstable files.
>
> We have a great interest on the following ticket.
> https://issues.apache.org/jira/browse/CASSANDRA-2521
>
>
> Regards,
> Shotaro
>
>
> On Fri, May 27, 2011 at 11:27 AM, Jeffrey Kesselman <je...@gmail.com>
> wrote:
> > Im also not sure that will guarantee all space is cleaned up.  It
> > really depends on what you are doing inside Cassandra.  If you have
> > your on garbage collect that is just in some way tied to the gc run,
> > then it will run when  it runs.
> >
> > If otoh you are associating records in your storage with specific
> > objects in memory and using one of the post-mortem hooks (finalize or
> > PhantomReference) to tell you to clean up that particular record then
> > its quite possible they wont all get cleaned up.  In general hotspot
> > does not find and clean every candidate object on every GC run.  It
> > starts with the easiest/fastest to find and then sees what more it
> > thinks it needs to do to create enough memory for anticipated near
> > future needs.
> >
> > On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >> In summary, system.gc works fine unless you've deliberately done
> >> something like setting the -XX:-DisableExplicitGC flag.
> >>
> >> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
> >> <ko...@a-bb.net> wrote:
> >>> So, in summary, there is no way to predictably and efficiently tell
> Cassandra to get rid of all of the extra space it is using on disk?
> >>>
> >>> ----- Original Message -----
> >>> From: "Jeffrey Kesselman" <je...@gmail.com>
> >>> To: user@cassandra.apache.org
> >>> Sent: Thursday, May 26, 2011 8:57:49 PM
> >>> Subject: Re: Forcing Cassandra to free up some space
> >>>
> >>> Which JVM?  Which collector?  There have been and continue to be many.
> >>>
> >>> Hotspot itself supports a number of different collectors with
> >>> different behaviors.   Many of them do not collect every candidate on
> >>> every gc, but merely the easiest ones to find.  This is why depending
> >>> on finalizers is a *bad* idea in java code.  They may well never get
> >>> run.  (Finalizer is one of a few features the Sun Java team always
> >>> regretted putting in Java to start with.  It has caused quite a few
> >>> application problems over the years)
> >>>
> >>> The really important thing is that NONE of these behaviors of the
> >>> colelctors are guaranteed by specification not to change from version
> >>> to version.  Basing your code on non-specified behaviors is a good way
> >>> to hit mysterious failures on updates.
> >>>
> >>> For instance, in the mid 90s, IBM had a mode of their Vm called
> >>> "infinite heap."  it *never* garbage collected, even if you called
> >>> System.gc.  Instead it just threw away address space and counted on
> >>> the total memory needs for the life of the program being less then the
> >>> total addressable space of the processor.
> >>>
> >>> It was *very* fast for certain kinds of applications.
> >>>
> >>> Far from being pedantic, not depending on undocumented behavior is
> >>> simply good engineering.
> >>>
> >>>
> >>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>>> I've read the relevant source. While you're pedantically correct re
> >>>> the spec, you're wrong as to what the JVM actually does.
> >>>>
> >>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman <je...@gmail.com>
> wrote:
> >>>>> Some references...
> >>>>>
> >>>>> "An object enters an unreachable state when no more strong references
> >>>>> to it exist. When an object is unreachable, it is a candidate for
> >>>>> collection. Note the wording: Just because an object is a candidate
> >>>>> for collection doesn't mean it will be immediately collected. The JVM
> >>>>> is free to delay collection until there is an immediate need for the
> >>>>> memory being consumed by the object."
> >>>>>
> >>>>>
> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
> >>>>>
> >>>>> and "Calling the gc method suggests that the Java Virtual Machine
> >>>>> expend effort toward recycling unused objects"
> >>>>>
> >>>>>
> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
> >>>>>
> >>>>> It goes on to say that the VM will make a "best effort", but "best
> >>>>> effort" is *deliberately* left up to the definition of the gc
> >>>>> implementor.
> >>>>>
> >>>>> I guess you missed the many lectures I have given on this subject
> over
> >>>>> the years at Java One Conferences....
> >>>>>
> >>>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis <jb...@gmail.com>
> wrote:
> >>>>>> It's a common misunderstanding that system.gc is only a suggestion;
> on
> >>>>>> any VM you're likely to run Cassandra on, System.gc will actually
> >>>>>> invoke a full collection.
> >>>>>>
> >>>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman <
> jeffpk@gmail.com> wrote:
> >>>>>>> Actually this is no gaurantee.   Its a common misunderstanding that
> >>>>>>> System.gc "forces" gc.  It does not. It is a suggestion only. The
> vm always
> >>>>>>> has the option as to when and how much it gcs
> >>>>>>>
> >>>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis" <jb...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Jonathan Ellis
> >>>>>> Project Chair, Apache Cassandra
> >>>>>> co-founder of DataStax, the source for professional Cassandra
> support
> >>>>>> http://www.datastax.com
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> It's always darkest just before you are eaten by a grue.
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Jonathan Ellis
> >>>> Project Chair, Apache Cassandra
> >>>> co-founder of DataStax, the source for professional Cassandra support
> >>>> http://www.datastax.com
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> It's always darkest just before you are eaten by a grue.
> >>>
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of DataStax, the source for professional Cassandra support
> >> http://www.datastax.com
> >>
> >
> >
> >
> > --
> > It's always darkest just before you are eaten by a grue.
> >
>
>
>
> --
> Shotaro Kamio
>