You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dominic Letz <do...@exosite.com> on 2014/10/01 08:23:57 UTC

Re: disk space issue

This is a shot into the dark but you could check whether you have too many
snapshots laying around that you actually don't need. You can get rid of
those with a quick "nodetool clearsnapshot".

On Wed, Oct 1, 2014 at 5:49 AM, cem <ca...@gmail.com> wrote:

> Hi All,
>
> I have a 7 node cluster. One node ran out of disk space and others are
> around 80% disk utilization.
> The data has 10 days TTL but I think compaction wasn't fast enough to
> clean up the expired data.  gc_grace value is set default. I have a
> replication factor of 3. Do you think that it may help if I delete all data
> for that node and run repair. Does node repair check the ttl value before
> retrieving data from other nodes? Do you have any other suggestions?
>
> Best Regards,
> Cem.
>

-- 
Dominic Letz
Director of R&D
Exosite <http://exosite.com>

Re: disk space issue

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Oct 1, 2014 at 6:17 AM, Ken Hancock <ke...@schange.com> wrote:

> Major compaction is bad if you're using size-tiered, especially if you're
> already having capacity issues.  Once you have one huge table, with default
> settings, you'll need 4x that huge table worth of storage in order for it
> to compact again to ever reclaim your TTL'd data.
>

I agree that OP likely cannot major compact. But if he could, he could do
so and then run sstablesplit (most safely, with the node down) on the One
Big SSTable to avoid the problem you describe.

If the OP actually has SSTables full of TTLed garbage, he should use JMX to
run UserDefinedCompaction on them, one at a time.

He should also probably reconsider whether a database with immutable
datafiles is the ideal solution for his data he wants to keep for 7 days. ;D

=Rob

Re: disk space issue

Posted by Ken Hancock <ke...@schange.com>.

Major compaction is bad if you're using size-tiered, especially if you're
already having capacity issues.  Once you have one huge table, with default
settings, you'll need 4x that huge table worth of storage in order for it
to compact again to ever reclaim your TTL'd data.

If you're running into space issues that are ultimately going to get your
system wedged and you're using columns with TTL, I'd recommend using the
jmx operation to compact individual tables.  This will free the TTL'd data
assuming that you've exceeded your gc_grace_seconds.  This can probably be
scripted up in a relatively easy manner with a nice,
shellshocked-vulnerable bash script and jmxterm.

On Wed, Oct 1, 2014 at 2:43 AM, Nikolay Mihaylov <nm...@nmmm.nu> wrote:

> my 2 cents:
>
> try major compaction on the column family with TTL's - for sure will be
> faster than full rebuild.
>
> also try not cassandra related things, such check and remove old log
> files, backups etc.
>
> On Wed, Oct 1, 2014 at 9:34 AM, Sumod Pawgi <sp...@gmail.com> wrote:
>
>> In the past in such scenarios it has helped us to check the partition
>> where cassandra is installed and allocate more space for the partition.
>> Maybe it is a disk space issue but it is good to check if it is related to
>> the space allocation for the partition issue. My 2 cents.
>>
>> Sent from my iPhone
>>
>> On 01-Oct-2014, at 11:53 am, Dominic Letz <do...@exosite.com>
>> wrote:
>>
>> This is a shot into the dark but you could check whether you have too
>> many snapshots laying around that you actually don't need. You can get rid
>> of those with a quick "nodetool clearsnapshot".
>>
>> On Wed, Oct 1, 2014 at 5:49 AM, cem <ca...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I have a 7 node cluster. One node ran out of disk space and others are
>>> around 80% disk utilization.
>>> The data has 10 days TTL but I think compaction wasn't fast enough to
>>> clean up the expired data.  gc_grace value is set default. I have a
>>> replication factor of 3. Do you think that it may help if I delete all data
>>> for that node and run repair. Does node repair check the ttl value before
>>> retrieving data from other nodes? Do you have any other suggestions?
>>>
>>> Best Regards,
>>> Cem.
>>>
>>
>>
>>
>> --
>> Dominic Letz
>> Director of R&D
>> Exosite <http://exosite.com>
>>
>>
>

-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hancock@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
<http://www.linkedin.com/in/kenhancock>

[image: SeaChange International]
<http://www.schange.com/>This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.

Re: disk space issue

Posted by Nikolay Mihaylov <nm...@nmmm.nu>.

my 2 cents:

try major compaction on the column family with TTL's - for sure will be
faster than full rebuild.

also try not cassandra related things, such check and remove old log files,
backups etc.

On Wed, Oct 1, 2014 at 9:34 AM, Sumod Pawgi <sp...@gmail.com> wrote:

> In the past in such scenarios it has helped us to check the partition
> where cassandra is installed and allocate more space for the partition.
> Maybe it is a disk space issue but it is good to check if it is related to
> the space allocation for the partition issue. My 2 cents.
>
> Sent from my iPhone
>
> On 01-Oct-2014, at 11:53 am, Dominic Letz <do...@exosite.com> wrote:
>
> This is a shot into the dark but you could check whether you have too many
> snapshots laying around that you actually don't need. You can get rid of
> those with a quick "nodetool clearsnapshot".
>
> On Wed, Oct 1, 2014 at 5:49 AM, cem <ca...@gmail.com> wrote:
>
>> Hi All,
>>
>> I have a 7 node cluster. One node ran out of disk space and others are
>> around 80% disk utilization.
>> The data has 10 days TTL but I think compaction wasn't fast enough to
>> clean up the expired data.  gc_grace value is set default. I have a
>> replication factor of 3. Do you think that it may help if I delete all data
>> for that node and run repair. Does node repair check the ttl value before
>> retrieving data from other nodes? Do you have any other suggestions?
>>
>> Best Regards,
>> Cem.
>>
>
>
>
> --
> Dominic Letz
> Director of R&D
> Exosite <http://exosite.com>
>
>

Re: disk space issue

Posted by Sumod Pawgi <sp...@gmail.com>.

In the past in such scenarios it has helped us to check the partition where cassandra is installed and allocate more space for the partition. Maybe it is a disk space issue but it is good to check if it is related to the space allocation for the partition issue. My 2 cents.

Sent from my iPhone

> On 01-Oct-2014, at 11:53 am, Dominic Letz <do...@exosite.com> wrote:
> 
> This is a shot into the dark but you could check whether you have too many snapshots laying around that you actually don't need. You can get rid of those with a quick "nodetool clearsnapshot".
> 
>> On Wed, Oct 1, 2014 at 5:49 AM, cem <ca...@gmail.com> wrote:
>> Hi All,
>> 
>> I have a 7 node cluster. One node ran out of disk space and others are around 80% disk utilization. 
>> The data has 10 days TTL but I think compaction wasn't fast enough to clean up the expired data.  gc_grace value is set default. I have a replication factor of 3. Do you think that it may help if I delete all data for that node and run repair. Does node repair check the ttl value before retrieving data from other nodes? Do you have any other suggestions?
>> 
>> Best Regards,
>> Cem.
> 
> 
> 
> -- 
> Dominic Letz
> Director of R&D
> Exosite
>