You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Brian Tarbox <ta...@cabotresearch.com> on 2014/06/18 18:01:45 UTC

running out of diskspace during maintenance tasks

I'm running on AWS m2.2xlarge instances using the ~800 gig
ephemeral/attached disk for my data directory.  My data size per node is
nearing 400 gig.

Sometimes during maintenance operations (repairs mostly I think) I run out
of disk space as my understanding is that some of these operations require
double the space of one's data.

Since I can't change the size of attached storage for my instance type my
question is can I somehow get these maintenance operations to use other
volumes?

Failing that, what are my options?  Thanks.

Brian Tarbox

Re: running out of diskspace during maintenance tasks

Posted by Russell Bradberry <rb...@gmail.com>.
repair only creates snapshots if you use the “-snapshot” option.



On June 18, 2014 at 12:28:58 PM, Marcelo Elias Del Valle (marcelo@s1mbi0se.com.br) wrote:

AFAIK, when you run a repair a snapshot is created.
After the repair, I run "nodetool clearsnapshot" to save disk space.
Not sure it's you case or not.
[]s


2014-06-18 13:10 GMT-03:00 Brian Tarbox <ta...@cabotresearch.com>:
We do a repair -pr on each node once a week on a rolling basis.
Should we be running cleanup as well?  My understanding that was only used after adding/removing nodes?

We'd like to avoid adding nodes if possible (which might not be).   Still curious if we can get C* to do the maintenance task on a separate volume.

Thanks.


On Wed, Jun 18, 2014 at 12:03 PM, Jeremy Jongsma <je...@barchart.com> wrote:
One option is to add new nodes, and do a node repair/cleanup on everything. That will at least reduce your per-node data size.


On Wed, Jun 18, 2014 at 11:01 AM, Brian Tarbox <ta...@cabotresearch.com> wrote:
I'm running on AWS m2.2xlarge instances using the ~800 gig ephemeral/attached disk for my data directory.  My data size per node is nearing 400 gig.

Sometimes during maintenance operations (repairs mostly I think) I run out of disk space as my understanding is that some of these operations require double the space of one's data.

Since I can't change the size of attached storage for my instance type my question is can I somehow get these maintenance operations to use other volumes?

Failing that, what are my options?  Thanks.

Brian Tarbox




Re: running out of diskspace during maintenance tasks

Posted by Marcelo Elias Del Valle <ma...@s1mbi0se.com.br>.
AFAIK, when you run a repair a snapshot is created.
After the repair, I run "nodetool clearsnapshot" to save disk space.
Not sure it's you case or not.
[]s


2014-06-18 13:10 GMT-03:00 Brian Tarbox <ta...@cabotresearch.com>:

> We do a repair -pr on each node once a week on a rolling basis.
> Should we be running cleanup as well?  My understanding that was only used
> after adding/removing nodes?
>
> We'd like to avoid adding nodes if possible (which might not be).   Still
> curious if we can get C* to do the maintenance task on a separate volume.
>
> Thanks.
>
>
> On Wed, Jun 18, 2014 at 12:03 PM, Jeremy Jongsma <je...@barchart.com>
> wrote:
>
>> One option is to add new nodes, and do a node repair/cleanup on
>> everything. That will at least reduce your per-node data size.
>>
>>
>> On Wed, Jun 18, 2014 at 11:01 AM, Brian Tarbox <ta...@cabotresearch.com>
>> wrote:
>>
>>> I'm running on AWS m2.2xlarge instances using the ~800 gig
>>> ephemeral/attached disk for my data directory.  My data size per node is
>>> nearing 400 gig.
>>>
>>> Sometimes during maintenance operations (repairs mostly I think) I run
>>> out of disk space as my understanding is that some of these operations
>>> require double the space of one's data.
>>>
>>> Since I can't change the size of attached storage for my instance type
>>> my question is can I somehow get these maintenance operations to use other
>>> volumes?
>>>
>>> Failing that, what are my options?  Thanks.
>>>
>>> Brian Tarbox
>>>
>>
>>
>

Re: running out of diskspace during maintenance tasks

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Jun 18, 2014 at 9:10 AM, Brian Tarbox <ta...@cabotresearch.com>
wrote:

> We do a repair -pr on each node once a week on a rolling basis.
>

https://issues.apache.org/jira/browse/CASSANDRA-5850?focusedCommentId=14036057&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036057


> Should we be running cleanup as well?
>

No.

=Rob

Re: running out of diskspace during maintenance tasks

Posted by Brian Tarbox <ta...@cabotresearch.com>.
We do a repair -pr on each node once a week on a rolling basis.
Should we be running cleanup as well?  My understanding that was only used
after adding/removing nodes?

We'd like to avoid adding nodes if possible (which might not be).   Still
curious if we can get C* to do the maintenance task on a separate volume.

Thanks.


On Wed, Jun 18, 2014 at 12:03 PM, Jeremy Jongsma <je...@barchart.com>
wrote:

> One option is to add new nodes, and do a node repair/cleanup on
> everything. That will at least reduce your per-node data size.
>
>
> On Wed, Jun 18, 2014 at 11:01 AM, Brian Tarbox <ta...@cabotresearch.com>
> wrote:
>
>> I'm running on AWS m2.2xlarge instances using the ~800 gig
>> ephemeral/attached disk for my data directory.  My data size per node is
>> nearing 400 gig.
>>
>> Sometimes during maintenance operations (repairs mostly I think) I run
>> out of disk space as my understanding is that some of these operations
>> require double the space of one's data.
>>
>> Since I can't change the size of attached storage for my instance type my
>> question is can I somehow get these maintenance operations to use other
>> volumes?
>>
>> Failing that, what are my options?  Thanks.
>>
>> Brian Tarbox
>>
>
>

Re: running out of diskspace during maintenance tasks

Posted by Jeremy Jongsma <je...@barchart.com>.
One option is to add new nodes, and do a node repair/cleanup on everything.
That will at least reduce your per-node data size.


On Wed, Jun 18, 2014 at 11:01 AM, Brian Tarbox <ta...@cabotresearch.com>
wrote:

> I'm running on AWS m2.2xlarge instances using the ~800 gig
> ephemeral/attached disk for my data directory.  My data size per node is
> nearing 400 gig.
>
> Sometimes during maintenance operations (repairs mostly I think) I run out
> of disk space as my understanding is that some of these operations require
> double the space of one's data.
>
> Since I can't change the size of attached storage for my instance type my
> question is can I somehow get these maintenance operations to use other
> volumes?
>
> Failing that, what are my options?  Thanks.
>
> Brian Tarbox
>

Re: running out of diskspace during maintenance tasks

Posted by Jens Rantil <je...@tink.se>.
Hi Brian,


What compaction are you running? Have you tried using leveled compaction? AFAIK it should generally require less disk space during compaction.




Cheers,

Jens
—
Sent from Mailbox

On Wed, Jun 18, 2014 at 6:02 PM, Brian Tarbox <ta...@cabotresearch.com>
wrote:

> I'm running on AWS m2.2xlarge instances using the ~800 gig
> ephemeral/attached disk for my data directory.  My data size per node is
> nearing 400 gig.
> Sometimes during maintenance operations (repairs mostly I think) I run out
> of disk space as my understanding is that some of these operations require
> double the space of one's data.
> Since I can't change the size of attached storage for my instance type my
> question is can I somehow get these maintenance operations to use other
> volumes?
> Failing that, what are my options?  Thanks.
> Brian Tarbox